*** rpittau|afk is now known as rpittau | 07:02 | |
qwebirc64230 | hi | 07:44 |
---|---|---|
qwebirc64230 | i m running the ansible-playbook setuphost.yml | 07:45 |
qwebirc64230 | but gettting an error no host found | 07:45 |
qwebirc64230 | though i have set up the host ip in the user-config yaml file. | 07:45 |
qwebirc64230 | anything else that i can check for this issue | 07:47 |
jrosser | qwebirc64230: you can paste any interesting output to paste.opendev.org if you want someone to look at it | 07:49 |
jrosser | also you can check that the inventory is parsed properly with the script openstack-ansible/scripts/inventory-manage.py | 07:50 |
qwebirc64230 | [root@tstosdep01 scripts]# ./inventory-manage.py --list-host +----------------------------------------------+----------+--------------------------+---------------+----------------+----------------+--------------------------+ | container_name | is_metal | component | physical_host | tunnel_address | ansible_host | container_types | +----------------------------------------------+------- | 07:56 |
qwebirc64230 | i could see the containers | 07:57 |
qwebirc64230 | i could see the containers details | 07:57 |
qwebirc64230 | but the ip addresses are something which are different from the openstack-user-config file | 07:57 |
qwebirc64230 | 172.29.236.14 | 07:57 |
qwebirc64230 | in my openstack user config yaml i have the CIDR mentioned : 10.0.160.0/24 | 08:02 |
qwebirc64230 | but if i run the ./inventory-manage.py --list-host i could see the ip address of the container are in the CIDR : 172.29.236.14 | 08:03 |
qwebirc64230 | still if i run the stup host playbook i get an error no host matching found | 08:03 |
jrosser | can you paste the openstack user config file to paste.opendev.org? | 08:13 |
qwebirc64230 | done | 08:14 |
jrosser | you'll need to share the url here | 08:15 |
qwebirc64230 | https://paste.opendev.org/show/807754/ | 08:20 |
qwebirc64230 | my infra hosts are in different subnet like 10.0.16.0/22 | 08:24 |
qwebirc64230 | and my CIDR are in the different subnet | 08:24 |
qwebirc64230 | maybe that is causing the issue | 08:24 |
jrosser | sorry i have a meeting for a while - back later | 08:32 |
qwebirc64230 | no worries take ur time | 08:33 |
*** zbr is now known as Guest2544 | 09:09 | |
jrosser | qwebirc64230: i don't think you need cidr_networks defined inside global_overrides | 09:23 |
jrosser | also you should think about reserving IP addresses in your subnets with used_ips | 09:23 |
jrosser | because it is coming up with addresses in 172.29.236.x i would guess that this is a host you've previously done a deployment on and have now changed the networking to use 10.x addresses instead | 09:24 |
jrosser | the only way that 172.29.236.x addressing can be used if it has (or used to be) set up that way in openstack_user_config | 09:25 |
jrosser | there are state files kept in /etc/openstack_deploy, and those are read to display the output when you use the inventory_manage script | 09:26 |
jrosser | thats why I think you have an older config and your current one all somehow mixed up together | 09:26 |
qwebirc64230 | is it compulsory to provide the used ip's | 09:35 |
qwebirc64230 | ideally it should pick the free ip addresses within the CIDR | 09:36 |
qwebirc64230 | i have removed the generated files which are ip adress and inventory | 09:37 |
qwebirc64230 | and now the playbook picks up my CIDR range | 09:37 |
qwebirc64230 | so the infra hosts should be in the CIDR range which is specified at the top or i can have a different subnet for my infra hosts | 09:39 |
*** sshnaidm|afk is now known as sshnaidm | 09:45 | |
jrosser | qwebirc64230: it does not know which IP that you have used for routers, other hardware, ssh bastions in those subnets which are out of the scope of OSA | 09:47 |
jrosser | it does pick free IP addresses in the subnets, but if you want the first 10(?) ip to be reserved for default gateway / vrrp / whatever.... then thats what used_ip is for | 09:48 |
jrosser | qwebirc64230: you can have one subnet that is used for ssh (and by extension ansible) to access your hosts, like your 10.0.16.x | 09:49 |
jrosser | and you can use completely different ones for mgmt / storage / tunnel, however you like | 09:50 |
jrosser | this is all very flexible and there's no particularly right or wrong answer | 09:50 |
qwebirc64230 | can i use 1 subnet for all management storage and tunneling | 09:50 |
qwebirc64230 | is that possible | 09:51 |
jrosser | most deployments make the mgmt network the same subnet as the physical hosts, but that certinaly is not a fixed rule and there are plenty that dont | 09:51 |
jrosser | you can perhaps collapse it all down to one network but i think there is much more risk of that not working, as generally on-one does that | 09:52 |
jrosser | highly un-tested | 09:52 |
qwebirc64230 | the deployment node where i have my ansible places should be in the same subnet as the mgmt network | 09:52 |
jrosser | it does not have to be | 09:52 |
jrosser | it needs to be able to ssh to the bare metal hosts on the IP addresses you specify on openstack_user_config | 09:52 |
jrosser | you can make this be the mgmt network if you want | 09:52 |
jrosser | or you can have a subnet dedicated to ssh / bare metal access - totally flexible here | 09:53 |
qwebirc64230 | got it | 09:53 |
jrosser | for example in my deployments PXEboot and ssh happen on a network thats not mgmt network, they're seperate | 09:54 |
jrosser | this can be set up to fit around how you want to manage the hosts | 09:54 |
qwebirc64230 | cool | 10:14 |
opendevreview | Andrew Bonney proposed openstack/ansible-role-python_venv_build master: Add distro/arch to requirements file path https://review.opendev.org/c/openstack/ansible-role-python_venv_build/+/801738 | 10:46 |
jrosser | oh no they did it again GPG error: https://packages.erlang-solutions.com/ubuntu focal Release: The following signatures were invalid: BADSIG D208507CA14F4FCA Erlang Solutions Ltd. <packages@erlang-solutions.com> | 11:27 |
evrardjp | good morning | 11:31 |
evrardjp | mmm, that's not fun jrosser... :( | 11:31 |
evrardjp | Did they propose a new gpg signature file? | 11:31 |
jrosser | they do it every time :/ | 11:31 |
jrosser | no, when they release new packages they snafu up their repo | 11:32 |
evrardjp | at some point, I recalled that I wanted to catch up with system provided rabbitmq, so that we can not care about those anymore... | 11:32 |
evrardjp | I suppose that didn't happen? :) | 11:32 |
jrosser | oh well we just got into a big big mess here with that on a bionic -> focal upgrade | 11:32 |
jrosser | so sticking with the same version is most preferable | 11:33 |
evrardjp | :D | 11:33 |
evrardjp | understandable | 11:33 |
* jrosser lunchtime | 11:34 | |
evrardjp | I was checking our osa integrated repo, and I am surprised (well, consider I am old!) to see collection requirements file.. | 11:34 |
jrosser | oh yes we went in fully on collections | 11:35 |
evrardjp | I see we now have in tree roles, out of tree roles, collections | 11:35 |
jrosser | it decouples the ansible version from the module version and is very nice | 11:35 |
jrosser | we can use up to date openstack modules without needed bleeding edge ansible | 11:35 |
evrardjp | oh that's interesting, but what's the difference if it was in tree from a submodule/subtree? | 11:36 |
evrardjp | I don't see ceph-ansible using collections | 11:36 |
evrardjp | (and will we move all roles to collections?) | 11:36 |
evrardjp | is there a documented "future" state of collections for OSA, so I can understand it a bit better? | 11:37 |
jrosser | hmm? | 11:50 |
jrosser | so far this is just for ansible modules | 11:50 |
jrosser | we install ansible-base which doesnt give you any modules at all | 11:51 |
evrardjp | ok | 11:51 |
evrardjp | is there a plan to move all our roles into a certain collection? | 11:51 |
jrosser | so then everything thats actually needed like openstack / rabbitmq / mysql etc has to be specificed in the collection requriements file | 11:51 |
evrardjp | ok so that's not really using galaxy for the collections, it's just manually done, did I get that right? | 11:52 |
jrosser | it uses the galaxy cli to install them, but the locations are git repos rather than the published collections | 11:53 |
jrosser | https://github.com/openstack/openstack-ansible/blob/4d6c3a2ec743e149505e5b9c936dacee6d6d4379/scripts/get-ansible-collection-requirements.yml#L54-L62 | 11:53 |
jrosser | thats becasue the service that sits behind ansible galaxy is not sufficiently reliable for our CI use case | 11:54 |
evrardjp | haha, same when we moved to use galaxy for a-r-r, if you remember ;) | 11:54 |
jrosser | soooooo much job failures in the past that we switched it to install from git repos hosted elswhere | 11:54 |
jrosser | the collections documentation discourages installation from git sources, but tbh real life has said otherwise | 11:55 |
jrosser | evrardjp: would be good to have you reviewing code again :) | 11:58 |
evrardjp | jrosser: I sadly don't have time for that! But I can bring more ppl on the table, which isn't bad either :) | 12:13 |
evrardjp | jrosser: yes I am not surprised about the "do not install from git sources" . But I am more puzzled nowadays on how we managed to make ansible more complex than what it should be ... | 12:14 |
* evrardjp shrugs | 12:14 | |
jrosser | yeah, i find the "roles in collections" thing kind of hard to understand | 12:15 |
jrosser | too used to things just being simple and in git repos i guess | 12:15 |
evrardjp | well, I will be honest, and explain why I am here: I feel things too complex, and I want to help simplify. Maybe not me directly, but at least indirectly. But for that I need proper input of the community ;) | 12:18 |
evrardjp | Are you still using ceph-ansible jrosser ? | 12:19 |
jrosser | we are, though we deploy outside of OSA framework | 12:19 |
evrardjp | I am wondering if we shouldn't be simple like them... finally admit ansible.cfg in our repo ;) | 12:19 |
evrardjp | yeah that's what City Network is doing too | 12:19 |
evrardjp | I find it "good enough" :) | 12:20 |
jrosser | imho ceph-ansible is not a good example | 12:20 |
evrardjp | oh ? | 12:20 |
evrardjp | do you have a better example? | 12:20 |
jrosser | looked at from a step back "is this stuff going to change randomly and break my stuff?" -> probably | 12:20 |
evrardjp | (Fun story, I dug up one old documentation of an "ideal wrapper" that I wanted to bring in RAX long ago) | 12:21 |
evrardjp | well, that's not linked to the structure, that's linked to the content ;) | 12:21 |
jrosser | theres a lot of tech-debt to address in general though | 12:23 |
jrosser | so we can be busy on things related to SSL, secrets store, finally lose openstack-ansible-tests <- very very close now | 12:24 |
jrosser | would be interesting to hear what you think is too complex | 12:25 |
evrardjp | it's the management of x clouds from a CI perspective | 12:26 |
evrardjp | I would be happy to hear what you do there :) | 12:26 |
jrosser | you mean getting toward CD? | 12:29 |
evrardjp | correct | 12:31 |
evrardjp | well, CD is relatively easy: Just run the plays in prod | 12:31 |
evrardjp | proper testing of the plays in an pre-prod environment matching prod is always a challenge... | 12:32 |
evrardjp | I was seeing this problem at RAX, but we structured things differently than CN, so.... | 12:32 |
evrardjp | I am just wondering how the rest of the people are doing | 12:32 |
evrardjp | (one of the technical annoyances is the multiple repos management of multiple sources for multiple environments, and simplifying this sounds key in our place) | 12:33 |
jrosser | from on OSA perspective i think that one of the toughest things is that it's a toolbox | 12:38 |
jrosser | we run a quite large pre-prod environment which is as close to production as we can make it | 12:39 |
jrosser | though it's on an internal network rather than internet which makes that harder | 12:39 |
jrosser | but it is a very large overhead, in fact right now we're rebuilding it to address divergence between it and the prod environment | 12:40 |
jrosser | for multiple environments we create a virtualised deploy host per environment | 12:47 |
jrosser | and those are reproducible, i.e we can destroy one and bring it back including all the state if necessary | 12:47 |
jrosser | ultimately i guess there still needs to be some sort of manifest for that, the OSA SHA, the role SHA, the collection versions, overrides for all our forked repos (doh!) | 12:50 |
opendevreview | Satish Patel proposed openstack/openstack-ansible-os_neutron master: Change OVN metadata protocol to https https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/802135 | 12:51 |
spatel | jrosser do you want me to add centos-8-stream job in this place also? - https://github.com/openstack/openstack-ansible/blob/master/zuul.d/jobs.yaml#L485-L548 | 12:54 |
jrosser | spatel: for the OVN job? | 12:55 |
spatel | for ovs and ovn both | 12:56 |
jrosser | for those i think they only need to go in the os_neutron repo | 12:56 |
spatel | ok i have already added stream in os_neutron repo just need to commit, this is what i going to add - https://paste.opendev.org/show/807765/ | 12:57 |
spatel | That is all we need right? i think we should add ovn job also for stream | 12:59 |
jrosser | oh, well at the moment we are only running ovn jobs on focal though? | 13:01 |
jrosser | i would make small patches that do one thing, like switch the existing centos-8 repos for OVS jobs to stream | 13:02 |
jrosser | then if you want to add centos jobs for OVN make that another, there might be tons of stuff to fix there | 13:02 |
spatel | got it | 13:03 |
spatel | i will keep testing stream in lab for ovn and then pushing it to CI | 13:03 |
jrosser | looks like the calico job is broken | 13:03 |
spatel | let me just add ovs at present | 13:03 |
jrosser | sure | 13:04 |
spatel | I have noticed calico job, i will try to see what is going on.. look like tempest is complaining | 13:04 |
jrosser | yeah, also thats still on bionic | 13:04 |
jrosser | really should be focal but i'm guessing there is some error in the neutron services | 13:05 |
opendevreview | Satish Patel proposed openstack/openstack-ansible-os_neutron master: Adding centos-8-stream job for ovs https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/802701 | 13:05 |
spatel | i will take a look at that... | 13:06 |
jrosser | if you want to just switch the centos-8 jobs over to stream there is no need to keep the old ones | 13:06 |
anskiy | ovn with stream is a little bit broken now | 13:07 |
spatel | it turned out to be easy way fix for ovn-metadata patch - https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/802135/2/templates/neutron_ovn_metadata_agent.ini.j2 | 13:07 |
spatel | jrosser should i remove centos-8 then? | 13:07 |
spatel | i thought for sometime we will keep it but if you want i can remove it and just keep stream | 13:07 |
jrosser | part of the work to do for the next release is to remove centos-8 support | 13:08 |
jrosser | so may as well start | 13:08 |
jrosser | but like i said yesterday keep an eye to what is happening for wallaby | 13:09 |
jrosser | for example see here, stable/wallaby is broken for OVN jobs right now https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/798881 | 13:10 |
spatel | jrosser this is broken because of hostname and this patch will fix it https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/802134 | 13:11 |
jrosser | ok so the thing preventing that merging is calico | 13:11 |
spatel | yes only calico is holding it back.. i can take a look at that otherwise we can set non-voting | 13:12 |
anskiy | I'm currently expiriencing problem with OVN service name on stream, like, there is no ovn-central | 13:12 |
jrosser | anskiy: is this a real deployment or in an AIO? | 13:14 |
anskiy | it's non-aio lab in VMs | 13:14 |
jrosser | if you are able to reproduce this in an AIO it would be super useful | 13:14 |
evrardjp | instead of making non voting, it's probably a good idea to call maintainers to fix it... I am pretty sure some of OSA users are using calico, and they don't want it to break. | 13:14 |
jrosser | anskiy: then we can compare it really directly with what is happening in our CI and getting it fixed quickly | 13:15 |
jrosser | and it would also give us a data point if it does work in AIO but not multinode then it points to different kinds of bugs | 13:15 |
anskiy | jrosser: well, I kinda have a fix, but it's not complete bc the path on rpm-based dists is /etc/sysconfig/<something>, not /etc/default: https://paste.opendev.org/show/807766/ | 13:16 |
anskiy | so I've just manually symlinked it to continue my tests :) | 13:16 |
spatel | jrosser i am removing centos-8 and re-committing patch | 13:17 |
spatel | lets remove which is going to die anyway in few month | 13:17 |
opendevreview | Satish Patel proposed openstack/openstack-ansible-os_neutron master: Adding centos-8-stream job for ovs https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/802701 | 13:18 |
spatel | jrosser what is calico build for? does it doing something different? sorry i have less knowledge around that :) | 13:29 |
spatel | jrosser look like some metadata service is failing for calico build based on logs - curl http://169.254.169.254/latest/meta-data/public-ipv4' failed, exit status: 52 | 13:37 |
jrosser | spatel: calico is an alternative network driver | 13:40 |
jrosser | https://docs.projectcalico.org/about/about-calico | 13:40 |
spatel | May be haproxy all SSL vip issue - https://b2cd002d267de9376201-ea96d6532bea611ab21e2fe90ffd8bb3.ssl.cf2.rackcdn.com/802134/4/check/openstack-ansible-deploy-aio_metal_calico-ubuntu-bionic/b6c1e88/logs/etc/host/calico/felix.cfg.txt | 13:40 |
jrosser | oh thats definatly possible | 13:40 |
spatel | may be doesn't know what protocol to use | 13:40 |
spatel | reading this and not sure it has option for https vs http - https://docs.projectcalico.org/reference/resources/felixconfig | 13:42 |
spatel | Gotta go will be back in 1 hour | 13:43 |
opendevreview | Andrew Bonney proposed openstack/openstack-ansible master: haproxy: decrease check frequency for letsencrypt back ends https://review.opendev.org/c/openstack/openstack-ansible/+/802716 | 14:22 |
opendevreview | Andrew Bonney proposed openstack/openstack-ansible master: haproxy: decrease check interval for letsencrypt back ends https://review.opendev.org/c/openstack/openstack-ansible/+/802716 | 14:25 |
*** rpittau is now known as rpittau|afk | 16:21 | |
dmsimard | noonedeadpunk: the delegate_to stuff sent me down quite the rabbithole | 17:35 |
dmsimard | the two main issues being that ansible returns the data about delegate_to differently depending if it's a loop task or not and then there's the potential for a task to be delegated to multiple hosts at the same time (i.e, with_items: {{ some_host_group }} and then delegate_to: {{ item }} | 17:36 |
dmsimard | I have a workaround for the loop thing but the part about potentially multiple hosts being delegated to for a single task makes it tricky | 17:38 |
*** prometheanfire is now known as Guest2621 | 19:34 | |
opendevreview | David Moreau Simard proposed openstack/openstack-ansible master: DNM: Test ara 1.5.7rc2 with --diff https://review.opendev.org/c/openstack/openstack-ansible/+/696634 | 23:04 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!