*** spatel has quit IRC | 00:03 | |
*** macz_ has quit IRC | 00:46 | |
*** tosky has quit IRC | 00:54 | |
*** openstackgerrit has quit IRC | 00:58 | |
*** rfolco has joined #openstack-ansible | 01:10 | |
*** rfolco has quit IRC | 01:28 | |
*** cshen has joined #openstack-ansible | 01:45 | |
*** cshen has quit IRC | 01:50 | |
*** spatel has joined #openstack-ansible | 03:11 | |
*** cshen has joined #openstack-ansible | 03:45 | |
*** cshen has quit IRC | 03:50 | |
*** openstackgerrit has joined #openstack-ansible | 04:40 | |
openstackgerrit | Satish Patel proposed openstack/openstack-ansible-openstack_hosts master: Fix caps issue to enable powertools repo https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/765906 | 04:40 |
---|---|---|
openstackgerrit | Satish Patel proposed openstack/openstack-ansible-openstack_hosts master: Fix caps issue to enable powertools repo https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/765906 | 04:42 |
openstackgerrit | Satish Patel proposed openstack/openstack-ansible-openstack_hosts master: Fix caps issue to enable powertools repo https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/765906 | 04:47 |
openstackgerrit | Satish Patel proposed openstack/openstack-ansible-openstack_hosts master: CentOS 8.3 Fix caps issue to enable powertools repo https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/765906 | 04:53 |
*** evrardjp has quit IRC | 05:33 | |
*** evrardjp has joined #openstack-ansible | 05:33 | |
*** cshen has joined #openstack-ansible | 05:45 | |
*** cshen has quit IRC | 05:50 | |
*** cshen has joined #openstack-ansible | 06:25 | |
*** cshen has quit IRC | 06:30 | |
openstackgerrit | Satish Patel proposed openstack/openstack-ansible-openstack_hosts master: Add support of CentOS 8.3 for aio https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/765906 | 06:40 |
*** rgogunskiy has joined #openstack-ansible | 06:55 | |
*** miloa has joined #openstack-ansible | 07:10 | |
*** spatel has quit IRC | 07:20 | |
noonedeadpunk | mornings | 07:22 |
*** pto_ has joined #openstack-ansible | 07:32 | |
*** pto has quit IRC | 07:33 | |
*** gyee has quit IRC | 07:37 | |
*** pto_ has quit IRC | 07:45 | |
*** pto has joined #openstack-ansible | 07:45 | |
*** luksky has joined #openstack-ansible | 08:04 | |
*** cshen has joined #openstack-ansible | 08:09 | |
*** rpittau|afk is now known as rpittau | 08:10 | |
*** pcaruana has joined #openstack-ansible | 08:15 | |
*** andrewbonney has joined #openstack-ansible | 08:16 | |
jrosser | morning | 08:16 |
jrosser | i guess we are going to need to backport https://review.opendev.org/c/openstack/openstack-ansible-tests/+/765839 | 08:17 |
noonedeadpunk | yeah, looks like we does | 08:33 |
*** sep has left #openstack-ansible | 08:41 | |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible master: Ensure kuryr repo is available within CI images https://review.opendev.org/c/openstack/openstack-ansible/+/765765 | 08:41 |
jrosser | ^ thats the blocker for zun i think | 08:42 |
noonedeadpunk | I don't think depends-on will work here | 08:42 |
noonedeadpunk | because we clone tests repo with run_tests? | 08:42 |
noonedeadpunk | or we check if we're in ci there? | 08:42 |
noonedeadpunk | (can'tactually recall) | 08:42 |
jrosser | iirc there is a symlink perhaps, but it could be broken | 08:42 |
noonedeadpunk | will see | 08:43 |
jrosser | https://github.com/openstack/openstack-ansible/blob/master/run_tests.sh#L67-L88 | 08:45 |
noonedeadpunk | oh | 08:45 |
noonedeadpunk | I'm a bit afraid of merging kyryr on master... | 08:48 |
jrosser | yes i saw your comment | 08:49 |
jrosser | this is a bit tricky | 08:49 |
noonedeadpunk | I'm not sure about how exactly kuryr works, but couldn't this result in broken db migrations or smth like this? | 08:49 |
jrosser | the zun stuff wont work without https://opendev.org/openstack/kuryr/commit/d36befada61e1376479536c0e62d1e769eee846c | 08:50 |
noonedeadpunk | won't there be some migration problems when we downgrade the version of kuryr? | 08:51 |
noonedeadpunk | just cherry-picked it to victoria | 08:51 |
jrosser | awesome thanks | 08:51 |
jrosser | i'm not sure - but afaik kuryr is a network driver for docker | 08:52 |
jrosser | *neutron driver for docker | 08:52 |
andrewbonney | Yeah, as far as I can see it's relatively separate given they're intending to remove it in a future release | 08:52 |
andrewbonney | Fwiw the changes to master since branching on kuryr look mostly packaging/testing related to date | 08:53 |
jrosser | the only reason that andrewbonney patch puts master is that there is no backport of the fix yet merged | 08:53 |
jrosser | so really depends how much we want to merge the zun stuff now, or wait for the kuryr backport | 08:54 |
noonedeadpunk | yeah, I got that. just thinking how safe would be to do rc on master. considering it's just rc - I guess that should be ok... | 08:54 |
noonedeadpunk | (considering we mostly likely will forget to bump it back) | 08:55 |
noonedeadpunk | looking through code quickly didn't find anything that could break things while downgrading, so agree, let's merge it | 08:55 |
jrosser | we had an etherpad for V didnt we? | 08:56 |
noonedeadpunk | I think we were writing to the ptg one | 08:56 |
noonedeadpunk | https://etherpad.opendev.org/p/osa-wallaby-ptg | 08:56 |
noonedeadpunk | oh, btw, we need to backport that to U https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/754122/9/vars/redhat.yml | 08:58 |
noonedeadpunk | in more nasty way to support centos 7 as well... | 08:58 |
jrosser | looks like the depends-on has worked in the linters job 2020-12-08 08:50:54.517944 | ubuntu-bionic | ++ sudo /usr/bin/pip3 install 'bindep>=2.4.0' tox 'virtualenv<20.2.2' | 08:59 |
jrosser | but then sadly ARA has bombed out | 08:59 |
jrosser | sqlite3.OperationalError: no such table: playbooks | 09:00 |
noonedeadpunk | but linters overall suceeded | 09:00 |
jrosser | yes, should we make a tiny change to the patch to cancel the job and re-try | 09:01 |
noonedeadpunk | 765765 is still passing? | 09:01 |
*** rfolco has joined #openstack-ansible | 09:03 | |
noonedeadpunk | so let it run maybe | 09:03 |
jrosser | oh you are right - i thought that the ara error would be failing it | 09:04 |
jrosser | so for https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/754122/9/vars/redhat.yml | 09:04 |
jrosser | do we need a nasty backport to U for centos7 or just switch the job to centos 8 on U | 09:04 |
jrosser | it won't cherry-pick in gerrit but it seems to be clean when i do the whole patch here | 09:05 |
noonedeadpunk | well, we need to ensure that things will work against centos 7 as well? | 09:05 |
noonedeadpunk | and on centos7 mod_proxy_uwsgi is needed | 09:06 |
noonedeadpunk | and we can't just use ternary here | 09:06 |
noonedeadpunk | (as empty element in list makes package module fail) | 09:06 |
jrosser | so you'd like a 7 and 8 job on U? | 09:06 |
noonedeadpunk | no, just centos 8 job, but, if we just drop mod_proxy_uwsgi - we will break migration path for centos 7? | 09:07 |
noonedeadpunk | as on U we still support 7? | 09:07 |
*** rfolco has quit IRC | 09:07 | |
jrosser | we can split out redhat-7.yml and redhat-8.yml vars files on U as part of the backport | 09:08 |
noonedeadpunk | well, upgrade will work since package will be present... | 09:08 |
jrosser | kind of ugly too.... | 09:08 |
*** pcaruana has quit IRC | 09:11 | |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_keystone stable/ussuri: Move openstack-ansible-uw_apache centos job to centos-8 https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/765928 | 09:12 |
noonedeadpunk | maybe like this? ^ | 09:13 |
jrosser | yes looks reasonable, needs the zuul jobs for 7 to stay though | 09:14 |
jrosser | i always forget we can build the vars inline like that | 09:15 |
noonedeadpunk | it's nasty as well though :( | 09:15 |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_keystone stable/ussuri: Move openstack-ansible-uw_apache centos job to centos-8 https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/765928 | 09:16 |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-os_keystone master: Remove centos-7 conditional packages https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/765931 | 09:19 |
noonedeadpunk | and we need https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/765906 as well.... | 09:25 |
noonedeadpunk | except I'm not sure about https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/765906/5/vars/redhat-8.yml | 09:26 |
* jrosser wonders where the zuul job for that is | 09:27 | |
jrosser | thats testing > 8.3? sounds odd | 09:28 |
noonedeadpunk | yeah | 09:28 |
noonedeadpunk | And I think that check of the kernel version should work here | 09:29 |
noonedeadpunk | ah, it has 4.18.0 kernel... | 09:32 |
jrosser | i expect those vars have just been copied straight over from debian/ubuntu | 09:35 |
jrosser | so it's probably reasonable to have a different version for centos | 09:35 |
*** pto has quit IRC | 09:35 | |
*** pto has joined #openstack-ansible | 09:36 | |
jrosser | all these variables we have like cinder_git_project_group: ... do they do anything any more | 09:39 |
jrosser | or just history from repo_build? | 09:39 |
*** SiavashSardari has joined #openstack-ansible | 09:40 | |
noonedeadpunk | iirc history from repo_build | 09:41 |
noonedeadpunk | let me check bump script just in case | 09:42 |
noonedeadpunk | no, can't find anything | 09:50 |
noonedeadpunk | I guess we can drop it now | 09:50 |
SiavashSardari | morning | 09:50 |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-os_aodh master: Remove centos-7 conditional packages https://review.opendev.org/c/openstack/openstack-ansible-os_aodh/+/765934 | 09:51 |
*** rfolco has joined #openstack-ansible | 09:51 | |
SiavashSardari | noonedeadpunk do we have a plan for backporting 'Explicitly use rabbitmq collection' in rabbitmq_server repo to ussuri branch? | 09:52 |
SiavashSardari | of course with all the dependencies in role requirements, etc. | 09:53 |
noonedeadpunk | Hi. I think no | 09:53 |
noonedeadpunk | We use ansible 2.9 in U and we won't change that | 09:53 |
noonedeadpunk | we probably can use collections, but not sure there were any changes in collection compared to module in 2.9 | 09:54 |
SiavashSardari | I checked rabbitmq collection and it says it requires ansible 2.9+ so I thought maybe it's not a bad idea | 09:54 |
noonedeadpunk | the problem in collections with 2.9 is that you can't install them from git, and ansible galaxy is soooooooo unstable | 09:55 |
SiavashSardari | oh, didn't know that. | 09:56 |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-os_ceilometer master: Remove centos-7 conditional configuration https://review.opendev.org/c/openstack/openstack-ansible-os_ceilometer/+/765956 | 09:57 |
SiavashSardari | but we use openstack collection with galaxy. should that change too? | 09:58 |
noonedeadpunk | iirc we needed some change that was present only in collection | 09:58 |
noonedeadpunk | well we can backport, but honestly it's pretty much work without any feasable profit | 09:59 |
noonedeadpunk | we have fixed rabbit version which works with build in module | 09:59 |
noonedeadpunk | and see no reason to do work that we can avoid doing :) | 10:00 |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/openstack-ansible-openstack_hosts master: Add support of CentOS 8.3 for aio https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/765906 | 10:02 |
*** pto has quit IRC | 10:02 | |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Ensure kuryr repo is available within CI images https://review.opendev.org/c/openstack/openstack-ansible/+/765765 | 10:03 |
SiavashSardari | I got your point. I'm going to add some rabbitmq shovels for our internal use, and I wanted to use collections on U release. I guess I use ansible modules till next upgrade. thank you | 10:06 |
jrosser | SiavashSardari: on ussuri we really took a quite conservative approach to collections, pretty much to try things out. At that time the openstack module shad just been moved to a collection and we wanted to get at the updated modules there | 10:07 |
jrosser | though for Victoria we have switched completely by going from ansible to ansible-base and complete use of collecitons | 10:08 |
noonedeadpunk | we don't even have USER_COLLECTION_FILE for U | 10:09 |
jrosser | in particular we were very much stuck with rabbitmq upgrades becasue the newer version of rabbit we wanted changed things which broke the module inside ansible | 10:09 |
jrosser | that was a reason for needing to move to the collection | 10:09 |
noonedeadpunk | I think we got this change merged at the end of the day :) | 10:10 |
SiavashSardari | Thanks for the explanation. | 10:10 |
SiavashSardari | qq: I've never came across USER_COLLECTION_FILE in OSA. what is that? | 10:11 |
noonedeadpunk | you can define set of collections that would be installed with bootstrap-ansible script | 10:12 |
jrosser | SiavashSardari: https://github.com/openstack/openstack-ansible/commit/ef1061a021a9b557d3dfb7f6e632b078e81e2f08 | 10:12 |
noonedeadpunk | by default it's user-collection-requirements.yml file in openstack-ansible path, but it can be adjusted with $USER_COLLECTION_FILE env var | 10:12 |
jrosser | that will be in master/victoria | 10:12 |
noonedeadpunk | yeah :) | 10:12 |
SiavashSardari | That is a great idea, if we had it in U, that would help me a lot. for now I use ansible-collection-requirement but definitely looking forward to nex release to take advantage of that. | 10:15 |
*** lkoranda has joined #openstack-ansible | 10:15 | |
SiavashSardari | jrosserThanks for the link | 10:16 |
noonedeadpunk | uh, how centos is annoying... | 10:21 |
noonedeadpunk | they seem to have exactly the same kernel, but just modules are merged | 10:22 |
noonedeadpunk | whaaat | 10:22 |
jrosser | thats not just an artefact of the CI node is it? | 10:24 |
noonedeadpunk | nope... `Red Hat Enterprise Linux 8.3 is distributed with the kernel version 4.18.0-240.` https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/8.3_release_notes/index#enhancement_kernel | 10:25 |
noonedeadpunk | well. 8.2 had 4.18.0-193 | 10:25 |
noonedeadpunk | I'm not sure we can catch this with version test | 10:26 |
noonedeadpunk | and probably doing spatel's way with checking distro is more reliable then... | 10:26 |
*** pto has joined #openstack-ansible | 10:27 | |
*** ygk_12345 has joined #openstack-ansible | 10:51 | |
*** ygk_12345 has left #openstack-ansible | 10:52 | |
noonedeadpunk | and what a bad timing for all of this each time... | 10:53 |
noonedeadpunk | I feel like we might have a cross-dependency between 765839 and 765906 | 10:54 |
noonedeadpunk | oh no.... http://paste.openstack.org/show/800825/ | 10:57 |
noonedeadpunk | that's on centos 8.3 | 10:59 |
noonedeadpunk | well, it might be pretty ok, probably this was happening before, but legacy install (without wheel build) also fails... | 11:01 |
*** SiavashSardari has quit IRC | 11:18 | |
*** lkoranda has quit IRC | 11:24 | |
jrosser | noonedeadpunk: for 765839 shall i make the centos-8 job nv then we merge it? | 12:00 |
jrosser | otherwise we are in difficulty | 12:00 |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-tests master: Bump virtualenv to version prior to 20.2.2 https://review.opendev.org/c/openstack/openstack-ansible-tests/+/765839 | 12:05 |
noonedeadpunk | yeah good choice as CI is much faster for 765839 | 12:09 |
jrosser | so the wheel build failure for systemd-python, is that something *else* we need to fix (!) ? | 12:11 |
noonedeadpunk | yeah.... | 12:11 |
noonedeadpunk | maybe it was just some floating thing... | 12:11 |
noonedeadpunk | but I'm afraid it's not | 12:13 |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-tests master: Return centos-8 jobs to voting https://review.opendev.org/c/openstack/openstack-ansible-tests/+/765986 | 12:13 |
*** fanfi has joined #openstack-ansible | 12:48 | |
*** fanfi has quit IRC | 13:02 | |
*** rgogunskiy has quit IRC | 13:11 | |
admin0 | utility container .. TASK [python_venv_build : Install python packages into the venv] almost hangs and takes ages .. is that normal ? | 13:13 |
admin0 | i am building 3 clusters in parallel, i see the same behaviour .. especially in 21.2.0 .. i don't think it was this long in 21.1.0 | 13:14 |
*** pto has quit IRC | 13:14 | |
*** pto has joined #openstack-ansible | 13:14 | |
*** pto has quit IRC | 13:19 | |
*** spatel has joined #openstack-ansible | 13:19 | |
*** pto has joined #openstack-ansible | 13:19 | |
*** owalsh has quit IRC | 13:20 | |
*** rfolco has quit IRC | 13:22 | |
*** rfolco has joined #openstack-ansible | 13:22 | |
jrosser | admin0: it's not normal, but without some debug it's hard to say | 13:23 |
jrosser | it will be probably building the wheels, and there is a log for that in the repo server container | 13:23 |
*** miloa has quit IRC | 13:26 | |
*** tosky has joined #openstack-ansible | 13:44 | |
admin0 | jrosser, in utility container -- python_venv_build : Install python packages into the venv - the repo is built in utility ? | 13:46 |
jrosser | the python wheels are built in the repo container | 13:47 |
jrosser | things are being installed into a venv in the utility container, like the openstack client | 13:48 |
jrosser | but the stuff that goes into that venv is actually compiled on the repo container | 13:48 |
admin0 | aah .now clear | 13:49 |
jrosser | it is running the python_venv_build ansible role to do that | 13:50 |
jrosser | so TASK [python_venv_build : Install python packages into the venv] almost hangs and takes ages .. is that normal ? | 13:50 |
jrosser | ^ with that my first debugging would be to go look at the log in the repo container | 13:50 |
openstackgerrit | Marc Gariépy proposed openstack/openstack-ansible-os_horizon master: Add ability to configure ALLOWED_HOSTS for horizon. https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/765998 | 13:52 |
spatel | jrosser: centos 8.3 is out and its failing to build AIO at https://opendev.org/openstack/openstack-ansible-lxc_hosts/src/branch/master/tasks/lxc_cache_preparation.yml#L85 | 14:03 |
spatel | look like issue of epel-lxc_host.repo | 14:03 |
spatel | I am debugging to see what is going on | 14:03 |
jrosser | you've made a patch though - or something different? | 14:04 |
jrosser | i am also just running some stuff in the 8.3 VM | 14:04 |
spatel | I did patch here https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/765906 | 14:04 |
spatel | but now i am stuck at next step which i am debugging and will commit patch if i find something | 14:05 |
spatel | it would be good to have Zuul job for 8.3 | 14:06 |
jrosser | spatel: it sadly doesnt work like | 14:07 |
jrosser | that | 14:07 |
jrosser | one day it is 8.2 the next it is 8.3 already in our jobs | 14:07 |
jrosser | so everything is broken completely | 14:07 |
spatel | jrosser: look like i was having issue last night but right now its working, may something was funky about repo last night | 14:10 |
spatel | jrosser: +1 | 14:10 |
jrosser | noonedeadpunk: i have reproduced the failure to build systemd-python | 14:11 |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Cleanup group_vars https://review.opendev.org/c/openstack/openstack-ansible/+/766001 | 14:16 |
noonedeadpunk | jrosser: out of logs it looked like something I have no idea how to fix, except symlink module from system python... | 14:17 |
jrosser | yeah i'm just trying to find what is going on | 14:18 |
jrosser | in expansion of macro ‘LIBSYSTEMD_VERSION’ <- not even finding where that is defined right now | 14:18 |
noonedeadpunk | maybe it's get's defined during hweels build? | 14:19 |
noonedeadpunk | I think, wheels build were failing previously as well, but then they got installed properly somehow... | 14:19 |
jrosser | it's from libsystemd-dev headers i think | 14:22 |
jrosser | spatel: have you got any centos8 not 8.3 hosts around? | 14:23 |
spatel | yes i have centos 8.3 in lab | 14:23 |
jrosser | no, i want earlier 8.2 or something | 14:24 |
spatel | i have 8.2 also | 14:24 |
jrosser | ok can you try `pkg-config --modversion libsystemd` | 14:24 |
spatel | doing it on 8.2 | 14:24 |
spatel | http://paste.openstack.org/show/800838/ | 14:25 |
jrosser | if thats an OSA install can you try the same in the repo container? | 14:26 |
spatel | same error inside containers | 14:27 |
mgariepy | haproxy_enpoint set state failed to connect on master now ? | 14:29 |
jrosser | spatel: if you don't mind installing systemd-devel package it should give a proper output | 14:30 |
spatel | doing it now | 14:30 |
-spatel- [root@infra-lxb-1 ~]# pkg-config --modversion libsystemd | 14:31 | |
-spatel- 239 (239-41.el8_3) | 14:31 | |
jrosser | oh hrrm will that have installed the 8.3 version of that package? | 14:32 |
* jrosser curses centos | 14:32 | |
spatel | but you can get version info using systemctl --version | 14:33 |
spatel | why do you want to use pkg-config ? | 14:33 |
jrosser | the wheel build is compiling C code | 14:34 |
jrosser | and C build systems make heavy use of pkg-config to find out 'things' like versions and required linker flags about system libraries | 14:35 |
spatel | hmm | 14:35 |
jrosser | and that is where centos8.3 is currently going all wrong | 14:35 |
*** cshen has quit IRC | 14:35 | |
jrosser | i want to compare the output from that pkg-config command from 8.3 with something earlier | 14:36 |
spatel | ok | 14:36 |
jrosser | because if you take the output you just gave `239 (239-41.el8_3)` and try to do some primitive version compare on that string it's going to not work | 14:36 |
jrosser | when i do the same on an ubuntu bionic box i get a plain version like `237` | 14:37 |
spatel | got it | 14:37 |
openstackgerrit | Marc Gariépy proposed openstack/openstack-ansible-os_horizon master: Add ability to configure ALLOWED_HOSTS for horizon. https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/765998 | 14:39 |
spatel | jrosser: my playbook also failing at TASK [python_venv_build : Build wheels for the packages to be installed into the venv] | 14:57 |
jrosser | for keystone? | 14:57 |
spatel | yes | 14:57 |
jrosser | yes that is where it is also failing ehre | 14:57 |
spatel | http://paste.openstack.org/show/800842/ | 14:57 |
mgariepy | https://blog.centos.org/2020/12/future-is-centos-stream/ << more fun to come. | 14:58 |
jrosser | spatel: sure, noonedeadpunk has already posted the underying error before http://paste.openstack.org/show/800825/ | 14:59 |
jrosser | this is why i want the pkg-config output | 14:59 |
spatel | jrosser: ok let me know when you cut patch | 15:00 |
noonedeadpunk | whaaat - another fedora? | 15:01 |
mgariepy | If you are using CentOS Linux 8 in a production environment, and are concerned that CentOS Stream will not meet your needs, we encourage you to contact Red Hat about options. | 15:02 |
noonedeadpunk | yeah, was really reading this | 15:02 |
noonedeadpunk | Well, this feels totally like IBM influence | 15:02 |
spatel | If you are using CentOS Linux 8 in a production environment, and are concerned that CentOS Stream will not meet your needs, we encourage you to contact Red Hat about options. | 15:03 |
noonedeadpunk | As well as the point to drop CentOS support | 15:03 |
spatel | yike... | 15:03 |
jrosser | this is just too much | 15:03 |
jrosser | as it breaks all the branches all the time | 15:03 |
jrosser | nothing is stable ever again | 15:03 |
mgariepy | yep. is sounds IBM a lot.. | 15:03 |
mgariepy | let's drop centos and add arch. ;p | 15:04 |
noonedeadpunk | it's like - how are we supposed to have RHEL when we're having centos for free which is as stable as rhel... | 15:04 |
noonedeadpunk | at least arch has nice docs lol. But not sure if anybody using it in prod? | 15:04 |
mgariepy | i use that in prod as on my laptop lol | 15:05 |
mgariepy | but i would never do servers with it hahaha | 15:05 |
mgariepy | i am not insane (at least i don't think i am) | 15:05 |
spatel | This is good news for Ubuntu community.. | 15:06 |
mgariepy | i don't think it's a good news for ubuntu but it's a really bad one for centos | 15:07 |
noonedeadpunk | +1 | 15:07 |
spatel | IBM wants more $$ (no free donuts) | 15:08 |
noonedeadpunk | pretty much afraid that ubuntu may follow the pattern | 15:08 |
ThiagoCMC | BTW, I've never ever used CentOS, for me, it was always unstable AH, not to mention super hard to install and maintain things and a dozen of repos. Debian has thousands of packages ready-to-go, no need to add third party repos. | 15:10 |
spatel | Time to fire up gentoo lab | 15:10 |
ThiagoCMC | Debian is rock solid, Ubuntu will keep the LTS release as-is, I bet. | 15:10 |
mgariepy | let;s fork rhel to eyebeeemOS | 15:10 |
ThiagoCMC | Let | 15:11 |
ThiagoCMC | Let's forget about RH-based distros lol | 15:11 |
jrosser | this is a serious point though - i've spent pretty much all my day today trying to figure out WTF is going on with a distro i don't even use | 15:12 |
jrosser | this is not sustainable if it's going to change all the time | 15:12 |
spatel | agreed, anyway next year it will be over | 15:13 |
spatel | i don't think people will use CentOS stream in production | 15:13 |
mgariepy | what's the point to work on it now if it's all over in a year? | 15:13 |
spatel | Still some folks like me using it :) | 15:13 |
spatel | may be next year i have to decided which way to go | 15:14 |
mgariepy | well. yep but. hey install this, then switch your os to something else. | 15:14 |
mgariepy | before upgrading. | 15:14 |
ThiagoCMC | If OSA focus only on Debian/Ubuntu, I'm super happy! Debian is awesome, even with systemd. lol | 15:14 |
jrosser | on a positive note i think i have a gross workaround for the keystone systemd_python build error | 15:15 |
mgariepy | LOL | 15:15 |
kleini | systemd is great, maybe sometimes buggy | 15:15 |
spatel | Once i install it i am not going to touch for next few years.. | 15:15 |
jrosser | spatel: do you have an AIO at the point keystone failed? | 15:15 |
spatel | yes jrosser | 15:16 |
jrosser | spatel: can you stick LIBSYSTEMD_VERSION="239" into /etc/environment inside the repo container and re-run the keystone playbook? | 15:16 |
spatel | ok | 15:17 |
jrosser | this may/may not work :/ | 15:17 |
spatel | running playbook | 15:18 |
spatel | jrosser: now it failed at different point - http://paste.openstack.org/show/800843/ | 15:20 |
jrosser | thats odd - did the python_venv_build step work though? | 15:21 |
jrosser | could do with a big chunk of the log if you can paste it | 15:22 |
noonedeadpunk | spatel: can you run with -e venv_rebuild=true? | 15:23 |
spatel | k | 15:23 |
jrosser | noonedeadpunk: i am trying to take advantage of this https://github.com/systemd/python-systemd/blob/d08f8dd0f4607a72f1d5497467a2f0cf5a8ee5d4/setup.py#L24-L28 | 15:24 |
noonedeadpunk | smart move | 15:24 |
noonedeadpunk | had no idea they was thinking about such situation in advance | 15:25 |
mgariepy | since 2016. | 15:25 |
spatel | noonedeadpunk: -e venv_rebuild=true did magic.. | 15:28 |
noonedeadpunk | lol | 15:28 |
spatel | jrosser: are you going to patch repo using LIBSYSTEMD_VERSION="239" ? | 15:30 |
jrosser | well it's figuring out a way to do that now which isnt breaking everything else! | 15:30 |
spatel | +1 | 15:30 |
jrosser | we don't generally drop environment variables much | 15:31 |
noonedeadpunk | we can do that on openstack_hosts for centos 8 only... and eventually I think we can just better parse output with regexp? | 15:32 |
jrosser | that would be good - you can guarantee that the version will change somehow :) | 15:33 |
spatel | why not just doing if distro==CentOS8 do systemctl --version | head -n1 | awk '{print $2}' otherwise systemctl --version | 15:37 |
noonedeadpunk | otherwise we shouldn't do anything actually :p | 15:37 |
spatel | :) | 15:37 |
spatel | Anyway this regex is for only 1 year.. not sure what centOS stream version will look like. | 15:39 |
jrosser | actually thats a good point - do we have code already somewhere that needs the systemd version | 15:39 |
jrosser | this var may already exist | 15:39 |
noonedeadpunk | hm, might be... can;t recall exactly where we might have it | 15:39 |
jrosser | we have some tasks here to copy https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/common-tasks/os-nspawn-container-setup.yml#L16-L29 | 15:41 |
noonedeadpunk | just lineinfile instead of the set_fact yeah... | 15:42 |
jrosser | i forget how we manage /etc/environment, is it copied from the host into the lxc? | 15:44 |
*** macz_ has joined #openstack-ansible | 15:44 | |
mgariepy | global_environment_variables | 15:44 |
*** tosky has quit IRC | 15:45 | |
mgariepy | https://github.com/openstack/openstack-ansible-openstack_hosts/blob/master/defaults/main.yml#L147 | 15:45 |
mgariepy | the same template exist for lxc_container_create | 15:47 |
mgariepy | template/var .. | 15:47 |
mgariepy | https://github.com/openstack/openstack-ansible-openstack_hosts/blob/master/templates/environment.j2 | 15:47 |
noonedeadpunk | oh well, we should jsut update template https://opendev.org/openstack/openstack-ansible-openstack_hosts/src/branch/master/templates/environment.j2 | 15:47 |
mgariepy | and this one: https://github.com/openstack/openstack-ansible-lxc_container_create/blob/master/templates/environment.j2 | 15:48 |
*** macz_ has quit IRC | 15:48 | |
admin0 | spatel, in your config, exact run .. 2020-12-08 15:47:57.748 48729 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Bridge br-lbaas for physical network lbaas does not exist. Agent terminated! -- is the first error i encounter on the compute nodes | 15:48 |
spatel | i think we copy /etc/environment file from host to container right? | 15:48 |
mgariepy | no not really, it's generated the same way tho. | 15:48 |
*** macz_ has joined #openstack-ansible | 15:48 | |
noonedeadpunk | isn't openstack_hosts also run against lxc containers? | 15:48 |
admin0 | do i need to create a blank br-lbaas in compute nodes also ? | 15:48 |
spatel | admin0: don't create br-lbaas on compute nodes | 15:49 |
admin0 | ok | 15:49 |
spatel | compute will use br-vlan to tunnel br-lbaas-mgmt vlan traffic | 15:49 |
admin0 | what to do to that error message that causes neutron-linuxbridge-agent to die ? | 15:50 |
noonedeadpunk | so these might be conflicting blocks ?? | 15:50 |
jrosser | admin0: you should not have a physical network 'lbaas', that suggests you still have config for the flat network in place | 15:52 |
noonedeadpunk | I think we might need to drop this from lxc_container_create | 15:52 |
jrosser | i have the start of a patch which i will push shortly for openstack_hosts | 15:52 |
jrosser | even if it needs some improvement | 15:52 |
admin0 | jrosser, type is raw .. i directly copied the blocks from https://satishdotpatel.github.io//openstack-ansible-octavia/ | 15:53 |
jrosser | ok, so look in the neutron config file that has been templated out and see what you have | 15:54 |
*** cshen has joined #openstack-ansible | 15:55 | |
admin0 | hmm. i think i accidently had mgariepy linuxbridge-override in my config | 15:56 |
admin0 | fixing .. | 15:56 |
*** gyee has joined #openstack-ansible | 15:58 | |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-openstack_hosts master: Fix libsystemd version for Centos https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/766030 | 15:58 |
jrosser | hrrm that still has the delegate_to physical_host in it | 16:00 |
jrosser | also i saw that we still use the 8.1 container base image but i think there was trouble moving newer than that? | 16:00 |
noonedeadpunk | yep | 16:01 |
jrosser | not really sure about this patch tbh - hack at it if you think it needs changing | 16:02 |
noonedeadpunk | but, we run upgrade of all packages in image before storing them and creating any container from it | 16:02 |
admin0 | ok .. its running .. now in the octavia logs, i see Failed to establish a new connection: [Errno 113] No route to host .. this is the container trying to contact the amphora instance | 16:03 |
spatel | Yes, i have seen 8.1 but we do run upgrade so now i am seeing 8.3 version in containers | 16:03 |
admin0 | as per the example, i see br-vlan.27 and the patch is there .. | 16:03 |
admin0 | so br-lbaas is patched to .27 .. and the amphora instance has the .27 vlan ports | 16:03 |
jrosser | admin0: inside the octavia container do you see eth14 with and IP you expect? | 16:04 |
spatel | can you post full brctl show output? | 16:04 |
admin0 | sure one moment please | 16:04 |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-openstack_hosts master: Fix libsystemd version for Centos https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/766030 | 16:05 |
noonedeadpunk | oh btw | 16:05 |
noonedeadpunk | #startmeeting openstack_ansible_meeting | 16:05 |
openstack | Meeting started Tue Dec 8 16:05:46 2020 UTC and is due to finish in 60 minutes. The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:05 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:05 |
*** openstack changes topic to " (Meeting topic: openstack_ansible_meeting)" | 16:05 | |
openstack | The meeting name has been set to 'openstack_ansible_meeting' | 16:05 |
noonedeadpunk | We don't have new bugs that are worth discussing, so | 16:06 |
noonedeadpunk | #topic office hours | 16:06 |
*** openstack changes topic to "office hours (Meeting topic: openstack_ansible_meeting)" | 16:06 | |
noonedeadpunk | well, and here I think we're on the same page, but just to sum up what has went wrong during last week | 16:07 |
noonedeadpunk | 1. CentOS 8.3 released and break master and ussuri | 16:07 |
noonedeadpunk | 2. new virtualenv has also released which breaks our linter jobs | 16:08 |
admin0 | i have not added vlan 27 in the router an .1 as the gateway ? i should do that ? | 16:08 |
noonedeadpunk | and that in the time where we were absolutely ready to make rc | 16:08 |
admin0 | spatel, https://gist.github.com/a1git/4368656babd5d74753eb1ce3e5c2bc83 | 16:09 |
noonedeadpunk | 766030 sound like smth that should work. except we probqbly need to squash it with 765906? | 16:09 |
noonedeadpunk | how do you think jrosser? | 16:09 |
jrosser | oh right yes, its not going to work on it's own | 16:11 |
noonedeadpunk | Well, except this I don't have much topics to discuss... Looking forward to fix gates and merge zun stuff to be able to branch | 16:14 |
noonedeadpunk | oh,well, also placed a bit scary thing - https://review.opendev.org/c/openstack/openstack-ansible/+/766001 | 16:14 |
noonedeadpunk | it made me pretty much frustrated because of inconsistency between roles and their behaviour | 16:14 |
noonedeadpunk | I think we should also move service_region and package_state to role defaults | 16:15 |
jrosser | oh nice cleanup there | 16:15 |
noonedeadpunk | eventually I faced that octavia role was creating me internal uri with http while all other internal urls set to https, as I had openstack_service_internaluri_proto: https in overrides... | 16:17 |
noonedeadpunk | and cant stop myself lol | 16:18 |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-openstack_hosts master: Fix libsystemd version for Centos https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/766030 | 16:18 |
admin0 | i added .1 on router for vlan 27 ..and also added one ip of that range in br-lbaas eth14 .. now the errors are not coming, i cannot still ping but it says Mark READY in DB for amphora: 1808c89b-2eeb-4f34-b6ab-b1a9536fab49 with compute id 1a018808-1467-4744-9acc-f1b3cbc0d717 | 16:18 |
noonedeadpunk | I tried to check that removed stuff is not used anywhere and equal to defaults, but worth double checking | 16:18 |
admin0 | that logs of non stop error connecting is not there yet | 16:18 |
admin0 | but since i cannot ping, i dunno if its working or not working | 16:18 |
jrosser | noonedeadpunk: there was also the vars we talked about from the openstack_services.yml file | 16:19 |
jrosser | https://review.opendev.org/c/openstack/openstack-ansible/+/766001 is passing apart from the centos things so it's not so bad to go in an rc | 16:20 |
noonedeadpunk | ah, indeed, git_project_group | 16:21 |
noonedeadpunk | well, the most scary thing with 766001 as it's not checking all roles.... | 16:21 |
noonedeadpunk | oh, btw, woerth saying, I placed a deprecation patches for galera_client roles https://review.opendev.org/q/topic:%22osa%252Fdeprecate_galera_client%22+(status:open%20OR%20status:merged) | 16:26 |
noonedeadpunk | and, placed patches to revive monasca repos https://review.opendev.org/q/topic:%22osa%252Frevive_monasca%22+(status:open%20OR%20status:merged) | 16:26 |
noonedeadpunk | mensis volunteered to submit required fixes to make role functional - he has working role for U so see no reason not to revive repo | 16:27 |
jrosser | ok so we need this to merge now i think https://review.opendev.org/c/openstack/openstack-ansible-tests/+/765839 | 16:29 |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Remove *_git_project_group variables https://review.opendev.org/c/openstack/openstack-ansible/+/766039 | 16:31 |
noonedeadpunk | any extra vote?:) | 16:31 |
*** nurdie has joined #openstack-ansible | 16:31 | |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-tests master: Return centos-8 jobs to voting https://review.opendev.org/c/openstack/openstack-ansible-tests/+/765986 | 16:34 |
jrosser | ^ that should give early indicator if we have fixed the wheel build | 16:34 |
admin0 | brb -- food | 16:38 |
prometheanfire | I imagine this is fine, but just wanted an ack openstack-ansible-galera_client is going away? | 16:42 |
prometheanfire | (offhand, what is it replaced by, upstream role)? | 16:42 |
jrosser | galera client and galera server ansible roles were always in a circular dependacy fight with each other | 16:43 |
noonedeadpunk | prometheanfire: we moved client side to galera_server role | 16:43 |
jrosser | as they both defined the version to install, so it was best to merge the two togetehr | 16:43 |
prometheanfire | yarp | 16:43 |
noonedeadpunk | well, thinking about this now - we probably could set version in the integrated repo :p | 16:43 |
noonedeadpunk | but whatever :) | 16:44 |
jrosser | if we tested then with the new infra jobs yes | 16:44 |
jrosser | but with functional jobs not so much | 16:44 |
noonedeadpunk | yeah, the we'd need to bump in tests repo as well... agree | 16:44 |
prometheanfire | also offhand, when is the .1 release? iirc that was when the upgrades were 'supported' | 16:45 |
noonedeadpunk | iirc there's not so much for upgrade for U->V | 16:45 |
noonedeadpunk | or we just implemented all at once, since we have upgrade jobs nowadays | 16:45 |
noonedeadpunk | so we see broken upgrade path at once | 16:46 |
prometheanfire | ya, my distro based upgrade went easilly, no big migration to run for the main projects | 16:46 |
jrosser | prometheanfire: you do install_method=distro? | 16:47 |
prometheanfire | oh, I mean my gentoo packaging stuff | 16:48 |
prometheanfire | forgot about that install method :P | 16:48 |
jrosser | oh phew - thought you meant OSA distro :) | 16:48 |
prometheanfire | as far as osa gentoo stuff I may work on it soon (a month or three, whenever the new nuc11 stuff comes out) since I'm rebuilding my lab | 16:50 |
prometheanfire | once that is done I can stop packaging most/many of the openstack things on gentoo and point people to that | 16:50 |
prometheanfire | then again, I don't think it'll ever be officially supported, I'm not the only gentoo user, but I am about the only dev | 16:51 |
*** gshippey has joined #openstack-ansible | 16:55 | |
noonedeadpunk | #endmeeting | 16:56 |
*** openstack changes topic to "Launchpad: https://launchpad.net/openstack-ansible || Weekly Meetings: https://wiki.openstack.org/wiki/Meetings/openstack-ansible || Review Dashboard: http://bit.ly/osa-review-board-v3" | 16:56 | |
openstack | Meeting ended Tue Dec 8 16:56:54 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:56 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/openstack_ansible_meeting/2020/openstack_ansible_meeting.2020-12-08-16.05.html | 16:56 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/openstack_ansible_meeting/2020/openstack_ansible_meeting.2020-12-08-16.05.txt | 16:56 |
openstack | Log: http://eavesdrop.openstack.org/meetings/openstack_ansible_meeting/2020/openstack_ansible_meeting.2020-12-08-16.05.log.html | 16:56 |
prometheanfire | oh, I was in a meeting, lol | 16:58 |
noonedeadpunk | np here :) | 16:59 |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-openstack_hosts master: Fix libsystemd version for Centos https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/766030 | 17:02 |
jrosser | i messed up the squash of those and put the delegate_to back in | 17:02 |
jrosser | danger of editing in the gerrit UI | 17:02 |
noonedeadpunk | I'm a bit lost with octavia client certificates. I t seems we do rotate certificates each time we run role. But should octavia be able to talk to amphoras with new certificates? I guess not? | 17:04 |
noonedeadpunk | As it makes auth with amphora? | 17:04 |
noonedeadpunk | *with SSL to amphora | 17:04 |
noonedeadpunk | or I got it wrong? | 17:04 |
noonedeadpunk | Just seeing some weird stuff here and trying to understand if it's related or not | 17:05 |
jrosser | it'll be to do with the CA i guess? | 17:10 |
jrosser | if the certificates can be validated then it doesnt necessarily matter if they change | 17:11 |
jrosser | this https://github.com/openstack/openstack-ansible-os_octavia/blob/master/tasks/octavia_certs.yml#L97-L104 | 17:12 |
*** owalsh has joined #openstack-ansible | 17:13 | |
noonedeadpunk | ok, I see what has happened here | 17:14 |
noonedeadpunk | not cool pattern https://opendev.org/openstack/openstack-ansible-os_octavia/src/branch/master/defaults/main.yml#L428 | 17:14 |
noonedeadpunk | in case you auth as non root user (or having ldap and running via sudo) this will differ for users... | 17:15 |
noonedeadpunk | well, it's configurable so whatever:) | 17:16 |
noonedeadpunk | so yeah, I got new CA and everything... | 17:17 |
jrosser | what on earth is that for! | 17:22 |
jrosser | shouldnt these default to /etc/openstack-deploy? | 17:23 |
noonedeadpunk | I'd say it should. Well, not /etc/openstack_deploy, but OSA_CONFIG_DIR | 17:24 |
noonedeadpunk | But I think caveat might be if choose different from loaclhost setup_host | 17:24 |
jrosser | this certainly needs improving | 17:26 |
*** tosky has joined #openstack-ansible | 17:26 | |
jrosser | i have just looked on a deploy host here and those files are in a place not covered by backup/version control | 17:26 |
noonedeadpunk | they are not, yes :p | 17:26 |
noonedeadpunk | I was pretty sure I will need to respawn all amphoras now... | 17:27 |
noonedeadpunk | but finally found valid certs... | 17:27 |
jrosser | i guess for the time being octavia_cert_dir can be overridden and the files put in a better place | 17:27 |
jrosser | +/- permissions of course, i'd expect some of those to have quite restrictive mode | 17:28 |
jrosser | yes the private keys are 0600 which makes it difficult | 17:29 |
noonedeadpunk | I just find it super hard to change default as it obviously will break existing deployments (except do some symlinking with upgrade scripts) | 17:29 |
jrosser | unless we have a one-cycle task that moves the files | 17:29 |
noonedeadpunk | honestly - we run osa with root only, so whatever permissions are on the deploy host... | 17:29 |
openstackgerrit | Merged openstack/openstack-ansible-tests master: Bump virtualenv to version prior to 20.2.2 https://review.opendev.org/c/openstack/openstack-ansible-tests/+/765839 | 17:29 |
noonedeadpunk | but in case of octavia_cert_setup_host != localhost it is really tricky... | 17:30 |
spatel | fyi, successfully deployed aio using centos 8.3 | 17:31 |
noonedeadpunk | I hope we will cover that with SSL topic as well though | 17:31 |
noonedeadpunk | will need to recheck 766030 | 17:31 |
jrosser | i'm not sure i really understand the use case for cert_setup_host != localhost | 17:31 |
jrosser | unless you don't allow the CA key off a specific host | 17:32 |
noonedeadpunk | no idea, but since we do allow this at the moment, I can expect somebody might be using it... | 17:32 |
* jrosser heads out for a bit | 17:32 | |
mgariepy | https://zuul.opendev.org/t/openstack/build/5bf01bb6687a41e0970220c461489297/log/job-output.txt#10807 | 17:40 |
mgariepy | noonedeadpunk, have you seen that ? | 17:40 |
mgariepy | somewhere else ? i'm not show what can trigger that but my horizon patch fails there on multiple os/checks | 17:41 |
*** rpittau is now known as rpittau|afk | 17:43 | |
noonedeadpunk | nope, not really | 17:47 |
*** cloudnull has quit IRC | 17:47 | |
*** cloudnull has joined #openstack-ansible | 17:47 | |
noonedeadpunk | but anyway we need to merge https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/766030 before anything will pass centos | 17:50 |
mgariepy | done | 18:03 |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_octavia master: Trigger service restart on cert change https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/766062 | 18:05 |
kleini | I just tried 21.2.0 in staging and noticed, that /etc/ceph for Glance, Cinder volume and computes is not created any more. Maybe ceph-client is not running. I am using ceph config from files. | 18:27 |
kleini | Does anybody have a pointer, what may have changed? | 18:27 |
spatel | folks, do you know what is going on with my rabbitMQ - http://paste.openstack.org/show/800861/ | 18:33 |
spatel | Look like some bad compute node crying.. | 18:37 |
admin0 | spatel, it seems to work .. but the operating status shows: offline | 18:45 |
spatel | admin0: look like time to check octavia logs etc.. and see if you find something interesting. | 18:46 |
spatel | did you see amphora vm on compute? | 18:46 |
kleini | https://opendev.org/openstack/openstack-ansible-ceph_client/src/branch/master/tasks/main.yml#L19 <- this is not backported to ussuri | 18:47 |
kleini | ^^^ answering my own question | 18:47 |
admin0 | yes .. lb is working fine .. just that that status shows offline | 18:51 |
*** andrewbonney has quit IRC | 18:51 | |
*** nurdie has quit IRC | 18:54 | |
spatel | very odd, it should be online | 18:58 |
jrosser | kleini: if you can find the patches (git blame?) hit the cherry-pick button In gerrit | 19:00 |
spatel | operation basic.publish caused a channel exception not_found: no exchange 'reply_92897d7bc2ad495c8688964cf8433a6b' in vhost '/nova' | 19:00 |
spatel | getting these errors do you think its good idea to restart rabbit cluster? | 19:01 |
jrosser | admin0: from the octavia container you should be able to curl the api endpoint in the amphora, can you check that? | 19:01 |
*** frickler has joined #openstack-ansible | 19:01 | |
admin0 | jrosser, spatel this is what I see: https://pasteboard.co/JE0QxcL.png | 19:07 |
openstackgerrit | Marcus Klein proposed openstack/openstack-ansible-ceph_client stable/ussuri: Allow to proceed with role if ceph_conf_file is set https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/765952 | 19:08 |
admin0 | when i curl the floating IP. i get round-robin between web1 web2 web1 web2 .. so i know its working fine | 19:08 |
spatel | Do you have spare_amphora_size_pool in octavia.conf ? | 19:08 |
spatel | just curious i had that setting and it was causing strange issue | 19:09 |
admin0 | so in the hypervisor, ( i only have 1 to test this) there are 5 instances, 2x web servers, 1 cirros and the 2x amphoras | 19:09 |
spatel | Are you using SINGLE or ACTIVE-STANDBY? | 19:10 |
admin0 | none.. whatever is the default | 19:10 |
spatel | may be LB is up but octavia-health manager doesn't able to pull stats (guessing) | 19:10 |
spatel | octavia_spare_amphora_pool_size: 0 | 19:11 |
spatel | octavia_loadbalancer_topology: SINGLE | 19:11 |
admin0 | spare_amphora_pool_size is set to 1 | 19:11 |
admin0 | how does the octavia-health-manager pull stats | 19:12 |
spatel | that option is very bad and causing issue in my setup | 19:12 |
admin0 | so set pool_size to 1 ? | 19:12 |
admin0 | i meant 0 | 19:12 |
spatel | spare_amphora_pool_size = 0 (rebuilt LB again) | 19:12 |
admin0 | ok | 19:12 |
admin0 | and another question is .. how does the health manager connect to amphora ? via its lb ip ? | 19:12 |
openstackgerrit | Marcus Klein proposed openstack/openstack-ansible-ceph_client stable/ussuri: Allow to proceed with role if ceph_conf_file is set https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/765952 | 19:12 |
admin0 | i mean its own lbaas pool ip ? | 19:12 |
spatel | lb-lbaas-mgmt | 19:13 |
masterpe | We did have not have any problem with spare_amphora_pool_size. In our setup we have it on 5. | 19:13 |
admin0 | i am using whatever the default amphora image it downloads .. | 19:13 |
admin0 | have not created my own | 19:13 |
spatel | if you do (ip netns list) you will see your VIP inside namespace | 19:13 |
spatel | I am using same default amphora image provided by OSA | 19:14 |
kleini | jrosser: https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/765952 | 19:14 |
admin0 | spatel, the ip netns list( where ? ) | 19:14 |
admin0 | inside the default amphora image ? | 19:14 |
spatel | inside amphora | 19:14 |
admin0 | aah .. ok | 19:14 |
admin0 | let me check how to connect to it | 19:14 |
spatel | amphora VM create public vip inside namespace to isolate from br-lbaas-mgmt IP | 19:14 |
spatel | ssh to br-lbaas-mgmt ip (you need to upload ssh-key) | 19:15 |
admin0 | for whatever reasons, i cannot even ping the default amphora ip | 19:16 |
spatel | VIP ip? | 19:16 |
spatel | you can't ping because ICMP not allowed only vip ports will be open like port 80 or 443 | 19:17 |
admin0 | i was trying to ping the default amphora instance that comes up first under user octavia in service tenant | 19:17 |
admin0 | which i cannot .. it only has the lbaas ip range . | 19:17 |
*** yann-kaelig has joined #openstack-ansible | 19:17 | |
admin0 | the other one i created via gui has a vip | 19:17 |
jrosser | admin0: this is the time to take a breath and recap how this works | 19:17 |
admin0 | :D | 19:17 |
admin0 | ok | 19:17 |
spatel | :) | 19:18 |
jrosser | the public vip is for the loadbalanced service | 19:18 |
jrosser | the interface on the lbaas network connects the amphora to the backend octavia container via br-lbaas and eth14 | 19:19 |
jrosser | octavia backend service connects over that lbaas network to the amphora, with https for monitoring and config | 19:19 |
jrosser | you should be able to see evidence of that in the various octavia service journals on the controller | 19:20 |
jrosser | you should also be able to curl the backend api endpoint of the amphora from inside the octavia container | 19:20 |
jrosser | these are verification steps that you lbaas network is all running as it should | 19:20 |
*** yann-kaelig has quit IRC | 19:21 | |
mgariepy | how do i run aio_metal tests for horizon repo ? all the metal test fails.. https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/765998 | 19:21 |
spatel | jrosser: is this guide still valid to restart rabbitMQ cluster - https://docs.openstack.org/openstack-ansible/rocky/admin/maintenance-tasks/rabbitmq-maintain.html | 19:21 |
jrosser | admin0: does this make sense? if you have some status error for the lb it will be todo with that backend network amphora<>octavia communication | 19:22 |
admin0 | does the default amphora image allow ping ? | 19:23 |
spatel | admin0: no | 19:23 |
spatel | ICMP is blocked by design | 19:23 |
admin0 | i mean from the octavia container -> amphora instance | 19:23 |
jrosser | curl the api | 19:23 |
admin0 | sorry when u mean api, exactly what :( | 19:23 |
masterpe | admin0: If you enable it in the acl it is. | 19:23 |
masterpe | But is is not meant to do that. | 19:24 |
spatel | You can ping from amphor --> octavia container using br-lbaas subnet (you can't ping lb vip) | 19:24 |
admin0 | i am unable to ping from container -> amphora | 19:24 |
admin0 | but i heard that is by design | 19:24 |
spatel | masterpe: i have been told octavia deploy security rules from python code (you can't modify by security-group) | 19:25 |
admin0 | ok .. so the amphora instance that is under octavia/service acts as a network node for all other tenant-created load balancers ? | 19:25 |
masterpe | I have changed the security group in the service project to allow icmp | 19:25 |
spatel | masterpe: admin0 sorry i found you can't ICMP (i just verified, its not allowing me to ICMP) | 19:27 |
admin0 | i see, 9443 is the pirt that is open | 19:27 |
spatel | you can telnet to port 22 and verify | 19:27 |
admin0 | i only see 9443 open in the firewall from octavia/service for the amphora image | 19:27 |
admin0 | so let me check if octaviacontainer can reach this port | 19:27 |
masterpe | Yes tcp work beter to verify. | 19:27 |
spatel | telnet 10.62.7.156 22 | 19:27 |
spatel | works for me but ICMP not | 19:28 |
jrosser | this is what runs in the amphora https://docs.openstack.org/octavia/latest/contributor/api/haproxy-amphora-api.html | 19:28 |
jrosser | it should be on port 9443 (i think?) and you should be able to curl https://your-amphora-on-lbass-mgmt-ip:9443/info or something similar | 19:29 |
johnsom | You would have to have the right TLS cert and keys for that curl to work. | 19:30 |
jrosser | indeed, though i think at the moment just verifying that the network is even connected is challenging | 19:30 |
jrosser | so a valid failure there is information :) | 19:31 |
johnsom | You can use s_client: openssl s_client -connect <amphora lb-mgmt-net IP>:9443 | 19:31 |
johnsom | That should dump the server certificate information. You will want to control-c out of it as it is waiting for the client to send certificate information | 19:32 |
jrosser | this is also all kind of well logged though too | 19:32 |
jrosser | so it should be reasonably easy to see success/failure in the service journal | 19:32 |
johnsom | True | 19:33 |
spatel | johnsom: Lets allow ICMP :) we should open story or future request. | 19:33 |
admin0 | recall, that i had to manually add the 10.62.0.x range IP in the br-lbaas in the container .. i had put a wrong subnet . so it was seeing one amphora but not able to connect to another one | 19:34 |
johnsom | Yeah, we can add an option to turn that on. By default the security is locked down on the amphora as they really are just a black box. | 19:34 |
ThiagoCMC | I'm curious about Octavia... Is it possible to run Octavia instances in LXC instead of QEMU? I wanna try it but I really don't want the network traffic of the LBaaS to pass though virtualization (it doesn't make sense to me)... | 19:34 |
admin0 | i am going to remove this lb, add another one and see what it does | 19:34 |
admin0 | if operators (like me) want, i easily see i can add the icmp rules to the amphora instanes logged in as octavia/service | 19:35 |
johnsom | ThiagoCMC Yes, we had a working proof of concept with LXC, but nova-lxc is now dead so we never merged it. That said, it's a rare case that you have enough load that the service VM is a bottleneck. It is very optimized. | 19:36 |
ThiagoCMC | johnsom, it feels sad that both nova-lxc and nova-lxd are gone... :-( | 19:38 |
ThiagoCMC | I'll give it a try then! | 19:38 |
admin0 | except "operating stats" offline .. everything else seems to work perfectly fine | 19:38 |
admin0 | operating-status | 19:39 |
ThiagoCMC | The nova-lxd was awesome, especially these days that LXD supports both LXC and QEMU side by side... OpenStack could be a Libvirt-free cloud! | 19:39 |
johnsom | https://review.opendev.org/c/openstack/octavia/+/636066 was the working LXD patch. Really I did it to try it out and to see if I could find something more stable than nova was at the time | 19:39 |
johnsom | admin0 Yeah, operating status OFFLINE and no stats means you have a problem on the lb-mgmt-net, the UDP traffic over 5555 from the amphora instance to the controllers is not working. The health manager is not receiving the heartbeat packets. | 19:41 |
admin0 | udp is not enabled by default in the firewalls its creating | 19:41 |
admin0 | let me check if i can add 5555 udp and see if it solves that | 19:42 |
johnsom | It's outbound, so the SGs on the amphora won't show it as a rule, it's open by default. There should be a rule on the controller SGs however that allows it. | 19:42 |
masterpe | admin0: did you solve the issue that the vlan interface was not joined in the bridge on the compute node? | 19:43 |
johnsom | I'm pretty sure OSA has that right, maybe people have deployed with it. Unless something has changed since I last looked at the repo. | 19:43 |
johnsom | maybe->many | 19:43 |
admin0 | in the default sec_group created, octavia_sec_grp only thing allows is tcp 9443 | 19:43 |
*** ianychoi__ has quit IRC | 19:44 | |
admin0 | in the other amphora where i created for load balancing http, i see 80 and 1025 | 19:44 |
johnsom | admin0 Yeah, that is the wrong SG | 19:44 |
admin0 | there are no other sec groups | 19:44 |
admin0 | its the default, octavia_sec_group ( for the default amphora) and some uuid for the one i created | 19:44 |
johnsom | It's the SG on the controller port | 19:44 |
admin0 | its the secgroup under service/octavia user isn't it ? | 19:45 |
admin0 | there is no 5555 or udp in any of the sec groups created | 19:45 |
johnsom | I doubt it. It's probably under a OSA admin account | 19:45 |
johnsom | SGs under the octavia account are really only for amphora side/service VMs | 19:46 |
admin0 | all lbs created, appear as instance under service/octavia with the security group also managed from there | 19:46 |
admin0 | under admin, it just appears on neutron as lbaas | 19:46 |
admin0 | and under service/octavia, there are 3 rules .. default, the one for the octavia and the one for my lb .. it does not have any udp in it | 19:47 |
admin0 | or tcp port 555 | 19:47 |
admin0 | 5555 | 19:47 |
johnsom | admin0 Right, that is what I said, it's a SG outside the octavia account. Probably the one called lbaas | 19:47 |
johnsom | It's been over a year since I have used OSA to deploy, so I'm a bit rusty on the setup there. | 19:48 |
jrosser | am i right in thinking that would only be for OVS? | 19:48 |
admin0 | johnsom, not under admin | 19:48 |
admin0 | i can confirm that all lbaas related instances/security groups are under service/octavia | 19:48 |
johnsom | OVS and linux bridge, but I don't know if the OVN stuff has been setup in OSA yet | 19:48 |
ThiagoCMC | johnsom, thanks! | 19:48 |
jrosser | for admin0 case the bridge on the compute node comes straight over to the controller as a vlan provider network with linuxbridge | 19:49 |
johnsom | admin0 Look for a port in neutron called something similar to o-hm0 | 19:49 |
spatel | I kept two openrc file one for admin and one for octavia.rc for LB operation | 19:50 |
johnsom | That would be the controller port, the SG on that port is the one that will have UDP/5555 open. | 19:50 |
admin0 | its not there | 19:50 |
spatel | so i can see status of amphora | 19:50 |
admin0 | maybe i need to add udp/5555 manually | 19:50 |
admin0 | and it will fix itself | 19:50 |
admin0 | you sure its udp/5555 ? | 19:50 |
johnsom | admin0 no, it's setup by OSA | 19:50 |
admin0 | johnsom, telling you bro .. its not there | 19:51 |
admin0 | at least not there on 21.1.0 tag i am using | 19:51 |
*** luksky has quit IRC | 19:51 | |
jrosser | admin0: use the diagram from spatel blog as your reference, thats what you have built | 19:52 |
jrosser | if you are trying to find this UDP traffic start at the source, with tcpdump on the compute node and follow it logically though | 19:52 |
masterpe | admin0: the octavia_sec_grp in my case ingress tcp 9443, ingress tcp 22 and ingress icmp. | 19:53 |
admin0 | i followed his blog exactly .. even vlan27 .. | 19:53 |
admin0 | and so far its working .. only issue i see is the offline status and was trying to understand why | 19:53 |
johnsom | https://github.com/openstack/openstack-ansible-os_octavia/blob/master/defaults/main.yml#L365 | 19:53 |
johnsom | So OSA is adding the rule directly via iptables | 19:53 |
spatel | This is what i have - http://paste.openstack.org/show/800864/ | 19:54 |
admin0 | spatel, i have exactly what you have | 19:54 |
johnsom | jrosser +1 , It's extremely unlikely it is an SG or iptables rule issue. Like I said, many people have deployed this over years and this part hasn't changed. | 19:55 |
johnsom | spatel Yeah, that looks good for a load balancer with a VIP on port 80 | 19:55 |
admin0 | so you guys see online status .. with the same security group, i see it as offline | 19:56 |
admin0 | means someting else | 19:56 |
spatel | admin0: enable debug on octavia side and you will see something in logs why its offline | 19:56 |
admin0 | i will maybe tcpdump all outgoing packets and see | 19:56 |
admin0 | from the container | 19:56 |
spatel | admin0: yes i am seeing online | 19:57 |
johnsom | Are you seeing stats increment when you hit the VIP endpoint? | 19:57 |
johnsom | If not, you have a networking problem on the lb-mgmt-net. If they are incrementing, check that your LB isn't disabled via admin state down. | 19:57 |
masterpe | admin0: you are able to connect to the Lbaas interface on port 22? | 19:58 |
admin0 | johnsom, this is what i see https://pasteboard.co/JE0QxcL.png .. i do not see stats | 19:58 |
*** nurdie has joined #openstack-ansible | 19:58 | |
johnsom | Yeah, horizon doesn't display them, use the CLI | 19:58 |
admin0 | masterpe, the lbaas has a intenal IP and a vip .. the VIP i understood is for the load balancer .. i tries ssh to the ineral ip, but there is no security group rule | 19:58 |
admin0 | let me enable ssh and try again | 19:58 |
johnsom | openstack loadbalancer stats show <lb ID> | 19:59 |
masterpe | For the ssh is a OSA varbiable: octavia_ssh_enabled | 20:00 |
masterpe | But I think you can also add it to the octavia_sec_grp SG | 20:00 |
spatel | http://paste.openstack.org/show/800865/ | 20:01 |
spatel | admin0: you need to upload ssh-key using octavia account (i mostly do that using horizon) | 20:02 |
admin0 | that is done | 20:02 |
johnsom | You will need to enable ssh via the octavia config (or OSA setting that setting) and then build a new amphora. Otherwise the SSH key won't be there. | 20:02 |
admin0 | i have ssh enable as well | 20:02 |
*** nurdie has quit IRC | 20:02 | |
admin0 | ok .. so even my lb is working , curl VIP round-robins between web1 and web2 .. my stats are all zero | 20:03 |
spatel | I have downloaded amphora image and stick my password there so i don't use SSH-key :) so i have freedom to ssh from any host | 20:03 |
admin0 | what happens if i delete the default amphora image | 20:04 |
admin0 | will it create a new one with the key ? | 20:04 |
spatel | you don't need to delete image at this point, just upload ssh-key and see if you can ssh in | 20:05 |
*** luksky has joined #openstack-ansible | 20:05 | |
admin0 | so i added from the security groups ssh, icmp .. but from the octavia_container, i cannot ping or even ssh .. all i can do that return something is: openssl s_client -connect <amophora-ip>:9443 | 20:05 |
spatel | In my case i have downloaded qcow2 image separately and edit that image to add my password and upload to glance (that way i don't need ssh key, i can just ssh root@amphoa-ip | 20:06 |
*** nurdie has joined #openstack-ansible | 20:08 | |
spatel | tonight i will add all these troubleshooting setups to my blog so it would be easy for future users. | 20:09 |
admin0 | https://gist.githubusercontent.com/a1git/671a00479386859f7f09697d522d778d/raw/1687a5a003972c78720a8680c37883f4ae4306d8/gistfile1.txt == this is what i started to see in hypervisor now | 20:10 |
admin0 | johnsom, is there a manual way to validate that the container can talk to port 5555 and that the firewall is not preventing the stats ? | 20:11 |
spatel | admin0: i am seeing same error in my neutron logs | 20:12 |
spatel | I thought this is me only, but now you are also seeing that | 20:12 |
admin0 | still everything seem to work .. and seems to be octavia related | 20:12 |
spatel | (nf_tables): CHAIN_USER_DEL failed (Device or resource busy): chain neutronARP-tap2cdc4ac7-46 | 20:12 |
admin0 | this is a complete greenfield env .. just set it up to do octavia only | 20:12 |
admin0 | i can add your ssh keys if interested to explore it | 20:13 |
spatel | worth opening bug to see what community thinking about it | 20:13 |
admin0 | i am going to try putting octavia service in debug mode, and also run tcpdump to check what ports its trying to connect but not able to | 20:13 |
admin0 | so that that offline-status can be online, and the stats are there | 20:14 |
johnsom | admin0 If you enter the container for your octavia health manager process, you can tcpdump -nli <interface name> udp port 5555. You should see one packet every ten seconds | 20:15 |
admin0 | ' container for your octavia health manager process ' -- where is this ? | 20:16 |
jrosser | admin0: are you on ubuntu focal for your compute node? | 20:16 |
admin0 | yep | 20:16 |
johnsom | on your controller(s) | 20:16 |
johnsom | If you have more than one controller in your OSA deployment, it may be longer than ten seconds for one controller to get a packet as it rotates through them randomly | 20:17 |
spatel | In my setup i am seeing 5555 udp ping | 20:17 |
admin0 | johnsom, i only have 1 controller for this, and it has a single container called: c1_octavia_server_container-bd64dff3 | 20:17 |
jrosser | it looks like your error on the compute node is this https://review.opendev.org/c/openstack/neutron/+/765408 | 20:17 |
-spatel- 12:16:58.123393 IP 10.62.7.156.55886 > 10.62.7.76.5555: UDP, length 292 | 20:17 | |
-spatel- 12:17:14.647583 IP 10.62.7.143.55887 > 10.62.7.76.5555: UDP, length 291 | 20:17 | |
admin0 | wait .. i should have more containers ? | 20:18 |
johnsom | admin0 Ok, jump in that container. | 20:18 |
johnsom | OSA used to have one container for each process, but maybe that has changed? | 20:18 |
johnsom | One for API, one for worker, one for health manager, and one for house keeping processes | 20:19 |
spatel | admin0: octavia-health proc run inside same octavia-server container (there will be 4 daemon) | 20:19 |
spatel | in short everything will run inside single octavia container | 20:19 |
johnsom | Ok, so that has changed since I have deployed. That is fine, they are low-load processes | 20:19 |
admin0 | https://gist.github.com/a1git/3f6fcbe73f9dff34ecc84d4413bb440b | 20:19 |
admin0 | that is what it has | 20:20 |
spatel | This is what i have - http://paste.openstack.org/show/800866/ | 20:20 |
johnsom | Yep, they are all there | 20:20 |
johnsom | What do you have for "ip link"? | 20:20 |
spatel | jrosser: that bug isn't merged yet right? I am seeing same error all over the place | 20:21 |
admin0 | https://gist.github.com/a1git/b4fa23d9011d3d35ae65ca11ba7f19a9 | 20:21 |
jrosser | spatel: correct, bad neutron but | 20:21 |
admin0 | i manually added the 10.63.0.3/16 as that is what the range i gave for lbaas-mgmt | 20:22 |
jrosser | i would suggest forking the neutron repo, cherry pick that and override to point to your own git repo | 20:22 |
spatel | jrosser: that would be good idea until it merge to victoria | 20:22 |
jrosser | spatel: it's also ussuri, and from an OSA perspective we can't do anything until that merges in the real neutron repo | 20:23 |
admin0 | i am hitting the load balancer, but tcpdump -nli any udp port 5555 produces nothing at all | 20:23 |
jrosser | though as an end user there are all the hooks for you to work around locally | 20:23 |
johnsom | Yeah, ok, so the lb-mgmt-net is broken somehow. | 20:23 |
spatel | johnsom: if lb-mgmt-net is broken then octavia should keep deleting and re-creating amphora right? | 20:24 |
johnsom | My guess is, since you manually added 10.63.0.3/16, the amphora on the lb-mgmt-net interface don't have a route back to this address. | 20:25 |
johnsom | spatel No, due to other nova issues, we don't start the failover clock until we have received at least one heartbeat. | 20:25 |
admin0 | its in the same L2 | 20:25 |
jrosser | it does seem to be a big red flag that you've got an exception on the compute node to do with security groups and a known neutron bug which hasn't had its fix merged yet | 20:26 |
johnsom | Can you paste the output of a "openstack server list" for the amphora service VM? | 20:26 |
admin0 | that i have to do as service/octavia | 20:27 |
admin0 | one moment | 20:27 |
johnsom | jrosser Yeah, could be. | 20:28 |
*** cshen has quit IRC | 20:43 | |
admin0 | ok .. question . i have octavia_management_net_subnet_cidr: 10.62.0.0/21 in user_variables, but in user_config, under lbaas: 172.29.232.0/22 -- should they have to be the same ? | 20:46 |
admin0 | coz i followed first rackspace, then mgariepy and then spatel, i think may have this inconsistency | 20:48 |
admin0 | thank you johnsom for pointing it out | 20:48 |
admin0 | so checking if mgariepy and spatel have the same, or diff .. and if they should match | 20:48 |
*** cshen has joined #openstack-ansible | 20:48 | |
jrosser | they should match, one is the config for the OSA containers and their interfaces | 20:50 |
spatel | lbaas and octavia_management_net_subnet_cidr should be same | 20:50 |
jrosser | the other is for the neutron provider network, and these must line up | 20:50 |
admin0 | aha .. heartbeat issue should be solved after this .. | 20:50 |
admin0 | let me work on it .. | 20:50 |
admin0 | then the remainig will be that error that both spatel and i see | 20:50 |
spatel | https://review.opendev.org/c/openstack/neutron/+/765408 | 20:52 |
spatel | we need to wait for merge that code before that we have to do some hacks to rollout patch | 20:52 |
spatel | admin0: in my block i have used same network lbaas: 172.27.40.0/24 / octavia_management_net_subnet_allocation_pools: 172.27.40.200-172.27.40.250 | 20:53 |
admin0 | mine got mixed up following multiple ways to get it done | 20:53 |
admin0 | but i have some clarity now | 20:53 |
spatel | :) | 20:53 |
admin0 | finally | 20:54 |
spatel | admin0: This is how you learn, if it works in first shot then you won't get chance to learn underlying technology :) | 20:56 |
spatel | After that try Senlin also for fun | 20:56 |
spatel | my rabbitMQ isn't happy on production, may be i will rebuild whole cluster tomorrow, its too late to poke that beast.. | 21:00 |
admin0 | so if the ip range and mgmt range is to be the same, then .. in a 3 controller setup, there will be 3 ips assigned to controller .. that openstack will have no idea of .. and the chances of the same ip being assigned to amphora is there | 21:07 |
admin0 | because as far as i can recall, the containers use the ip other than the reserved ips | 21:07 |
admin0 | and because these containers are created automatically, we cannot predict these reserved ips | 21:07 |
spatel | you can use used_ips to reserver some ips | 21:09 |
spatel | i kept some IP for future use and rest for lbaas-mgmt-net | 21:09 |
admin0 | you mean manually editing the eth14 ip to make it fall inside the reserved range .. means manual changes to the inventory file | 21:10 |
admin0 | user config has this now: lbaas: 10.62.0.0/21 . reserved: - "10.62.0.1,10.62.0.50" user variables have: octavia_management_net_subnet_allocation_pools: 10.62.0.51-10.62.7.250 | 21:11 |
admin0 | but when the lxc is created, it used 10.62.7.19 | 21:11 |
admin0 | it means out of 1000s of ips, the changes of 3 amphora created that will match the container ip is there, unless that container ip is manually specified somehow via ansible | 21:12 |
admin0 | i can manually cahange the container ip to make it within .50 and edit the inventory .. unless a better way exists | 21:13 |
spatel | This is my production example | 21:16 |
spatel | lbaas: 10.62.0.0/21 | 21:17 |
spatel | used_ips: - "10.62.0.1,10.62.6.255" | 21:17 |
spatel | octavia_management_net_subnet_allocation_pools: 10.62.4.1-10.62.7.254 | 21:17 |
spatel | from /21 subnet i gave 10.62.4.1-10.62.7.254 to amphora | 21:18 |
admin0 | yes, now the lxc container also need an ip right, which is picked outside of 10.62.0.1,10.62.6.255 .. .. | 21:18 |
admin0 | so there is a chance it will overlap with your allocation_pool | 21:18 |
spatel | octavia_management_net_subnet_allocation_pools: this pool will control by DHCP so don't use that IP anywhere manually | 21:19 |
admin0 | that is what i am saying | 21:19 |
spatel | yes.. | 21:19 |
admin0 | the ip picked by random by ansible for lxc , the dhcp will not know about it | 21:19 |
spatel | yup | 21:19 |
admin0 | so there is a chance that if you dont change your lxc container ip manually and fix the inventory, an amphora can overlap with the container | 21:20 |
admin0 | doing a possible service down due to both ips being in the same vlan | 21:20 |
spatel | possible | 21:20 |
jrosser | you should make these ranges mutually exclusive | 21:20 |
admin0 | right | 21:20 |
admin0 | or just edit the inventory and fix the container .. | 21:21 |
jrosser | in openstack_user_config ensure that the range you give to neutron is included in used_ips | 21:21 |
jrosser | and put at least the opposite of that in octavia_management_net_subnet_allocation_pools | 21:21 |
jrosser | well you know what i mean..... | 21:22 |
admin0 | yep | 21:22 |
spatel | I need to fix my range also :) 10.62.4.1-10.62.7.254 this is overlapped | 21:22 |
admin0 | isn't it also possible to just manually pass an IP to a container ? | 21:22 |
jrosser | it is possible but ugly | 21:22 |
jrosser | best to just chop a small range off the front of the subnet and let the OSA inventory allocate from that | 21:22 |
spatel | jrosser: what is the status of CentOS 8.3 builds, are you seeing them passing? | 21:23 |
jrosser | it is late, i am not currently looking at them much | 21:23 |
spatel | i am thinking to upgrade my 8.2 to 8.3 | 21:23 |
admin0 | spatel, https://lists.centos.org/pipermail/centos-announce/2020-December/048208.html | 21:24 |
spatel | we had very long discussion this morning about that :) | 21:25 |
admin0 | hehe | 21:25 |
spatel | i will test ubuntu in my lab to see how we can shift from centos8 | 21:26 |
admin0 | i like centos .. just don't like being helpless when in upgrade and cannot fix the inter-dependency issues .. or the package being a bit too older than required | 21:28 |
admin0 | ok . testing octavia again | 21:29 |
spatel | I never used ubuntu in my entire carrier :) | 21:30 |
*** nurdie has quit IRC | 21:32 | |
admin0 | ok .. so i nuked the default amphora, re-created the octavia lxc container with the correct ip range | 21:42 |
admin0 | but the amphora is not coming up again .. is there a way to kicstart it ? | 21:42 |
admin0 | the logs do not show any attempt .. i already ran the playbooks twice .. | 21:43 |
spatel | make sure you have lbaas-mgmt-net is there + compute nova logs etc | 21:48 |
*** pto_ has joined #openstack-ansible | 21:52 | |
*** pto has quit IRC | 21:54 | |
*** ebbex has quit IRC | 21:55 | |
*** ebbex has joined #openstack-ansible | 21:55 | |
*** pto_ has quit IRC | 22:05 | |
*** pto has joined #openstack-ansible | 22:05 | |
*** luksky has quit IRC | 22:18 | |
*** spatel has quit IRC | 22:19 | |
*** cshen has quit IRC | 22:31 | |
*** luksky has joined #openstack-ansible | 22:32 | |
admin0 | jrosser, this is actually breaking neuron now ( and without the agent, nova fails to spawn any new vms) | 22:45 |
*** cshen has joined #openstack-ansible | 22:51 | |
admin0 | looks like this https://bugs.launchpad.net/neutron/+bug/1887281 | 22:53 |
openstack | Launchpad bug 1887281 in neutron "[linuxbridge] ebtables delete arp protect chain fails" [Medium,Fix released] - Assigned to Lukas Steiner (steinerlukas) | 22:53 |
jrosser | admin0: yes, that is the bug i posted the fix for earlier | 22:57 |
jrosser | from an OSA perspective we can't do anything until that merges | 22:57 |
admin0 | how do i cherry pick it | 22:57 |
admin0 | when might it merge | 22:57 |
jrosser | but there are all the hooks you need to override your version | 22:57 |
admin0 | and can it be back-merged to old tags ? | 22:57 |
admin0 | aah | 22:57 |
admin0 | oh yes .. i forgot about those | 22:57 |
jrosser | it is in neutron, not OSA, so it is up to the neutron team to merge it | 22:57 |
jrosser | then when OSA makes a point release it wil be automatic for OSA upgrades to get that | 22:58 |
admin0 | will it see 21.3.0 ? | 22:58 |
jrosser | what i would do here is fork the neutron repo to your github or something, and make a branch off of stable/ussuri | 22:58 |
jrosser | then cherry pick the patch from gerrit onto the tip of that branch | 22:59 |
admin0 | i use tags .. on 21.1.0 | 22:59 |
jrosser | no | 22:59 |
jrosser | that is an OSA version and has no meaning for neutron | 22:59 |
admin0 | ok | 22:59 |
jrosser | when you have done that there is an example here https://docs.openstack.org/openstack-ansible/latest/user/source-overrides/index.html | 23:00 |
jrosser | use the stuff under "Overriding other upstream projects source code" but switch it for neutron instead of glance | 23:00 |
jrosser | and point to your patched repo | 23:00 |
jrosser | this is neutron itself you need to patch, not os_neutron ansible role or openstack-ansible | 23:01 |
admin0 | when i click the cherry pick ussuri, i get this url https://review.opendev.org/c/openstack/neutron/+/765408 | 23:02 |
admin0 | is that the url i need to point to ? | 23:02 |
jrosser | no, like i say, you need to fork the whole neutron repo yourself on github | 23:03 |
jrosser | standard github things, not gerrit | 23:03 |
jrosser | or wherever you would need to host your modified version of neutron | 23:03 |
*** gshippey has quit IRC | 23:03 | |
admin0 | oh | 23:04 |
admin0 | but if i wait and its merged, since it seems to break prod.. would it alone be enough to make a 21.3.0 tag ? | 23:05 |
admin0 | or if i wait it out, how does it make its way to our osa release ? | 23:06 |
*** rfolco has quit IRC | 23:06 | |
jrosser | every ~2 weeks we release a new tag of OSA which picks up the SHA of the tip of whatever the corresponding stable/<release> branches are for all of nova/neutron/keystone/... | 23:07 |
jrosser | this really is what an OSA release is | 23:07 |
jrosser | all of the git SHA are moved forward together and validated as a set in CI | 23:07 |
jrosser | admin0: take a look at this https://github.com/openstack/openstack-ansible/commit/1973fee1f3c172ad65a7440864092be237a01b10 | 23:10 |
jrosser | that is a patch which "makes a release" of OSA on a stable branch | 23:10 |
jrosser | it is nothing more than updating a whole heap of git repo SHA | 23:10 |
*** kukacz has quit IRC | 23:14 | |
*** kukacz has joined #openstack-ansible | 23:16 | |
*** rfolco has joined #openstack-ansible | 23:21 | |
*** cshen has quit IRC | 23:28 | |
*** nurdie has joined #openstack-ansible | 23:40 | |
admin0 | sorry my irc got disconnected | 23:41 |
admin0 | so if i wait 2 weeks, if its merged by then, it will be in the next release | 23:41 |
admin0 | thanks jrosser . | 23:41 |
admin0 | will check/learn tomorrow on how to do it manually from you ..late for today | 23:42 |
*** nurdie has quit IRC | 23:42 | |
*** rfolco has quit IRC | 23:53 | |
*** rfolco has joined #openstack-ansible | 23:53 | |
*** rfolco has quit IRC | 23:58 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!