noonedeadpunk | mornings | 08:52 |
---|---|---|
jrosser | good morning | 08:53 |
jrosser | upgrade bug question in #openstack fwiw | 08:53 |
* noonedeadpunk just joined the channel | 08:54 | |
noonedeadpunk | Was it same as a bug report? https://bugs.launchpad.net/openstack-ansible/+bug/2048842 ? | 08:54 |
* noonedeadpunk checking eavesdrop | 08:55 | |
jrosser | blues11> I was trying to upgrade from 'Zed' to 'Antelope' using OpenStack ansible upgrade script and get the below error https://paste.openstack.org/show/bMGGLkzALkRFYvvGbmab/ When I traced the error I noticed that the monitoring service was down, we noticed that a variable in ceph conf(located in monitor lxc container) is not expanded properly and I reckon this caused ceph monitor service down Has anyone had similar issues | 08:55 |
jrosser | or any clue about this? | 08:55 |
jrosser | i think thats the same as the bug you linked | 08:57 |
noonedeadpunk | yeah | 08:57 |
noonedeadpunk | it feels like we've landed most of bugfixes? | 08:59 |
jrosser | i think pretty much | 09:01 |
jrosser | maybe we are close to some point releases? | 09:02 |
noonedeadpunk | yeah exactly what I was thinking about | 09:07 |
noonedeadpunk | Well, this 1 potentially good to have - it can break Ironic CI (or better say Swift) | 09:08 |
noonedeadpunk | https://review.opendev.org/c/openstack/openstack-ansible/+/904941 | 09:08 |
noonedeadpunk | This is Magnum Octavia backport: https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/904510 | 09:10 |
noonedeadpunk | Other then that it looks like we're good to go | 09:11 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-ops master: WIP - Bootstrapping playbook https://review.opendev.org/c/openstack/openstack-ansible-ops/+/902178 | 09:27 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 09:28 |
jrosser | ^ i had some better idea about how to put this together in CI | 09:31 |
noonedeadpunk | I wonder if it's worth to add `/` to the end of src? As I guess you want to copy only some files from there? | 09:39 |
noonedeadpunk | rather then directory alltogether? | 09:39 |
noonedeadpunk | As at this point there already be /etc/openstack_deploy (it's also done at pre-stage, but will be eariler anyway) | 09:40 |
jrosser | somehow it runs no zuul job at all | 09:40 |
jrosser | openstack_deploy is made with pre playbook? i did not spot that | 09:42 |
andrewbonney | jrosser: looks like there's a mis-quoting on L23 of the playbook | 09:42 |
noonedeadpunk | jrosser: jobs.yaml | 09:42 |
noonedeadpunk | not .yml | 09:42 |
noonedeadpunk | IIRC zuul was picky about that | 09:42 |
jrosser | argh :) | 09:42 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 09:43 |
noonedeadpunk | yeah, quotes are wrong as well:) | 09:43 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 09:44 |
* jrosser goes to get coffee | 09:44 | |
jrosser | brain not engaged yet today | 09:44 |
noonedeadpunk | still nothing, huh | 09:47 |
noonedeadpunk | I would expect at least something to be scheduled or error out as a gerrit comment tbh | 09:47 |
jrosser | yeah usually you get some description of whats wrong | 09:48 |
noonedeadpunk | not saying I don't see anything wrong.... | 09:48 |
noonedeadpunk | but I defenitely do like the idae :) | 09:49 |
opendevreview | Merged openstack/openstack-ansible-os_magnum stable/zed: Add missing magnum octavia client configuration https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/904510 | 11:23 |
noonedeadpunk | I in fact can't reproduce magnum upgrade issue with tls scenario :( | 11:25 |
noonedeadpunk | But indeed `No image found with ID fedora-coreos-latest _image_get ` is in glance logs | 11:28 |
noonedeadpunk | Though it might be fine when client tries to resolve name to ID.... | 11:29 |
jrosser | oh well you might need to use uuid there | 11:30 |
jrosser | i see similar for the cluster api stuff and have not looked actually where that is problematic | 11:31 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-ops master: WIP - Bootstrapping playbook https://review.opendev.org/c/openstack/openstack-ansible-ops/+/902178 | 11:35 |
Tadios | o/, Hasnt this been fixed already? After a fresh deployment, specifically after the os-neutron-install playbook, the cpu usage on the infra nodes spike up and i have "ValueError: non-zero flags not allowed in calls to send() on <class 'eventlet.green.ssl.GreenSSLSocket'>" error on my neutron-server service https://paste.openstack.org/show/822819/ | 11:52 |
Tadios | basically performing the following will resolve the issue, ansible neutron_server -m shell -a "systemctl restart neutron-server", i am on 2023.1 | 11:52 |
noonedeadpunk | Tadios: frankly speaking - was not aware of this issue | 11:56 |
Tadios | noonedeadpunk: oh i though it was some how related to this bug https://bugs.launchpad.net/openstack-ansible/+bug/2027854 | 11:59 |
noonedeadpunk | ahhhhhhhhhhhhhhhhhhhhhhhhhhhhh | 11:59 |
* noonedeadpunk short on memory :D | 11:59 | |
jrosser | noonedeadpunk: you are right about problems with my os_magnum patch, openstack-ansible pre-osa-aio.yml does all the boostrapping before the os_magnum pre.yaml playbook gets run /o\ | 12:00 |
noonedeadpunk | Tadios: IIRC this was actually related to the OVS bug itself, that has been fixed in 2.17.3 | 12:01 |
noonedeadpunk | jrosser: it's easy to fix though? | 12:01 |
noonedeadpunk | Like you need not to copy directory, but just content of directory? | 12:01 |
jrosser | but it is too late | 12:01 |
jrosser | bootstrap-ansible and bootstrap-aio have already happened | 12:01 |
noonedeadpunk | ah, you mean that | 12:01 |
noonedeadpunk | for u-c-r | 12:02 |
noonedeadpunk | Ah! | 12:02 |
noonedeadpunk | I think you can pass some Zuul var to prevent this | 12:02 |
Tadios | noonedeadpunk: anything i need to do on my end when deploying? i pretty much cloned and start deployment from 27.3.0 | 12:03 |
noonedeadpunk | And that is Rocky/Centos? | 12:04 |
noonedeadpunk | Nah, unlikely.... | 12:04 |
Tadios | noonedeadpunk: ubuntu 22.04 | 12:04 |
noonedeadpunk | or well. I meant that probably it does not depend on OS then | 12:05 |
noonedeadpunk | so this fix should have been included in 27.3.0 | 12:05 |
Tadios | ya that's why am confused | 12:05 |
noonedeadpunk | But, the thing is that upper constraints were not updated. So inside neutron venv likely still old ovs python package is installed | 12:07 |
noonedeadpunk | So this can be the root-cause of the issue | 12:07 |
noonedeadpunk | OSA follows generic upper-constraints, and I'm not sure about really good way to workaround this, except fork the repo and edit the version of ovs in u-c | 12:08 |
noonedeadpunk | As our proposals to update ovs version there were rejected | 12:08 |
jrosser | i do wonder if we should implement some constraints override mechanism | 12:11 |
noonedeadpunk | Tadios: so that is the point of your interest I believe: https://opendev.org/openstack/requirements/src/branch/stable/2023.1/upper-constraints.txt#L185 | 12:12 |
noonedeadpunk | and then you can override repo of requirements here: https://opendev.org/openstack/openstack-ansible/src/branch/master/inventory/group_vars/all/source_git.yml#L19-L20 | 12:13 |
Tadios | noonedeadpunk: what do i do here exactly, https://opendev.org/openstack/openstack-ansible/src/branch/master/inventory/group_vars/all/source_git.yml#L19-L20 am confused can i set ovs to a specific version here? | 12:15 |
jrosser | Tadios: you should fork the openstack/requirements repo on github | 12:15 |
jrosser | patch it as you need | 12:16 |
jrosser | and then override those variables to point to your fork | 12:16 |
Tadios | jrosser: ohh, got it, Thank you. | 12:16 |
jrosser | this is truly terrible user experience unfortunately, but as noonedeadpunk says our attempts to fix this directly in the requirements repo have not been successful | 12:17 |
Tadios | why does restarting neutron-server fix the issue though? | 12:17 |
jrosser | thats an interesting question | 12:18 |
jrosser | Tadios: by any chance were you able to reproduce this in an all-in-one build? | 12:19 |
Tadios | jrosser: no i faced this issue twice, and they are both on multinode deployment | 12:19 |
jrosser | so you've not tried, or it's OK in AIO? | 12:20 |
Tadios | no i haven't tried | 12:20 |
jrosser | ok cool | 12:20 |
noonedeadpunk | jrosser: looking into current code, I don't think there's a variable available to skip bootstrap at pre-stage | 12:25 |
jrosser | its just unlucky the order zuul runs the playbooks | 12:26 |
jrosser | parent first then child job | 12:26 |
noonedeadpunk | there's one in scripts/gate-check-commit.sh | 12:26 |
noonedeadpunk | bit not in zuuul | 12:26 |
jrosser | actually maybe thats the answer - make a new parent job that doesnt have the pre-playbooks | 12:27 |
noonedeadpunk | I think it should be easy to add another condition here: https://opendev.org/openstack/openstack-ansible/src/branch/master/zuul.d/playbooks/pre-osa-aio.yml#L43 | 12:27 |
jrosser | yeah but then i actually do want to run it | 12:27 |
noonedeadpunk | you will? | 12:28 |
* jrosser confused | 12:28 | |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible/src/branch/master/zuul.d/playbooks/run.yml#L23 | 12:28 |
noonedeadpunk | or well... | 12:28 |
noonedeadpunk | the problem of new job is that you'd need to have huuuge list of required-projects | 12:31 |
jrosser | naah they aggregate | 12:32 |
jrosser | oh sorry yes i see | 12:32 |
jrosser | so - maybe another question is why do we have pre-osa-aio.yml as a zuul pre playbook | 12:35 |
noonedeadpunk | so it was made initially to reduce our timeout limit in zuul | 12:36 |
jrosser | do we break anything by moving that from pre-run: to run: in the base job config | 12:36 |
noonedeadpunk | as timeout is calculated for run step separately from all pre-post steps iirc | 12:36 |
jrosser | then a child job pre: playbook would go first | 12:36 |
noonedeadpunk | and failure of pre step also triggers retry of job, when run just failure | 12:37 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Add support for extra Python packages inside OSA venv https://review.opendev.org/c/openstack/openstack-ansible/+/905221 | 14:17 |
opendevreview | Merged openstack/openstack-ansible stable/2023.2: Skip installing curl for EL https://review.opendev.org/c/openstack/openstack-ansible/+/904845 | 14:23 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Add zuul job which does not run pre- playbooks. https://review.opendev.org/c/openstack/openstack-ansible/+/905250 | 16:43 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 16:45 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Add zuul job which does not run pre- playbooks. https://review.opendev.org/c/openstack/openstack-ansible/+/905250 | 16:47 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 17:14 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 18:01 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Return back /healtcheck URI verification https://review.opendev.org/c/openstack/openstack-ansible/+/904941 | 19:09 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!