Thursday, 2024-09-05

noonedeadpunkdoes anybody recall how we fixed `ModuleNotFoundError: No module named 'packaging'` for ceph-ansible?07:47
noonedeadpunkas I bet I saw that on master and we did patch that... but can't find it07:48
noonedeadpunkhttps://opendev.org/openstack/openstack-ansible-openstack_hosts/commit/434602a59e8d503bf4c0cbf47a358b1d822777aa07:50
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-openstack_hosts stable/2023.2: Ensure python3-packaging is installed for distros  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/92811907:51
jrossero/ morning08:27
noonedeadpunko/08:45
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server master: Include feature flags enablement only during upgrades  https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/92812409:13
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server master: Include feature flags enablement only during upgrades  https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/92812409:18
jrosserdo we have actual breakage on the horizon role `Failed to set module mpm_event to enabled`09:38
jrosseri see that on two jobs here https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/92795309:38
jrosserwhich suggests that we're not running horizon on the metal jobs elsewhere, which might not be ideal09:38
grauzikasHello, im testing OSA for some time and noticed issue when im reinstalling everything im using this steps: 1: remove instances, 2: remove all ceph pools, osd, 3: remove all lxc containers, 4: remove from all nodes all by me found packages like haproxy, keepalived, ovs, ovn, remove openstack directories and so on and on, later on compute nodes remove libvirtd and other stuff, 5: wipe drives on ceph osd nodes and so on, 09:47
grauzikasremove /etc/openstack_deploy dir in deploy node. When reruning playbooks all the time im getting issue that OSA playbook cant start libvirtd socket service and then i need manually connect to compute nodes and restart these services and redun playbook again: https://paste.openstack.org/show/b8JnnDlhujF7JevNnJVg/ then everything works. May be in playbook is missing something?09:47
jrosserthey playbook is not tested against anything except freshly installed systems so it is quite possible that something is not accounted for with a manually cleaned up node09:49
jrossergrauzikas: `"Unable to start service libvirtd.socket: Job failed. See \"journalctl -xe\" for details.\n"` <-  did you do this?09:50
jrosserif you have some good PXEboot setup it is simpler to just make fresh nodes each time09:54
grauzikashttps://paste.openstack.org/show/b7PIh00yuFDOQ6L7fGBO/09:57
grauzikasno i dont have pxe so simply trying to remove everything what osa applyed10:00
grauzikasso now waiting till it reaches nova, then crashes, then restarting these services and running again :)10:02
grauzikasor may be for next time i will append to playbook reboot of services10:03
*** rambo is now known as rambo241210:18
rambo2412Hi All This is regarding Ussuri to Victoria upgrade which I am discussing for last couple of days. I am checking each of the playbook to forecast the expected impact. now I am checking rabbitmq_install.yml playbook.10:20
rambo2412I can see there are main 2 tasks (we have 3 hosts of rabbitmq) what I could interpret , first host2, host3 rabbitmq service is stopped and next upgrade is done for all three hosts sequentially starting host1. so I need to confirm if there will be any time all three rabbitmq services will be down during this time?10:22
noonedeadpunkyes, there will be10:29
rambo2412okay thanks for confirmation.10:34
noonedeadpunkbut it's usdually not causing too much troubles as oslo.messaging ensures reconnects to rabbitmq once it\s up10:45
vicentHi! I am trying OSA 29.0.2 with the services on metal and I seem to have a problem with nova-compute not finding nova-rootwrap. https://paste.openstack.org/show/bSEP8iyuNc7PNSOADyx7/ Any idea of what could be wrong?11:28
noonedeadpunkhey11:55
noonedeadpunkdarn good question11:56
noonedeadpunkugh, I don't have any sandbox handy with caracal to check on that12:07
noonedeadpunkvicent: don't you accidentally have distro install path?12:07
vicentnoonedeadpunk: I did install_method: source, I didn't install the distro packages. On the service unit, the path points to the venv.12:11
vicent# grep -i execstart /etc/systemd/system/nova-compute.service 12:11
vicentExecStart = /openstack/venvs/nova-29.0.2/bin/nova-compute12:11
noonedeadpunkah, yes true-true12:11
noonedeadpunkI kind of really wonder if the issue is in not finding nova-rootwrap or not12:28
noonedeadpunkah12:30
noonedeadpunkvicent: what `exec_dirs` you have for /etc/nova/rootwrap.conf ?12:30
vicentI am now reinstalling the OS, but IIRC, the nova venv was included there12:38
noonedeadpunkas eventually the folder is a symlink, so in case you'd manually upgrade package inside venv - it could bring in unpatched version of rootwrap.conf12:44
noonedeadpunkthere are also other cases where such thing can happen iirc12:45
noonedeadpunkie - issues in venv build12:45
vicentI get the result after a clean install following the deploy guide https://docs.openstack.org/project-deploy-guide/openstack-ansible/2024.1/overview.html13:15
vicentno upgrade at all13:15
noonedeadpunkhuh, and no failures during nova deployment?13:18
noonedeadpunkalso - you get that error while creating a VM, right?13:19
vicentnoonedeadpunk: no errors on deployment. And yes, while creating a VM. Or if I add lvm on images_type, I get that on the nova logs after deployment. Similar error.13:20
noonedeadpunkas you're resintalling OS I assume that right now there's no possibility to check anything on the host? if so - once you'll complete redeployment, can you share if issue did re-occur?13:20
vicentI got the OS reinstalled. Now I am installing OSA, so I can check stuff on the host.13:21
noonedeadpunk++13:21
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server master: Add retries for feature flags check  https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/92813313:22
vicentnoonedeadpunk: Same error. And the venv is on the exec_dirs13:41
vicent$ sudo grep exec_dirs /etc/nova/rootwrap.conf13:42
vicentexec_dirs = /openstack/venvs/nova-29.0.2/bin,/sbin,/usr/sbin,/bin,/usr/bin,/usr/local/bin,/usr/local/sbin13:42
vicentIt is like the nova-compute service doesn't have the venv activated13:46
noonedeadpunkjust in case - `sudo` is present on host?13:51
noonedeadpunkcan you share please your OS so I could try to reproduce?13:51
vicentUbuntu 22.04.4, and sudo is present13:52
noonedeadpunkhuh13:52
noonedeadpunkso it's just metal installation? anything specific to keep in mind when spawning sanddbox? Like some torage driver or smth like that?13:53
noonedeadpunk*storage13:53
noonedeadpunkAs we've just this week upgraded one of our regions on 22.04 to 29.0.213:54
noonedeadpunkand haven't seen anything weird so far ,except indeed things being slower then on 2023.113:55
vicentIt is an ml2/ovn + sriov Openstack. I have some pci passthrough configuration. But I don't think that's relevant. I reduced the instance to not use those devices.13:56
noonedeadpunkyeah, as it;s failing somewhere on storage allocation13:56
noonedeadpunkand you're trying to use LVM, right?13:56
noonedeadpunkor?13:56
noonedeadpunkin terms of `images_type`?13:57
vicentLVM was also failing, so I just increased the root fs size. Single node. https://github.com/openstack/openstack-ansible/blob/master/etc/openstack_deploy/env.d/aio_metal.yml.example13:57
vicentI have this13:58
noonedeadpunkoh, so that is aio as well? o_O13:58
vicentand no_containers: true on the host13:58
vicentNo, I didn't follow the AIO, just that conf to set the services to metal13:59
noonedeadpunk++13:59
vicentShould I try 29.0.1?14:03
noonedeadpunkthat should not actually matter much (I guess)14:04
noonedeadpunkI think 29.0.2 includes a CVE fix though14:05
noonedeadpunkhttps://security.openstack.org/ossa/OSSA-2024-002.html14:05
* noonedeadpunk spawning a sandbox14:08
noonedeadpunkvicent: hm, do you have failure of VM creation when you catch the error?14:10
noonedeadpunkAs according to comment in nova code - it might be "expected" failure for $reasons: https://opendev.org/openstack/nova/src/commit/cd4e58173a1533878eccc6efabbda0560dfde613/nova/virt/libvirt/imagebackend.py#L57-L8014:11
noonedeadpunkthough it's not an OSError....14:12
noonedeadpunkbut still `FailedToDropPrivileges`14:12
noonedeadpunkvicent: just in case to verify - /etc/sudoers.d/nova_sudoers do have correct path to nova as well?14:19
vicentYes, the nova_sudoers have the correct file14:21
vicentIt seems like the nova service is not running on the virtualenv.14:21
noonedeadpunkwell according to paste it does14:22
noonedeadpunkas stack trace is totally from inside of venv14:22
noonedeadpunkoh14:22
vicentIf I run the command that fails manually on the virtualenv, it works fine14:22
noonedeadpunkI think there should be smth more for service to look for binaries in expected folder14:22
noonedeadpunknot for me actually14:23
noonedeadpunkactually I have weird output14:24
noonedeadpunkhttps://paste.openstack.org/show/bXZTKy3TOgXuXk8UiJ2V/14:24
noonedeadpunkah, I should change user...14:26
vicent$ sudo -u nova bash -c '. /openstack/venvs/nova-29.0.2/bin/activate ; echo $PATH'14:26
vicent/openstack/venvs/nova-29.0.2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games14:26
noonedeadpunkyeah14:26
vicentWhere do you see on the paste that the service is running on the virtualenv?14:27
noonedeadpunkso one possibility is that this command is not run as `nova` user14:27
noonedeadpunkin stack trace you have like `/openstack/venvs/nova-29.0.2/lib/python3.10/site-packages/nova/compute/manager.py`14:28
vicentWorks fine:14:28
vicent$ sudo  bash -c '. /openstack/venvs/nova-29.0.2/bin/activate ; echo $PATH'14:28
vicent/openstack/venvs/nova-29.0.2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games14:28
noonedeadpunkvicent: it did not work for me, because the venv path was not in secure_path for sudo14:29
noonedeadpunkso if I run as root I get `command not found`14:29
noonedeadpunkbut as nova user it works14:29
vicent$ grep -i execstart /etc/systemd/system/nova-compute.service 14:29
vicentExecStart = /openstack/venvs/nova-29.0.2/bin/nova-compute14:29
vicentThis doesn't mean that the venv is activated AFAIK14:30
vicentright?14:30
noonedeadpunkit should not be activated I assume14:30
noonedeadpunkas you have `Defaults:nova secure_path="/openstack/venvs/nova-28.1.0.dev87/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"`14:30
noonedeadpunkwhich should adjust PATH for the nova user if sudo is involve14:31
noonedeadpunk(in /etc/sudoers.d/nova_sudoers)14:31
vicentbut that is not enought to modify PATH env:14:31
vicent$ sudo -u nova bash -c 'echo $PATH'14:31
vicent/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games14:31
noonedeadpunkbut can you try setting that secure_path globally in sudoers? I really wonder if it's that nova does not run as "nova" somehow....14:31
noonedeadpunkvicent: this command is very different14:32
vicentThe service unit explicitly runs as nova14:32
vicentI can see it on the process list14:32
noonedeadpunksecure_path - Path used for every command run from sudo. If you don't trust the people running sudo to have a sane PATH environment variable you may want to use this. Another use is if you want to have the “root path” be separate from the “user path”.14:33
noonedeadpunkso if secure_path is set - once you try to use `sudo` to run command - sudo will ignore PATH and look inside secure_path14:33
noonedeadpunkideally14:33
vicentaha, I was missing that knowledge14:34
noonedeadpunkjsut try to `su nova; sudo nova-rootwrap`14:34
noonedeadpunkand it should find the binary14:35
vicentyeah, failing: "sudo: nova-rootwrap: command not found"14:35
noonedeadpunkoh14:35
noonedeadpunkit works for me14:35
noonedeadpunkeven without activating venv14:35
noonedeadpunkok, that's interesting14:35
noonedeadpunkare you sure you don't have any issues in sudoers files ?:)14:36
noonedeadpunkvisudo -c ?14:36
vicentLooks fine to me: https://paste.openstack.org/show/bnQwhBfuyMaNtPHXKYBL/14:36
noonedeadpunkI wonder if you for some reason don't have include if /etc/sudoers.d/14:37
noonedeadpunkie I have that https://paste.openstack.org/show/b4at0HPTbJ0kzAsaauR7/14:37
noonedeadpunkand `@includedir /etc/sudoers.d` is last line in /etc/sudoers14:38
vicentYeah! That's it! I don't have that include. Probably my organization removed that :S14:39
noonedeadpunkwe had same issue when just `template` /etc/sudoers so random crap...14:39
noonedeadpunkok, cool, revived nmy rusty memories of how rootwrap works :D14:40
noonedeadpunkthough not it being present after os re-setup is weird14:40
vicentI think there is some pxe magic and customizations in my lab14:41
noonedeadpunkah14:41
noonedeadpunkmight be that we should ensure presence of includes somewhere in openstack_hosts...14:43
vicentnova-rootwrap manually works now14:43
vicentAnd the vm gets created!14:44
vicentMany thanks noonedeadpunk!14:44
vicentI think devstack makes sure the include is there. I have another maching with devstack and I could see it.14:55
fungiwarehouse (pypi) is removing some expensive xmlrpc api methods: https://mail.python.org/archives/list/pypi-announce@python.org/message/5VOX33ARFQUYKIMKM5NS7PM7Z6ZNCSJY/15:18
fungithe only match i found in codesearch was this: https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/get-pypi-pkg-version.py#L3515:19
fungisomeone might want to rethink that routine if it's of critical importance15:19
*** rambo is now known as Guest263620:10
*** jonher_ is now known as jonher22:34

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!