Monday, 2024-07-29

opendevreviewJens Harbott proposed openstack/project-config master: Drop project definitions for some x/* repos  https://review.opendev.org/c/openstack/project-config/+/92507504:56
*** tobias-urdin|pto is now known as tobias-urdin06:31
fungijust a heads up, i'm going to take the opportunity to catch a film over lunch, so i'll be disappearing between 14:45 utc and about 17:15 utc13:11
fungipip 24.2 was just released: https://pip.pypa.io/en/stable/news/14:40
fungii can see quite a few things in the changelog that could end up subtly breaking prior behavior assumptions, so be on the lookout14:42
fungiheading out now, back in a couple of hours14:46
opendevreviewMerged openstack/project-config master: Add openstack/os-test-images project under glance  https://review.opendev.org/c/openstack/project-config/+/92504314:50
clarkbhave fun!15:03
clarkbif I can get at least one review on the stack starting at https://review.opendev.org/c/openstack/project-config/+/925029 I'll go ahead and start the process of shutting down that cloud today15:04
clarkbseparately I haev received my repaired laptop so need to spend some time testing that the repairs corrected the issue adn reloading a functional os on it15:04
corvusclarkb: +216:04
fricklercorvus: clarkb: please also check https://review.opendev.org/c/openstack/project-config/+/925075 this has caused some config errors16:25
opendevreviewMerged openstack/project-config master: Set linaro cloud's max servers to 0  https://review.opendev.org/c/openstack/project-config/+/92502916:30
opendevreviewMerged openstack/project-config master: Use the osuosl mirror for deb packages in image builds  https://review.opendev.org/c/openstack/project-config/+/92504816:30
clarkbthanks. I'ev approved that change to clean up the config errors frickler16:42
opendevreviewMerged openstack/project-config master: Drop project definitions for some x/* repos  https://review.opendev.org/c/openstack/project-config/+/92507516:45
clarkbthere are three nodes stuck in deleting in linaro. I think I remember that nodepool hanldes this more gracefully now if we proceed towards complete provider remove. Is that recollection correct corvus ?17:45
clarkbif so I can approve the next change in the stack17:46
clarkbin the meantime /me starts reinstalling this laptop17:58
corvusclarkb: not sure what will happen here, but if it goes wrong ping me and i can help18:08
opendevreviewMerged openstack/project-config master: Remove labels and diskimages from the linaro cloud  https://review.opendev.org/c/openstack/project-config/+/92504918:50
carlosso/ fungi https://review.opendev.org/c/openstack/project-config/+/924430 manila-unmaintained-core group was created last week. Could you please add manila-stable-maint group to the manila-unmaintained-core? :)18:54
fungicarloss: sure, i guess they're going to seed that group but then add people outside manila-stable-maint to it as well? (otherwise there was little point in creating a wholly new group in gerrit)18:56
carlossyep18:57
fungicarloss: done19:03
carlossfungi: thank you!19:10
fungiclarkb: the image/label removal change deployed, but `nodepool image-list` is still showing images there (14 deleting, 6 ready). i'll check it again in a bit and see if those counts decrease19:56
fungilooks like it just transitioned to 15 deleting, 5 ready so it's at least wanting to delete them now19:56
clarkbfungi: ack thannks19:58
clarkbI suspect that if we have things stuck in deleting we run the nodepool pruge command or whatever it is after both nodes and images should be deleted but haven't been19:59
clarkbthen we can land the last change19:59
clarkbfungi: https://zuul-ci.org/docs/nodepool/latest/operation.html#erase this command. Though now that I've said it I think we may want to remove the provider from the config first (so that we don't accidentally createa anything new) then run that command to get the last bits out20:00
fungiyeah, i haven't seen the total node count fall yet, just more images transitioning from ready to deleting20:03
clarkbya the nodes are stuck. They are old ones and I think the easiest thing is to just nodepool erase the provider since the cloud is going away20:04
johnsomHi OpenDev colleagues. I starting to look at being able to test SR-IOV code in tempest. To do this I need an instance that uses the "igb" network driver. This is available in recent distros such as Ubuntu 24.04 and Fedora. Thoughts on setting this up? 20:04
clarkbjohnsom: we have 24.04 nodes available20:20
clarkbdoes it also require special hardware or just the driver?20:21
clarkbif it is just the new enough kernel to have the driver then you should be all set20:21
johnsomJust the driver, it simulates SR-IOV 20:22
johnsomIt's the version of qemu that  is running the instance. It has to be new enough to have the igb driver in it20:22
clarkbyou should be able to run your job on 24.04. You may run into problems with python3.12 support there as that is slowly making its way into openstack but we have the nodes available20:23
johnsomOk, so the trick is how to boot an instance in zuul with the igb network driver instead of the virtio (my guess) driver.20:25
johnsomThis for example: https://zuul.opendev.org/t/openstack/build/23390a2df40e448c97361a357c101cc3/log/zuul-info/host-info.controller.yaml#34720:26
clarkbcan't you just modprobe it?20:31
clarkbI think I'm not understanding why you're expecting zuul needs to do anything if this is an emulation driver it shouldn't be attached to any hardware right? so modprobe when you need ti then it should be available?20:32
fungier, sorry, i meant i haven't seen the total image count fall yet20:33
clarkbbut for real hardware the driver would be selected by udev rules based on the hardware20:33
clarkbI think20:33
fungiat this point we're just at 20 images in deleting state, none in ready20:33
clarkband we don't control what virtual devices the underlying clouds attach to our instances20:33
johnsomQEMU needs to be configured to use the qemu igb instead of virtio. It's outside the instance kernel. Once qemu is using the correct driver, then you can go to the kernel and enable the VFs.20:35
clarkbwe don't control qemu20:35
johnsomhttps://www.irccloud.com/pastebin/vGgMKrR4/20:36
johnsomhttps://www.irccloud.com/pastebin/9shGFGT5/20:36
clarkbwhether the qemu in question is the one controlling our test instance VM or the one nested in the test node we don't control it20:36
clarkbif you are takling about the qemu at the "base" layer controlling the instance VM I suspect there isn't a whole lot we can do. For the nested qemu you should have control to configurei t however you like within the job20:37
johnsomThe qemu that runs the VM that devstack will be installed into.20:37
clarkbya unfortunately I don't think that is something we can control and one of the clouds involved isn't doing qemu at all20:38
johnsomYeah, I know one is on Xen20:39
clarkbassuming you are going to be testing things in a nested VM why do we care about the host hardware at all? Cant' you test it with a virtual device? I guess the issue is you want to test the passthrough of pci which means virtual devices aren't very valid?20:42
clarkbbut ya I think there are a few issues. The first is qemu isn't universal, second is that clouds tend to udpate their hypervisors more slowly so won't have the options available, and finally they may prefer to lock that stuff down and not expose it anyway20:43
johnsomMy intent is to have a few VFs available to devstack such that we  can test that nova and neutron are working correctly when "direct" ports are assigned to a VM. Right now we aren't regularly testing any of it.20:46
fungithat seems like it violates the usual attempt to isolate test workloads from underlying implementation details of the providers where they run20:50
fungiit probably needs a third-party ci where the lowest hypervisor layer has a specific kernel and configuration20:51
fungiif one of our generic donors is able to guarantee those parameters, it's possible we could add a custom label limited to that provider (or providers)20:52
fungisimilar to the labels for nested virt acceleration support20:53
johnsomIt is just a qemu configuration setting when the VM is booted up.20:53
fungiis there an openstacky way for a normal cloud user to request that through the nova api? or does it need a custom flavor?20:53
fungii.e. is there a cloud provider you have an account in where you are able to boot instances that meet that requirement?20:55
johnsomYeah, there is an "OpenStack way" with nova, let me find it.20:55
Clark[m]Sorry back to matrix as I'm trying to do laptop stuff but even if there is an openstacky way I doubt any of our clouds run new enough hypervisor based on what you said before.20:56
johnsomYou set hw_vif_model=igb as an image property20:57
Clark[m]And even if they did they may not want to expose hardware devices directly for security reasons20:57
johnsomIt is not pass through, it's an emulated driver20:57
johnsomIt is documented here: https://docs.openstack.org/glance/latest/admin/useful-image-properties.html (sorry, the page doesn't have direct links)20:59
fungiah, so it's a setting on the image, not the instance itself?21:02
Clark[m]If it isn't pass through why can't you use it at the nested level? (This goes back to my original confusion)21:02
johnsomThey are all essentially tap drivers (over simplifying a lot), it's just what features they emulate for the kernel inside the VM that differentiates them.21:02
johnsomWell, that is how nova gets these extra settings passed in. I find it odd, but....21:03
johnsomWhat I understand you are proposing is get a 24.04 nodepool instance, then have it spawn a VM inside it with the setting enabled, then install devstack inside that VM and run the tempest tests there. Correct?21:05
clarkbjohnsom: no, you have the cloud hypervisor, the nodepool VM, and then any VMs that the devstack cloud boots21:09
clarkbif this is all emulation why can't you emulate between the nodepool VM and teh VMs that the devstack cloud boots21:10
johnsomThey setting has to be on the nodepool VM.21:11
johnsomSuch that nova and neutron in devstack can access the virtual SR-IOV capabilities of the igb driver.21:12
clarkbright so can't you modprobe that driver and then create an emulated device?21:12
johnsomno, qemu has to expose it.21:14
johnsomRunning modprobe from devstack would give you nothing as the nic is virtio by default in the nodepool VM.21:15
clarkbbut linux allows you to have virtual network interfaces too completely detached from any hardware21:16
clarkbI guess I was hoping you could just create an interface of that type and then have devstack VMs attach to them since it is all emulated anyway apparently21:16
johnsomNope, there is no emulated SR-IOV driver in the linux kernel.21:17
johnsomThe devstack level kernel would see it as an actual igb nic, but in reality it is qemu emulating one.21:18
clarkbthen ya you may need to do the extra level of nesting which I have no idea if that is feasible21:24
johnsomYeah, RAM might become an issue.21:28
clarkbI'm almost to the point where I'll feel functional on the repaired laptop (it makes a lot of noise now unfortunately, but seems like lenovo doesn't consider that to be an issue)21:50
clarkbfungi: re the image cleanup we can probably give it until tomorrow then plan to land the last cleanup change and ifnally run the erase command. I don't think we're in a super hurry but I also want to get it done21:59
clarkband now to update the meeting agenda21:59
fungisgtm22:01
clarkbwe also need to land https://review.opendev.org/c/opendev/base-jobs/+/922653 so that we can move forward on cleaning up those images from nodepool22:09
clarkbbut other things keep popping up anddistracting me22:10
clarkbI've just updated the meeting agenda. Anything need to be added/removed/edited?22:13
fungii can't think of anything22:27
fungii went ahead and approved 922653 too22:27
opendevreviewMerged opendev/base-jobs master: Drop CentOS 8 Stream nodesets and sanity checks  https://review.opendev.org/c/opendev/base-jobs/+/92265322:30
clarkbdigging in more looks like libvirt 9.3.0 or newer is necessary for igb?22:37
clarkbI think that is also included in centos 9 stream so potentially useable in the openmetal cloud?22:37
clarkbjohnsom: do you know if you can set those flags on vm boot as an alternative to being an image setting? Its kinda weird that would be limited to the image. Also any idea if you set it on either an image or vm and the backing cloud doesn't support it if that gets silently ignored or will it explode?22:38
clarkbsean-k-mooney may know but isn't in here22:40
clarkbagenda has bee nsent23:43
clarkbyou can tell I'm on the laptop again because my typing is worse here23:43

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!