*** akekane__ is now known as abhishekk | 05:26 | |
gibi | dmitriis, sean-k-mooney: I made a fix for the func test pci device issue in https://review.opendev.org/c/openstack/nova/+/829248/4/nova/compute/resource_tracker.py let me know what you think. I'm happy to split that patch up if needed | 07:30 |
---|---|---|
*** akekane__ is now known as abhishekk | 07:42 | |
*** ricolin is now known as Guest1112 | 07:59 | |
*** ricolin_ is now known as ricolin | 07:59 | |
*** akekane_ is now known as abhishekk | 08:11 | |
opendevreview | ribaudr proposed openstack/python-novaclient master: Microversion 2.91: Support specifying destination host to unshelve https://review.opendev.org/c/openstack/python-novaclient/+/831651 | 10:25 |
opendevreview | OpenStack Release Bot proposed openstack/os-vif stable/yoga: Update .gitreview for stable/yoga https://review.opendev.org/c/openstack/os-vif/+/831664 | 10:50 |
opendevreview | OpenStack Release Bot proposed openstack/os-vif stable/yoga: Update TOX_CONSTRAINTS_FILE for stable/yoga https://review.opendev.org/c/openstack/os-vif/+/831670 | 10:51 |
opendevreview | OpenStack Release Bot proposed openstack/os-vif master: Update master for stable/yoga https://review.opendev.org/c/openstack/os-vif/+/831678 | 10:51 |
opendevreview | OpenStack Release Bot proposed openstack/osc-placement stable/yoga: Update .gitreview for stable/yoga https://review.opendev.org/c/openstack/osc-placement/+/831695 | 10:51 |
opendevreview | OpenStack Release Bot proposed openstack/osc-placement stable/yoga: Update TOX_CONSTRAINTS_FILE for stable/yoga https://review.opendev.org/c/openstack/osc-placement/+/831698 | 10:51 |
opendevreview | OpenStack Release Bot proposed openstack/osc-placement master: Update master for stable/yoga https://review.opendev.org/c/openstack/osc-placement/+/831701 | 10:51 |
opendevreview | OpenStack Release Bot proposed openstack/python-novaclient stable/yoga: Update .gitreview for stable/yoga https://review.opendev.org/c/openstack/python-novaclient/+/831704 | 10:51 |
opendevreview | OpenStack Release Bot proposed openstack/python-novaclient stable/yoga: Update TOX_CONSTRAINTS_FILE for stable/yoga https://review.opendev.org/c/openstack/python-novaclient/+/831705 | 10:51 |
opendevreview | OpenStack Release Bot proposed openstack/python-novaclient master: Update master for stable/yoga https://review.opendev.org/c/openstack/python-novaclient/+/831706 | 10:51 |
*** __ministry1 is now known as __ministry | 11:05 | |
*** dasm|off is now known as dasm|rover | 12:08 | |
opendevreview | Merged openstack/os-vif stable/yoga: Update .gitreview for stable/yoga https://review.opendev.org/c/openstack/os-vif/+/831664 | 12:55 |
opendevreview | Merged openstack/os-vif stable/yoga: Update TOX_CONSTRAINTS_FILE for stable/yoga https://review.opendev.org/c/openstack/os-vif/+/831670 | 12:57 |
opendevreview | ribaudr proposed openstack/python-novaclient master: Microversion 2.91: Support specifying destination host to unshelve https://review.opendev.org/c/openstack/python-novaclient/+/831651 | 13:09 |
sean-k-mooney | by the way am i the only one who sees this when they run commands | 13:22 |
sean-k-mooney | /usr/lib/python3/dist-packages/secretstorage/dhcrypto.py:15: CryptographyDeprecationWarning: int_from_bytes is deprecated, use int.from_bytes instead | 13:22 |
sean-k-mooney | from cryptography.utils import int_from_bytes | 13:22 |
sean-k-mooney | /usr/lib/python3/dist-packages/secretstorage/util.py:19: CryptographyDeprecationWarning: int_from_bytes is deprecated, use int.from_bytes instead | 13:22 |
sean-k-mooney | from cryptography.utils import int_from_bytes | 13:22 |
sean-k-mooney | im seing it for nova manage and also osc | 13:22 |
sean-k-mooney | i assume secretstorage is managed by uc | 13:25 |
sean-k-mooney | i wonder if that has beeen adressed there yet | 13:26 |
gibi | sean-k-mooney: I see that in a fairly recent devstack from master too so I think this is still happening | 13:54 |
sean-k-mooney | gibi: ack i think this need to be updated in secrtstorage but i dont even know what we use that for | 14:12 |
sean-k-mooney | like sure it provides a way for securly storing password | 14:13 |
sean-k-mooney | https://pypi.org/project/SecretStorage/ | 14:13 |
sean-k-mooney | but why woudl we need dbus integratoin for nova-manage | 14:13 |
sean-k-mooney | liek we are not going to be pulling things form the gnome keyring | 14:13 |
*** ricolin_ is now known as ricolin | 14:13 | |
sean-k-mooney | im guessing this is really from oslo | 14:14 |
sean-k-mooney | actully i dont see it as a direct dep in any openstack porject that woudl make sense | 14:15 |
sean-k-mooney | https://codesearch.opendev.org/?q=secretstorage&i=nope&literal=nope&files=&excludeFiles=&repos= | 14:15 |
sean-k-mooney | based on strace it looks like its coming form keyring | 14:19 |
sean-k-mooney | it might be a sideffect fo stevador and the entrypoint scan it does | 14:21 |
sean-k-mooney | gibi: ok so this is from python-keystoneclient | 14:26 |
sean-k-mooney | https://opendev.org/openstack/python-keystoneclient/commit/5939541bc771e1205394b05e757d7b23b3aca862 | 14:26 |
opendevreview | Merged openstack/python-novaclient master: Update master for stable/yoga https://review.opendev.org/c/openstack/python-novaclient/+/831706 | 14:32 |
tobias-urdin | how does nova track the available of vgpu resources through placement? Or maybe phrased like: how does nova update placement with availability of vgpu resources with multiple "flavors" when using custom traits? | 14:32 |
tobias-urdin | i've tried to dig through nova scheduler and resource tracker code to understand how it calculates where there is availability | 14:33 |
sean-k-mooney | tobias-urdin: that has cahnged a littel in the last release or two | 14:34 |
sean-k-mooney | but basically you list the mdev type and the partent gpu pci adress in the config | 14:34 |
sean-k-mooney | thne nova will look at the avaiable count and create a Resouce prover per card you listed with an inventory of VGPU for each rp/card | 14:35 |
sean-k-mooney | we recently added support for generic mdevs so now you can use a differnt resouce class if you prefer | 14:35 |
sean-k-mooney | so you could have CUSTOM_NVIDIA_LARGE instead of VGPU | 14:36 |
sean-k-mooney | the generic mdev support is intendd for things that are not GPUs too | 14:36 |
sean-k-mooney | so in terms of capsity if you are not useing any traits request | 14:37 |
sean-k-mooney | the compute agent as part of update_aviable_resouces periodic task(and init_host) will read the capasity info form sysfs and translate that into RPs that are child RPs of the comptue node RP with on RP per phsyical gpu card | 14:38 |
sean-k-mooney | if you just ask for resouces:CGPU=1 in the flavor | 14:38 |
sean-k-mooney | then we will not filter on any mdev type or trait and will just select host with free VGPU inventory | 14:39 |
sean-k-mooney | which is fine if all your gpus are the same | 14:39 |
sean-k-mooney | if you have differnt ones configured you shoudl use traits or the new generic mdev feature instead ot differenciate | 14:39 |
sean-k-mooney | tobias-urdin: is there anything in particalr that you wanted to know beyond that overview | 14:40 |
tobias-urdin | so let's say i have two cards 0000:3b:00.0 and 0000:af:00.0 that is NVIDIA A10 cards, I enable VFs on those and get like 30 each let's say on 0000:3b:x.x 0000:af:x.x and assign enabled_mdev_types=nvidia-1, nvidia-2 and [mdev_nvidia-1]/device_addresses=<all VF for 0000:3b:x.x> (and same for 0000:af card) - assign those CUSTOM_NVIDIA_1 and | 14:41 |
tobias-urdin | CUSTOM_NVIDIA_2 traits, flavors with resources:VGPU=1 and trait:CUSTOM_NVIDIA_1=required etc | 14:41 |
tobias-urdin | how is placement populated for all those <computenode>_pci_0000_3b_x_x RPs to know how many it is left when it a 1:1 mapping in that RP per PCI dev addr | 14:42 |
* tobias-urdin reading through above | 14:42 | |
ade_lee_ | sean-k-mooney, hey -- can you take a look at the failing experimental fips job on https://review.opendev.org/c/openstack/tempest/+/831607 | 14:51 |
ade_lee_ | sean-k-mooney, this time its on centos-9 | 14:51 |
ade_lee_ | sean-k-mooney, dansmith : https://zuul.opendev.org/t/openstack/build/0a4f8346b89f4bbfa92135dbdbf811f9 | 14:52 |
sean-k-mooney | so rescue and temp url | 14:52 |
ade_lee_ | sean-k-mooney, ack - what does that point to? | 14:54 |
dansmith | check the cinder logs? | 14:54 |
sean-k-mooney | rescure failde because fo this | 14:55 |
sean-k-mooney | Waiting for libvirt event about the detach of device vdb with device alias virtio-disk1 from instance 7323f68a-b4dc-4630-b4fd-bd7a7f69d4f4 is timed out. | 14:55 |
sean-k-mooney | so that looks like the intermiting libvirt volume detach issue | 14:55 |
sean-k-mooney | ya it is internal error: unable to execute QEMU command 'device_del': Device virtio-disk1 is already in the process of unplug. | 14:56 |
sean-k-mooney | that will be fixed by a new qemu soon we hope | 14:56 |
dansmith | seems like a lot of these centos job fails are qemu/libvirt related | 14:56 |
dansmith | which is pretty disturbing :/ | 14:57 |
sean-k-mooney | if it makes you feel better those also fail on rhel downstream | 14:57 |
dansmith | not really :) | 14:57 |
* sean-k-mooney it make me feel worse but who know | 14:57 | |
gibi | the centos jobs are running with newer libvirt and qemu than the ubunut jobs so we see the new failure modes there first | 14:58 |
sean-k-mooney | currntly yes although it prevoulsy was the other way around | 14:58 |
sean-k-mooney | gibi: dansmith what is more disturbing to me is this is using the pc machine type | 14:59 |
sean-k-mooney | not q35 | 14:59 |
dansmith | hmm | 14:59 |
sean-k-mooney | gibi: so this is partly realted to the fact that even with the event based case we still retry | 15:00 |
sean-k-mooney | but fundimetnally qemu is taking a long time to detach | 15:00 |
sean-k-mooney | which it shoudl not, the wait for sshable/pingable tempest change might help | 15:01 |
sean-k-mooney | if this is because the os is not ready | 15:01 |
sean-k-mooney | but this happeing a lot lately | 15:01 |
sean-k-mooney | what do the ObjectTempUrlTest test do | 15:02 |
sean-k-mooney | are they swift related im not familar with them | 15:02 |
sean-k-mooney | GET https://149.202.163.165:8080/v1/AUTH_ab7063290b7341eeb77f5198d9e09903/tempest-TestContainer-735377846/tempest-TestObject-1768297629 | 15:03 |
sean-k-mooney | that looks like possible swift to me | 15:03 |
sean-k-mooney | ade_lee_: in anycase the rescue failure does not look fips related | 15:05 |
tobias-urdin | sean-k-mooney: any input on above? | 15:07 |
ade_lee_ | sean-k-mooney, ack - I didn't think it was, but unfortunately , it is blocking the fips patches. Do we have any workarounds/possible fixes ? I'll ask the swift folks about the swift issues. | 15:09 |
ade_lee_ | sean-k-mooney, is there a BZ /launchpad to track these libvirt/qemu issues? | 15:10 |
gibi | sean-k-mooney: yeah, the base case retry is something we can remove when we switch to qemu 6.2 as a minimum | 15:10 |
gibi | (or something around 6.2 Im not sure) | 15:10 |
sean-k-mooney | we proably could make it conditonal on the version before we raise our minium | 15:15 |
gibi | yeah, good point | 15:15 |
sean-k-mooney | tobias-urdin: oh i missed your follow ups | 15:15 |
sean-k-mooney | ill read back one sec | 15:16 |
sean-k-mooney | ade_lee_: why is it blocking? | 15:16 |
sean-k-mooney | the job is non voting right | 15:16 |
sean-k-mooney | i think we could proceed with this failure unless it reliably fails every time in the fips job? | 15:17 |
ade_lee_ | sean-k-mooney, fair enough -- maybe what we do then is change the job to be for centos-9 - and then merge it | 15:18 |
sean-k-mooney | basically what i woudl hope is when we fix this normally it would be fixed for fips | 15:19 |
*** efried1 is now known as efried | 15:24 | |
ade_lee_ | sean-k-mooney, ack - I'll update to centos 9 - and then ping for reviews. do we have any sense of when it will be fixed ? its showing up in glance reviews, cinder etc .. | 15:26 |
ade_lee_ | sean-k-mooney, a BZ will be super helpful so I can track things | 15:26 |
sean-k-mooney | so we kind fo do have one for qemu and there are a few cix issues | 15:30 |
sean-k-mooney | we dont have a singel one for nova for example because its not really a nova issue | 15:30 |
sean-k-mooney | we think that some of the recent bugfixes in qemu and libvirt will help | 15:30 |
opendevreview | ribaudr proposed openstack/python-novaclient master: Microversion 2.91: Support specifying destination host to unshelve https://review.opendev.org/c/openstack/python-novaclient/+/831651 | 15:33 |
tobias-urdin | sean-k-mooney: no hurry, let me know when you have a second :) | 16:19 |
sean-k-mooney | tobias-urdin: sorry im in a meeting downstream which is why i did not respond | 16:19 |
opendevreview | Merged openstack/python-novaclient stable/yoga: Update .gitreview for stable/yoga https://review.opendev.org/c/openstack/python-novaclient/+/831704 | 16:34 |
opendevreview | Merged openstack/python-novaclient stable/yoga: Update TOX_CONSTRAINTS_FILE for stable/yoga https://review.opendev.org/c/openstack/python-novaclient/+/831705 | 16:34 |
* Uggla had a hard time with pep8 "E127 continuation line over-indented for visual indent". :) | 17:21 | |
Uggla | sean-k-mooney, fyi now the unshelve to host is complete with client part as well. | 17:28 |
opendevreview | Merged openstack/osc-placement stable/yoga: Update .gitreview for stable/yoga https://review.opendev.org/c/openstack/osc-placement/+/831695 | 17:56 |
opendevreview | Merged openstack/osc-placement stable/yoga: Update TOX_CONSTRAINTS_FILE for stable/yoga https://review.opendev.org/c/openstack/osc-placement/+/831698 | 17:57 |
sean-k-mooney | Uggla: ack ill try and review what you have proably monday | 18:12 |
Uggla | sean-k-mooney, no hurries I think. | 18:33 |
sean-k-mooney | are all patches in https://review.opendev.org/q/topic:bp%252Funshelve-to-host | 18:51 |
sean-k-mooney | if so ill add that to my review-list bookmark folder for monday | 18:51 |
opendevreview | Ade Lee proposed openstack/nova master: Test setting the nova job to centos-9-stream https://review.opendev.org/c/openstack/nova/+/831844 | 20:13 |
*** dasm|rover is now known as dasm|off | 23:16 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!