*** tkajinam has joined #openstack-nova | 00:00 | |
*** zhanglong has joined #openstack-nova | 00:46 | |
*** Liang__ has joined #openstack-nova | 01:12 | |
*** spatel has joined #openstack-nova | 01:26 | |
*** spatel has quit IRC | 01:30 | |
*** zzzeek has quit IRC | 01:42 | |
openstackgerrit | Wenping Song proposed openstack/nova-specs master: Support vGPU in nova and cyborg interaction https://review.opendev.org/750116 | 01:42 |
---|---|---|
*** zzzeek has joined #openstack-nova | 01:45 | |
*** zhanglong has quit IRC | 01:52 | |
*** iurygregory has quit IRC | 02:03 | |
*** suryasingh has joined #openstack-nova | 02:15 | |
*** jamesdenton has quit IRC | 02:30 | |
*** zzzeek has quit IRC | 02:30 | |
*** zzzeek has joined #openstack-nova | 02:30 | |
*** jamesdenton has joined #openstack-nova | 02:30 | |
*** mkrai has joined #openstack-nova | 02:54 | |
*** sapd1_x has joined #openstack-nova | 03:20 | |
*** euclidsun has joined #openstack-nova | 03:24 | |
*** mkrai has quit IRC | 03:26 | |
*** mkrai has joined #openstack-nova | 03:27 | |
*** psachin has joined #openstack-nova | 03:39 | |
*** euclidsun has quit IRC | 03:46 | |
*** spatel has joined #openstack-nova | 04:18 | |
*** links has joined #openstack-nova | 04:21 | |
*** euclidsun has joined #openstack-nova | 04:26 | |
*** euclidsun has left #openstack-nova | 04:26 | |
*** evrardjp has quit IRC | 04:33 | |
*** evrardjp has joined #openstack-nova | 04:33 | |
*** ratailor has joined #openstack-nova | 04:34 | |
*** vishalmanchanda has joined #openstack-nova | 04:38 | |
*** euclidsun has joined #openstack-nova | 05:24 | |
*** euclidsun has quit IRC | 05:31 | |
*** sapd1_x has quit IRC | 05:33 | |
*** brinzhang has joined #openstack-nova | 05:36 | |
*** jsuchome has joined #openstack-nova | 05:43 | |
*** spatel has quit IRC | 05:56 | |
*** songwenping_ has quit IRC | 06:05 | |
*** swp20 has joined #openstack-nova | 06:05 | |
*** zzzeek has quit IRC | 06:08 | |
*** zzzeek has joined #openstack-nova | 06:09 | |
*** mkrai has quit IRC | 06:13 | |
*** mkrai_ has joined #openstack-nova | 06:13 | |
*** ralonsoh has joined #openstack-nova | 06:26 | |
*** zzzeek has quit IRC | 06:32 | |
*** zzzeek has joined #openstack-nova | 06:35 | |
*** ralonsoh_ has joined #openstack-nova | 06:55 | |
*** ralonsoh has quit IRC | 06:57 | |
*** Yumeng has joined #openstack-nova | 07:01 | |
*** slaweq has joined #openstack-nova | 07:08 | |
*** mkrai_ has quit IRC | 07:09 | |
*** mkrai has joined #openstack-nova | 07:10 | |
*** nightmare_unreal has joined #openstack-nova | 07:11 | |
*** tesseract has joined #openstack-nova | 07:12 | |
*** mkrai has quit IRC | 07:15 | |
*** sapd1_x has joined #openstack-nova | 07:31 | |
bauzas | good morning Nova | 07:35 |
gibi | bauzas: good morning | 07:37 |
*** rcernin has quit IRC | 07:37 | |
*** owalsh has quit IRC | 07:48 | |
*** sapd1_x has quit IRC | 07:49 | |
*** sapd1_x has joined #openstack-nova | 07:49 | |
*** owalsh has joined #openstack-nova | 07:52 | |
*** iurygregory has joined #openstack-nova | 08:07 | |
*** damien_r has quit IRC | 08:27 | |
*** damien_r has joined #openstack-nova | 08:28 | |
*** mkrai has joined #openstack-nova | 08:28 | |
brinzhang | bauzas, gibi: good morning | 08:29 |
brinzhang | bauzas, gibi: two backport patches hope you can review, trivial changes https://review.opendev.org/#/c/749681/ , https://review.opendev.org/#/c/749701/ | 08:30 |
*** derekh has joined #openstack-nova | 08:31 | |
*** jaosorior has joined #openstack-nova | 08:44 | |
*** k_mouza has joined #openstack-nova | 08:52 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Use UUID as vif and network_id in vif tests https://review.opendev.org/748722 | 08:53 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Support SRIOV interface attach and detach https://review.opendev.org/740995 | 08:53 |
*** zzzeek has quit IRC | 08:58 | |
*** zzzeek has joined #openstack-nova | 08:59 | |
openstackgerrit | Liang Fang proposed openstack/nova master: Add volume local cache support https://review.opendev.org/663542 | 09:02 |
elod | gibi: if you will have time: https://review.opendev.org/#/c/750068/ :) (Stein release patch) | 09:04 |
*** Liang__ has quit IRC | 09:06 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Make PCI claim NUMA aware during live migration https://review.opendev.org/748453 | 09:09 |
gibi | elod: done | 09:10 |
gibi | thanks | 09:10 |
*** stephenfin has joined #openstack-nova | 09:11 | |
gibi | stephenfin: hi! the nova-multi-cell failure in the vtpm resize patch seems to be relevant | 09:12 |
stephenfin | ack, looking | 09:13 |
gibi | https://zuul.opendev.org/t/openstack/build/7326bc59092346b18fe9858774606321/log/compute1/logs/screen-n-cpu.txt?severity=4#7741 | 09:13 |
*** jangutter_ has joined #openstack-nova | 09:15 | |
*** jangutter has quit IRC | 09:18 | |
elod | gibi: thx! \o/ | 09:18 |
luyao | stephenfin: Hi, not sure whether you saw my message last week, just a kind remind for vpmem-enhencement https://review.opendev.org/#/q/topic:bp/vpmem-enhancement+(status:open+OR+status:merged) . I move the patch 'improve orphans tracking' to the last of the patch sequence (we don't have any big concern on the other 3 patches I think), I redefined those orphans in updated patch since previous version involved the | 09:24 |
luyao | bug #1879878, and I didn't notice you have fixed it. | 09:24 |
openstack | bug 1879878 in OpenStack Compute (nova) "VM become Error after confirming resize with Error info CPUUnpinningInvalid on source node " [Medium,In progress] https://launchpad.net/bugs/1879878 - Assigned to Stephen Finucane (stephenfinucane) | 09:24 |
stephenfin | luyao: ack, will take a look this afternoon | 09:25 |
luyao | stephenfin: Thank you in advance. :) | 09:25 |
bauzas | gosh, morning hell here | 09:34 |
*** ralonsoh_ is now known as ralonsoh | 09:45 | |
openstackgerrit | Wenping Song proposed openstack/nova-specs master: Support vGPU in nova and cyborg interaction https://review.opendev.org/750116 | 09:47 |
*** xek has joined #openstack-nova | 09:50 | |
*** xek has quit IRC | 09:56 | |
nightmare_unreal | live-migration force node is not working with v2.67 , In v.268 force live migration was completely removed but it should work with v2.67 | 09:59 |
* bauzas is about to cry... | 10:10 | |
bauzas | why don't we have good api reference for https://docs.openstack.org/api-ref/network/v2/index.html#subnets ? | 10:10 |
*** k_mouza has quit IRC | 10:11 | |
noonedeadpunk | hi everyone. any chance you know why nova-compute may fail on centos7 for master and ussuri that way? https://zuul.opendev.org/t/openstack/build/6add842202c34390959d5fd0bd6fc83b/log/logs/host/nova-compute.service.journal-15-37-20.log.txt#2729-2768 | 10:11 |
noonedeadpunk | centos8, debian,ubuntu feel ok with the same setup path and configs | 10:13 |
bauzas | gibi: do you know how I can know the API reference for getting the subnets information ? | 10:13 |
bauzas | gibi: looking at https://docs.openstack.org/neutron/latest/admin/config-routed-networks.html#example | 10:13 |
bauzas | gibi: it looks to me you can get the related segment of a subnet | 10:14 |
bauzas | openstack subnet show my_subnet --c segment_id | 10:14 |
bauzas | but i guess it's a neutron extension | 10:14 |
bauzas | ahah, that's the reference which is confusing | 10:17 |
bauzas | https://docs.openstack.org/api-ref/network/v2/index.html?expanded=show-subnet-details-detail#show-subnet-details | 10:17 |
*** tobias-urdin has joined #openstack-nova | 10:25 | |
*** psachin has quit IRC | 10:25 | |
*** k_mouza has joined #openstack-nova | 10:26 | |
*** zzzeek has quit IRC | 10:27 | |
*** zzzeek has joined #openstack-nova | 10:30 | |
*** tosky has joined #openstack-nova | 10:33 | |
gibi | bauzas: sorry, I was afk | 10:33 |
*** kaisers1 has quit IRC | 10:37 | |
*** dtantsur|afk is now known as dtantsur | 10:38 | |
*** jawad_axd has joined #openstack-nova | 10:50 | |
*** kaisers has joined #openstack-nova | 10:53 | |
*** ratailor has quit IRC | 10:57 | |
*** k_mouza has quit IRC | 11:05 | |
*** rcernin has joined #openstack-nova | 11:06 | |
*** k_mouza has joined #openstack-nova | 11:09 | |
*** vishalmanchanda has quit IRC | 11:17 | |
*** lee1 has joined #openstack-nova | 11:26 | |
lee1 | kashyap: morning, random question, have you had to debug device detach issues before through libvirt and the guestOS and if so do you have any tips? | 11:27 |
*** lee1 is now known as lyarwood | 11:27 | |
lyarwood | kashyap: context is https://bugs.launchpad.net/nova/+bug/1882521 | 11:27 |
openstack | Launchpad bug 1882521 in OpenStack Compute (nova) "Failing device detachments on Focal" [Critical,Confirmed] - Assigned to Lee Yarwood (lyarwood) | 11:27 |
kashyap | Ack; back here in a min :) | 11:27 |
lyarwood | kashyap: I can reproduce while running the full suite of tests and I'm pretty sure it's just an issue of the guestOS (cirros) not being able to process the request but I just want to prove it somehow | 11:28 |
lyarwood | ack np | 11:28 |
kashyap | lyarwood: lee1 was your nick too, I suppose? | 11:29 |
* kashyap is reading; and morning/afternoon | 11:29 | |
kashyap | lyarwood: I vaguely recall some triaging some device detach issues; but I forget the details. Gimme a few | 11:32 |
kashyap | lyarwood: Are you also implying that this is not reproducible with non-CirrOS guests? | 11:32 |
kashyap | Okay, you say as much in #5 | 11:34 |
kashyap | "Each time this has been hit however it appears that the Guest OS (cirros) isn't able to react to the ACPI request to detach the disk device. " | 11:34 |
lyarwood | kashyap: yeah that's my feeling at the moment, I'm looking to prove it now | 11:34 |
lyarwood | kashyap: just trying to work out how to capture the moment libvirt / QEMU signal the guestOS to detach the device | 11:35 |
kashyap | lyarwood: Right, configure the debug log filters, it should definitely give us some clues | 11:35 |
lyarwood | kashyap: and then work out how to capture that in the guest, AFAICT dmesg doesn't list it | 11:35 |
kashyap | `journalctl`? | 11:35 |
lyarwood | kashyap: cirros doesn't have systemd | 11:37 |
kashyap | Darn, I keep forgetting | 11:37 |
*** k_mouza has quit IRC | 11:37 | |
lyarwood | ah it's using acpid | 11:37 |
*** k_mouza has joined #openstack-nova | 11:39 | |
sean-k-mooney1 | i need to try and find time to test alpine as a cirros alternitive | 11:40 |
kashyap | lyarwood: Fedora doesn't do it? | 11:40 |
kashyap | (As in, 'acpid' daemon) | 11:40 |
sean-k-mooney1 | fedora uses systemd for udev im not cure if that will handel acpid too | 11:41 |
sean-k-mooney1 | *sure | 11:41 |
*** sean-k-mooney1 is now known as sean-k-mooney | 11:41 | |
*** k_mouza has quit IRC | 11:44 | |
kashyap | sean-k-mooney: 'systemd' can handle some ACPI events; not all - https://wiki.archlinux.org/index.php/Power_management#ACPI_events | 11:45 |
kashyap | On my Fedora laptop I see: | 11:45 |
kashyap | $> systemctl | grep -i acpi sys-devices-platform-thinkpad_acpi-leds-tpacpi::kbd_backlight.device loaded active plugged /sys/devices/platform/thinkpad_acpi/leds/tpacpi::kbd_backlight | 11:46 |
kashyap | systemd-backlight@leds:tpacpi::kbd_backlight.service loaded active exited Load/Save Screen Backlight Brightness of leds:tpacpi::kbd_backlight | 11:46 |
sean-k-mooney | lyarwood: so i have been suggesting we should look into useing alpine instead of cirros going forward in the gate. its still does not use systemd but its one of the lightest weight distros i know of and unlike cirros its still maintained regurally | 11:46 |
kashyap | (So some Thinkpad-related ACPI events are handled) | 11:46 |
sean-k-mooney | kashyap: that sound like a lenovo extention | 11:46 |
*** k_mouza has joined #openstack-nova | 11:46 | |
sean-k-mooney | rather then generic support | 11:46 |
kashyap | lyarwood: Back to your original question - yeah, we need to find the "event" (IIRC, DEVICE_DELETED - need to double-check) that libvirtsends to the guest OS | 11:47 |
lyarwood | kashyap: do you know what that actually maps to in terms of what the guestOS sees? | 11:47 |
lyarwood | kashyap: an ACPI event right but any idea what type etc? | 11:47 |
kashyap | lyarwood: Not top off my head, perhaps Michal from libvirt might know; he worked on the 'udev' integration | 11:47 |
lyarwood | kashyap: could you ask and I'll work out a while of capturing that within the guestOS itself | 11:48 |
kashyap | lyarwood: Yeah, just asked; he's AFK. I'm checking w/ a couple of others | 11:48 |
lyarwood | sean-k-mooney: tbh it's a little silly that we are using it in CI and running nodes with such little resource as well tbh | 11:49 |
sean-k-mooney | lyarwood: well we dont have enogh disk/ram to use something much hevier | 11:49 |
sean-k-mooney | not without reducing concurancy at least | 11:49 |
sean-k-mooney | cirros made sense when it was activly maintained and updated | 11:50 |
*** k_mouza has quit IRC | 11:51 | |
*** rcernin has quit IRC | 11:54 | |
kashyap | lyarwood: Do you have access to the guest? If so - is this present in it: /sys/module/pci_hotplug? | 11:59 |
*** sapd1_x has quit IRC | 12:00 | |
sean-k-mooney | kashyap: cirrus uses a striped down ubuntu 18.04 kernel so it may not be | 12:01 |
lyarwood | kashyap: yeah that's there, I assume I can enable that | 12:01 |
lyarwood | kashyap: debug that is | 12:02 |
*** xek has joined #openstack-nova | 12:02 | |
lyarwood | and yeah was just reading https://blog.chrishowie.com/2019/09/19/hot-swapping-virtio-disks-on-qemu/ so it's a PCI hot remove with virtio-blk that makes sense | 12:03 |
sean-k-mooney | yep it is | 12:03 |
sean-k-mooney | that why i was asserting that virtio-scsi or q35 might help | 12:03 |
kashyap | lyarwood: So I learn that's the part (the /sys/module/pci_hotplug) which is responsible for hotplug/hotunplug events | 12:04 |
sean-k-mooney | virtio-scsi woudl be the simplest thing to enable | 12:04 |
lyarwood | sean-k-mooney: well if it the guestOS can't process the request to detach I don't think changing the underlying bus is going to help tbh | 12:04 |
kashyap | lyarwood: So I just chatted w/ a couple of QEMU devs; and it seems notoriously difficult to detect this. Way too low-level ... | 12:05 |
sean-k-mooney | lyarwood: well it wont be a pci hotplug anymore | 12:05 |
sean-k-mooney | lyarwood: it will be a scsi detach | 12:05 |
lyarwood | sean-k-mooney: true but the guest would still need to handle the SCSI command (?) to detach | 12:05 |
gibi | stephenfin: fyi, I have a question in https://review.opendev.org/#/c/746945/6/nova/tests/functional/libvirt/test_pci_sriov_servers.py@a370 | 12:05 |
sean-k-mooney | yes proably but i think that would be more relyable | 12:05 |
kashyap | lyarwood: A snippet: | 12:06 |
kashyap | <kashyap> Hiya, a ranodm question: on monitor command 'device_del' (for device detach), would you happen to know how exactly does it manifest in the guest? | 12:06 |
kashyap | Answer (from Igor): guest gets SCI interrupt, next thing it reads status from GPE block and calls appropriate AML handler (it's all done within guest kernel) | 12:06 |
kashyap | Answer 2 (from DanPB): "you'll get <insert hand waving> an ACPI unplug event something in the guest needs to respond to this event for it to complete" | 12:06 |
*** xek has quit IRC | 12:08 | |
*** rcernin has joined #openstack-nova | 12:10 | |
jangutter_ | kashyap: on physical hw I've hotplugged and unplugged SATA/SCSI/USB devices for ages, but I've NEVER done so with a PCIe device. | 12:11 |
*** jangutter_ is now known as jangutter | 12:11 | |
*** rcernin has quit IRC | 12:11 | |
sean-k-mooney | gibi: stephenfin can i get your eyes on this https://review.opendev.org/#/c/738432/ | 12:11 |
*** rcernin has joined #openstack-nova | 12:11 | |
sean-k-mooney | i want to get that bug fix merged before m3 if we can so we can backport it to train | 12:12 |
kashyap | jangutter: Yeap, noted | 12:12 |
kashyap | lyarwood: So Jiri from libvirt also suggests to get the communication w/ QEMU monitor | 12:12 |
sean-k-mooney | gibi: stephenfin im also hoping to get https://review.opendev.org/#/q/topic:bug/1888395+(status:open+OR+status:merged) merged soon bug im going to adress artoms nits now | 12:13 |
lyarwood | kashyap: yeah tracking that, I see the DEVICE_DELETED events | 12:16 |
lyarwood | kashyap: I've used https://www.kernel.org/doc/html/latest/firmware-guide/acpi/debug.html to enable ACPI debug for the ACPI_PCI_COMPONENT | 12:17 |
lyarwood | kashyap: within the guestos | 12:17 |
lyarwood | kashyap: lets see if that helps | 12:17 |
kashyap | lyarwood: So I just posted #9 | 12:17 |
kashyap | To copy/paste my point-1 from there: | 12:17 |
kashyap | "- DEVICE_DELETED is the event that QEMU sends to libvirt, *once* the device was removed by the guest, so that libvirt can clean-up. So if we see DEVICE_DELETED that means the device was successfully detached from QEMU's point of view (therefore, from the guest's PoV, too)" | 12:17 |
lyarwood | kashyap: right sorry I'm just working out how to instrument things in CI at the moment | 12:18 |
kashyap | lyarwood: Are you using a new kernel rebuilt with it? | 12:18 |
lyarwood | kashyap: detach works correctly in the env at the moment | 12:18 |
lyarwood | kashyap: I'm just trying to figure out what I need to capture during a run to show things are delayed in the guestos | 12:19 |
lyarwood | kashyap: and yeah 5.3.0-26-generic is the kernel | 12:19 |
*** k_mouza has joined #openstack-nova | 12:20 | |
kashyap | lyarwood: So, Igor (KVM/QEMU dev) says: "You'd could watch for udev events as indirect result of unplug events for specific device subsystem" | 12:20 |
*** rcernin has quit IRC | 12:21 | |
lyarwood | kashyap: I don't think cirros is using udev tbh | 12:21 |
kashyap | lyarwood: Nod; I've actually snipped out his first part where he admits he isn't familiar w/ 'acpid' | 12:22 |
lyarwood | https://git.busybox.net/busybox/tree/util-linux/acpid.c it's not even the old version I was used to tbh | 12:23 |
*** k_mouza has quit IRC | 12:24 | |
kashyap | lyarwood: I'm curious if your test with slightly "better resources" for the guest fixes it | 12:25 |
*** mkrai has quit IRC | 12:25 | |
kashyap | lyarwood: Also can you tell what's the buggy guest configuration? If you don't mind posting the guest XML... | 12:25 |
lyarwood | kashyap: I still saw a few failures | 12:25 |
kashyap | So it's not the resources allocated to the guest | 12:26 |
lyarwood | kashyap: that was in reference to the host guest running openstack FWIW | 12:26 |
lyarwood | kashyap: correct | 12:26 |
lyarwood | kashyap: CI nodes run with 1 vCPU and 8GB of RAM at the moment | 12:26 |
lyarwood | kashyap: the instances have 1 vCPU and 128MB of RAM | 12:27 |
kashyap | lyarwood: BTW, haven't we "proved" that it is the guest OS that is buggy when you can't reproduce it w/ other guest OSes? :) | 12:27 |
kashyap | (Thx for the guest config) | 12:27 |
lyarwood | kashyap: I'd just like to capture the actual events to prove it | 12:27 |
kashyap | Nod. Seems notoriously difficult so far from my interactions | 12:28 |
kashyap | lyarwood: I guess your approach w/ this rebuilt kernel w/ ACPI debug is to reproduce the prob and watch for output in 'dmesg'? | 12:28 |
*** mkrai has joined #openstack-nova | 12:29 | |
lyarwood | kashyap: yeah, I shouldn't need to rebuild the kernel | 12:30 |
lyarwood | kashyap: I just need to work out a way of providing command line args to the instances | 12:30 |
lyarwood | kashyap: and then capture their console logs on failure | 12:31 |
* kashyap bbiab; break | 12:31 | |
*** jangutter_ has joined #openstack-nova | 12:36 | |
*** jsuchome has quit IRC | 12:36 | |
*** jangutter has quit IRC | 12:38 | |
*** jangutter has joined #openstack-nova | 12:38 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: [Trivial] Replace ref of policy.json to policy.yaml https://review.opendev.org/749821 | 12:39 |
gmann | dansmith: sean-k-mooney gibi policy file default change is ready now- https://review.opendev.org/#/c/748059/9 | 12:39 |
*** jangutter_ has quit IRC | 12:41 | |
gibi | gmann: ack | 12:43 |
openstackgerrit | sean mooney proposed openstack/nova master: add functional regression test for bug #1888395 https://review.opendev.org/747454 | 12:45 |
openstack | bug 1888395 in OpenStack Compute (nova) "shared live migration of a vm with a vif is broken in train" [High,In progress] https://launchpad.net/bugs/1888395 - Assigned to sean mooney (sean-k-mooney) | 12:45 |
openstackgerrit | sean mooney proposed openstack/nova master: Set migrate_data.vifs only when using multiple port bindings https://review.opendev.org/742180 | 12:45 |
*** Luzi has joined #openstack-nova | 12:45 | |
*** mkrai has quit IRC | 12:50 | |
*** k_mouza has joined #openstack-nova | 12:57 | |
sean-k-mooney | stephenfin: another one for your review queue https://review.opendev.org/#/q/topic:bug/1860555+(status:open+OR+status:merged) althouhg that is still WIP so lower priority but that might be the cause of our downstream issue | 13:10 |
sean-k-mooney | gmann: im just poping out to grab lunch but ill try and take a look at the polcy change when i get back. its not really my area but ill take a look in anycase | 13:21 |
gmann | sure, thanks | 13:21 |
*** jangutter_ has joined #openstack-nova | 13:21 | |
kashyap | lyarwood: BTW, can you please link to the latest error logs from upstream? I can't find them here - https://zuul.opendev.org/t/openstack/build/9290c83e18a741a5bdab4e28de5eedb7/log/ | 13:22 |
kashyap | lyarwood: I'm looking for the offending guest QEMU command-line and its guest kernel version | 13:22 |
gibi | gmann: only have a request in the reno https://review.opendev.org/#/c/748059 but overall looks good to me | 13:23 |
*** sapd1_x has joined #openstack-nova | 13:23 | |
gmann | gibi: thanks. updating. | 13:23 |
kashyap | lyarwood: The reason for the above details is because one of the QEMU devs say "lack of CPU time doesn't make sense [as a potential cause], as hot[un]plug events should be porcessed sooner or later" | 13:23 |
*** jangutte_ has joined #openstack-nova | 13:24 | |
*** jangutter has quit IRC | 13:25 | |
kashyap | lyarwood: I think I should find the logs here (for the latest failing -focal logs): https://review.opendev.org/#/c/734029/ | 13:25 |
lyarwood | kashyap: https://zuul.opendev.org/t/openstack/build/eee0dc94780c4555b376f17c4f50c301 is a recent example | 13:26 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Migrate default policy file from JSON to YAML https://review.opendev.org/748059 | 13:26 |
lyarwood | kashyap: https://zuul.opendev.org/t/openstack/build/eee0dc94780c4555b376f17c4f50c301/log/controller/logs/libvirt/qemu/instance-0000007a_log.txt is the QEMU log for an instance that hit this | 13:27 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Migrate default policy file from JSON to YAML https://review.opendev.org/748059 | 13:27 |
lyarwood | kashyap: 1dec20ff-922e-4bed-a97f-1699f114e74b | 13:27 |
*** jangutter_ has quit IRC | 13:27 | |
kashyap | lyarwood: Thank you; do you have the guest kernel version? (Or the CirrOS version - then I can figure out the kernel version) | 13:27 |
lyarwood | kashyap: pretty sure it's the same as the version I listed earlier | 13:27 |
lyarwood | 5.3.0-26-generic | 13:28 |
kashyap | Ah, okay; was about to guess as much. Thank you | 13:28 |
gmann | gibi: updated, added bug in cmt msg also | 13:28 |
*** xek has joined #openstack-nova | 13:28 | |
lyarwood | kashyap: just modified the cirros image in my test env to use debug ACPI btw | 13:28 |
lyarwood | kashyap: just hacking tempest to dump the console log / dmesg on failure | 13:28 |
kashyap | Ah, cool | 13:29 |
*** jawad_axd has quit IRC | 13:29 | |
*** jangutter has joined #openstack-nova | 13:30 | |
*** jangutte_ has quit IRC | 13:30 | |
*** k_mouza has quit IRC | 13:30 | |
*** jangutter_ has joined #openstack-nova | 13:31 | |
*** jangutter has quit IRC | 13:34 | |
gibi | gmann: thanks, +2 | 13:35 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: [Trivial] Replace ref of policy.json to policy.yaml https://review.opendev.org/749821 | 13:35 |
gmann | gibi: thanks. ^^ this is trivial one to replace the ref of policy.json in doc and test | 13:36 |
gibi | looking | 13:37 |
*** xek has quit IRC | 13:38 | |
gibi | sean-k-mooney: I have a question at https://review.opendev.org/#/c/742180/11 | 13:45 |
*** priteau has joined #openstack-nova | 13:47 | |
* bauzas just discovered today a new world with neutron | 13:48 | |
*** priteau has quit IRC | 13:48 | |
*** priteau has joined #openstack-nova | 13:49 | |
bauzas | sean-k-mooney: sooooo, we build the VIFs once we are in the compute service, right? | 13:49 |
bauzas | well, answering myself | 13:51 |
bauzas | right, only when we add the fixed IP to an instance | 13:51 |
bauzas | which is called either after creating the instance in the compute, or when adding the fixed IP directly to an instance by the API... | 13:52 |
kashyap | lyarwood: So, I just combed through the libvirtd log surrounding the QMP 'device_del' (which does the detach), and here's the little fragment: https://kashyapc.fedorapeople.org/CirrOS_device_detach_issues/libvirtd-log-surrounding-device_del.txt | 14:00 |
bauzas | gibi: sean-k-mooney: question, should we look at the segments if someone asks the API to put a port to an existance ? | 14:00 |
bauzas | if so... | 14:00 |
kashyap | lyarwood: It all looks "clean" until here to me: | 14:01 |
kashyap | 2020-09-03 20:01:53.019+0000: 65328: debug : qemuMonitorJSONIOProcessEvent:205 : handle DEVICE_DELETED handler=0x7f0230572840 data=0x55d556edf3c0 | 14:01 |
kashyap | 2020-09-03 20:01:53.019+0000: 65328: debug : qemuMonitorJSONHandleDeviceDeleted:1287 : missing device in device deleted event | 14:01 |
gibi | port will be bound and I guess neutron will fail the binding if there is no segment on the given host | 14:01 |
gibi | as far as I remember interface_attach is a call so the error will propagate back the user | 14:01 |
gibi | bauzas: ^^ | 14:01 |
bauzas | gibi: ok, so Neutron will check it ? | 14:02 |
bauzas | if so, fine | 14:02 |
gibi | I assume, yes | 14:02 |
bauzas | cool | 14:02 |
gibi | as neutron would need to assign an ip | 14:02 |
gibi | during the binding | 14:02 |
bauzas | anyway, we could provide a caveat documentation if no | 14:02 |
bauzas | anyway, today is the last day I'm trying to work on this | 14:03 |
bauzas | gibi: sean-k-mooneyif you have changes you want to me to review, lemme know | 14:03 |
bauzas | and then I'll review them tomorrow | 14:03 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: WIP/DNM libvirt: Start emitting DeviceRemovedEvent and DeviceRemovalFailedEvent events https://review.opendev.org/749929 | 14:03 |
*** sapd1_x has quit IRC | 14:03 | |
gibi | bauzas: sriov attach is ready, sean-k-mooney is alread +1 on it and the bottom has +2s from stephenfin. series starts here https://review.opendev.org/#/c/741436 | 14:04 |
bauzas | gibi: ack, will look | 14:04 |
gibi | thanks! | 14:04 |
bauzas | gibi: now that I work on some network features, I know better the related files ;) | 14:05 |
gibi | :) | 14:05 |
lyarwood | kashyap: yeah that's long after tempest has stopped waiting for the volume to be detached | 14:07 |
lyarwood | kashyap: let me grab some logs in pastebin | 14:07 |
kashyap | lyarwood: I've got some contextual stuff here: https://kashyapc.fedorapeople.org/CirrOS_device_detach_issues/ | 14:07 |
*** sapd1_x has joined #openstack-nova | 14:07 | |
lyarwood | kashyap: http://paste.openstack.org/show/797545/ - AFAICT n-cpu stops trying to detach the volume much earlier than the libvirtd logs you've posted | 14:16 |
* kashyap clicks | 14:17 | |
kashyap | lyarwood: Okay, I perhaps need to look further up; let me see if I can see this "Unable to detach" thing in the log | 14:18 |
kashyap | lyarwood: I'm stumped - I don't see why that "Unable to detach ..." isn't captured here: https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_c3a/734029/2/check/devstack-platform-focal/c3ab542/controller/logs/libvirt/libvirtd_log.txt | 14:19 |
kashyap | (Beaware the above log size: gzip-compressed - 8.2MB; uncompressed - 118MB) | 14:19 |
lyarwood | 721862 2020-09-03 19:58:35.443+0000: 65331: debug : qemuDomainDeleteDevice:128 : Detaching of device virtio-disk1 failed and no event arrived | 14:20 |
lyarwood | ^ kashyap I think that's what we are after | 14:20 |
kashyap | lyarwood: "Huzzah", that's right | 14:20 |
*** k_mouza has joined #openstack-nova | 14:29 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Follow up for I67504a37b0fe2ae5da3cba2f3122d9d0e18b9481 https://review.opendev.org/750184 | 14:33 |
*** Luzi has quit IRC | 14:39 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Add support for resize and cold migration of emulated TPM files https://review.opendev.org/639934 | 14:50 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Expand generic reproducer for bug #1879878 https://review.opendev.org/750186 | 14:50 |
openstack | bug 1879878 in OpenStack Compute (nova) "VM become Error after confirming resize with Error info CPUUnpinningInvalid on source node " [Medium,In progress] https://launchpad.net/bugs/1879878 - Assigned to Stephen Finucane (stephenfinucane) | 14:50 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Set 'old_flavor', 'new_flavor' on source before resize https://review.opendev.org/750187 | 14:50 |
gmann | stephenfin: are you planning the xenapi removal for Victoria release? if so i can review your tempest patch on priority (as that will block the nova side change) otherwise after Focal migration work. | 14:54 |
stephenfin | gmann: Yes, I was hoping to | 14:54 |
stephenfin | I think it's in merge conflict though | 14:54 |
* stephenfin looks | 14:54 | |
gmann | yeah. | 14:55 |
stephenfin | okay, resolved that. docstring conflict | 14:56 |
stephenfin | gibi: replied at https://review.opendev.org/#/c/746945/6/nova/tests/functional/libvirt/test_pci_sriov_servers.py@a370 | 14:58 |
gibi | thanks, looking | 14:59 |
*** mkrai has joined #openstack-nova | 15:01 | |
*** jangutter has joined #openstack-nova | 15:02 | |
*** jangutter has quit IRC | 15:02 | |
*** jangutter has joined #openstack-nova | 15:03 | |
*** jangutter_ has quit IRC | 15:05 | |
*** k_mouza has quit IRC | 15:07 | |
*** jangutter has quit IRC | 15:07 | |
*** jangutter has joined #openstack-nova | 15:08 | |
*** k_mouza has joined #openstack-nova | 15:11 | |
*** k_mouza has quit IRC | 15:21 | |
*** k_mouza has joined #openstack-nova | 15:24 | |
*** martinkennelly has joined #openstack-nova | 15:26 | |
*** links has quit IRC | 15:34 | |
*** k_mouza has quit IRC | 15:39 | |
*** ircuser-1 has joined #openstack-nova | 15:42 | |
*** k_mouza has joined #openstack-nova | 15:47 | |
sean-k-mooney | bauzas: technically yes but the port binding would fail | 15:52 |
sean-k-mooney | ah gibi aready said that | 15:52 |
bauzas | all cool then | 15:52 |
sean-k-mooney | gibi so regarding https://review.opendev.org/#/c/742180/11/nova/tests/functional/regressions/test_bug_1888395.py i was thinking of using stephens seriese eventuly to enable the migration testing | 15:55 |
sean-k-mooney | gibi: once the sriov migration fuctional test series merges tehn that regression test can be updated | 15:55 |
gibi | sean-k-mooney: yeah that would be nice | 15:55 |
gibi | I read through stephenfin's series today and I'm +2 almost all the way | 15:56 |
sean-k-mooney | im not sure if i need all the patches by the way. i was hopeing to get this merged before his series to avoid conflicts on backport but im hoping both merged in victoria | 15:57 |
sean-k-mooney | all the patches in stephens series that is | 15:57 |
gibi | sean-k-mooney: my -1 on https://review.opendev.org/#/c/742180/ is about the question if we break SRIOV live migration if there is no multi portbinding | 15:58 |
sean-k-mooney | yes so sriov live migration requires multiple port bindings | 15:58 |
sean-k-mooney | it was only ment to work if the backend supported that | 15:58 |
gibi | then the question will it fail cleanly? | 15:58 |
sean-k-mooney | yes it will | 15:59 |
*** mkrai has quit IRC | 15:59 | |
sean-k-mooney | we check if multiple port bindigns is supproted i nthe conductor | 15:59 |
sean-k-mooney | and fail the migration if not | 15:59 |
sean-k-mooney | gibi: https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L250-L255 | 16:00 |
gibi | cool | 16:00 |
gibi | I'm droping my -1 then | 16:00 |
sean-k-mooney | :) any other concerns? | 16:00 |
gibi | nope | 16:01 |
sean-k-mooney | should we be worried about all those gate timeouts | 16:01 |
gibi | I haven't checked the gate this afternoon | 16:01 |
sean-k-mooney | im seeing sqlalcamy errors in the unit tests which are unrelated | 16:02 |
gibi | I'm leaving for today... | 16:02 |
gibi | o/ | 16:03 |
sean-k-mooney | o/ | 16:03 |
*** openstackgerrit has quit IRC | 16:11 | |
*** k_mouza has quit IRC | 16:34 | |
*** xek has joined #openstack-nova | 16:41 | |
*** ralonsoh has quit IRC | 16:44 | |
*** k_mouza has joined #openstack-nova | 16:45 | |
*** martinkennelly has quit IRC | 16:46 | |
*** sapd1_x has quit IRC | 16:57 | |
*** derekh has quit IRC | 17:03 | |
*** tesseract has quit IRC | 17:04 | |
*** dtantsur is now known as dtantsur|afk | 17:18 | |
*** k_mouza has quit IRC | 17:22 | |
*** nightmare_unreal has quit IRC | 17:40 | |
*** zzzeek has quit IRC | 18:20 | |
*** zzzeek has joined #openstack-nova | 18:24 | |
*** openstackgerrit has joined #openstack-nova | 18:30 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: fakelibvirt: Use versionutils to set min versions found in the driver https://review.opendev.org/749707 | 18:30 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Bump MIN_{LIBVIRT,QEMU}_VERSION and NEXT_MIN_{LIBVIRT,QEMU}_VERSION https://review.opendev.org/746981 | 18:30 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Remove MIN_LIBVIRT_FILE_BACKED_DISCARD_VERSION https://review.opendev.org/746982 | 18:30 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Remove MIN_{LIBVIRT,QEMU}_NATIVE_TLS_VERSION https://review.opendev.org/746983 | 18:30 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Remove MIN_LIBVIRT_BETTER_SIGKILL_HANDLING https://review.opendev.org/746984 | 18:30 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Remove MIN_LIBVIRT_VIDEO_MODEL_VERSIONS https://review.opendev.org/746985 | 18:30 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Remove MIN_{LIBVIRT,QEMU}_PMEM_SUPPORT https://review.opendev.org/746986 | 18:30 |
*** zzzeek has quit IRC | 18:50 | |
*** zzzeek has joined #openstack-nova | 18:52 | |
openstackgerrit | Merged openstack/nova stable/ussuri: Add note and daxio version to the vPMEM document https://review.opendev.org/749681 | 19:00 |
*** priteau has quit IRC | 19:04 | |
*** gregwork has joined #openstack-nova | 19:27 | |
*** ociuhandu has joined #openstack-nova | 19:37 | |
*** ociuhandu_ has joined #openstack-nova | 19:42 | |
*** ociuhandu has quit IRC | 19:45 | |
*** ociuhandu_ has quit IRC | 20:19 | |
openstackgerrit | sean mooney proposed openstack/nova master: add functional regression test for bug #1888395 https://review.opendev.org/747454 | 20:22 |
openstack | bug 1888395 in OpenStack Compute (nova) "shared live migration of a vm with a vif is broken in train" [High,In progress] https://launchpad.net/bugs/1888395 - Assigned to sean mooney (sean-k-mooney) | 20:22 |
openstackgerrit | sean mooney proposed openstack/nova master: Set migrate_data.vifs only when using multiple port bindings https://review.opendev.org/742180 | 20:22 |
*** ociuhandu has joined #openstack-nova | 20:35 | |
openstackgerrit | Merged openstack/nova stable/train: Removed the host FQDN from the exception message https://review.opendev.org/749609 | 20:35 |
openstackgerrit | Merged openstack/nova stable/ussuri: resolve ResourceProviderSyncFailed issue https://review.opendev.org/749668 | 20:35 |
openstackgerrit | Merged openstack/nova stable/ussuri: Set different VirtualDevice.key https://review.opendev.org/749418 | 20:35 |
openstackgerrit | Merged openstack/nova stable/train: tests: Add reproducer for bug #1889633 https://review.opendev.org/748254 | 20:35 |
openstack | bug 1889633 in OpenStack Compute (nova) train "Pinned instance with thread policy can consume VCPU" [High,In progress] https://launchpad.net/bugs/1889633 - Assigned to Stephen Finucane (stephenfinucane) | 20:35 |
openstackgerrit | Merged openstack/nova stable/train: Add checks for volume status when rebuilding https://review.opendev.org/748558 | 20:35 |
*** ociuhandu has quit IRC | 20:39 | |
*** zzzeek has quit IRC | 20:42 | |
*** zzzeek has joined #openstack-nova | 20:44 | |
*** k_mouza has joined #openstack-nova | 21:11 | |
*** xek has quit IRC | 21:14 | |
openstackgerrit | sean mooney proposed openstack/nova master: use os-brick connector fixture form ServersTestBase https://review.opendev.org/750215 | 21:16 |
*** slaweq has quit IRC | 21:19 | |
*** slaweq has joined #openstack-nova | 21:23 | |
*** slaweq has quit IRC | 21:27 | |
openstackgerrit | sean mooney proposed openstack/nova master: add using_multiple_port_bindings property to livemigrate_data https://review.opendev.org/750217 | 21:42 |
openstackgerrit | sean mooney proposed openstack/nova master: add using_multiple_port_bindings property to LiveMigrateData https://review.opendev.org/750217 | 21:44 |
*** k_mouza has quit IRC | 21:48 | |
*** hoonetorg has quit IRC | 22:15 | |
*** hoonetorg has joined #openstack-nova | 22:28 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Migrate default policy file from JSON to YAML https://review.opendev.org/748059 | 22:32 |
*** rcernin has joined #openstack-nova | 22:41 | |
*** tosky has quit IRC | 23:16 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!