*** mlavalle has quit IRC | 00:05 | |
*** macz_ has quit IRC | 00:09 | |
*** k_mouza has joined #openstack-nova | 00:14 | |
*** k_mouza has quit IRC | 00:19 | |
*** dustinc has quit IRC | 00:30 | |
*** swp20 has joined #openstack-nova | 00:38 | |
*** sapd1_x has quit IRC | 00:47 | |
openstackgerrit | norman shen proposed openstack/nova master: Saving security group to info_cache https://review.opendev.org/c/openstack/nova/+/786348 | 00:48 |
---|---|---|
*** swp20 has quit IRC | 00:56 | |
*** yangkai has joined #openstack-nova | 01:07 | |
*** __ministry has joined #openstack-nova | 01:21 | |
*** swp20 has joined #openstack-nova | 01:45 | |
*** macz_ has joined #openstack-nova | 02:10 | |
*** macz_ has quit IRC | 02:14 | |
openstackgerrit | norman shen proposed openstack/nova-specs master: Speed up server details https://review.opendev.org/c/openstack/nova-specs/+/791620 | 02:16 |
*** k_mouza has joined #openstack-nova | 02:34 | |
*** k_mouza has quit IRC | 02:39 | |
*** rcernin has quit IRC | 02:42 | |
*** rcernin has joined #openstack-nova | 02:50 | |
openstackgerrit | norman shen proposed openstack/nova-specs master: Speed up server details https://review.opendev.org/c/openstack/nova-specs/+/791620 | 03:12 |
*** mkrai has joined #openstack-nova | 03:18 | |
openstackgerrit | norman shen proposed openstack/nova-specs master: Speed up server details https://review.opendev.org/c/openstack/nova-specs/+/791620 | 03:38 |
*** psachin has joined #openstack-nova | 03:39 | |
*** ociuhandu has joined #openstack-nova | 03:40 | |
*** ociuhandu has quit IRC | 03:44 | |
*** macz_ has joined #openstack-nova | 03:46 | |
*** macz_ has quit IRC | 03:51 | |
*** sapd1_x has joined #openstack-nova | 04:29 | |
*** ratailor has joined #openstack-nova | 05:00 | |
*** ralonsoh has joined #openstack-nova | 05:36 | |
*** sapd1_x has quit IRC | 05:38 | |
*** macz_ has joined #openstack-nova | 05:47 | |
*** macz_ has quit IRC | 05:52 | |
*** supamatt has joined #openstack-nova | 05:58 | |
*** swp20 has quit IRC | 06:19 | |
*** slaweq has joined #openstack-nova | 06:33 | |
*** k_mouza has joined #openstack-nova | 06:35 | |
*** vishalmanchanda has joined #openstack-nova | 06:35 | |
*** k_mouza has quit IRC | 06:39 | |
*** nightmare_unreal has joined #openstack-nova | 06:45 | |
*** dklyle has quit IRC | 06:47 | |
*** mkrai has quit IRC | 07:06 | |
*** jawad_axd has quit IRC | 07:07 | |
*** jawad_axd has joined #openstack-nova | 07:07 | |
*** sapd1_x has joined #openstack-nova | 07:09 | |
*** jawad_axd has quit IRC | 07:11 | |
*** ociuhandu has joined #openstack-nova | 07:15 | |
*** lucasagomes has joined #openstack-nova | 07:16 | |
*** ociuhandu has quit IRC | 07:16 | |
*** andrewbonney has joined #openstack-nova | 07:17 | |
*** rcernin has quit IRC | 07:20 | |
*** ralonsoh has quit IRC | 07:24 | |
*** ralonsoh has joined #openstack-nova | 07:24 | |
*** mkrai has joined #openstack-nova | 07:27 | |
*** rpittau|afk is now known as rpittau | 07:33 | |
*** jawad_axd has joined #openstack-nova | 07:37 | |
*** tosky has joined #openstack-nova | 07:38 | |
*** rcernin has joined #openstack-nova | 07:47 | |
*** macz_ has joined #openstack-nova | 07:48 | |
*** ociuhandu has joined #openstack-nova | 07:52 | |
*** macz_ has quit IRC | 07:52 | |
*** rcernin has quit IRC | 07:54 | |
*** ChanServ has quit IRC | 07:54 | |
*** ChanServ has joined #openstack-nova | 07:54 | |
*** services. sets mode: +o ChanServ | 07:54 | |
*** derekh has joined #openstack-nova | 07:55 | |
*** rcernin has joined #openstack-nova | 07:56 | |
*** swp20 has joined #openstack-nova | 08:01 | |
*** ociuhandu has quit IRC | 08:02 | |
*** ociuhandu has joined #openstack-nova | 08:05 | |
*** ociuhandu has quit IRC | 08:07 | |
*** ociuhandu has joined #openstack-nova | 08:07 | |
*** rcernin has quit IRC | 08:09 | |
*** mkrai has quit IRC | 08:10 | |
*** rcernin has joined #openstack-nova | 08:10 | |
*** rcernin has quit IRC | 08:19 | |
*** ociuhandu has quit IRC | 08:20 | |
*** rcernin has joined #openstack-nova | 08:20 | |
*** ociuhandu has joined #openstack-nova | 08:20 | |
openstackgerrit | Daniel Bengtsson proposed openstack/nova master: Use the new type HostDomainOpt. https://review.opendev.org/c/openstack/nova/+/788240 | 08:23 |
*** rcernin has quit IRC | 08:25 | |
*** k_mouza has joined #openstack-nova | 08:30 | |
*** ociuhandu has quit IRC | 08:30 | |
*** ociuhandu has joined #openstack-nova | 08:31 | |
*** ociuhandu has quit IRC | 08:31 | |
*** ociuhandu has joined #openstack-nova | 08:31 | |
*** k_mouza has quit IRC | 08:34 | |
*** rcernin has joined #openstack-nova | 08:38 | |
*** rcernin has quit IRC | 08:44 | |
*** ignaziocassano has joined #openstack-nova | 08:49 | |
ignaziocassano | Ignazio Cassano <ignaziocassano@gmail.com> | 08:50 |
ignaziocassano | 10:48 (1 minuto fa) | 08:50 |
ignaziocassano | a openstack-discuss | 08:50 |
ignaziocassano | Hello Guys, | 08:50 |
ignaziocassano | on train centos7 I am facing live migration issue only for some instances (not all). | 08:50 |
ignaziocassano | The error reported is: | 08:50 |
ignaziocassano | 2021-05-19 08:45:57.096 142537 ERROR nova.compute.manager [-] [instance: b18450e8-b3db-4886-a737-c161d99c6a46] Live migration failed.: libvirtError: Unable to read from monitor: Connection reset by peer | 08:50 |
ignaziocassano | some instances migrate without errors. I tried to stop end restart libvirtd end nova-compute without solving | 08:51 |
ignaziocassano | On instances where migration failed, If I stop them and I start on another node, If migrate them on original node and migrate again, it works | 08:52 |
ignaziocassano | Sorry, the version is stein | 08:55 |
*** lyarwood has joined #openstack-nova | 08:56 | |
*** rcernin has joined #openstack-nova | 08:56 | |
stephenfin | ignaziocassano: Have you looked into the libvirt logs directly or investigated syslog? | 08:58 |
stephenfin | kashyap can correct me if I'm wrong, but that error usually implies the connection between QEMU and libvirt has died | 08:58 |
* kashyap blinks and looks | 08:59 | |
kashyap | stephenfin: Yes, you're right | 08:59 |
kashyap | But the underlying problem could be anywhere ... and needs more details to debug | 09:00 |
kashyap | ignaziocassano: Are you migrating including your storage? (I.e. "block migration"?) | 09:00 |
ignaziocassano | kashyap the storage is shared on netapp nfs | 09:01 |
ignaziocassano | no errors on openvswitch agent | 09:01 |
*** rcernin has quit IRC | 09:01 | |
*** rcernin has joined #openstack-nova | 09:02 | |
ignaziocassano | kashyap: I presume the connection between QEMU and libvirt ha died for some instances, but I do not know how I can verify it | 09:03 |
kashyap | ignaziocassano: Right; that's the reason. So one way to debug this is to obtain libvirt debug log filters that track interactions b/n QEMU and libvirt | 09:04 |
*** k_mouza has joined #openstack-nova | 09:04 | |
kashyap | ignaziocassano: You can get it this way: | 09:04 |
kashyap | On the relevant compute nodes (both source and dest): | 09:04 |
kashyap | (1) Set the log output file: $> virt-admin daemon-log-outputs "1:file:/var/log/libvirt/libvirtd.log" | 09:04 |
kashyap | (2) Configure the dynamic filters: | 09:04 |
kashyap | $> virt-admin daemon-log-filters \ "1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 3:object 1:util" | 09:04 |
kashyap | (Remove that "\") | 09:05 |
ignaziocassano | kashyap: on messages it reports: 19:22:29 podvc-kvm04 libvirtd: 2021-05-16 17:22:29.562+0000: 166295: error : qemuMonitorIO:718 : internal error: End of file from qemu monitor | 09:05 |
kashyap | And then re-run the migration of the affected instances | 09:05 |
kashyap | ignaziocassano: Yeah; that's also "normal" / "expected"; doesn't tell much still | 09:05 |
kashyap | ignaziocassano: BTW, can you quickly try this: | 09:06 |
kashyap | $> journalctl -u libvirtd -l --since=yesterday -p err | 09:06 |
ignaziocassano | error : qemuDomainObjBeginJobInternal:6825 : Timed out during operation: cannot acquire state change lock (held by monitor=remo | 09:08 |
kashyap | Ugh, this one... | 09:08 |
ignaziocassano | unfortunately I cannot try to move further vm because they are in production and I must inform my customers | 09:09 |
*** ociuhandu has quit IRC | 09:10 | |
ignaziocassano | it us very strange because for some instances it works fine | 09:10 |
kashyap | ignaziocassano: So this is one of the most painful errors to debug from libvirt. I recently wrote a response on a Red Hat bugzilla | 09:10 |
*** ociuhandu has joined #openstack-nova | 09:10 | |
kashyap | Let me get that for you (beware: no easy solution here :-() | 09:10 |
ignaziocassano | kashyap, I read some red hat suggestions, but they include the instance restart :-( | 09:11 |
kashyap | ignaziocassano: Yes, you're not wrong, afraid. | 09:12 |
kashyap | ignaziocassano: These "cannot acquire state change lock" errors are notorious; and it could be due to QEMU getting hung, which in turn could be caused by stuck I/O | 09:12 |
*** ociuhandu has quit IRC | 09:13 | |
*** ociuhandu has joined #openstack-nova | 09:13 | |
ignaziocassano | I did not see any nfs stale on my compute node | 09:14 |
*** vishalmanchanda has quit IRC | 09:14 | |
kashyap | ignaziocassano: Can you post the affected guest log from /var/log/libvir/qemu/instance-YYYY.log? | 09:15 |
kashyap | It _might_ sometimes havea clue | 09:16 |
openstackgerrit | liuzhuangzhuang proposed openstack/nova master: Fix RBD timeout https://review.opendev.org/c/openstack/nova/+/786588 | 09:16 |
kashyap | And also this (if the output is long, use a paste-bin): `journalctl -u libvirtd -r` | 09:16 |
*** mnasiadka has quit IRC | 09:17 | |
*** coreycb has quit IRC | 09:19 | |
ignaziocassano | kashyap, unfortunately since the affected vm remained in pause on source and destination, I had to destroy them so the log files contains only: | 09:20 |
ignaziocassano | 021-05-19 08:11:19.800+0000: initiating migration | 09:20 |
ignaziocassano | 2021-05-19 08:12:30.446+0000: shutting down, reason=destroyed | 09:20 |
kashyap | Hm; that won't help us much, afraid | 09:21 |
ignaziocassano | I could try on destination host if there is something else | 09:21 |
admin0 | hi guys .. i have a strange issue: openstack server show $uuid shows its host on hypervisor => h7 . on the hypervisor, hostname, hostname -f and virsh hostname returns h7 , but during migration of this instance ( ceph backed ) to say h9 or h10, it says h7 host not found .. .. how can i address/solve this ? | 09:22 |
ignaziocassano | yes | 09:22 |
ignaziocassano | I post it on openstack pastebin | 09:22 |
ignaziocassano | :q! | 09:22 |
admin0 | https://gist.github.com/a1git/c22ec0c17aaa9dcf95fd7485eb76af2f looks like my hostnames cycled from h7, h7. h7.openstack.local .. so now i am unable to migrate anything off of the hosts | 09:22 |
*** damien_r has joined #openstack-nova | 09:23 | |
ignaziocassano | kashyap: http://paste.openstack.org/show/805476/ | 09:24 |
kashyap | ignaziocassano: Ha, see this: | 09:25 |
kashyap | --- | 09:25 |
kashyap | 2021-05-19T08:12:30.397606Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:08.0/virtio-blk' | 09:25 |
kashyap | 2021-05-19T08:12:30.399542Z qemu-kvm: load of migration failed: Input/output error | 09:25 |
kashyap | 2021-05-19 08:12:31.022+0000: shutting down, reason=crashed | 09:25 |
kashyap | --- | 09:26 |
kashyap | I've pretty sure seen that 'virtio-blk' error before | 09:26 |
*** rcernin has quit IRC | 09:26 | |
ignaziocassano | kashyap: so do you think instanance not worked fine also before live migration ? | 09:27 |
*** mkrai has joined #openstack-nova | 09:27 | |
kashyap | ignaziocassano: See this bug (comments are complex; browse it from the bottom): https://bugs.launchpad.net/nova/+bug/1761798 | 09:27 |
openstack | Launchpad bug 1761798 in OpenStack Compute (nova) "live migration intermittently fails in CI with "VQ 0 size 0x80 Guest index 0x12c inconsistent with Host index 0x134: delta 0xfff8"" [Medium,Confirmed] | 09:27 |
kashyap | ignaziocassano: No, not really; see this bit from the logs: | 09:28 |
kashyap | [quote] | 09:28 |
kashyap | we get this "guest index inconsistent" error when the migrated RAM is inconsistent with the migrated 'virtio' device state. And a common case is where a 'virtio' device does an operation after the vCPU is stopped and after RAM has been transmitted. | 09:28 |
kashyap | [/quote] | 09:28 |
kashyap | (From my comment#11) | 09:29 |
ignaziocassano | kashyap: my englush is poor but I did not read a conclusion. | 09:30 |
ignaziocassano | kayshap: do you think post_copy and autoconverge can help ? | 09:31 |
kashyap | ignaziocassano: Your English is not poor; there's no conclusion yet, as the problem is complex. | 09:31 |
kashyap | ignaziocassano: They can help if you're guest is doing I/O faster than your migration can keep up | 09:31 |
*** mkrai has quit IRC | 09:33 | |
*** whoami-rajat has joined #openstack-nova | 09:33 | |
ignaziocassano | So, the solution at this time is waiting for someone solves the bug, right ? | 09:33 |
kashyap | ignaziocassano: Not really; as upstream QEMU claims these problems should not occur with newer QEMU releases | 09:34 |
kashyap | But if you can generate a reproducer that'll help. But usually reproducers in this case are difficult to come by. | 09:34 |
ignaziocassano | kashyap: let me know if I am wrong: new qemu version come with centos 8, right ? | 09:35 |
*** coreycb has joined #openstack-nova | 09:35 | |
kashyap | ignaziocassano: Yes. Or even a potentially updated QEMU on CentOS7 (if you're running older ones; I haven't checked) | 09:35 |
ignaziocassano | kashyap: I am running the last available for centos 7 repos. I cannot migrate to centos 8 before upgrading to train, but with train there is another bug for live migration because loss packages during it | 09:37 |
*** vishalmanchanda has joined #openstack-nova | 09:38 | |
ignaziocassano | I stein I solved with Seean Mooney workaround to force port legacy binding. On train I have not found any workaround yet | 09:38 |
kashyap | I see; I don't know about that problem, afraid | 09:39 |
*** mnasiadka has joined #openstack-nova | 09:39 | |
gibi | sean-k-mooney: hi! If you have time then please check https://bugs.launchpad.net/nova/+bug/1928922 Is the observation correct? If yes then why neutron sends vif-plugged before nova even requested the plug? | 09:41 |
openstack | Launchpad bug 1928922 in OpenStack Compute (nova) "evacuation tests in nova-live-migration post hook fails with VirtualInterfaceCreateException due to vif-plugged event received before nova starts waiting for it." [Medium,New] - Assigned to Balazs Gibizer (balazs-gibizer) | 09:41 |
ignaziocassano | kashyap: thanks, I will keep informed with the bug you mentioned | 09:42 |
*** ociuhandu has quit IRC | 09:45 | |
*** ociuhandu has joined #openstack-nova | 09:47 | |
*** macz_ has joined #openstack-nova | 09:49 | |
*** ignaziocassano has quit IRC | 09:52 | |
openstackgerrit | Stephen Finucane proposed openstack/nova stable/train: Reproduce bug 1897528 https://review.opendev.org/c/openstack/nova/+/792116 | 09:52 |
openstack | bug 1897528 in OpenStack Compute (nova) "32bit pci domain number is not supported" [High,In progress] https://launchpad.net/bugs/1897528 - Assigned to Balazs Gibizer (balazs-gibizer) | 09:52 |
openstackgerrit | Stephen Finucane proposed openstack/nova stable/train: Ignore PCI devices with 32bit domain https://review.opendev.org/c/openstack/nova/+/792117 | 09:52 |
*** ociuhandu has quit IRC | 09:53 | |
*** macz_ has quit IRC | 09:54 | |
*** k_mouza has quit IRC | 09:59 | |
*** k_mouza has joined #openstack-nova | 10:05 | |
*** mkrai has joined #openstack-nova | 10:12 | |
*** dtantsur|afk is now known as dtantsur | 10:16 | |
sean-k-mooney | gibi: that should be fixed now | 10:19 |
sean-k-mooney | gibi: i have not read it fully but there was a race with the neutron dhcp server | 10:20 |
sean-k-mooney | gibi: what release was this | 10:20 |
gibi | sean-k-mooney: this is from master | 10:27 |
gibi | fairly recent master | 10:27 |
sean-k-mooney | i belive to stop the race you need to set a neutron config option | 10:28 |
sean-k-mooney | gibi: i think https://review.opendev.org/c/openstack/nova/+/770745 should help with the issue | 10:30 |
sean-k-mooney | but you would still need to enable https://review.opendev.org/c/openstack/neutron/+/790702/1/neutron/conf/common.py | 10:30 |
sean-k-mooney | [nova]/live_migration_events=true | 10:31 |
sean-k-mooney | gibi: i have not fully looked at the bug yet so ill do that shortly and try an confirm if its the same thing | 10:31 |
sean-k-mooney | gibi: but im pretty sure its https://bugzilla.redhat.com/show_bug.cgi?id=1930432 | 10:31 |
openstack | bugzilla.redhat.com bug 1930432 in openstack-nova "Nova evacuate fails due to timeout waiting for a network-vif-plugged event for instance" [Medium,Verified] - Assigned to smooney | 10:31 |
gibi | does live_migration_event affects evacutaion? | 10:31 |
*** tesseract has joined #openstack-nova | 10:32 | |
sean-k-mooney | kind of the patch has 2 fixes. one it only send events form the l2 agent to nova and second if fixes the filtering in nueton to allow procing the port if it has any port binding for the current host. previously it only did it for active port bindign which was wrong | 10:33 |
sean-k-mooney | gibi: are you able to repoduce this reliably | 10:34 |
*** ociuhandu has joined #openstack-nova | 10:34 | |
gibi | sean-k-mooney: nope, it is random and seemingly infrequent | 10:34 |
sean-k-mooney | ok i was going to suggest chanig the default of that config option since its ment to be removed in Y it shoudl default to true in Xena anyway | 10:35 |
sean-k-mooney | and then using a depends on patch to test | 10:35 |
*** tbachman has quit IRC | 10:35 | |
sean-k-mooney | but if its infrequent we might not see a difference | 10:35 |
*** tbachman has joined #openstack-nova | 10:35 | |
gibi | looking at the live migration fix, I think we could have a similar race during evacuation causing the vent to arrive too early | 10:36 |
sean-k-mooney | gibi: one casue fo this in the past was that during evac in the ci we were previously just stoping the nova compute agent not the neutron l2 agent so wehn we did the port update it would respond | 10:37 |
sean-k-mooney | e.g. the souce agent woudl say yep its already wired | 10:37 |
gibi | sean-k-mooney: I confirmed that the q-agt is dead on the source host during this run | 10:38 |
sean-k-mooney | i think we fixed that in our job | 10:38 |
gibi | yes, it is fixed in the test | 10:38 |
sean-k-mooney | ok good | 10:38 |
sean-k-mooney | am did we also ensure the job is not swapted to ovn | 10:38 |
sean-k-mooney | i belvie we did | 10:38 |
gibi | checking... | 10:38 |
sean-k-mooney | it looks like ml2/ovs | 10:41 |
gibi | we only held back nova-next on ovs https://review.opendev.org/c/openstack/nova/+/776944 but nova-live-migration moved to OVN as far as I see | 10:41 |
sean-k-mooney | really the build you linked to has teh screen-q-agt.txt | 10:41 |
sean-k-mooney | i guess it has not run since the default change maybe? | 10:43 |
gibi | hm it run last week | 10:43 |
sean-k-mooney | this was from friday yes | 10:44 |
gibi | let me check a more recent run... | 10:44 |
gibi | hm a todays run also has q-agt.txt | 10:44 |
gibi | but I don't see in the zuul config where we set the Q_AGENT option to be openvswitch for this job | 10:45 |
sean-k-mooney | yep also looking at it | 10:45 |
sean-k-mooney | is this still zullv2? | 10:45 |
sean-k-mooney | no? https://github.com/openstack/nova/blob/master/.zuul.yaml#L53-L87 | 10:46 |
sean-k-mooney | that looks like it v3 | 10:46 |
sean-k-mooney | ok well that is a different mistery to look at after | 10:46 |
sean-k-mooney | maybe it has something to do with neutron-trunk: true but that seams odd if it does | 10:48 |
gibi | yeah, that is the only odd thing in the job config | 10:48 |
sean-k-mooney | do you think the previos live_migation_evetns patch would fix evacuate? | 10:48 |
sean-k-mooney | we were planning to backport that to train as the race exits basically since netuuron has been a thing | 10:49 |
*** yangkai has quit IRC | 10:49 | |
gibi | I don't think it fixes the evacuate as we see the error on master _after_ the live migration event patch merged | 10:50 |
gibi | but the logic behind the failure can be the same | 10:50 |
sean-k-mooney | gibi: well its disabled by default | 10:50 |
sean-k-mooney | because it need the nova patch to be present to enable it | 10:51 |
gibi | also it only changes _get_neutron_events_for_live_migration() which is fairly live migration specific | 10:51 |
gibi | ohh you talk about the neutron fixes | 10:51 |
sean-k-mooney | yes | 10:52 |
gibi | we can try that | 10:52 |
gibi | as you said we want that to default to true anyhow | 10:52 |
sean-k-mooney | the nova patch was just so that the neutron fix would not break live migraiton | 10:52 |
sean-k-mooney | and because we needed the nova fix the neutron one defaulted to false for last cycle | 10:52 |
sean-k-mooney | but now it can go to ture since the nova patch is present and backported | 10:53 |
gibi | lets switch it to true on master | 10:54 |
sean-k-mooney | would you like me to submit a patch to change the default and note the evacuation bug you filed as a related bug? or will i leave that to you | 10:54 |
gibi | you have a bit more context on that neutron flag so if you have time please propose the switch | 10:54 |
sean-k-mooney | ok ill go do that shortly i have a docs meeting at the top of the hour so ill likely do it after that | 10:55 |
*** ociuhandu has quit IRC | 10:58 | |
*** ociuhandu has joined #openstack-nova | 10:58 | |
gibi | sean-k-mooney: thanks | 10:59 |
gibi | sean-k-mooney: it seems the devstack default OVN change has been reverted https://review.opendev.org/c/openstack/devstack/+/791104 | 11:00 |
sean-k-mooney | oh ok that would explain why we are seeing ml2/ovs | 11:01 |
sean-k-mooney | things exploded? | 11:01 |
gibi | based on the revert commit message yes there are extra things to fix | 11:01 |
sean-k-mooney | im really happy we did not make the default change during feature freeze then :) | 11:01 |
gibi | yes :) | 11:02 |
gibi | good judgement | 11:02 |
*** mkrai has quit IRC | 11:04 | |
*** ociuhandu has quit IRC | 11:09 | |
*** links has joined #openstack-nova | 11:21 | |
*** sapd1_x has quit IRC | 11:21 | |
*** f0o has joined #openstack-nova | 11:22 | |
* bauzas goes afk for the afternoon if you look at me (getting a new electric car ;) | 11:23 | |
gibi | bauzas: congrats! | 11:24 |
gibi | is it a tesla? | 11:24 |
bauzas | gibi: nope, for my wife, a Peugeot 208 | 11:24 |
bauzas | I already have a PHEV (Skoda Superb iV) for me :) | 11:25 |
gibi | sounds cool :) | 11:26 |
*** lpetrut has joined #openstack-nova | 11:29 | |
lyarwood | elod: https://review.opendev.org/c/openstack/nova/+/788720 - thoughts on landing this series? | 11:39 |
*** ociuhandu has joined #openstack-nova | 11:39 | |
elod | lyarwood: sorry, yes, I started to look it yesterday, but want to see the "whole picture" first o:) | 11:42 |
elod | lyarwood: it is (again) a bit too big for my taste for stable, but it's more or less clean if I'm not mistaken.... | 11:43 |
lyarwood | elod: yeah agreed it's pretty large but while it's clean I thought it would be useful | 11:46 |
lyarwood | elod: it's something we wanted downstream for Wallaby either way | 11:46 |
lyarwood | elod: so if we can squeeze it in early this cycle it would be great :) | 11:47 |
*** ociuhandu has quit IRC | 11:48 | |
elod | lyarwood: understood :) I'll try to go through the patches today | 11:48 |
*** macz_ has joined #openstack-nova | 11:50 | |
lyarwood | many thanks | 11:50 |
gibi | elod: as far as I remember it is a clean cherr-pick all the way. I did a mistake when I first tried to cherry-pick it as I used some old unmereged version of a patch | 11:52 |
*** macz_ has quit IRC | 11:54 | |
sean-k-mooney | bauzas: i have been debating about getting an eletric car for a while but 1 i like my mini and 2 i driver maybe 5000KMs per year so its har to justtify spending more then a few grand on a car | 12:04 |
elod | gibi: that part is OK then :) just have to think whether the series is valid for backport or not o:) (by looking at the code and patches yesterday it was a bit "featurish" for me, but probably there's no risk to backport... but haven't looked all of the patches yet) | 12:08 |
*** ociuhandu has joined #openstack-nova | 12:16 | |
*** ociuhandu has quit IRC | 12:21 | |
*** ratailor has quit IRC | 12:22 | |
*** ociuhandu has joined #openstack-nova | 12:26 | |
gibi | elod: it does not change any external interfaces except the two new config options, but those can be removed, if you wish, from the backport. There is also no externally visible behavior change except the fix of the failure | 12:32 |
gibi | I guess you feel it as a feature becuase we started using a different mechanism to talk to libvirt regarding the attachment, we went from polling to waiting for events. And waiting for events needed extra preparation in the code | 12:33 |
sean-k-mooney | i dont think its a featue its just a refacotiong | 12:35 |
gibi | sean-k-mooney: yes, it needed a sizeable refactoring to properly fix that bug | 12:35 |
sean-k-mooney | the behaivior before and after modulo bugs is identical from blackbox perspctive | 12:35 |
gibi | yepp | 12:35 |
*** Luzi has joined #openstack-nova | 12:40 | |
elod | yes... the refactor... o:) thanks for the details, it is useful for me to hear other opinions as well! | 12:43 |
lyarwood | https://review.opendev.org/c/openstack/nova/+/790660 - core reviews on this bugfix would be appreciated if anyone has time this week | 12:44 |
lyarwood | and the series below it sorry | 12:44 |
gibi | lyarwood: added to my queue | 12:44 |
lyarwood | thanks | 12:44 |
*** _erlon_ has joined #openstack-nova | 12:56 | |
*** ricolin has joined #openstack-nova | 13:08 | |
*** psachin has quit IRC | 13:10 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Add same_subtree field to RequestLevelParams https://review.opendev.org/c/openstack/nova/+/791503 | 13:18 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Bump min placement microversion to 1.36 https://review.opendev.org/c/openstack/nova/+/791504 | 13:18 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Support same_subtree in allocation_canadidate query https://review.opendev.org/c/openstack/nova/+/791505 | 13:18 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Support the new port resource_request format https://review.opendev.org/c/openstack/nova/+/787208 | 13:18 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Transfer RequestLevelParams from ports to scheduling https://review.opendev.org/c/openstack/nova/+/791506 | 13:18 |
*** dasp has quit IRC | 13:24 | |
*** dasp has joined #openstack-nova | 13:25 | |
sean-k-mooney | ... https://twitter.com/freenodestaff https://fuchsnet.ch/freenode-resign-letter.txt | 13:28 |
*** ricolin has quit IRC | 13:29 | |
*** macz_ has joined #openstack-nova | 13:30 | |
*** ociuhandu has quit IRC | 13:31 | |
*** ociuhandu has joined #openstack-nova | 13:32 | |
sean-k-mooney | os maybe moving to irc.libera.chat i guess although we will have to see how the topic plays out in the mailing list | 13:33 |
*** macz_ has quit IRC | 13:35 | |
kashyap | Yeah; this whole episode seems bizarre | 13:36 |
*** ociuhandu has quit IRC | 13:36 | |
*** Luzi has quit IRC | 13:39 | |
sean-k-mooney | its not the first time that this type of thing has happened. its similar to the whole mysql vs mariadb split after the sale to oracale. when legal entities assocaited with opensrouce comunities are sold and the aquiring company desides to put there stamp on comunity it can often result in a split | 13:41 |
sean-k-mooney | kashyap: in this case it sound liek the new ownwer of freenode LTD wanted to moitise it in some way and as a result the people that actully ran it but were not employees rightly dont want to have there work or the comunity monitised | 13:42 |
sean-k-mooney | reading between the lines there is also privacy concerns and perhaps lawful intercept or us law eleemnt to this too | 13:44 |
kashyap | I see. I haven't read their story in full, but it's annoying | 13:49 |
*** tosky has quit IRC | 13:56 | |
*** tosky has joined #openstack-nova | 14:00 | |
*** sapd1_x has joined #openstack-nova | 14:00 | |
*** raildo has joined #openstack-nova | 14:10 | |
*** ociuhandu has joined #openstack-nova | 14:19 | |
*** alexe9191 has joined #openstack-nova | 14:23 | |
*** ociuhandu has quit IRC | 14:24 | |
alexe9191 | Good day everyone:) . A while ago i had an issue with the nova scheduler in rocky and I was advised to use the placement api instead of filters. However, we are depending on the instance extra specs aggregate filter to match certain hosts properties. Like for instance, SSD. | 14:24 |
alexe9191 | I am wondering if such an option is available for placement? and if so, how to configure nova to use it? it's not clear in the documentation. | 14:24 |
sean-k-mooney | alexe9191: there is a replacemnt yes | 14:31 |
sean-k-mooney | https://docs.openstack.org/nova/latest/reference/isolate-aggregates.html | 14:31 |
sean-k-mooney | alexe9191: its not a direct replacment but it achive the same goal using trait in a slight better way | 14:31 |
sean-k-mooney | alexe9191: i say its not a direct replacment because you need update your flaovrs | 14:32 |
alexe9191 | I am assuming I also need to update the hosts one by one to add that trait to each resource class? | 14:33 |
alexe9191 | and also update the flavor to use trait instead of for instance hw:cpu_policy='dedicated' ? | 14:34 |
alexe9191 | or quota:disk_write? something like that | 14:34 |
sean-k-mooney | no you would have to create a new flavor with the trait e.g. traits:CUSTOM_SSD=required as an extra spec | 14:36 |
sean-k-mooney | the resize the instance to use that trait | 14:36 |
*** belmoreira has joined #openstack-nova | 14:37 | |
alexe9191 | but that trait need to be saved to the hypervisor provider, correct? or would it be inherited from the aggregate the host is sitting in? | 14:38 |
alexe9191 | Creating new flavors it not the issue. I am wondering if I have to update 800x hypervisors with the new CUSTOM_SSD trait and all other custom traits that i need? | 14:39 |
*** ociuhandu has joined #openstack-nova | 14:39 | |
alexe9191 | ah I see. I think you can add the trait to the aggregate as well? | 14:40 |
sean-k-mooney | you add it to the aggreate yes | 14:40 |
sean-k-mooney | you do also have to add it too the compute node RP | 14:40 |
sean-k-mooney | alexe9191: traits live on the resouce provider not the invetories | 14:41 |
sean-k-mooney | so you woudl put CUSTOM_SSD on all the compute nodes with an SSD | 14:41 |
sean-k-mooney | alexe9191: you can continue to use the old filter by the way | 14:42 |
alexe9191 | ah the --property in the flavor you mean? that would be translated to a trait? | 14:42 |
sean-k-mooney | alexe9191: no | 14:42 |
sean-k-mooney | so there are two ways to do it sorry | 14:43 |
sean-k-mooney | you can tag your flavor with a requried trait and tag each host rp with the custome trait | 14:43 |
sean-k-mooney | or you can use aggreates https://docs.openstack.org/nova/latest/reference/isolate-aggregates.html | 14:43 |
*** ociuhandu has quit IRC | 14:44 | |
alexe9191 | I'd actually rather use the aggregates because that means i don't have to update each host and also update each new host. The documentation is however very confusing. | 14:44 |
sean-k-mooney | you stil need to update each host | 14:44 |
*** ociuhandu has joined #openstack-nova | 14:44 | |
sean-k-mooney | so there are 3 parts | 14:44 |
sean-k-mooney | first you update the hosts rescoue providers to have the custom tratis | 14:45 |
sean-k-mooney | second you update the image or flavor to requrest the requred trait | 14:45 |
sean-k-mooney | third you enabel the placment prefilter and update the aggrate to enforace that only instance that request a given trait can land on that aggrate | 14:45 |
sean-k-mooney | alexe9191: when you add a tarit to a host you can land on that host if you request it or not | 14:46 |
sean-k-mooney | when you add a trait to the image or flavor as required you will guarentte that you willl land on a host with that trait | 14:46 |
alexe9191 | I thought the placement prefilter was enabled by default? do I need to explicitly enable it? | 14:46 |
sean-k-mooney | when you add the required trait to the aggreate you will guarentee you will only land in that aggreate if you asked for the requited trait | 14:47 |
sean-k-mooney | alexe9191: its disable by default so you have to enable it yes | 14:47 |
alexe9191 | Apologies for the question but it's not mentioned anywhere in nova.conf nor do I see a placement filter in the scheduler/filter source tree. | 14:49 |
alexe9191 | How would I enable it? | 14:49 |
sean-k-mooney | its covered in the doc | 14:49 |
sean-k-mooney | https://docs.openstack.org/nova/latest/reference/isolate-aggregates.html | 14:49 |
sean-k-mooney | you enable https://docs.openstack.org/nova/latest/configuration/config.html#scheduler.enable_isolated_aggregate_filtering | 14:49 |
alexe9191 | aha. I though about placement as a whole not that specific pre-filter. | 14:50 |
sean-k-mooney | alexe9191: placemetn filter are what we call prefilter they are not implemented in scheduler/filters since they work differently | 14:50 |
sean-k-mooney | pre-filter modify the quiery we pass to placment before the normal filters run to be more strict | 14:51 |
sean-k-mooney | alexe9191: they are implemented in https://github.com/openstack/nova/blob/master/nova/scheduler/request_filter.py | 14:51 |
alexe9191 | Thank you:) | 14:52 |
alexe9191 | That being said. I am wondering if there are any known bugs in scheduler in nova rocky? | 14:52 |
sean-k-mooney | most of them are disable by default although some are always on and we will be move more to on by default as we continue | 14:52 |
sean-k-mooney | in the schduler in general proably | 14:52 |
sean-k-mooney | in this code not that im direclty aware of that would affect your usecase | 14:53 |
alexe9191 | Allow me to be more specific. | 14:53 |
alexe9191 | We have availability zones and we have aggregates to group some hosts that has certain properties like I mentioned. | 14:53 |
alexe9191 | For some reason. the scheduling works just fine till it stops. And then the solution to that would be removing one host from the aggregates and readding it. And then scheduling would work again. | 14:54 |
sean-k-mooney | that is odd | 14:54 |
alexe9191 | It seems to fail mostly on the availability zone filter or the extra spec filter. The other day it said that property defined in the flavor does not match the metadata on the aggregate. But it does | 14:54 |
*** jawad_axd has quit IRC | 14:55 | |
sean-k-mooney | well you can disable that if you enabel https://github.com/openstack/nova/blob/stable/rocky/nova/scheduler/request_filter.py#L63 | 14:55 |
sean-k-mooney | alexe9191: you do not have the issolated aggrate feature in rocky that we were discussing | 14:55 |
sean-k-mooney | but you you have access to using placment for AZs | 14:55 |
sean-k-mooney | instead of teh az filter | 14:55 |
alexe9191 | Indeed. This is why I am investigating moving what I can move to the placement. Also for performance reasons. | 14:56 |
sean-k-mooney | alexe9191: https://docs.openstack.org/nova/latest/admin/availability-zones.html#availability-zones-with-placement | 14:56 |
sean-k-mooney | alexe9191: that is still not enabled by default upstream but that is becasue we ment to do it in victoria and we got distrated | 14:57 |
sean-k-mooney | alexe9191: ill be enableing it by defualt this cycle and deprecating the AZ filter for removal in the Y release | 14:57 |
alexe9191 | may I ask what issues has been faced when it was enabled by default? | 14:57 |
sean-k-mooney | none that im aware of | 14:58 |
sean-k-mooney | we have not enabled it by default yet | 14:58 |
alexe9191 | ah I misread. Apologies. | 14:58 |
sean-k-mooney | i submited https://review.opendev.org/c/openstack/nova/+/745605 | 14:59 |
alexe9191 | enabling this pre-filter will not have an effect on the extraspecs aggregate I assume? As I intend to slowly migrate the users to new flavors. | 14:59 |
sean-k-mooney | to do it last year but i got distracted | 14:59 |
sean-k-mooney | alexe9191: correct it will not | 14:59 |
sean-k-mooney | what it will do is map all AZ to plamcent aggrate and then include the aggreate uuid in the query | 15:00 |
sean-k-mooney | so if you as for AZ X it will tell placment to only look at host that are in AZ x by mapping those host to a placment aggaet and stating the canidat hosts must be a member_of the AZ | 15:01 |
*** lpetrut has quit IRC | 15:01 | |
*** nightmare_unreal has quit IRC | 15:01 | |
alexe9191 | That is the desired behaviour. | 15:01 |
sean-k-mooney | we have been very carful to ensure that you can opt into the new plamcent feature gradulally | 15:02 |
sean-k-mooney | placment should never break any of the exisitng filters | 15:02 |
*** rpittau is now known as rpittau|bbl | 15:03 | |
sean-k-mooney | but it should narrow down thte host passed to it so the filters have to check less host by moveing more of the filtering to placment and sql instead of python | 15:03 |
*** dklyle has joined #openstack-nova | 15:03 | |
alexe9191 | Excellent. | 15:04 |
alexe9191 | The mentioned bug about the scheduling, is it fixed in releases after rocky? | 15:05 |
*** jawad_axd has joined #openstack-nova | 15:05 | |
alexe9191 | Does not seem to be: https://bugs.launchpad.net/nova/+bug/1677217 | 15:05 |
openstack | Launchpad bug 1677217 in OpenStack Compute (nova) " AggregateImagePropertiesIsolation filter return unwanted compute nodes" [Low,Triaged] | 15:05 |
*** macz_ has joined #openstack-nova | 15:06 | |
sean-k-mooney | alexe9191: no its not really a bug | 15:06 |
sean-k-mooney | alexe9191: its a limitation fo the desgin fo filter | 15:06 |
alexe9191 | I see that there is a "workaround" sort to speak... | 15:07 |
sean-k-mooney | they cannot easily work the way they wanted it to which is why we built the placment feature to prevent it | 15:07 |
sean-k-mooney | yes you can set the key to something in all images | 15:07 |
sean-k-mooney | then it will see teh key and enforce the rules | 15:07 |
sean-k-mooney | but that is basically the best we can do in the exsitng filters | 15:08 |
*** mlavalle has joined #openstack-nova | 15:08 | |
sean-k-mooney | alexe9191: there is an out of tree filter that kind fo fixes it https://opendev.org/x/nfv-filters/src/branch/master | 15:08 |
sean-k-mooney | but that is no longer maintined | 15:09 |
alexe9191 | Excellent. I am gonna try that on a subset of hosts and see what the results are. I wish there was a way to define the trait in the nova.conf file of the compute host though instead of adding it to each host one by one. | 15:09 |
alexe9191 | thanks for the tip:) | 15:09 |
sean-k-mooney | alexe9191: there is but not in rocky | 15:09 |
alexe9191 | aw? | 15:09 |
sean-k-mooney | in ussuri or wallably we added a provider.ymal file that allows you to add traits and resource provider via a file | 15:10 |
sean-k-mooney | alexe9191: https://docs.openstack.org/nova/latest/admin/managing-resource-providers.html | 15:10 |
sean-k-mooney | https://specs.openstack.org/openstack/nova-specs/specs/ussuri/approved/provider-config-file.html | 15:10 |
sean-k-mooney | so ya it was complted in victoria https://specs.openstack.org/openstack/nova-specs/specs/victoria/implemented/provider-config-file.html | 15:11 |
alexe9191 | This is good news! | 15:12 |
kashyap | gibi: How did you unpack Sam Su's email? For me in Mutt it only shows up as " Error: unable to create OpenSSL subprocess!" | 15:12 |
kashyap | gibi: Thanks for quoting it in full! | 15:12 |
gibi | kashyap: I guess that it was sent from outlook so I forwarded it to my work email, and outlook was able to extract the content from the smime | 15:14 |
kashyap | Ah, I see | 15:14 |
kashyap | Thx :) | 15:14 |
gibi | artom: hi! are you OK if we make the decision about the meeting schedule change tomorrow on the meeting? | 15:16 |
*** Alon_KS has quit IRC | 15:17 | |
artom | gibi, totally, it was my plan as well, and is in fact on the agenda :) | 15:17 |
gibi | artom: awesome :) | 15:18 |
*** _erlon_ has quit IRC | 15:18 | |
*** Alon_KS has joined #openstack-nova | 15:22 | |
*** dave-mccowan has joined #openstack-nova | 15:41 | |
*** jawad_axd has quit IRC | 15:41 | |
*** ociuhandu has quit IRC | 15:45 | |
*** lucasagomes has quit IRC | 15:59 | |
*** k_mouza has quit IRC | 16:14 | |
*** hamalq has joined #openstack-nova | 16:25 | |
*** cgoncalves has quit IRC | 16:32 | |
*** cgoncalves has joined #openstack-nova | 16:33 | |
*** tesseract has quit IRC | 16:43 | |
*** rpittau|bbl is now known as rpittau | 16:45 | |
*** cgoncalves has quit IRC | 16:52 | |
*** cgoncalves has joined #openstack-nova | 16:53 | |
*** derekh has quit IRC | 17:02 | |
*** ricolin_ has joined #openstack-nova | 17:03 | |
*** ralonsoh has quit IRC | 17:11 | |
*** links has quit IRC | 17:11 | |
*** gyee has joined #openstack-nova | 17:12 | |
*** dtantsur is now known as dtantsur|afk | 17:25 | |
*** rpittau is now known as rpittau|afk | 17:37 | |
*** ociuhandu has joined #openstack-nova | 17:46 | |
*** ociuhandu has quit IRC | 17:52 | |
*** k_mouza has joined #openstack-nova | 18:00 | |
* stephenfin finishes for the evening o/ | 18:01 | |
*** k_mouza has quit IRC | 18:05 | |
*** andrewbonney has quit IRC | 18:05 | |
openstackgerrit | Artom Lifshitz proposed openstack/nova stable/wallaby: Test SRIOV port move operations with PCI conflicts https://review.opendev.org/c/openstack/nova/+/790710 | 18:17 |
openstackgerrit | Artom Lifshitz proposed openstack/nova stable/wallaby: Update SRIOV port pci_slot when unshelving https://review.opendev.org/c/openstack/nova/+/790711 | 18:17 |
openstackgerrit | Artom Lifshitz proposed openstack/nova stable/wallaby: Neutron fixture: don't clobber profile and vif_details if empty https://review.opendev.org/c/openstack/nova/+/792233 | 18:17 |
artom | me: Damn, I'm getting KeyError on ['pci_slot'] in my func tests, clearly there's a patch that fixed that in the fixtures and I need to find it and backport it | 18:18 |
*** bbowen has quit IRC | 18:18 | |
artom | also me: wrote the patch himself in master, forgot it even existed | 18:18 |
*** vishalmanchanda has quit IRC | 18:22 | |
*** fyx has joined #openstack-nova | 18:30 | |
sean-k-mooney | gibi: by the way now that we have provider.yaml we could have bandwith or pps inventoreis chreated using it too right instead of having to modify nova | 18:34 |
sean-k-mooney | *neutron | 18:34 |
sean-k-mooney | im not nessisarly saying we want to do that but i was just thinkink about the ovn case it might be nice to enabel gurenteed minium bandwith with ovn by having nova via provider.yaml report invetories of bandwithd | 18:36 |
*** belmoreira has quit IRC | 18:36 | |
sean-k-mooney | as i replied on your spec though the way i would expect teh inventories to be configured woudl be useing pseudo agent binding by adding the config info to the external-ids column in the chassis table of the ovn-southdb for the given host | 18:37 |
sean-k-mooney | and have the ml2 driver report the inveotreis as normal | 18:37 |
sean-k-mooney | just pointing out that since provider.yaml exsits taht might be another option | 18:37 |
*** sapd1_x has quit IRC | 18:56 | |
*** whoami-rajat has quit IRC | 19:19 | |
*** bbowen has joined #openstack-nova | 19:26 | |
*** dklyle has quit IRC | 19:34 | |
*** dklyle has joined #openstack-nova | 19:52 | |
*** k_mouza has joined #openstack-nova | 20:15 | |
*** MrClayPole_ has joined #openstack-nova | 20:15 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!