opendevreview | Merged openstack/nova stable/wallaby: Invalidate provider tree when compute node disappears https://review.opendev.org/c/openstack/nova/+/811807 | 00:45 |
---|---|---|
*** EugenMayer7 is now known as EugenMayer | 02:30 | |
*** dasm|ruck is now known as dasm|ruck|off | 05:13 | |
*** prometheanfire is now known as Guest0 | 06:23 | |
opendevreview | anguoming proposed openstack/nova master: Add catching InstanceNotFound exception when call live_migration_abort https://review.opendev.org/c/openstack/nova/+/840429 | 07:59 |
opendevreview | anguoming proposed openstack/nova master: Add catching InstanceNotFound exception when call live_migration_abort https://review.opendev.org/c/openstack/nova/+/840429 | 08:09 |
opendevreview | anguoming proposed openstack/nova master: Add catching InstanceNotFound exception when call live_migration_abort https://review.opendev.org/c/openstack/nova/+/840429 | 08:10 |
opendevreview | anguoming proposed openstack/nova master: Add catching InstanceNotFound exception when call live_migration_abort https://review.opendev.org/c/openstack/nova/+/840429 | 08:13 |
opendevreview | anguoming proposed openstack/nova master: Add catching InstanceNotFound exception when call live_migration_abort https://review.opendev.org/c/openstack/nova/+/840429 | 08:23 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Allow claiming PCI PF if child VF is unavailable https://review.opendev.org/c/openstack/nova/+/838555 | 08:57 |
gibi | melwitt, sean-k-mooney: ^^ thanks for noting the missing test coverage, I extended the patch now | 08:57 |
EugenMayer | Hello. Today an instance just shutdown. I checked the audit log and see that it has been stopped by 'nobody' means, the user UUID is '-' - what could that mean? | 09:55 |
bauzas | gmann: as you prefer, I just want to make sure that both the LP name, the spec name and the link to the LP BP in the spec are the same :) | 10:24 |
bauzas | gmann: if you modify the LP name tho, the link in the yoga spec won't work, but that's fine | 10:24 |
opendevreview | Merged openstack/nova-specs master: Amend unified limits spec to explain "API limit" enforcement https://review.opendev.org/c/openstack/nova-specs/+/829413 | 10:56 |
opendevreview | anguoming proposed openstack/nova master: Add catching InstanceNotFound exception when call live_migration_abort https://review.opendev.org/c/openstack/nova/+/840429 | 11:11 |
gibi | fyi reported a gate failure https://bugs.launchpad.net/neutron/+bug/1971563 it is not super frequent (two hits in 14 days) but I needed a bug number to recheck ;) | 11:17 |
sean-k-mooney | ok so that in ml2/ovs with iptables | 11:24 |
sean-k-mooney | am both the dhcp agent and the l2 agent have to set porvisioning completed for the neutron server to send the event so maybe only one of those completed in time | 11:26 |
gibi | yeah I tagged neutron in the bug as I only see that the port state goes from ACTIVE -> ACTIVE on neutron side after nova started waiting for the vif-plug but I don't see the notification sending in the neutron server logs | 11:27 |
sean-k-mooney | rovisioning for port b6dc2b79-ed38-4907-86e2-bdff1c5a9b9f completed by entity L2. | 11:27 |
sean-k-mooney | so ya l2 agent completed wiring it up but not dhcp agent | 11:27 |
sean-k-mooney | May 03 16:47:04.331996 ubuntu-focal-ovh-bhs1-0029531414 neutron-dhcp-agent[90791]: INFO neutron.agent.dhcp.agent [None req-89d4fdba-e0f4-4778-b944-ed87f5102ff1 None None] DHCP configuration for ports {'b6dc2b79-ed38-4907-86e2-bdff1c5a9b9f'} is completed | 11:29 |
sean-k-mooney | https://zuul.opendev.org/t/openstack/build/518f8641b9a7419391b0f99f795f26bd/log/controller/logs/screen-q-dhcp.txt#4145 | 11:29 |
sean-k-mooney | so ya the dhcp agent things it completed but we did not see the event on the neutron server marking provisioning complete by the dhcp agent so this looks like a bug in the dhcp agent | 11:31 |
sean-k-mooney | normally it compeltes first so perhaps there is a bug where it wont send the event if the port is already active or something like that | 11:32 |
gibi | sean-k-mooney: thanks for looking into this, could you please add these details to the bug | 11:39 |
sean-k-mooney | sure just responeing to a review comment and ill add them | 11:39 |
gibi | thank you | 11:40 |
sean-k-mooney | gibi: actully the dhcp agent has marked it as complete | 11:55 |
sean-k-mooney | in the neutron server log | 11:55 |
sean-k-mooney | gibi: neutron did send the event | 12:02 |
sean-k-mooney | INFO neutron.notifiers.nova [-] Nova event matching ['req-17e4fee0-ad06-4350-af09-1db0d331d6b5'] response: {'server_uuid': '3a81145d-d263-4e1d-8ec3-faf38fed34f2', 'name': 'network-vif-plugged', 'status': 'completed', 'tag': 'b6dc2b79-ed38-4907-86e2-bdff1c5a9b9f', 'code': 200} | 12:02 |
gibi | sean-k-mooney: timing doesn't add up | 12:05 |
gibi | nova plugged the vif at May 03 16:48:41.237538 | 12:05 |
sean-k-mooney | we recvie it at 16:47:22 | 12:05 |
sean-k-mooney | ya so this is proably because of the dhcp agent race | 12:06 |
gibi | neutron should not send the plugged event _before_ nova plugs the vif | 12:06 |
sean-k-mooney | i bet we dont have the config option set in neutron | 12:06 |
sean-k-mooney | gibi: there was a race in neutron where it would not wait for both the dhcp and l2 agent to finish | 12:06 |
sean-k-mooney | it was fixed by https://review.opendev.org/c/openstack/neutron/+/766277 | 12:07 |
sean-k-mooney | although hum | 12:07 |
sean-k-mooney | that ws for live migration | 12:08 |
gibi | this is not a live migration | 12:08 |
gibi | this is evacuate | 12:08 |
sean-k-mooney | the same could happen there | 12:08 |
gibi | and I still not get it. Can neutron send a vif-plugged event _before_ nova even plugs the vif via os-vif? | 12:08 |
sean-k-mooney | let me check if that is enabled or not | 12:08 |
sean-k-mooney | gibi: its simple | 12:09 |
sean-k-mooney | the port was active on the host we are evacuating form | 12:09 |
sean-k-mooney | so it thinks the l2 agent is finished doign its work | 12:09 |
sean-k-mooney | so when the dhcp agent responds it sends the event | 12:09 |
sean-k-mooney | that is proably what is happening here | 12:09 |
sean-k-mooney | that is what happend for live migration | 12:10 |
sean-k-mooney | i expect the same behavior for evacuate | 12:10 |
sean-k-mooney | gibi: https://zuul.opendev.org/t/openstack/build/518f8641b9a7419391b0f99f795f26bd/log/controller/logs/etc/neutron/neutron_conf.txt#1322-1334 | 12:11 |
sean-k-mooney | its disabled | 12:11 |
sean-k-mooney | we should try enabling that and see if it fixes the problem | 12:11 |
sean-k-mooney | gibi: the fix in neutron is based on the presence of migratin_to in the port profile | 12:12 |
gibi | I thought the expected sequence would be: 1) nova bounds the port to the target host 2) nova plugs the vif on the target host 3) neutron agents plugs the other end of the vif on the target host 4) neutron sends the vif-plugged event to nova | 12:13 |
gibi | but based on what you said 3) and 4) happens before 2) | 12:14 |
sean-k-mooney | right but what actuly happens is the l2 agent on the souce host say the port is alreay bound the dhcp agent say the dhcp configurtion is correct and then neuton sends the event | 12:14 |
gibi | I see, | 12:14 |
sean-k-mooney | https://review.opendev.org/c/openstack/neutron/+/766277/10/neutron/agent/rpc.py | 12:14 |
sean-k-mooney | add filterign so that we only consider updates form the host that migrating_to points too | 12:15 |
gibi | OK | 12:15 |
gibi | based on the comment in the config, the live_migration_events flag should be removed in neutron in Zed already | 12:16 |
sean-k-mooney | well it will be removed in zed and always enabled | 12:16 |
sean-k-mooney | but i dont know if they have done that yet | 12:16 |
gibi | OK | 12:16 |
sean-k-mooney | https://github.com/openstack/neutron/blob/master/neutron/conf/common.py#L182-L199= | 12:17 |
sean-k-mooney | still there | 12:17 |
gibi | OK, I will push a patch to enable that flag in the hybrid plug job | 12:17 |
sean-k-mooney | ack ralonsoh do you have patches to remove https://github.com/openstack/neutron/blob/master/neutron/conf/common.py#L182-L199= | 12:17 |
sean-k-mooney | and alway smake that the correct behvior | 12:18 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Enable live_migration_events in nova-ovs-hybrid-plug https://review.opendev.org/c/openstack/nova/+/840446 | 12:23 |
gibi | sean-k-mooney: ^^ | 12:23 |
sean-k-mooney | ack just realised im meent to be on a call | 12:24 |
sean-k-mooney | ill take a look after | 12:24 |
gibi | thanks | 12:24 |
gibi | btw this error was pretty infrequent so we might not know if this fixes it | 12:24 |
sean-k-mooney | ok ya if its what i think it is its an race that we ocationly loose | 12:26 |
sean-k-mooney | normaly the event arrives after we start waiting but not always | 12:26 |
ralonsoh | sean-k-mooney, I'll do it now | 12:34 |
sean-k-mooney | ralonsoh: thanks no rush but better to land that earlier then late in the cycle | 12:35 |
ralonsoh | sean-k-mooney, btw, just to confirm: the value will be True now | 12:35 |
sean-k-mooney | yes | 12:35 |
ralonsoh | perfect | 12:35 |
sean-k-mooney | well you are removing the option yes | 12:35 |
ralonsoh | sean-k-mooney, https://review.opendev.org/c/openstack/neutron/+/840448 | 12:45 |
ralonsoh | I'll wait until your reviews | 12:46 |
sean-k-mooney | ralonsoh: it looks good to me but one nit | 12:47 |
ralonsoh | sure | 12:47 |
sean-k-mooney | you do not have a release note for this | 12:47 |
ralonsoh | right, it deserves one | 12:47 |
ralonsoh | I'll add it | 12:47 |
sean-k-mooney | +0 while you adress that but otherwise +1 | 12:48 |
sean-k-mooney | i do want to also see the ci run on this too but it should be fine | 12:48 |
ralonsoh | perfect | 12:49 |
gmann | bauzas: ah, good point on yoga spec link. in that case, let me keep it same name then and in detail i can mention what all things this BP is targeting. I will update zed proposed spec file | 13:09 |
bauzas | gmann: ack ok | 13:09 |
*** dasm|ruck|off is now known as dasm|ruck | 13:43 | |
opendevreview | Rico Lin proposed openstack/nova-specs master: Add vIOMMU device support for libvirt driver https://review.opendev.org/c/openstack/nova-specs/+/840310 | 13:47 |
sean-k-mooney | gibi: so rodolfo has a patch to make this the default but do we ant to proceed with https://review.opendev.org/c/openstack/nova/+/840446 anyway and perhaps backport that to the relevent branches ? | 14:11 |
gibi | I haven't see hits of this bug on stable. do we have the hybrid job on stable? | 14:11 |
sean-k-mooney | i think artom is adding it we had before yoga too since that was the devstack default | 14:12 |
sean-k-mooney | gibi: https://github.com/openstack/nova/blob/stable/yoga/.zuul.yaml#L650= | 14:14 |
gibi | then I think it make sense to land this now and backport it | 14:15 |
gibi | then we can drop the flag from master when ralonsoh's patch lands | 14:15 |
sean-k-mooney | this https://review.opendev.org/c/openstack/nova/+/828413/2 and https://review.opendev.org/c/openstack/nova/+/828418 will be adding it to xena and wallaby | 14:16 |
sean-k-mooney | cool | 14:16 |
opendevreview | Rico Lin proposed openstack/nova-specs master: Add vIOMMU device support for libvirt driver https://review.opendev.org/c/openstack/nova-specs/+/840310 | 16:04 |
ricolin | sean-k-mooney: Thanks for your very detailed review, just update the spec accordingly :) | 16:05 |
opendevreview | Merged openstack/nova-specs master: Re-propose remove tenant_id https://review.opendev.org/c/openstack/nova-specs/+/837789 | 16:34 |
opendevreview | Ghanshyam proposed openstack/nova-specs master: Re-propose allow Project admin to list allowed hypervisors https://review.opendev.org/c/openstack/nova-specs/+/833165 | 16:44 |
gmann | gibi: dansmith ^^ as you reviewed it in Yoga cycle. re-proposing the spec. | 16:45 |
*** whoami-rajat__ is now known as whoami-rajat | 17:25 | |
*** Guest0 is now known as prometheanfire | 19:47 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!