*** wolverineav has joined #openstack-nova | 00:01 | |
*** brault has joined #openstack-nova | 00:03 | |
*** wolverineav has quit IRC | 00:05 | |
*** jmlowe has quit IRC | 00:07 | |
*** brault has quit IRC | 00:07 | |
*** itlinux has joined #openstack-nova | 00:08 | |
openstackgerrit | Hongbin Lu proposed openstack/nova-specs master: [WIP] Support scheduling VM's NICs to different PFs https://review.openstack.org/626055 | 00:19 |
---|---|---|
*** hongbin has quit IRC | 00:21 | |
openstackgerrit | sean mooney proposed openstack/nova-specs master: Add spec for sriov live migration https://review.openstack.org/605116 | 00:24 |
*** mriedem has quit IRC | 00:27 | |
*** itlinux has quit IRC | 00:28 | |
*** _alastor_ has joined #openstack-nova | 00:28 | |
*** itlinux has joined #openstack-nova | 00:28 | |
*** itlinux has quit IRC | 00:29 | |
*** macza has quit IRC | 00:32 | |
*** mlavalle has quit IRC | 00:32 | |
*** fragatina has joined #openstack-nova | 00:40 | |
*** sapd1 has joined #openstack-nova | 00:45 | |
*** ileixe has joined #openstack-nova | 00:50 | |
*** ileixe has quit IRC | 00:51 | |
*** ileixe has joined #openstack-nova | 00:53 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Remove "API Service Version" upgrade check https://review.openstack.org/615348 | 00:54 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Drop old service version check compat from _delete_while_booting https://review.openstack.org/623589 | 00:54 |
*** gyee has quit IRC | 00:55 | |
*** itlinux has joined #openstack-nova | 01:12 | |
*** wolverineav has joined #openstack-nova | 01:18 | |
*** sapd1 has quit IRC | 01:20 | |
*** wolverineav has quit IRC | 01:23 | |
*** igordc has quit IRC | 01:29 | |
*** _alastor_ has quit IRC | 01:34 | |
*** tiendc has joined #openstack-nova | 01:34 | |
*** igordc has joined #openstack-nova | 01:38 | |
*** igordc has quit IRC | 01:42 | |
*** colby_ has quit IRC | 01:45 | |
openstackgerrit | zhaodan7597 proposed openstack/nova master: fix a bug, when creating a vmware instance from a volume,and it goes to error state, the volume still in "in use" state. https://review.openstack.org/571112 | 01:47 |
*** tetsuro has quit IRC | 01:47 | |
openstackgerrit | zhaodan7597 proposed openstack/nova master: Bug description: when creating a vmware instance from a volume is failed, vm goes to error state, the volume is in "in use" state, and after deleting the vm, the state of the volume is still in "in use" and can't be deleted. https://review.openstack.org/571112 | 01:54 |
openstackgerrit | Merged openstack/nova master: Make [cinder]/catalog_info no longer require a service_name https://review.openstack.org/620738 | 01:59 |
*** sapd1 has joined #openstack-nova | 02:04 | |
*** Dinesh_Bhor has joined #openstack-nova | 02:08 | |
*** sapd1 has quit IRC | 02:09 | |
*** cfriesen has quit IRC | 02:12 | |
openstackgerrit | zhaodan7597 proposed openstack/nova master: Catch the not found exception when power off an bfv wmware instance failed https://review.openstack.org/571112 | 02:15 |
*** colby_ has joined #openstack-nova | 02:17 | |
*** igordc has joined #openstack-nova | 02:22 | |
*** mhen has quit IRC | 02:22 | |
*** mhen has joined #openstack-nova | 02:23 | |
*** Dinesh_Bhor has quit IRC | 02:23 | |
*** wolverineav has joined #openstack-nova | 02:33 | |
*** wolverineav has quit IRC | 02:34 | |
*** wolverin_ has joined #openstack-nova | 02:34 | |
*** Dinesh_Bhor has joined #openstack-nova | 02:39 | |
*** itlinux has quit IRC | 02:40 | |
*** igordc has quit IRC | 02:43 | |
*** psachin has joined #openstack-nova | 02:58 | |
*** mrsoul has quit IRC | 03:00 | |
*** brinzhang has joined #openstack-nova | 03:04 | |
*** igordc has joined #openstack-nova | 03:19 | |
*** Bhujay has joined #openstack-nova | 03:20 | |
*** tbachman has quit IRC | 03:20 | |
openstackgerrit | zhaodan7597 proposed openstack/nova master: Catch the not found exception when powering off a wmware instance https://review.openstack.org/571112 | 03:30 |
*** tbachman has joined #openstack-nova | 03:30 | |
*** Bhujay has quit IRC | 03:38 | |
*** erlon has quit IRC | 03:44 | |
*** Dinesh_Bhor has quit IRC | 03:46 | |
openstackgerrit | Brin Zhang proposed openstack/nova-specs master: Support admin to specify project to create snapshot https://review.openstack.org/616843 | 03:46 |
*** hongbin has joined #openstack-nova | 03:51 | |
*** igordc has quit IRC | 03:52 | |
*** dklyle has quit IRC | 04:02 | |
*** david-lyle has joined #openstack-nova | 04:02 | |
*** brault has joined #openstack-nova | 04:05 | |
*** wolverin_ has quit IRC | 04:06 | |
*** sapd1 has joined #openstack-nova | 04:07 | |
*** takashin has joined #openstack-nova | 04:09 | |
*** brault has quit IRC | 04:09 | |
*** wolverineav has joined #openstack-nova | 04:13 | |
*** wolverineav has quit IRC | 04:13 | |
gmann | gibi: i replied on this, please check if that make sense - https://review.openstack.org/#/c/625002/ | 04:13 |
*** slaweq has quit IRC | 04:13 | |
*** udesale has joined #openstack-nova | 04:14 | |
*** wolverineav has joined #openstack-nova | 04:14 | |
*** macza has joined #openstack-nova | 04:17 | |
*** wolverineav has quit IRC | 04:18 | |
*** Bhujay has joined #openstack-nova | 04:33 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Use renamed template 'integrated-gate-py3' https://review.openstack.org/626088 | 04:34 |
*** janki has joined #openstack-nova | 04:48 | |
*** vabada has quit IRC | 05:05 | |
*** evrardjp_ has joined #openstack-nova | 05:05 | |
*** evrardjp has quit IRC | 05:07 | |
*** Dinesh_Bhor has joined #openstack-nova | 05:18 | |
*** ileixe has quit IRC | 05:25 | |
*** ileixe has joined #openstack-nova | 05:27 | |
*** hongbin has quit IRC | 05:40 | |
*** udesale has quit IRC | 05:45 | |
*** brinzhang has quit IRC | 05:45 | |
*** udesale has joined #openstack-nova | 05:49 | |
*** macza has quit IRC | 05:58 | |
*** evrardjp has joined #openstack-nova | 06:01 | |
*** evrardjp_ has quit IRC | 06:03 | |
*** ratailor has joined #openstack-nova | 06:04 | |
*** ratailor has quit IRC | 06:04 | |
*** licanwei has joined #openstack-nova | 06:04 | |
*** ratailor has joined #openstack-nova | 06:05 | |
openstackgerrit | zhaodan7597 proposed openstack/nova master: Catch the not found exception when powering off a wmware instance https://review.openstack.org/626095 | 06:05 |
*** ratailor has quit IRC | 06:06 | |
openstackgerrit | zhaodan7597 proposed openstack/nova master: Catch the not found exception when powering off a wmware instance https://review.openstack.org/626095 | 06:08 |
*** sridharg has joined #openstack-nova | 06:09 | |
openstackgerrit | zhaodan7597 proposed openstack/nova master: Catch the not found exception when powering off a wmware instance https://review.openstack.org/571112 | 06:10 |
*** wolverineav has joined #openstack-nova | 06:12 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Adds view builders for keypairs controller https://review.openstack.org/347289 | 06:18 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Fix 500 error while passing 4-byte unicode data https://review.openstack.org/407514 | 06:19 |
*** wolverineav has quit IRC | 06:21 | |
*** dims has quit IRC | 06:27 | |
*** fragatina has quit IRC | 06:27 | |
*** fragatina has joined #openstack-nova | 06:28 | |
*** dims has joined #openstack-nova | 06:28 | |
*** dims has quit IRC | 06:33 | |
*** dims has joined #openstack-nova | 06:34 | |
*** mgagne has quit IRC | 06:35 | |
*** fragatina has quit IRC | 06:35 | |
*** fragatina has joined #openstack-nova | 06:36 | |
*** mgagne has joined #openstack-nova | 06:39 | |
*** ianw is now known as ianw_pto | 06:43 | |
*** Dinesh_Bhor has quit IRC | 06:56 | |
*** fragatina has quit IRC | 07:01 | |
*** fragatina has joined #openstack-nova | 07:01 | |
*** liuyulong has joined #openstack-nova | 07:07 | |
*** Dinesh_Bhor has joined #openstack-nova | 07:10 | |
*** brault has joined #openstack-nova | 07:12 | |
*** brault has quit IRC | 07:14 | |
*** wolverineav has joined #openstack-nova | 07:20 | |
*** wolverineav has quit IRC | 07:20 | |
*** wolverineav has joined #openstack-nova | 07:20 | |
*** ratailor has joined #openstack-nova | 07:21 | |
*** dpawlik has joined #openstack-nova | 07:21 | |
*** dpawlik has quit IRC | 07:25 | |
*** vabada has joined #openstack-nova | 07:27 | |
*** ileixe has quit IRC | 07:29 | |
*** imacdonn has quit IRC | 07:29 | |
*** imacdonn has joined #openstack-nova | 07:29 | |
openstackgerrit | zhaodan7597 proposed openstack/nova master: Catch the not found exception when powering off a vmware instance https://review.openstack.org/571112 | 07:31 |
*** dpawlik has joined #openstack-nova | 07:36 | |
*** dpawlik has quit IRC | 07:36 | |
*** dpawlik_ has joined #openstack-nova | 07:36 | |
*** ccamacho has joined #openstack-nova | 07:40 | |
*** moshele has joined #openstack-nova | 07:42 | |
*** wolverineav has quit IRC | 07:46 | |
*** sapd1 has quit IRC | 07:49 | |
*** slaweq has joined #openstack-nova | 08:00 | |
*** takashin has left #openstack-nova | 08:00 | |
*** Dinesh_Bhor has quit IRC | 08:03 | |
*** pcaruana has joined #openstack-nova | 08:03 | |
openstackgerrit | Yongli He proposed openstack/nova-specs master: add 'show-server-group' spec https://review.openstack.org/612255 | 08:05 |
*** sapd1 has joined #openstack-nova | 08:06 | |
*** markvoelker has joined #openstack-nova | 08:10 | |
openstackgerrit | Martin Midolesov proposed openstack/nova master: vmware:add support for the hw_video_ram image property https://review.openstack.org/564193 | 08:12 |
openstackgerrit | wingwj proposed openstack/nova master: Fix a broken-link in nova doc https://review.openstack.org/626110 | 08:12 |
*** liuyulong has quit IRC | 08:15 | |
*** Bhujay has quit IRC | 08:19 | |
openstackgerrit | wingwj proposed openstack/nova master: Fix a broken-link in nova doc https://review.openstack.org/626113 | 08:22 |
*** sahid has joined #openstack-nova | 08:23 | |
*** helenafm has joined #openstack-nova | 08:24 | |
*** pcaruana has quit IRC | 08:24 | |
openstackgerrit | wingwj proposed openstack/nova master: Fix a broken-link in nova doc https://review.openstack.org/626110 | 08:26 |
*** pcaruana has joined #openstack-nova | 08:33 | |
*** bhagyashris has joined #openstack-nova | 08:34 | |
*** brault has joined #openstack-nova | 08:40 | |
*** brault has quit IRC | 08:41 | |
*** pcaruana has quit IRC | 08:41 | |
openstackgerrit | Merged openstack/nova master: Remove legacy request spec compat code from API https://review.openstack.org/614309 | 08:43 |
gibi | gmann: hi! regarding https://review.openstack.org/#/c/625002/ do we assume that every dependency related code change are fully covered with unit and functional test? | 08:46 |
*** jogo has joined #openstack-nova | 08:46 | |
*** rcernin has quit IRC | 08:51 | |
*** alexchadin has joined #openstack-nova | 08:55 | |
*** Bhujay has joined #openstack-nova | 09:03 | |
*** Bhujay has quit IRC | 09:04 | |
*** Bhujay has joined #openstack-nova | 09:05 | |
*** Bhujay has quit IRC | 09:06 | |
*** Dinesh_Bhor has joined #openstack-nova | 09:06 | |
*** Bhujay has joined #openstack-nova | 09:06 | |
*** Bhujay has quit IRC | 09:07 | |
*** Bhujay has joined #openstack-nova | 09:08 | |
*** Bhujay has quit IRC | 09:09 | |
*** Bhujay has joined #openstack-nova | 09:09 | |
*** rodolof has joined #openstack-nova | 09:10 | |
*** wolverineav has joined #openstack-nova | 09:10 | |
*** Bhujay has quit IRC | 09:10 | |
*** Bhujay has joined #openstack-nova | 09:11 | |
*** moshele has quit IRC | 09:13 | |
*** wolverineav has quit IRC | 09:15 | |
gibi | gmann: left a reply in https://review.openstack.org/#/c/625002/ | 09:16 |
*** priteau has joined #openstack-nova | 09:23 | |
*** Bhujay has quit IRC | 09:24 | |
*** sapd1 has quit IRC | 09:34 | |
*** sapd1 has joined #openstack-nova | 09:35 | |
*** licanwei has quit IRC | 09:35 | |
*** derekh has joined #openstack-nova | 09:37 | |
*** rodolof has quit IRC | 09:39 | |
*** rodolof has joined #openstack-nova | 09:40 | |
*** brault has joined #openstack-nova | 09:45 | |
*** Bhujay has joined #openstack-nova | 09:48 | |
*** Bhujay has quit IRC | 09:51 | |
*** ttsiouts has joined #openstack-nova | 09:58 | |
*** bhagyashris has quit IRC | 10:00 | |
*** maciejjozefczyk has quit IRC | 10:04 | |
*** maciejjozefczyk has joined #openstack-nova | 10:05 | |
*** maciejjozefczyk has quit IRC | 10:06 | |
*** erlon has joined #openstack-nova | 10:06 | |
*** maciejjozefczyk has joined #openstack-nova | 10:08 | |
*** maciejjozefczyk has quit IRC | 10:08 | |
*** Dinesh_Bhor has quit IRC | 10:09 | |
*** Bhujay has joined #openstack-nova | 10:11 | |
*** Bhujay has quit IRC | 10:12 | |
*** Bhujay has joined #openstack-nova | 10:12 | |
*** Bhujay has quit IRC | 10:13 | |
*** Bhujay has joined #openstack-nova | 10:14 | |
*** ttsiouts has quit IRC | 10:16 | |
*** ttsiouts has joined #openstack-nova | 10:16 | |
*** erlon_ has joined #openstack-nova | 10:17 | |
*** erlon has quit IRC | 10:20 | |
*** ttsiouts has quit IRC | 10:21 | |
*** Bhujay has quit IRC | 10:24 | |
*** ttsiouts has joined #openstack-nova | 10:25 | |
*** psachin has quit IRC | 10:29 | |
*** yan0s has joined #openstack-nova | 10:34 | |
*** maciejjozefczyk has joined #openstack-nova | 10:49 | |
*** maciejjozefczyk has quit IRC | 10:51 | |
*** rodolof has quit IRC | 10:51 | |
*** rodolof has joined #openstack-nova | 10:51 | |
*** maciejjozefczyk has joined #openstack-nova | 10:54 | |
*** maciejjozefczyk has quit IRC | 10:57 | |
*** avolkov has joined #openstack-nova | 11:03 | |
*** rodolof has quit IRC | 11:03 | |
*** ccamacho has quit IRC | 11:05 | |
*** maciejjozefczyk has joined #openstack-nova | 11:05 | |
*** maciejjozefczyk has quit IRC | 11:07 | |
*** dpawlik_ has quit IRC | 11:09 | |
*** dpawlik has joined #openstack-nova | 11:09 | |
*** sapd1 has quit IRC | 11:10 | |
*** dpawlik has quit IRC | 11:13 | |
*** udesale has quit IRC | 11:13 | |
*** dpawlik has joined #openstack-nova | 11:13 | |
*** ccamacho has joined #openstack-nova | 11:14 | |
*** dpawlik has quit IRC | 11:14 | |
*** dpawlik has joined #openstack-nova | 11:14 | |
*** maciejjozefczyk has joined #openstack-nova | 11:15 | |
*** dpawlik has quit IRC | 11:16 | |
*** dpawlik has joined #openstack-nova | 11:17 | |
*** dpawlik has quit IRC | 11:17 | |
*** sapd1 has joined #openstack-nova | 11:17 | |
*** dpawlik has joined #openstack-nova | 11:17 | |
*** rodolof has joined #openstack-nova | 11:20 | |
*** Bhujay has joined #openstack-nova | 11:21 | |
*** ralonsoh has joined #openstack-nova | 11:27 | |
*** ttsiouts has quit IRC | 11:27 | |
*** rodolof has quit IRC | 11:52 | |
*** rodolof has joined #openstack-nova | 11:53 | |
*** erlon_ has quit IRC | 11:59 | |
*** tbachman_ has joined #openstack-nova | 12:02 | |
*** jonher_ has joined #openstack-nova | 12:05 | |
*** rodolof has quit IRC | 12:05 | |
*** tbachman has quit IRC | 12:05 | |
*** tbachman_ is now known as tbachman | 12:05 | |
*** dpawlik has quit IRC | 12:06 | |
*** jonher has quit IRC | 12:08 | |
*** jonher_ is now known as jonher | 12:08 | |
*** tbachman_ has joined #openstack-nova | 12:08 | |
*** tbachman has quit IRC | 12:10 | |
*** tbachman_ is now known as tbachman | 12:10 | |
*** erlon_ has joined #openstack-nova | 12:15 | |
*** ttsiouts has joined #openstack-nova | 12:20 | |
*** ratailor has quit IRC | 12:23 | |
*** tiendc has quit IRC | 12:27 | |
*** hogepodge has quit IRC | 12:29 | |
*** hogepodge has joined #openstack-nova | 12:30 | |
*** janki has quit IRC | 12:32 | |
*** wolverineav has joined #openstack-nova | 12:47 | |
*** dpawlik has joined #openstack-nova | 12:50 | |
*** wolverineav has quit IRC | 12:51 | |
*** ttsiouts has quit IRC | 12:54 | |
*** ttsiouts has joined #openstack-nova | 12:57 | |
*** sapd1 has quit IRC | 13:03 | |
*** alex_xu has quit IRC | 13:03 | |
*** mriedem has joined #openstack-nova | 13:06 | |
*** dave-mccowan has joined #openstack-nova | 13:08 | |
mriedem | sean-k-mooney: you might be able to triage this https://bugs.launchpad.net/nova/+bug/1809095 | 13:10 |
openstack | Launchpad bug 1809095 in OpenStack Compute (nova) "Wrong representor port was unplugged from OVS during cold migration" [Undecided,New] | 13:10 |
*** maciejjozefczyk has quit IRC | 13:11 | |
sean-k-mooney | ill take a look. form the title it sounds like its related to os-vif and mellonox hardware offload | 13:11 |
*** maciejjozefczyk has joined #openstack-nova | 13:12 | |
gibi | mriedem: I left a hint about the serialization problem in https://review.openstack.org/#/c/582417/6/nova/compute/rpcapi.py@736 | 13:22 |
mriedem | melwitt: this sounds very similar to a thing you fixed about caching HostState values globally per scheduler worker https://bugs.launchpad.net/nova/+bug/1809061 | 13:24 |
openstack | Launchpad bug 1809061 in OpenStack Compute (nova) "KeyError when booting multi-stagger-instances" [Undecided,New] | 13:24 |
*** markvoelker has quit IRC | 13:24 | |
*** janki has joined #openstack-nova | 13:26 | |
mriedem | gibi: replied, thanks i'll mess with that | 13:30 |
*** helenafm has quit IRC | 13:34 | |
mriedem | sean-k-mooney: stephenfin: you may also enjoy https://bugs.launchpad.net/nova/+bug/1809040 but i'm not really sure what to do about it, | 13:35 |
openstack | Launchpad bug 1809040 in OpenStack Compute (nova) "pci device lost when error in the configuration file " [Undecided,New] | 13:35 |
mriedem | basically, they goofed their pci passthrough whitelist config during an upgrade, | 13:35 |
mriedem | and lost all the pci device inventory that was previously discovered and assigned to a given vm once they rebooted the vm | 13:35 |
mriedem | maybe they can cold migrate their way out of it | 13:36 |
*** yan0s has quit IRC | 13:37 | |
mriedem | pci devices are only assigned during claims right? | 13:37 |
*** maciejjozefczyk has quit IRC | 13:38 | |
mriedem | frickler: you should have your queens/pike releases now | 13:39 |
stephenfin | mriedem: Just finishing up a rather lengthy email. I'll take a look soon as that's done | 13:39 |
stephenfin | mriedem: But yeah, only during claims to the best of my recollection | 13:39 |
frickler | mriedem: yes, I saw the notification on the bug report, thank you | 13:41 |
*** helenafm has joined #openstack-nova | 13:44 | |
mgariepy | ha, hello :) | 13:44 |
*** mmethot has quit IRC | 13:45 | |
mgariepy | the cold migrate might work guess i don't tend to migrate vms with pci passthrough tho.. and resize isn't really an option since it allow you to select a new flavor and not old same one. | 13:45 |
mriedem | cold migrate is just resize without a new flavor | 13:46 |
mriedem | the pci passthrough whitelist / framework is a fickle beast | 13:47 |
*** mmethot has joined #openstack-nova | 13:47 | |
mriedem | given the "inventory" in nova is the intersection of what's in the config and what's on the host | 13:47 |
*** moshele has joined #openstack-nova | 13:48 | |
mgariepy | my use case is more like: 1 vm / host with all the ressources, (pci passthrough ram and cpu), resize doesn't really work in that case. | 13:50 |
mgariepy | i noticed that the pci devices are re-created in the db, how is the link made to the computes ? | 13:51 |
mgariepy | pci passthrough is fun. it gave me quite a few issues lately. | 13:51 |
mriedem | so you can't cold migrate the vm because ther are no other available hosts with the same pci device? | 13:52 |
mriedem | *there | 13:52 |
mgariepy | no there isn't | 13:52 |
*** yan0s has joined #openstack-nova | 13:52 | |
mriedem | hmm, i wonder if you could trick the resize to same host though by creating a private duplicate flavor with some bogus extra spec like foo=bar | 13:54 |
mgariepy | in nova.pci_devices why isn't it link to the compute node id ? | 13:54 |
mriedem | you'd have to of course enable this option https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.allow_resize_to_same_host | 13:54 |
mgariepy | when recreating the device | 13:54 |
mriedem | i'm really not the person that can answer low level questions about how the pci device code works within nova, | 13:55 |
mriedem | hopefully sean-k-mooney and/or stephenfin could help there | 13:55 |
mriedem | iow i have to look all of the code up every time i need to investigate it | 13:56 |
mriedem | the PciDevice object does have a compute_node_id field | 13:56 |
sean-k-mooney | mriedem: sorry was reading bug. what is the context | 13:56 |
mgariepy | as far as i'm concern, if i have a compute node, and the devices is the same in the same pci address, it should be ""undeleted"" instead of created again. | 13:56 |
mgariepy | sean-k-mooney, https://bugs.launchpad.net/nova/+bug/1809040 | 13:57 |
openstack | Launchpad bug 1809040 in OpenStack Compute (nova) "pci device lost when error in the configuration file " [Undecided,New] | 13:57 |
mgariepy | sean-k-mooney, in nova.pci_devices why isn't it link to the compute node id ? | 13:58 |
mgariepy | https://paste.ubuntu.com/p/Pn76QVmwqr/ | 13:58 |
mriedem | they are linked to compute node 23 there | 13:59 |
sean-k-mooney | mgariepy: they are | 13:59 |
mgariepy | yep, but the issue is that if i remove the passthrough config from nova.conf, it get the deleted_at but if I re-add it it create new one. | 14:00 |
mgariepy | in the same host, same address. etc.. | 14:00 |
sean-k-mooney | yes | 14:00 |
sean-k-mooney | that is expected | 14:00 |
sean-k-mooney | the id filed is an auto incremting filed and the uuid is randomly generated if the device id not found in the database | 14:01 |
mgariepy | wouldn't be better to re-use the old entry if all the info matches? | 14:01 |
sean-k-mooney | mgariepy: if we have already deleted it no | 14:02 |
stephenfin | mgariepy: For what it's worth, that also confused me but it is expected | 14:02 |
mgariepy | the 2 first are the ""original"" one, then the 2 other are the new one created. | 14:02 |
*** Bhujay has quit IRC | 14:02 | |
mriedem | it seems that things break down on reboot here https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L5350 | 14:02 |
*** Bhujay has joined #openstack-nova | 14:02 | |
mriedem | pci_manager.get_instance_pci_devs(instance) must be returning [] | 14:02 |
sean-k-mooney | mgariepy: we do it this way because you could have pull the card and install a different one in the same slot | 14:02 |
mgariepy | yes. | 14:02 |
mgariepy | then the product id would have changed | 14:03 |
sean-k-mooney | that could break the guest if we jsut blindly reused it | 14:03 |
mriedem | and i think that's probably [] because i think instance.pci_devices is set during a resource claim, which doesn't happen on reboot | 14:03 |
sean-k-mooney | no the product ids could be the same but if they were nics | 14:03 |
*** Bhujay has quit IRC | 14:03 | |
sean-k-mooney | that were usidn for pf passthouhg the mac woudl have changed | 14:03 |
openstackgerrit | Merged openstack/nova master: Fix a broken-link in nova doc https://review.openstack.org/626110 | 14:03 |
*** Bhujay has joined #openstack-nova | 14:04 | |
mgariepy | yes for a nic, that's true. | 14:04 |
mgariepy | that's my testbed , the other system i have uses graphic card :D | 14:04 |
mgariepy | haha | 14:04 |
sean-k-mooney | event for gpus we only the vendor id and produc id are recored | 14:04 |
sean-k-mooney | not the subvend id | 14:04 |
*** Bhujay has quit IRC | 14:05 | |
sean-k-mooney | so all GTX 1080s have the same vendor id and product id but an EVGA or asus one has different subvendor ids | 14:05 |
*** Bhujay has joined #openstack-nova | 14:05 | |
sean-k-mooney | not they should be identical but no all such product are. | 14:05 |
sean-k-mooney | well actully the clock speed/ram could chagne | 14:06 |
mgariepy | yeah but the drive would manage that for you. | 14:06 |
*** Bhujay has quit IRC | 14:06 | |
*** Bhujay has joined #openstack-nova | 14:07 | |
sean-k-mooney | it may but the point is if we deteact the device was removed we cannot trust that it si the same and cant reuse the entry | 14:07 |
sean-k-mooney | we do not recerate tehm on every reboot | 14:07 |
sean-k-mooney | we only do it if the agent does a pci scan and did not find them | 14:08 |
mgariepy | yeah, i never had issue with reboot before :D haha | 14:08 |
mgariepy | anyway at least now i know, and i'll be more careful next time. | 14:09 |
sean-k-mooney | so out of interst the device was allocated when it was removed | 14:10 |
sean-k-mooney | what was teh state of the vm | 14:10 |
stephenfin | mriedem: Yeah, I'm not actually sure how else we could resolve that besides a cold migrate/rebuild. It's a mismatch between two sources of truth: what the libvirt driver is finding on the host (based on the whitelist) and what the instance is saying it's using | 14:10 |
mgariepy | as long as you don't restart it's ok | 14:11 |
sean-k-mooney | mgariepy: i would have assumed you would have migrated the vm off the host before upgrade but if you didnt have you tried doing a openstack server reboot --hard to try and fix the issue | 14:11 |
mgariepy | when you restart, the libvirt config generated doesn't have the pci devices but boots anyway | 14:11 |
*** Bhujay has quit IRC | 14:12 | |
sean-k-mooney | right ok then the only way to fix that is likely to shelve and unshelve | 14:12 |
sean-k-mooney | e.g. to free up the node and its pci device and then recreate the vm with its data on the same node | 14:13 |
mgariepy | the reboot --hard doesn't work, since i guess the data is pulled from the DB and my ""new"" device is not allocated | 14:13 |
mgariepy | yeah. | 14:13 |
mgariepy | i'll have my client to rebuild his cluster. | 14:13 |
mgariepy | sean-k-mooney, i do inplace upgrades, it's not a big deal ;P | 14:14 |
sean-k-mooney | mgariepy: well you could also manually fix this in the db without too much hasel but i guess | 14:14 |
sean-k-mooney | on a larger cluster it may be more complicated however | 14:14 |
sean-k-mooney | mgariepy: i have done that but usally i create a test env first. | 14:15 |
mgariepy | i don't like updating the db. sometimes it comes back to bite me .. | 14:15 |
*** eharney has joined #openstack-nova | 14:15 | |
*** munimeha1 has joined #openstack-nova | 14:15 | |
sean-k-mooney | mgariepy: yes it can it can be less painful then redeploying the cluster however | 14:15 |
sean-k-mooney | unless you jsut ment the applicateion in the cluster. | 14:16 |
mgariepy | it's a "special" contributed cluster part of another one | 14:16 |
sean-k-mooney | deleting the vm and recreating it would have the same effect but i assume this happend on all nodes so you would have to delete all the vms with passthough and recreate them | 14:16 |
mgariepy | he runs some kind of hpc cluster on kubernetes on the vms | 14:16 |
sean-k-mooney | so to be clear. you use openstack to spwan 1 vm per host that uses all the host resouces then they use kubernetes to run a distibuted hpc application on the vms | 14:17 |
mgariepy | anyway, not a big deal, i'll be more careful next time. i just messed up the nova.conf pci config on the upgrade | 14:18 |
mgariepy | yep. | 14:18 |
mgariepy | haha :D | 14:18 |
sean-k-mooney | that seam overly complicated but if ti work it works i guess :) | 14:18 |
*** maciejjozefczyk has joined #openstack-nova | 14:18 | |
mgariepy | it's shared, and this way the client doesn't really have to deal with the hardware... | 14:18 |
mgariepy | and have some other benefit like access to some storage and so on. | 14:19 |
sean-k-mooney | yep i totally get why you would do it its just you have at least 3 layers fo orchestration there | 14:19 |
mgariepy | yep, not all the same person do all 3. | 14:20 |
sean-k-mooney | openstack orchestrting the vms, kubernets orchestatin the hpc cluster and spark or whatever orchestatign the hpc jobs on the cluster | 14:20 |
mgariepy | probably slurm, but i'm not 100% sure. | 14:21 |
mgariepy | anyway, thanks a lot for your time and help. | 14:21 |
*** moshele has quit IRC | 14:21 | |
sean-k-mooney | if you deploy your openstack with kubernets you can make it nice an inceptione | 14:21 |
sean-k-mooney | no worres are you ok with me closing https://bugs.launchpad.net/nova/+bug/1809040 | 14:21 |
openstack | Launchpad bug 1809040 in OpenStack Compute (nova) "pci device lost when error in the configuration file " [Undecided,New] | 14:21 |
mgariepy | the question is ,will I be able to remove the physical server at some point. | 14:22 |
mgariepy | yep | 14:22 |
stephenfin | sean-k-mooney: Perhaps we could add a nova-compute start up check to see if there are unrecognized PCI devices attached to running instances and fail to start if so? | 14:25 |
stephenfin | sean-k-mooney: Thought I guess by then the old PCI devs in the manager would have been marked deleted and new ones created | 14:25 |
stephenfin | Unless we did it realllly early, but that would involve duplicating a lot of logic | 14:25 |
stephenfin | really early = before the PCI manager stuff kicks off | 14:26 |
sean-k-mooney | stephenfin: well i was going to suggest in the but if some wanted to retarget the bug to allow a hard reboot to fix it then it would be fine | 14:26 |
jaypipes | jackding: will try my best | 14:26 |
mgariepy | stephenfin, there are already something like that : https://paste.ubuntu.com/p/GVJQqMSTrM/ | 14:26 |
sean-k-mooney | stephenfin: e.g. if you had a vm with a passthough deivce in the flavor aliase we would revalidate taht we have claimed it and fix it on hard reboot | 14:28 |
sean-k-mooney | i was gong to triage it as wontfix and low priority | 14:29 |
mgariepy | anyone of you uses passthrough and seen: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1808412 | 14:30 |
openstack | Launchpad bug 1808412 in linux (Ubuntu) "4.15.0 memory allocation issue" [Undecided,Confirmed] | 14:30 |
mgariepy | when i start the vm with pci_passthrough ,it does pre-allocate the ram, and get stuck at some point. | 14:30 |
mriedem | stephenfin: rebuild won't fix it b/c we don't do a resource claim on rebuild | 14:30 |
*** liuyulong has joined #openstack-nova | 14:31 | |
mgariepy | shelve/unshelve works | 14:32 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Fix a broken-link in nova doc https://review.openstack.org/626113 | 14:35 |
*** ralonsoh has quit IRC | 14:38 | |
mriedem | yeah that would work, didn't think about that, but that will do a new instance resoure claim on unshelve | 14:38 |
mriedem | how are you pinning it back to the same host though? | 14:38 |
mriedem | or is that the only option for that instance? | 14:38 |
*** yan0s has quit IRC | 14:38 | |
mriedem | s/instance/flavor/ | 14:38 |
mgariepy | i don't have another spot for it haha | 14:38 |
mgariepy | otherwise migrate should be better | 14:39 |
mgariepy | as shelve is not shelving ephemeral drive. | 14:39 |
*** maciejjozefczyk has quit IRC | 14:41 | |
mriedem | mgariepy: are you ok with https://bugs.launchpad.net/nova/+bug/1809040/comments/4 ? | 14:42 |
openstack | Launchpad bug 1809040 in OpenStack Compute (nova) "pci device lost when error in the configuration file " [Undecided,New] | 14:42 |
sean-k-mooney | sorry had to pop away for a bit. i have changed my mind i think i will set it to triaged and low instead of wontfix and low and state that the logic is working as desigend but that we shoudl be able to correct the issue with a hard reboot and we shoudl fix that | 14:42 |
*** mlavalle has joined #openstack-nova | 14:42 | |
sean-k-mooney | mriedem: mgariepy stephenfin does ^ sound resonable | 14:43 |
*** maciejjozefczyk has joined #openstack-nova | 14:43 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: compute: Reject migration requests when source is down https://review.openstack.org/623489 | 14:43 |
stephenfin | sean-k-mooney: Sounds fair | 14:44 |
mriedem | avolkov was working a fix at one point https://review.openstack.org/#/c/426243/ | 14:44 |
stephenfin | mriedem: Could you send this docs followup patch on its way? https://review.openstack.org/#/c/614322/ | 14:44 |
mriedem | done | 14:45 |
sean-k-mooney | mriedem: that occurred to me as well but i was not sure how involved that would be. e.g. updating the db from the deleted value. also not sure how safe it would be in some cases such as if the divece was chaged | 14:46 |
sean-k-mooney | ill link it in the bug for context too | 14:46 |
mriedem | there is also https://bugs.launchpad.net/nova/+bug/1633120 | 14:46 |
openstack | Launchpad bug 1633120 in OpenStack Compute (nova) "Nova scheduler tries to assign an already-in-use SRIOV QAT VF to a new instance" [Undecided,Confirmed] | 14:46 |
mriedem | all the same issue, | 14:46 |
mriedem | change pci whitelist and restart nova compute blows away allocated pci devices | 14:47 |
sean-k-mooney | you know when we start tracking these device in placement we are not going to be able to just blow away the resouce provider anymore if there are allocation against it | 14:48 |
mriedem | that was my reply on mgariepy's bug, | 14:48 |
mriedem | long-term when pci device inventory and allocation is in placement, this shouldn't be a problem, | 14:48 |
mriedem | because even if nova-compute tried to remove device inventory, if it's in-use by a consumer (instance) the inventory delete request will fail with a 409 | 14:49 |
mriedem | where is the code that removes / deletes devices on restart of nova-compute? shouldn't that be smarter and not delete those devices which are assigned to an instance? | 14:49 |
*** hongbin has joined #openstack-nova | 14:50 | |
mriedem | as we can see in https://paste.ubuntu.com/p/Pn76QVmwqr/ pci device 8 is allocated to an instance but was deleted anyway | 14:50 |
sean-k-mooney | mriedem: well we will still need to resouce track to handel the pci device assingment aspect and we will need to be able to coralt it back to the rp | 14:50 |
*** udesale has joined #openstack-nova | 14:50 | |
mriedem | so in 4 years when pci devices are handled with placement, this will not suck as bad anymore yeah? | 14:50 |
mriedem | can't we just not delete allocated pci devices? | 14:51 |
sean-k-mooney | mriedem: actuly i think it will suck more | 14:51 |
mriedem | obviously that is some kind of referential constraint | 14:51 |
sean-k-mooney | we could delay deleting the entry until it is deallcoated | 14:52 |
sean-k-mooney | if the admin change the whitelist to prevent a device form being used we would still want to stop new instaces form using it | 14:52 |
sean-k-mooney | but it would be resoable to assume that an existing instance could continue to use it | 14:52 |
mriedem | i don't know how you delay that | 14:53 |
mriedem | without new logic / data modeling on the pci device record to say, "pending delete" or something once it's no longer allocated | 14:54 |
mriedem | but by then you'd have to see if it's back in the pci whitelist | 14:54 |
sean-k-mooney | this is where we figure out the available devices https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L119 | 14:54 |
mriedem | i would think a simple solution is fail nova-compute restart if you tried scrubbing an entry from the whitelist for an allocated devie | 14:54 |
mriedem | jaypipes: have you ever heard of this craziness? ^ | 14:54 |
*** maciejjozefczyk has quit IRC | 14:55 | |
* jaypipes reads back | 14:55 | |
mriedem | jaypipes: if you change the pci whitelist config to remove an already allocated device, and restart nova-compute, compute deletes the pci device record that is still allocated to an instance | 14:55 |
mriedem | if you then add it back to the whitelist and restart, nova-compute creates a new 'available' pci device record and the scheduler can try to use it | 14:56 |
sean-k-mooney | so we could just change the if form self.dev_filter.device_assignable(dev) to self.dev_filter.device_assignable(dev) or is_allocated(deve) where is allocated is a new fuction | 14:56 |
*** idlemind has joined #openstack-nova | 14:56 | |
mriedem | which the hypervisor will reject | 14:56 |
mriedem | you can also lose the old device assigned to the old instance if you reboot the instance | 14:56 |
*** jonher_ has joined #openstack-nova | 14:56 | |
mriedem | sean-k-mooney: maybe? i don't know what devices_json is | 14:57 |
mriedem | is that from the db or the config? | 14:57 |
*** ralonsoh has joined #openstack-nova | 14:57 | |
*** awaugama has joined #openstack-nova | 14:57 | |
mriedem | god even finding the object code to see where pci device records are deleted is hard | 14:58 |
*** jonher- has joined #openstack-nova | 14:59 | |
mriedem | oh it's in save() i should have known! | 14:59 |
*** jonher has quit IRC | 14:59 | |
*** jonher- is now known as jonher | 14:59 | |
mriedem | https://github.com/openstack/nova/blob/master/nova/objects/pci_device.py#L244 | 14:59 |
sean-k-mooney | so this is suspicious https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L166-L181 | 14:59 |
mriedem | idk, seems like a very easy solution to avoid screwing up the db state would just be raise an exception here https://github.com/openstack/nova/blob/master/nova/objects/pci_device.py#L244 if self.instance_uuid is not None | 15:01 |
mriedem | i.e. you can't remove/delete an allocated pci device | 15:01 |
*** jaypipes has quit IRC | 15:01 | |
*** erlon_ has quit IRC | 15:01 | |
*** erlon has joined #openstack-nova | 15:01 | |
*** jonher_ has quit IRC | 15:02 | |
sean-k-mooney | yes but it looks like https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L166-L181 was maybe working around that | 15:02 |
mriedem | that was added 6 years ago https://github.com/openstack/nova/commit/4855239497050c9ee03fed627c5f41d6b59eddc6 | 15:04 |
mriedem | and clearly it sucks | 15:04 |
gibi | mriedem: FYI my wall of text as a review guide for the bandwidth series http://lists.openstack.org/pipermail/openstack-discuss/2018-December/001129.html | 15:04 |
sean-k-mooney | yes im reading the original commit curently | 15:04 |
*** dpawlik has quit IRC | 15:04 | |
*** dpawlik has joined #openstack-nova | 15:05 | |
mriedem | gibi: ack thanks | 15:05 |
*** dpawlik has quit IRC | 15:06 | |
*** dpawlik has joined #openstack-nova | 15:06 | |
sean-k-mooney | mriedem: if we simply dont do existed.status = 'removed' with "continue" in the excpet clause that may be enough. | 15:06 |
*** jonher_ has joined #openstack-nova | 15:07 | |
sean-k-mooney | mriedem: i lean more and more to its working how its was intended but what was intended is dumb and we shoudl fix it | 15:07 |
mriedem | mgariepy: did you see this warning in the nova-compute logs? https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L169 | 15:07 |
mriedem | sean-k-mooney: yeah i don't understand that logic at all | 15:08 |
mriedem | i guess it's saying the hypervisor is no longer reporting that device so we need to forcefully remove it/ | 15:08 |
mriedem | ? | 15:08 |
sean-k-mooney | mriedem: the orginal code comment was lost at some point from the set_hvdevs fucntion | 15:09 |
sean-k-mooney | "Devices should not be hot-plugged when assigned to a guest, | 15:09 |
sean-k-mooney | but possibly the hypervisor has no such guarantee. The best | 15:09 |
sean-k-mooney | we can do is to give a warning if a device is changed | 15:09 |
sean-k-mooney | or removed while assigned. | 15:09 |
sean-k-mooney | " | 15:09 |
mriedem | oh but the new_devs are filtered through the whitelist | 15:09 |
mriedem | in device_assignable | 15:09 |
mriedem | https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L120 | 15:09 |
mriedem | so it's not that the hypervisor is no longer reporting the devices, it's that the whitelist changed | 15:10 |
mriedem | which is exactly the bug | 15:10 |
sean-k-mooney | yes | 15:10 |
*** jonher has quit IRC | 15:10 | |
*** jonher_ is now known as jonher | 15:10 | |
sean-k-mooney | so if we change the filtering to allow allocated deivce in addtion to the whitelist it will be fine | 15:10 |
*** dpawlik has quit IRC | 15:11 | |
*** alexchadin has quit IRC | 15:11 | |
*** cfriesen has joined #openstack-nova | 15:11 | |
sean-k-mooney | when the device is deallocated form the guest it will be removed from the aviable device on the next periodic sync | 15:11 |
sean-k-mooney | as it si nolonger in the whitelist or allocated | 15:11 |
stephenfin | mriedem: Not to distract you now, but did you make a mistake on https://github.com/openstack/nova/commit/cdf8ba5acb ? You've said it fixes https://bugs.launchpad.net/nova/+bug/1784579 but that bug is for live migration, not compute service restart which is what your commit addresses | 15:12 |
openstack | Launchpad bug 1784579 in OpenStack Compute (nova) queens "unable to live migrate instance after update to queens" [Medium,Confirmed] | 15:12 |
stephenfin | mriedem: I ask because I found a similar bug which does deal with the compute service restart https://bugs.launchpad.net/nova/+bug/1738373 | 15:13 |
openstack | Launchpad bug 1738373 in OpenStack Compute (nova) "nova-compute cannot restart if _init_host failed" [Undecided,In progress] - Assigned to Xiao Gong (gongxiao) | 15:13 |
mgariepy | mriedem, https://paste.ubuntu.com/p/GVJQqMSTrM/ | 15:13 |
mgariepy | yes i did saw the warning. | 15:13 |
mriedem | mgariepy: yup and that matches the instance uuid in https://paste.ubuntu.com/p/Pn76QVmwqr/ | 15:14 |
mriedem | for the deleted pci device | 15:14 |
*** jaypipes has joined #openstack-nova | 15:15 | |
sean-k-mooney | mriedem: mgariepy ill write this up in the bug and i might go fix it. that said im going on vaction today so i might not get to it untill january | 15:15 |
sean-k-mooney | if other want to work on it in the interim feel free. | 15:15 |
mriedem | sean-k-mooney: i just did https://bugs.launchpad.net/nova/+bug/1633120 | 15:16 |
openstack | Launchpad bug 1633120 in OpenStack Compute (nova) "Nova scheduler tries to assign an already-in-use SRIOV QAT VF to a new instance" [Undecided,Confirmed] | 15:16 |
*** cfriesen has quit IRC | 15:17 | |
sean-k-mooney | mriedem: ok in that case ill close https://bugs.launchpad.net/nova/+bug/1809040 as a duplicate of https://bugs.launchpad.net/nova/+bug/1633120 | 15:17 |
openstack | Launchpad bug 1633120 in OpenStack Compute (nova) "duplicate for #1809040 Nova scheduler tries to assign an already-in-use SRIOV QAT VF to a new instance" [Undecided,Confirmed] | 15:17 |
mriedem | sean-k-mooney: already did | 15:17 |
mriedem | stephenfin: yes bug 1784579 is about os-vif port binding failed errors right? | 15:18 |
openstack | bug 1784579 in OpenStack Compute (nova) queens "unable to live migrate instance after update to queens" [Medium,Confirmed] https://launchpad.net/bugs/1784579 | 15:18 |
jaypipes | mriedem: apologies. internet down for last 15 minutes in Sarasota... last thing I got from you was "is that from the db or the config" | 15:18 |
sean-k-mooney | mriedem: in that case ill pay sean-k-mooney 1 million dollars :) | 15:18 |
mriedem | jaypipes: oh just this terrible code https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L177 | 15:19 |
stephenfin | mriedem: Yup, but it's to do with live migration and the fix is only for the service startup code path | 15:19 |
mriedem | sean-k-mooney: if you want, throw up a patch that changes ^ to a continue, fix whatever test hits that and i'll +2 | 15:19 |
stephenfin | At least, assuming I'm reading it right. I'll do some digging but just wanted to sanity check it before I dived down the rabbit hole :) | 15:19 |
mriedem | stephenfin: the live migratoin fails because of the port binding failures | 15:19 |
jaypipes | mriedem: I think you meant this code: https://github.com/openstack/nova/blob/a1dba961f0018a4995d208a290f4a859ce295840/nova/pci/manager.py#L1-L357 | 15:20 |
mriedem | i see what you did there | 15:20 |
jaypipes | u like that? | 15:20 |
sean-k-mooney | jaypipes: your not wronge | 15:21 |
mriedem | stephenfin: comment | 15:21 |
mriedem | 2 | 15:21 |
mriedem | "To summarize, it looks like the pre_live_migration method on the destination host fails to plug vifs and you end up with the "binding_failed" error, which is raised and makes the source live_migration method fail as expected. The failure is on the dest host. As a result, the info cache is updated with "binding_failed" which causes the source compute restart to fail here:" | 15:21 |
jaypipes | sean-k-mooney: but you are. | 15:21 |
jaypipes | wronge that is. | 15:21 |
jaypipes | sean-k-mooney: :P | 15:21 |
sean-k-mooney | jaypipes: it is currently more functional then cyborge's pci passthough support at least ... | 15:22 |
mriedem | stephenfin: so no i didn't fix the original reason for the port binding failure in pre_live_migration, because that could have been for any number of reasons (neutron agent was down on the dest host?) | 15:22 |
mriedem | i fixed a symptom of that failure, which was nova-compute failed to restart after that failure | 15:22 |
mriedem | as the commit message says, "Admittedly this isn't the smartest thing and doesn't attempt | 15:22 |
mriedem | Â Â Â Â to recover / fix the instance networking info" | 15:22 |
stephenfin | mriedem: I'm missing something. Why make changes to 'ComputeManager.init_host' (via '_init_instance') in that commit? The exception was being seen in the live migration flow | 15:22 |
stephenfin | ahhhhh | 15:23 |
mriedem | 1. live migratoin fails, port binding failed - that gets saved in the info cache | 15:23 |
*** dpawlik has joined #openstack-nova | 15:23 | |
mriedem | 2. restart source compute - that blows up because it wasn't handling binding_failed vif types in the os-vif conversion code | 15:23 |
mriedem | i handle #2 | 15:23 |
mriedem | #1 is sort of out of my control | 15:23 |
stephenfin | Your fix would inadvertently resolve https://bugs.launchpad.net/nova/+bug/1738373 so | 15:23 |
openstack | Launchpad bug 1738373 in OpenStack Compute (nova) "nova-compute cannot restart if _init_host failed" [Undecided,In progress] - Assigned to Xiao Gong (gongxiao) | 15:23 |
mriedem | i mean, we probably shouldn't be saving off busted port binding information when pre_live_migration fails, | 15:24 |
mriedem | since that overwrites the previously good port binding information from the source host | 15:24 |
mriedem | i would have to dig into where we save off the bad port binding information | 15:25 |
stephenfin | Yup, there's a related fix (also for live migration) that you worked on which looks more involved https://bugs.launchpad.net/nova/+bug/1783917 | 15:25 |
openstack | Launchpad bug 1783917 in OpenStack Compute (nova) "live migration fails with NovaException: Unsupported VIF type unbound convert '_nova_to_osvif_vif_unbound'" [High,Fix released] - Assigned to Matt Riedemann (mriedem) | 15:25 |
stephenfin | Wait, wrong link? | 15:25 |
* stephenfin checks (ETOOMUCHTABS) | 15:25 | |
*** yan0s has joined #openstack-nova | 15:25 | |
mriedem | ^ was a regression in rocky | 15:26 |
*** tbachman has quit IRC | 15:26 | |
mriedem | so i suppose my fix should have been related to bug 1784579 | 15:26 |
openstack | bug 1784579 in OpenStack Compute (nova) queens "unable to live migrate instance after update to queens" [Medium,Confirmed] https://launchpad.net/bugs/1784579 | 15:26 |
mriedem | not closes it | 15:26 |
stephenfin | Probably, yeah | 15:27 |
stephenfin | OK, there's just a lot of overlap on these and I'm just trying to unravel it. We also have https://bugzilla.redhat.com/show_bug.cgi?id=1578028 (which I'm filing upstream now) which seems related too | 15:27 |
openstack | bugzilla.redhat.com bug 1578028 in openstack-nova "ovaException: Unsupported VIF type unbound convert '_nova_to_osvif_vif_unbound'" [Urgent,Assigned] - Assigned to nova-maint | 15:27 |
*** dpawlik has quit IRC | 15:27 | |
stephenfin | That os_vif conversion code could probably do with some tweaks. I'll see what I can do | 15:28 |
mriedem | looks like i need to backport https://review.openstack.org/#/c/595317/ as well - i had marked the bug for queens but forgot to keep going i guess | 15:29 |
sean-k-mooney | stephenfin mriedem fixed that in rocky rc phase | 15:29 |
mriedem | stephenfin: sorting out where we save off bogus port binding failed information during pre_live_migration is probably worthwhile | 15:30 |
mriedem | i'm sure there is probably some update db decorator involved that automatically does it | 15:30 |
sean-k-mooney | mriedem: i thik its related to the network opdate events we get form neutron | 15:31 |
mriedem | oh yeah that might have been it | 15:31 |
stephenfin | mriedem: Ack. I just need to get a reproducer. Easier said than done | 15:31 |
mriedem | sean-k-mooney: b/c we'll get a network-changed event for that i believe | 15:32 |
stephenfin | sean-k-mooney: I assume you're referring to https://bugs.launchpad.net/nova/+bug/1783917 which I think is different to https://bugzilla.redhat.com/show_bug.cgi?id=1578028 | 15:32 |
openstack | Launchpad bug 1783917 in OpenStack Compute (nova) "live migration fails with NovaException: Unsupported VIF type unbound convert '_nova_to_osvif_vif_unbound'" [High,Fix released] - Assigned to Matt Riedemann (mriedem) | 15:32 |
openstack | bugzilla.redhat.com bug 1578028 in openstack-nova "ovaException: Unsupported VIF type unbound convert '_nova_to_osvif_vif_unbound'" [Urgent,Assigned] - Assigned to nova-maint | 15:32 |
stephenfin | The latter is an issue all the way back to newton, assuming that BZ information is correct. Couldn't possibly be a Rocky regression | 15:33 |
mriedem | stephenfin: https://bugzilla.redhat.com/show_bug.cgi?id=1578028 is newton, so yes | 15:33 |
mriedem | https://review.openstack.org/#/c/595317/1/nova/network/os_vif_util.py wouldn't fix that since it's specifically handling vif_type of 'binding_failed' | 15:33 |
mriedem | not 'unbound' | 15:33 |
sean-k-mooney | unbound is the vif type before we set the host in the binding profile | 15:34 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/queens: Handle binding_failed vif plug errors on compute restart https://review.openstack.org/626218 | 15:34 |
stephenfin | mriedem: Aye, the dumb fix would be to handle unbound. Reading that bug, sahid suggested refreshing the info_cache but I'm thinking we can't do that on service startup due to the cost? | 15:34 |
sean-k-mooney | stephenfin: well https://review.openstack.org/#/c/591607/ will help with that | 15:35 |
*** ccamacho has quit IRC | 15:35 | |
mriedem | stephenfin: there is a lot of conversation about that in https://review.openstack.org/#/c/587498/ | 15:35 |
stephenfin | mriedem++ | 15:35 |
mriedem | this https://review.openstack.org/#/c/603844/ would also be related | 15:35 |
mriedem | as a forced recovery action | 15:36 |
*** erlon_ has joined #openstack-nova | 15:36 | |
mriedem | sean correctly pointed out i should have used related/partial bug https://review.openstack.org/#/c/587498/3//COMMIT_MSG@35 | 15:37 |
mriedem | i must have missed that | 15:37 |
sean-k-mooney | https://bugzilla.redhat.com/show_bug.cgi?id=1645316 needs https://review.openstack.org/#/c/591607/ | 15:37 |
openstack | bugzilla.redhat.com bug 1645316 in openstack-nova "Nova fails to attach both interfaces to VM after hypervisor reboot" [High,On_dev] - Assigned to smooney | 15:37 |
mriedem | stephenfin: i think the heal conversation is here https://review.openstack.org/#/c/587498/1/nova/compute/manager.py@956 | 15:38 |
mriedem | i have to come back on https://review.openstack.org/#/c/591607/ | 15:38 |
mriedem | too much gd stuff going on | 15:38 |
mriedem | sean-k-mooney: are you working on that pci device bogus removal patch? | 15:38 |
mriedem | or should i? | 15:38 |
stephenfin | mriedem: Yeah, sorry. Let me draft a bug report and a patch and we can discuss later. As you were | 15:39 |
sean-k-mooney | am i just booted up my sriov env so ill make the change now and see if that works | 15:39 |
mriedem | stephenfin: bug report for the unbound thing? yeah i guess just refer to these other 20 bugs for binding_failed and it's the same issue | 15:39 |
*** erlon has quit IRC | 15:39 | |
stephenfin | Yeah, maybe I should just make the existing bug more generic and re-open it | 15:40 |
mriedem | stephenfin: btw, PS1 of my patch would have handled unbound https://review.openstack.org/#/c/587498/1/nova/network/os_vif_util.py | 15:40 |
mriedem | since it raised a new UnsupportedVifTypeConversion exception for anything we can't convert | 15:40 |
*** fragatina has quit IRC | 15:41 | |
mriedem | and handled in init_host https://review.openstack.org/#/c/587498/1/nova/compute/manager.py | 15:41 |
mriedem | eric convinced me to do something more targeted | 15:41 |
stephenfin | mriedem: Any idea why you changed? | 15:41 |
stephenfin | ah | 15:41 |
mriedem | https://review.openstack.org/#/c/587498/1/nova/network/os_vif_util.py | 15:42 |
mriedem | so your fix is just do the same pattern but for vif_type=unbound | 15:42 |
openstackgerrit | Martin Midolesov proposed openstack/nova master: vmware:add support for the hw_video_ram image property https://review.openstack.org/564193 | 15:42 |
stephenfin | yup | 15:42 |
*** cfriesen has joined #openstack-nova | 15:42 | |
canori01 | mriedem: Is it possible to force the resize of an instance to the same host? I have allow_resize_to_same_host, but it seems like it still tries to pick different hosts | 15:43 |
mriedem | stephenfin: btw it goes back to newton https://review.openstack.org/#/c/350595/ | 15:43 |
mriedem | canori01: i know you can force a cold migrate to a specific host with a newer microversion but i'm not sure if that applies to resize as well | 15:44 |
stephenfin | backporting fun \o/ | 15:44 |
mriedem | https://developer.openstack.org/api-ref/compute/?expanded=migrate-server-migrate-action-detail#migrate-server-migrate-action | 15:44 |
mriedem | canori01: ^ host param was added in 2.56 | 15:44 |
* stephenfin has no idea how mriedem keeps all that context/these conversation threads in his head | 15:45 | |
mriedem | canori01: ah looks like that won't work for cold migrate at least https://github.com/openstack/nova/blob/master/nova/compute/api.py#L3542 | 15:45 |
mriedem | and you can't specify a host for resize https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/servers.py#L806 | 15:46 |
mriedem | https://developer.openstack.org/api-ref/compute/?expanded=migrate-server-migrate-action-detail,resize-server-resize-action-detail#resize-server-resize-action | 15:46 |
*** jmlowe has joined #openstack-nova | 15:47 | |
mriedem | canori01: so the answer is no, and the scheduler is likely picking another host because one is available even though you could use the original host, but maybe weighers or something is picking the other host | 15:47 |
mriedem | canori01: or the source host is filtered out because of bug https://bugs.launchpad.net/nova/+bug/1790204 | 15:48 |
openstack | Launchpad bug 1790204 in OpenStack Compute (nova) "Allocations are "doubled up" on same host resize even though there is only 1 server on the host" [Medium,Triaged] | 15:48 |
*** brault has quit IRC | 15:48 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Create RequestGroup from neutron port https://review.openstack.org/625941 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Include requested_resources to allocation candidate query https://review.openstack.org/625942 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Transfer port.resource_request to the scheduler https://review.openstack.org/567268 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Extend RequestGroup object for mapping https://review.openstack.org/619527 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Calculate RequestGroup resource provider mapping https://review.openstack.org/616239 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Fill the RequestGroup mapping during schedule https://review.openstack.org/619528 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Pass resource provider mapping to neutronv2 api https://review.openstack.org/616240 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Recalculate request group - RP mapping during re-schedule https://review.openstack.org/619529 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Send RP uuid in the port binding https://review.openstack.org/569459 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Test boot with more ports with bandwidth request https://review.openstack.org/573317 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Reject interface attach with QoS aware port https://review.openstack.org/570078 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Reject networks with QoS policy https://review.openstack.org/570079 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Remove port allocation during detach https://review.openstack.org/622421 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Refactor PortResourceRequestBasedSchedulingTestBase https://review.openstack.org/624080 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Record requester in the InstancePCIRequest https://review.openstack.org/625310 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Add pf_interface_name tag to passthrough_whitelist https://review.openstack.org/625311 | 15:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Ensure that bandwidth and VF are from the same PF https://review.openstack.org/623543 | 15:49 |
sean-k-mooney | damb that a long patch chain | 15:49 |
gibi | sean-k-mooney: sorry, it is "just" 17 patches long :) | 15:50 |
sean-k-mooney | gibi: no complait form me. its much better then one bing one. | 15:51 |
gibi | sean-k-mooney: and I have to do a rebase soon as the second half is in merge conflict :/ | 15:51 |
sean-k-mooney | im also looking forward to this feature too | 15:51 |
gibi | sean-k-mooney: yeah, I tried to make small steps | 15:51 |
gibi | sean-k-mooney: I've just posted a mail to the ML about the status and some summary about the implementation if you like long mails :) | 15:51 |
ShilpaSD | gibi: Hi | 15:51 |
gibi | ShilpaSD: hi | 15:52 |
mriedem | gibi: are you going to kill me? https://review.openstack.org/#/c/625942/2 | 15:52 |
ShilpaSD | gibi: one question, for notification, can we have multi-valued for driver in configuration file? | 15:52 |
gibi | mriedem: you are right, so I'm not going to kill anybody :) | 15:52 |
gibi | ShilpaSD: that is coming from oslo, let me dig the doc for it | 15:53 |
ShilpaSD | gibi: Are you talking about https://docs.openstack.org/oslo.messaging/latest/reference/notifier.html | 15:54 |
gibi | ShilpaSD: https://docs.openstack.org/oslo.messaging/ocata/opts.html#oslo_messaging_notifications.driver | 15:54 |
gibi | yes | 15:54 |
mriedem | funny the config options are not in the oslo.messaging docs | 15:55 |
gibi | it says multi-valued | 15:55 |
mriedem | bnemec: ^ | 15:55 |
mriedem | ooo but they're in our docs https://docs.openstack.org/nova/latest/configuration/config.html#oslo-messaging-notifications | 15:55 |
*** Bhujay has joined #openstack-nova | 15:57 | |
*** macza has joined #openstack-nova | 15:57 | |
gibi | mriedem: I think it is in the official too https://docs.openstack.org/oslo.messaging/latest/configuration/opts.html#oslo_messaging_notifications.driver | 15:58 |
*** Bhujay has quit IRC | 15:58 | |
*** Bhujay has joined #openstack-nova | 15:58 | |
*** Bhujay has quit IRC | 15:59 | |
*** Bhujay has joined #openstack-nova | 16:00 | |
*** cdent has joined #openstack-nova | 16:01 | |
*** ccamacho has joined #openstack-nova | 16:01 | |
*** Bhujay has quit IRC | 16:01 | |
*** fragatina has joined #openstack-nova | 16:01 | |
*** macza has quit IRC | 16:01 | |
melwitt | mriedem: thanks, taking a look | 16:01 |
*** Bhujay has joined #openstack-nova | 16:01 | |
mriedem | gibi: ah yes good call | 16:02 |
ShilpaSD | gibi: thnaks, actually i want this config option to be set for masakari for notificatins as Default:'', so looking how to be declared this in conf | 16:02 |
*** Bhujay has quit IRC | 16:02 | |
*** Bhujay has joined #openstack-nova | 16:03 | |
mriedem | [oslo_messaging_notifications]/driver=x | 16:03 |
mriedem | see http://logs.openstack.org/44/603844/11/check/tempest-full/ee8b609/controller/logs/etc/nova/nova_conf.txt.gz | 16:04 |
*** Bhujay has quit IRC | 16:04 | |
mriedem | if you want multi-valued, you just specify additional driver entries | 16:04 |
*** Bhujay has joined #openstack-nova | 16:04 | |
*** Bhujay has quit IRC | 16:05 | |
*** Bhujay has joined #openstack-nova | 16:06 | |
ShilpaSD | mriedem: means as comma seperated? | 16:06 |
*** Bhujay has quit IRC | 16:07 | |
*** Bhujay has joined #openstack-nova | 16:07 | |
*** Bhujay has quit IRC | 16:08 | |
nicolasbock | Hi, good morning. I have a quick (hopefully) question: I am trying to find all key pairs. `openstack keypair list` only shows me keypairs associated with the current user. So I started digging through the Nova DB and stumbled upon a `key_pairs` table, which mysteriously is empty though. Where are those keypairs stored? | 16:09 |
*** Bhujay has joined #openstack-nova | 16:09 | |
nicolasbock | Thanks already! | 16:09 |
*** _alastor_ has joined #openstack-nova | 16:10 | |
*** Bhujay has quit IRC | 16:10 | |
*** Bhujay has joined #openstack-nova | 16:10 | |
*** Bhujay has quit IRC | 16:11 | |
*** Bhujay has joined #openstack-nova | 16:12 | |
sahid | nicolasbock: are you sure to use the right database? | 16:13 |
mriedem | ShilpaSD: no i think separate lines | 16:13 |
mriedem | ListOpt is comma-separated | 16:13 |
mriedem | MultiOpt is multiple entries for the same key | 16:13 |
nicolasbock | sahid: Yes, that thought occurred to me as well | 16:13 |
mriedem | they are similar | 16:13 |
*** Bhujay has quit IRC | 16:13 | |
nicolasbock | I'll double check | 16:13 |
ShilpaSD | mriedem: ok, thnaks | 16:13 |
mriedem | nicolasbock: key pairs are in the api db | 16:13 |
*** Bhujay has joined #openstack-nova | 16:13 | |
mriedem | key pair info per instance is in the instance_extra table in the cell db | 16:14 |
nicolasbock | Thanks mriedem ! | 16:14 |
mriedem | stephenfin: sean-k-mooney; btw, i tried to summarize some stuff on that port binding failed bug https://bugs.launchpad.net/nova/+bug/1784579/comments/13 | 16:14 |
openstack | Launchpad bug 1784579 in OpenStack Compute (nova) queens "unable to live migrate instance after update to queens" [Medium,In progress] - Assigned to Matt Riedemann (mriedem) | 16:14 |
*** Bhujay has quit IRC | 16:14 | |
sean-k-mooney | mriedem: so i made the continue change and there were not unit test failures with tox -e py27 -- "pci|PCI|hvdevs|update_devices_from_hypervisor_resources" so im going to write a new one | 16:15 |
*** moshele has joined #openstack-nova | 16:15 | |
stephenfin | mriedem: :D https://bugs.launchpad.net/nova/+bug/1784579/comments/14 | 16:15 |
*** Bhujay has joined #openstack-nova | 16:15 | |
sean-k-mooney | ill run the full set to be sure but im guessing ther was not test code | 16:15 |
*** itlinux has joined #openstack-nova | 16:15 | |
stephenfin | Probably should have agreed on who was doing that. Oh well | 16:15 |
mriedem | sean-k-mooney: i'm not at all surprised there was missing test coverage for that code | 16:16 |
*** Bhujay has quit IRC | 16:16 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Handle unbound vif plug errors on compute restart https://review.openstack.org/626228 | 16:16 |
*** mdbooth has quit IRC | 16:16 | |
*** Bhujay has joined #openstack-nova | 16:16 | |
stephenfin | mriedem: There's the fix for the latest issue | 16:16 |
sean-k-mooney | mriedem: look like ther are some test but none that assert that behavior | 16:17 |
*** Bhujay has quit IRC | 16:17 | |
sahid | nicolasbock: i just checked on my devstack i can list the keypairs, i used "nova_api" database | 16:18 |
*** Bhujay has joined #openstack-nova | 16:18 | |
*** moshele has quit IRC | 16:19 | |
*** Bhujay has quit IRC | 16:19 | |
*** Bhujay has joined #openstack-nova | 16:19 | |
melwitt | jackding: review runways are only for approved blueprint implementations, not spec reviews unfortuatnately (please see instructions on the etherpad), so I'm removing the specs from the queue FYI | 16:20 |
mriedem | stephenfin: comments inline | 16:20 |
*** Bhujay has quit IRC | 16:20 | |
nicolasbock | sahid: Thanks, I found them! | 16:21 |
*** Bhujay has joined #openstack-nova | 16:21 | |
nicolasbock | I was looking in the `nova` database before | 16:21 |
nicolasbock | Thanks for the help sahid and mriedem | 16:21 |
sean-k-mooney | melwitt: for blueprints/specs we are ment to list them for open discution in the nova team meeting instead right too highlight them | 16:21 |
mriedem | melwitt: before adding https://blueprints.launchpad.net/nova/+spec/handling-down-cell back into the runway, we should probably know if tssurya is even around | 16:22 |
mriedem | because i don't think she is and there are -1s on the api change | 16:22 |
mriedem | so there isn't much point in it being in a runway slot | 16:22 |
melwitt | mriedem: oh, right | 16:22 |
*** Bhujay has quit IRC | 16:22 | |
*** Bhujay has joined #openstack-nova | 16:22 | |
*** wolverineav has joined #openstack-nova | 16:23 | |
*** Bhujay has quit IRC | 16:23 | |
canori01 | mriedem: so pinning my flavors to the az's like you suggested yesterday worked fine. So instances don't leave their hypervisor's az if I give them a flavor that associates them to a host aggregate. My situation for the boot volumes is that they are ceph rbd backed. However, each az is backed by a different ceph cluster (because we didn't want a ceph problem in one az to affect the others). | 16:24 |
canori01 | Probnlem I had is that on resize operations, the scheduler sometimes picked a host in another az and the resize would subsequently fail because that host can't access the rbd volume if it's in a different az | 16:24 |
*** Bhujay has joined #openstack-nova | 16:24 | |
melwitt | sean-k-mooney: yeah, that's a way to get visibility by putting them on open discussion agenda | 16:24 |
canori01 | So while the flavor pinning works, I'm wondering how come it doesn't honor the OS-EXT-AZ:availability_zone of the instance when resizing | 16:24 |
*** eharney has quit IRC | 16:24 | |
*** Bhujay has quit IRC | 16:25 | |
*** Bhujay has joined #openstack-nova | 16:25 | |
*** dpawlik has joined #openstack-nova | 16:26 | |
*** moshele has joined #openstack-nova | 16:26 | |
*** Bhujay has quit IRC | 16:26 | |
*** Bhujay has joined #openstack-nova | 16:27 | |
*** wolverineav has quit IRC | 16:27 | |
sean-k-mooney | Cardoe: does the instance have an az set | 16:27 |
mriedem | canori01: as i said before, if the instance is not created with a specific az, the scheduler does not restrict it to that az | 16:27 |
sean-k-mooney | Cardoe: sorry that was for canori01 | 16:28 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Handle unbound vif plug errors on compute restart https://review.openstack.org/626228 | 16:28 |
mriedem | you could set https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.default_schedule_zone to force a default az, but you might not want that | 16:28 |
stephenfin | mriedem: Thanks. Addressed | 16:28 |
*** Bhujay has quit IRC | 16:28 | |
*** Bhujay has joined #openstack-nova | 16:28 | |
mriedem | canori01: alternatively, if each volume is in a specific az, you could set https://docs.openstack.org/nova/latest/configuration/config.html#cinder.cross_az_attach to false and then would need https://review.openstack.org/#/c/469675/ to proxy the volume az to the instance during server create | 16:29 |
mriedem | cross_az_attach=false means the server and root volume have to be in the same az | 16:29 |
*** Bhujay has quit IRC | 16:29 | |
*** Bhujay has joined #openstack-nova | 16:30 | |
mriedem | having said that, https://review.openstack.org/#/c/469675/ is kind of fugly and i would like to work on an alternative fix that is less tightly coupled down that stack of code, but haven't found the time | 16:30 |
sean-k-mooney | mriedem: does that rely on the cinder az mataching the nova az | 16:30 |
mriedem | sean-k-mooney: yes | 16:30 |
*** dpawlik has quit IRC | 16:30 | |
mriedem | as can be seen here, it's extremely easy to break cross_az_attach=false today https://review.openstack.org/#/c/467674/ | 16:31 |
*** Bhujay has quit IRC | 16:31 | |
*** janki has quit IRC | 16:31 | |
*** gyee has joined #openstack-nova | 16:31 | |
mriedem | sorrison at nectar is the only deployer i know personally (target uses it also) that uses cross_az_attach and he said their users are just required to always specify an az when creating a server | 16:31 |
mriedem | b/c of bug 1694844 | 16:31 |
openstack | bug 1694844 in OpenStack Compute (nova) "Boot from volume fails when cross_az_attach=False and volume is provided to nova without an AZ for the instance" [Medium,In progress] https://launchpad.net/bugs/1694844 - Assigned to Matt Riedemann (mriedem) | 16:31 |
*** Bhujay has joined #openstack-nova | 16:31 | |
sean-k-mooney | right | 16:32 |
sean-k-mooney | i assume we dont have an api policy/config option to enforce that | 16:32 |
canori01 | mriedem: is the OS-EXT-AZ:availability_zone attribute on the instance different than what the scheduler looks at? When, instantiate from horizon and choose any availability zone, that gets set with that of the host. So if I were to actually choose an az at that time, then it would honor the az when moving/resizing? | 16:32 |
*** Bhujay has quit IRC | 16:32 | |
*** Bhujay has joined #openstack-nova | 16:33 | |
sean-k-mooney | i think that is just the az it is currently schdulerd to but i think the schduler looks at the requst spec for the az when picking hosts | 16:33 |
melwitt | mriedem: I agree with your assessment on that bug you linked. the trace shows "(self.host_state_map[host] for host in seen_nodes)" noting self.host_state_map which is before my fix landed, but also noting that's different than what the code was _before_ my fix landed as well. the previous code was "(self.host_state_map[host] for host in seen_nodes if host in self.host_state_map)" which would also avoid the KeyError, which reminds me, | 16:33 |
melwitt | of another fix that they also don't have, judging from that trace https://github.com/openstack/nova/commit/d72b33b986525a9b2c7aa08b609ae386de1d0e89 | 16:34 |
*** Bhujay has quit IRC | 16:34 | |
mriedem | ah yeah | 16:34 |
*** macza has joined #openstack-nova | 16:34 | |
*** Bhujay has joined #openstack-nova | 16:34 | |
mriedem | ok maybe they just reported their version incorrectly | 16:34 |
mriedem | sean-k-mooney: api policy/config option for what? | 16:35 |
mriedem | [cinder]/cross_az_attach is read in the api | 16:35 |
*** udesale has quit IRC | 16:35 | |
mriedem | canori01: "When, instantiate from horizon and choose any availability zone, that gets set with that of the host. So if I were to actually choose an az at that time, then it would honor the az when moving/resizing?" correct | 16:35 |
*** Bhujay has quit IRC | 16:35 | |
mriedem | canori01: "is the OS-EXT-AZ:availability_zone attribute on the instance different than what the scheduler looks at? " also correct | 16:36 |
*** Bhujay has joined #openstack-nova | 16:36 | |
mriedem | canori01: the instance.availability_zone field in the db is set to whatever az its compute host is in | 16:36 |
mriedem | and instance.availability_zone gets changed as the instance moves around | 16:36 |
mriedem | the scheduler looks at RequestSpec.availability_zone, which is the thing the user requested when they created the server | 16:37 |
*** yan0s has quit IRC | 16:37 | |
mriedem | so request_spec.az is immutable, but instance.az is not | 16:37 |
*** Bhujay has quit IRC | 16:37 | |
mriedem | how we decide which az to use if the instance.host is in multiple azs....idk | 16:37 |
*** Bhujay has joined #openstack-nova | 16:37 | |
mriedem | ^ is a jaypipes feature | 16:38 |
mriedem | oh wait the compute host can be in multiple aggregates but only 1 az right? | 16:38 |
mriedem | i always forget the rules | 16:38 |
*** macza_ has joined #openstack-nova | 16:38 | |
*** macza has quit IRC | 16:38 | |
*** Bhujay has quit IRC | 16:38 | |
*** Bhujay has joined #openstack-nova | 16:39 | |
mriedem | yeah https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#host-aggregates-and-availability-zones "Availability zones are different from host aggregates in that they are explicitly exposed to the user, and hosts can only be in a single availability zone. Administrators can configure a default availability zone where instances will be scheduled when the user fails to specify one." | 16:39 |
mriedem | canori01: so, you could setup a default ceph pool with a default az for users that don't specify a specific az and flavor that isn't tied to a given az | 16:39 |
mriedem | gets kind of hard to manage after awhile probably | 16:40 |
*** Bhujay has quit IRC | 16:40 | |
openstackgerrit | Merged openstack/nova master: Address nits on I08991796aaced2abc824f608108c0c786181eb65 https://review.openstack.org/614322 | 16:40 |
canori01 | mriedem: Perfect, thansk! | 16:40 |
*** Bhujay has joined #openstack-nova | 16:40 | |
canori01 | sean-k-mooney: thank you as well | 16:40 |
*** jistr has quit IRC | 16:41 | |
*** Bhujay has quit IRC | 16:41 | |
*** jistr has joined #openstack-nova | 16:42 | |
*** Bhujay has joined #openstack-nova | 16:42 | |
*** dpawlik has joined #openstack-nova | 16:42 | |
*** Bhujay has quit IRC | 16:43 | |
*** Bhujay has joined #openstack-nova | 16:43 | |
mriedem | stephenfin: +2 | 16:43 |
stephenfin | \o/ | 16:44 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/pike: Handle binding_failed vif plug errors on compute restart https://review.openstack.org/626361 | 16:44 |
*** Bhujay has quit IRC | 16:44 | |
*** Bhujay has joined #openstack-nova | 16:45 | |
jaypipes | mriedem: ask bauzas. | 16:45 |
mriedem | i think bauzas is on permanent PTO | 16:45 |
*** Bhujay has quit IRC | 16:46 | |
*** jistr_ has joined #openstack-nova | 16:46 | |
*** dpawlik has quit IRC | 16:46 | |
*** Bhujay has joined #openstack-nova | 16:46 | |
*** jistr has quit IRC | 16:47 | |
*** Bhujay has quit IRC | 16:47 | |
*** Bhujay has joined #openstack-nova | 16:48 | |
*** helenafm has quit IRC | 16:49 | |
*** Bhujay has quit IRC | 16:49 | |
*** Bhujay has joined #openstack-nova | 16:49 | |
mriedem | seriously though if anyone knows if bauzas is out the rest of the year, or what, it would be nice to know since i thought he was done with downstream fires for awhile | 16:50 |
*** Bhujay has quit IRC | 16:50 | |
bauzas | mriedem: I literrally have 2 days left :( | 16:50 |
mriedem | oh so you have been around | 16:51 |
bauzas | mriedem: but I'll commit myself on upstream reviews | 16:51 |
*** Bhujay has joined #openstack-nova | 16:51 | |
*** moshele has quit IRC | 16:51 | |
bauzas | and upstream revision of the placement spec I have | 16:51 |
mriedem | well i can give you a bunch of specs to just blindly approve then | 16:51 |
*** jistr_ has quit IRC | 16:51 | |
bauzas | mriedem: that's reasonable, I'm just discussing with internal folks about begging time for upstream before I leave | 16:51 |
*** jistr has joined #openstack-nova | 16:52 | |
bauzas | I'll just throw my downstream firehose for the next 2 days | 16:52 |
mriedem | bauzas: in berlin you told me you were good to go for upstream again? | 16:52 |
bauzas | mriedem: I was *thinking* to | 16:52 |
*** Bhujay has quit IRC | 16:52 | |
*** Bhujay has joined #openstack-nova | 16:52 | |
bauzas | mriedem: but then someone left us, and more customers are using our OSP12/OSP13 codebase that runs placement :) | 16:52 |
bauzas | which makes me dragged | 16:53 |
mriedem | alright well here is a list: https://review.openstack.org/#/c/393930/ https://review.openstack.org/#/c/612531/ https://review.openstack.org/#/c/616037/ https://review.openstack.org/#/c/609779/ https://review.openstack.org/#/c/603352/ | 16:53 |
*** Bhujay has quit IRC | 16:53 | |
mriedem | melwitt: weren't you also working on a list of specs that looked like they could use some attention before the freeze? | 16:54 |
*** Bhujay has joined #openstack-nova | 16:54 | |
melwitt | mriedem: yes, sent it out like a minute ago | 16:54 |
mriedem | in general, i need reviews on the cross cell resize spec from people not named dan since he's been the only one | 16:54 |
mriedem | and it's a hairy gd monster and if others aren't going to review it it's DOA for stein | 16:54 |
mriedem | bauzas: thoughts on my email yesterday about per-instance live migration timeouts would also be nice | 16:55 |
melwitt | yeah, I know :( I'm reviewing it today | 16:55 |
*** Bhujay has quit IRC | 16:55 | |
bauzas | mriedem: ack | 16:55 |
mriedem | cfriesen: you might chime in on http://lists.openstack.org/pipermail/openstack-discuss/2018-December/001112.html as well | 16:55 |
*** Bhujay has joined #openstack-nova | 16:55 | |
melwitt | mriedem: feel free to add notes and specs that are on your radar that I missed https://etherpad.openstack.org/p/nova-stein-blueprint-spec-freeze | 16:56 |
mriedem | will do | 16:56 |
*** Bhujay has quit IRC | 16:56 | |
*** Bhujay has joined #openstack-nova | 16:57 | |
*** dpawlik has joined #openstack-nova | 16:58 | |
*** Bhujay has quit IRC | 16:58 | |
*** Bhujay has joined #openstack-nova | 16:58 | |
*** Bhujay has quit IRC | 16:59 | |
*** Bhujay has joined #openstack-nova | 17:00 | |
*** Bhujay has quit IRC | 17:01 | |
*** Bhujay has joined #openstack-nova | 17:01 | |
*** Bhujay has quit IRC | 17:02 | |
*** dpawlik has quit IRC | 17:03 | |
*** Bhujay has joined #openstack-nova | 17:03 | |
bauzas | melwitt: thanks for the etherpad | 17:03 |
*** Bhujay has quit IRC | 17:04 | |
*** Bhujay has joined #openstack-nova | 17:04 | |
*** Bhujay has quit IRC | 17:05 | |
*** Bhujay has joined #openstack-nova | 17:06 | |
*** Bhujay has quit IRC | 17:07 | |
*** Bhujay has joined #openstack-nova | 17:07 | |
*** Bhujay has quit IRC | 17:08 | |
*** Bhujay has joined #openstack-nova | 17:09 | |
*** Bhujay has quit IRC | 17:10 | |
*** Bhujay has joined #openstack-nova | 17:10 | |
cfriesen | mriedem: will take a look | 17:11 |
*** Bhujay has quit IRC | 17:11 | |
*** Bhujay has joined #openstack-nova | 17:12 | |
*** Bhujay has quit IRC | 17:13 | |
*** Bhujay has joined #openstack-nova | 17:13 | |
*** moshele has joined #openstack-nova | 17:14 | |
*** ttsiouts has quit IRC | 17:14 | |
*** Bhujay has quit IRC | 17:14 | |
*** Bhujay has joined #openstack-nova | 17:15 | |
*** Bhujay has quit IRC | 17:16 | |
*** Bhujay has joined #openstack-nova | 17:16 | |
*** Bhujay has quit IRC | 17:17 | |
*** Bhujay has joined #openstack-nova | 17:18 | |
*** Bhujay has quit IRC | 17:19 | |
*** Bhujay has joined #openstack-nova | 17:19 | |
mriedem | tl;dr are the compromises worthwhile to move forward | 17:20 |
*** Bhujay has quit IRC | 17:20 | |
*** Bhujay has joined #openstack-nova | 17:21 | |
*** Bhujay has quit IRC | 17:22 | |
*** Bhujay has joined #openstack-nova | 17:22 | |
*** Bhujay has quit IRC | 17:23 | |
*** Bhujay has joined #openstack-nova | 17:24 | |
*** Bhujay has quit IRC | 17:25 | |
*** Bhujay has joined #openstack-nova | 17:25 | |
stephenfin | Bhujay: What is going on with your IRC connection? | 17:26 |
*** Bhujay has quit IRC | 17:26 | |
*** Bhujay has joined #openstack-nova | 17:27 | |
openstackgerrit | Merged openstack/nova master: Address nits on I1f1fa1d0f79bec5a4101e03bc2d43ba581dd35a0 https://review.openstack.org/614323 | 17:27 |
openstackgerrit | Merged openstack/nova master: Fix a broken-link in nova doc https://review.openstack.org/626113 | 17:27 |
*** Bhujay has quit IRC | 17:28 | |
*** Bhujay has joined #openstack-nova | 17:28 | |
*** Bhujay has quit IRC | 17:29 | |
*** Bhujay has joined #openstack-nova | 17:30 | |
*** Bhujay has quit IRC | 17:31 | |
*** Bhujay has joined #openstack-nova | 17:31 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/ocata: Handle binding_failed vif plug errors on compute restart https://review.openstack.org/626369 | 17:32 |
mriedem | hooray for ocata em ^ | 17:32 |
*** Bhujay has quit IRC | 17:32 | |
*** Bhujay has joined #openstack-nova | 17:33 | |
*** Bhujay has quit IRC | 17:34 | |
stephenfin | melwitt: Seeing as you looked at the earlier change, fancy taking a look at https://review.openstack.org/#/c/626228 ? | 17:34 |
*** Bhujay has joined #openstack-nova | 17:34 | |
melwitt | stephenfin: sure, always up for being pinged for reviews | 17:35 |
stephenfin | mriedem: Thanks for reviewing that nit patch (y) | 17:35 |
*** Bhujay has quit IRC | 17:35 | |
mriedem | the docs one? i didn't really, just saw gibi was +2 and it was a rebase | 17:36 |
mriedem | but yw :) | 17:36 |
*** Bhujay has joined #openstack-nova | 17:36 | |
*** Bhujay has quit IRC | 17:37 | |
*** Bhujay has joined #openstack-nova | 17:37 | |
*** Bhujay has quit IRC | 17:38 | |
*** Bhujay has joined #openstack-nova | 17:39 | |
openstackgerrit | Jack Ding proposed openstack/nova-specs master: Select cpu model from a list of cpu models https://review.openstack.org/620959 | 17:39 |
*** fragatina has quit IRC | 17:39 | |
*** Bhujay has quit IRC | 17:40 | |
*** Bhujay has joined #openstack-nova | 17:40 | |
*** Bhujay has quit IRC | 17:41 | |
*** Bhujay has joined #openstack-nova | 17:42 | |
*** Bhujay has quit IRC | 17:43 | |
*** Bhujay has joined #openstack-nova | 17:43 | |
*** derekh has quit IRC | 17:44 | |
*** Bhujay has quit IRC | 17:44 | |
*** Bhujay has joined #openstack-nova | 17:45 | |
*** Bhujay has quit IRC | 17:46 | |
*** Bhujay has joined #openstack-nova | 17:46 | |
*** Bhujay has quit IRC | 17:47 | |
*** dpawlik has joined #openstack-nova | 17:48 | |
*** Bhujay has joined #openstack-nova | 17:48 | |
*** Bhujay has quit IRC | 17:49 | |
*** Bhujay has joined #openstack-nova | 17:49 | |
*** Bhujay has quit IRC | 17:50 | |
*** Bhujay has joined #openstack-nova | 17:51 | |
*** Bhujay has quit IRC | 17:52 | |
*** Bhujay has joined #openstack-nova | 17:52 | |
*** dpawlik has quit IRC | 17:53 | |
*** Bhujay has quit IRC | 17:53 | |
*** Bhujay has joined #openstack-nova | 17:54 | |
*** Bhujay has quit IRC | 17:55 | |
*** Bhujay has joined #openstack-nova | 17:55 | |
*** Bhujay has quit IRC | 17:56 | |
openstackgerrit | Chris Dent proposed openstack/nova master: Redirect user/placement to placement docs https://review.openstack.org/626333 | 17:57 |
*** Bhujay has joined #openstack-nova | 17:57 | |
*** Bhujay has quit IRC | 17:58 | |
*** Bhujay has joined #openstack-nova | 17:58 | |
*** Bhujay has quit IRC | 17:59 | |
*** Bhujay has joined #openstack-nova | 18:00 | |
openstackgerrit | Krzysztof Opasiak proposed openstack/nova master: Fix server IPs with non-unique network names https://review.openstack.org/625371 | 18:02 |
cfriesen | stephenfin: any chance you could take a look at the cpu models spec proposed by Jack ^ ? Basically instead of setting one model in nova.conf the operator could specify a list, and the virt driver would use the first one that provides the requested cpu features. | 18:02 |
openstackgerrit | Krzysztof Opasiak proposed openstack/nova master: Fix server IPs with non-unique network names https://review.openstack.org/625371 | 18:02 |
*** dpawlik has joined #openstack-nova | 18:04 | |
*** Bhujay has quit IRC | 18:05 | |
*** dpawlik has quit IRC | 18:09 | |
*** cezary_zukowski has quit IRC | 18:11 | |
*** wolverineav has joined #openstack-nova | 18:12 | |
*** wolverineav has quit IRC | 18:12 | |
*** wolverineav has joined #openstack-nova | 18:12 | |
*** sahid has quit IRC | 18:13 | |
*** wolverineav has quit IRC | 18:13 | |
*** cdent has quit IRC | 18:15 | |
melwitt | mriedem: I wanted to bring this review to your attention, bug about returning build requests when a marker is specified (I know you love paginating stuff). I'm +2 on it https://review.openstack.org/624870 | 18:16 |
*** wolverineav has joined #openstack-nova | 18:19 | |
openstackgerrit | Merged openstack/nova master: Remove legacy RequestSpec compat code from live migrate task https://review.openstack.org/625705 | 18:22 |
*** amodi has quit IRC | 18:25 | |
*** amodi has joined #openstack-nova | 18:26 | |
*** sridharg has quit IRC | 18:32 | |
openstackgerrit | Tim Rozet proposed openstack/nova master: Fixes race condition with privsep utime https://review.openstack.org/625741 | 18:39 |
*** moshele has quit IRC | 18:46 | |
*** avolkov has quit IRC | 18:53 | |
mriedem | i saw it before, asked andrey to flesh out the commit message, haven't been back | 18:57 |
*** moshele has joined #openstack-nova | 19:06 | |
*** moshele has quit IRC | 19:12 | |
melwitt | ah, ok | 19:16 |
*** tbachman has joined #openstack-nova | 19:24 | |
*** dpawlik has joined #openstack-nova | 19:25 | |
*** tbachman has quit IRC | 19:26 | |
*** moshele has joined #openstack-nova | 19:27 | |
mnaser | friendly bump on this - https://review.openstack.org/#/c/619352/ | 19:28 |
mnaser | simple backport, the changes in the newer branches have merged too | 19:28 |
mriedem | frickler: fyi redo of the queens release https://review.openstack.org/626377 | 19:29 |
*** dpawlik has quit IRC | 19:30 | |
mriedem | duplicate bug of https://review.openstack.org/#/c/567701/ just came through triage, the fix is straight-forward, the patch is mostly a functional test | 19:31 |
*** gouthamr_ is now known as gouthamr | 19:38 | |
*** dpawlik has joined #openstack-nova | 19:41 | |
*** dpawlik has quit IRC | 19:46 | |
*** brault has joined #openstack-nova | 19:47 | |
*** brault has quit IRC | 19:51 | |
*** erlon_ has quit IRC | 19:55 | |
*** wolverineav has quit IRC | 19:58 | |
openstackgerrit | sean mooney proposed openstack/nova master: PCI: do not force remove allcoated devices https://review.openstack.org/626381 | 19:58 |
sean-k-mooney | mriedem: i have no idea why my unit test is not working in ^ | 19:59 |
sean-k-mooney | im going to grab dinner but if you have any insight let me know. | 20:00 |
mriedem | ack thanks | 20:01 |
*** david-lyle has quit IRC | 20:04 | |
*** moshele has quit IRC | 20:07 | |
*** markvoelker has joined #openstack-nova | 20:14 | |
*** dklyle has joined #openstack-nova | 20:15 | |
*** markvoelker has quit IRC | 20:19 | |
*** wolverineav has joined #openstack-nova | 20:30 | |
*** jmlowe has quit IRC | 20:31 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Document using service user tokens for long running operations https://review.openstack.org/626388 | 20:33 |
*** wolverineav has quit IRC | 20:34 | |
openstackgerrit | Jack Ding proposed openstack/nova-specs master: Select cpu model from a list of cpu models https://review.openstack.org/620959 | 20:35 |
melwitt | mriedem: re: the ML thread about that, I thought the oslo.messaging heart beat would take care of the long running live migration problem? | 20:39 |
mriedem | the problem isn't rpc | 20:40 |
*** priteau has quit IRC | 20:40 | |
melwitt | oh, the token auth expiring | 20:40 |
mriedem | nova tries to make a rest api request using the users token to cinder, | 20:40 |
mriedem | the token has timed out | 20:40 |
melwitt | I see, ok | 20:40 |
melwitt | yeah, have to have both then. I got the two confused together but they are two different issues | 20:41 |
mriedem | i want to say i heard anecdotes at one point that rax public cloud had 24 token timeouts because of stuff like this way back when | 20:41 |
mriedem | the service user token stuff was added by osic, which was rax+intel | 20:41 |
melwitt | yeah, sounds familiar. I feel like we had something similar at yahoo too | 20:43 |
melwitt | *something similar to service user auth | 20:44 |
*** wolverineav has joined #openstack-nova | 20:45 | |
*** wolverineav has quit IRC | 20:50 | |
*** wolverineav has joined #openstack-nova | 20:57 | |
mriedem | long_rpc_timeout probably also deserves some mention somewhere in troubleshooting admin docs, but not sure right now, | 20:58 |
mriedem | in general i've had random thoughts about things that would be good to put into a 'scaling issues' page in the docs | 20:58 |
openstackgerrit | Krzysztof Opasiak proposed openstack/nova master: Fix server IPs with non-unique network names https://review.openstack.org/625371 | 20:58 |
mriedem | but haven't started anything | 20:58 |
melwitt | ++ | 20:59 |
*** wolverineav has quit IRC | 21:09 | |
*** wolverineav has joined #openstack-nova | 21:09 | |
*** wolverineav has quit IRC | 21:14 | |
mriedem | melwitt: i'm going to fix that unnecessary for loop in https://review.openstack.org/#/c/624870/ that Kevin pointd out, then approve | 21:15 |
melwitt | ok, sounds good | 21:15 |
*** brault has joined #openstack-nova | 21:19 | |
*** brault has quit IRC | 21:23 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Exclude build request marker from server listing https://review.openstack.org/624870 | 21:30 |
melwitt | mriedem: so the func test in this change doesn't fail without the change https://review.openstack.org/567701 is that expected based on the commit message? if so, is there no way to demonstrate the bug in the test? | 21:41 |
melwitt | I wasn't sure based on the wording "that is not a regression" | 21:42 |
*** dpawlik has joined #openstack-nova | 21:42 | |
mriedem | been awhile, but the commit message is saying I8d426f2635232ffc4b510548a905794ca88d7f99 didn't introduce a regression | 21:43 |
melwitt | ok, so unrelated to what I'm seeing I think. basically, without the change, somehow AZ is being updated on the instance. I don't yet know how | 21:43 |
mriedem | i'll have to poke at it, i wrote that in may | 21:44 |
*** takashin has joined #openstack-nova | 21:45 | |
* melwitt nods | 21:45 | |
*** wolverineav has joined #openstack-nova | 21:46 | |
*** dpawlik has quit IRC | 21:46 | |
*** awaugama has quit IRC | 21:50 | |
*** wolverineav has quit IRC | 21:51 | |
*** wolverineav has joined #openstack-nova | 21:55 | |
*** wolverineav has quit IRC | 21:55 | |
*** wolverineav has joined #openstack-nova | 21:56 | |
*** dpawlik has joined #openstack-nova | 21:58 | |
melwitt | looking at it myself for curiosity, I'm not finding how AZ could be updated without the fix. weird | 22:00 |
*** dpawlik has quit IRC | 22:02 | |
openstackgerrit | Merged openstack/nova master: Move a generic bridge helper to a linux_net privsep file. https://review.openstack.org/620010 | 22:05 |
*** takashin has left #openstack-nova | 22:06 | |
*** takashin has joined #openstack-nova | 22:07 | |
*** slaweq has quit IRC | 22:13 | |
* melwitt prints | 22:14 | |
mriedem | oh yay, for a looong time we passed potentially the wrong image to move claim during a resize https://github.com/openstack/nova/blob/1249617bdfaa8f4c586159374a4a0b244bbb298a/nova/conductor/tasks/migrate.py#L77 | 22:15 |
mriedem | based on the original image used to create the server, but potentially not the last image used to rebuild the server | 22:15 |
mriedem | gd req spec | 22:15 |
*** markvoelker has joined #openstack-nova | 22:15 | |
*** igordc has joined #openstack-nova | 22:17 | |
mriedem | which wasn't fixed until https://github.com/openstack/nova/commit/984dd8ad6add4523d93c7ce5a666a32233e02e34 inadvertently | 22:17 |
melwitt | hoo boy | 22:17 |
* mriedem hates request spec | 22:18 | |
melwitt | me too :( | 22:18 |
*** rcernin has joined #openstack-nova | 22:19 | |
mriedem | oh this also likely means that if you shelve, unshelve, resize, we're passing the original image used to create the server, not the current image meta (in case that changed) | 22:19 |
*** ralonsoh has quit IRC | 22:22 | |
melwitt | so the AZ is still the original all the way to the end of the live migration. so how is the servers.get API returning the new AZ... the search continues | 22:23 |
mriedem | i think i know | 22:25 |
mriedem | the api code looks up the az from the instance.host | 22:26 |
mriedem | i think | 22:26 |
melwitt | pulled from the DB, the AZ is still the original in instance.availability_zone | 22:26 |
melwitt | ahhhh | 22:26 |
melwitt | must be, it can't be looking at instance.availability_zone | 22:26 |
mriedem | there are other bugs for that behavior | 22:26 |
melwitt | THANKS API | 22:26 |
mriedem | https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/views/servers.py#L185 | 22:26 |
melwitt | le sigh | 22:27 |
mriedem | https://github.com/openstack/nova/blob/master/nova/availability_zones.py#L165 | 22:27 |
mriedem | yeah if instance.host is not None, that code gets the az from the host aggregate, not the instance.az | 22:27 |
mriedem | ala https://review.openstack.org/#/c/582342/ | 22:27 |
melwitt | well, that explains it | 22:28 |
mriedem | i have a f'ing patch for everything | 22:28 |
melwitt | it's true | 22:28 |
mriedem | https://bugs.launchpad.net/nova/+bug/1782539 | 22:28 |
openstack | Launchpad bug 1782539 in OpenStack Compute (nova) "Fail to filter the list of instances by the available zone" [Medium,In progress] - Assigned to huanhongda (hongda) | 22:28 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Pass request_spec from compute to cell conductor on reschedule https://review.openstack.org/582417 | 22:33 |
mriedem | speak-o-the-turd | 22:33 |
mriedem | melwitt: so i'll adjust that test to assert based on the db rather than the api | 22:34 |
melwitt | ok, makes sense | 22:34 |
*** rcernin has quit IRC | 22:36 | |
melwitt | now, to finish reviewing cross-cell-resize | 22:37 |
*** rcernin has joined #openstack-nova | 22:37 | |
*** erlon_ has joined #openstack-nova | 22:43 | |
*** munimeha1 has quit IRC | 22:45 | |
*** rcernin has quit IRC | 22:57 | |
openstackgerrit | Hongbin Lu proposed openstack/nova-specs master: [WIP] Support scheduling VM's NICs to different PFs https://review.openstack.org/626055 | 22:58 |
*** rcernin has joined #openstack-nova | 22:58 | |
*** rcernin has quit IRC | 22:59 | |
tonyb | I'm seeing something 'strange' where placement is always selecting the same host out of the $n available I *suspect* there is cruft in the DB after a bunch of failed boots but I don't really know | 23:03 |
tonyb | asside form the scheduler and placement logs where should I look? | 23:03 |
melwitt | tonyb: in the past, default scheduler behavior was to pack instances, so as long as a host has capacity, it would be returned again. I'm not sure if that's still the case now though | 23:04 |
melwitt | is that what you're seeing or no? | 23:04 |
mriedem | tonyb: there was a thread in the ops list about this recently, there is an option you can set to randomize the results from placement | 23:05 |
mriedem | https://docs.openstack.org/nova/latest/configuration/config.html#placement.randomize_allocation_candidates | 23:05 |
tonyb | melwitt: Ahh perhaps that's it | 23:05 |
mriedem | ^ goes in the placement config btw, not nova-scheduler | 23:05 |
tonyb | mriedem: Okay. I should have said this is queens but I'll look for packing and/or randomising | 23:06 |
tonyb | Thanks | 23:07 |
*** rcernin has joined #openstack-nova | 23:07 | |
mriedem | looky here | 23:07 |
mriedem | https://docs.openstack.org/nova/queens/configuration/config.html#placement.randomize_allocation_candidates | 23:07 |
tonyb | mriedem: Will do. | 23:08 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Update instance.availability_zone during live migration https://review.openstack.org/567701 | 23:09 |
melwitt | in the olden days, IIRC, lots of operators preferred to maximize cloud utilization and packing is the way to do that. but maybe nowadays people prefer to spread out servers and risk not being able to land larger flavors. that's the history on the default as I understand it | 23:09 |
mriedem | huawei public cloud wants to pack like you've never packed before | 23:09 |
melwitt | :) | 23:09 |
mriedem | they want nova to pack even more, | 23:09 |
mriedem | doing shit like the solver scheduler and what watcher does, | 23:09 |
melwitt | moar efficiency. not surprising | 23:10 |
mriedem | dynamically load balancing for ultimate packitude | 23:10 |
*** mlavalle has quit IRC | 23:13 | |
*** jmlowe has joined #openstack-nova | 23:16 | |
*** mchlumsky has quit IRC | 23:18 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/rocky: Update port device_owner when unshelving https://review.openstack.org/626407 | 23:20 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/queens: Update port device_owner when unshelving https://review.openstack.org/626408 | 23:22 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/pike: Update port device_owner when unshelving https://review.openstack.org/626409 | 23:29 |
*** alex_xu has joined #openstack-nova | 23:32 | |
*** rcernin has quit IRC | 23:38 | |
*** fragatina has joined #openstack-nova | 23:38 | |
openstackgerrit | Hongbin Lu proposed openstack/nova-specs master: [WIP] Support scheduling VM's NICs to different PFs https://review.openstack.org/626055 | 23:40 |
*** rcernin has joined #openstack-nova | 23:41 | |
openstackgerrit | Merged openstack/nova master: Handle unbound vif plug errors on compute restart https://review.openstack.org/626228 | 23:43 |
*** NostawRm has quit IRC | 23:44 | |
mriedem | jackding: looked at https://review.openstack.org/#/c/603844/ again, i still don't like it | 23:45 |
*** wolverineav has quit IRC | 23:45 | |
*** wolverineav has joined #openstack-nova | 23:46 | |
*** erlon_ has quit IRC | 23:46 | |
openstackgerrit | Merged openstack/nova master: Update port device_owner when unshelving https://review.openstack.org/559828 | 23:49 |
openstackgerrit | melanie witt proposed openstack/nova stable/rocky: Handle unbound vif plug errors on compute restart https://review.openstack.org/626410 | 23:49 |
*** wolverineav has quit IRC | 23:50 | |
mriedem | tl;dr i don't want to randomly have to hit the neutron API every time we rebuild/reboot to list ports for a server just to check if something is broken | 23:52 |
*** wolverineav has joined #openstack-nova | 23:55 | |
*** slaweq has joined #openstack-nova | 23:58 | |
*** dpawlik has joined #openstack-nova | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!