*** hamzy has joined #openstack-nova | 00:02 | |
*** slaweq has quit IRC | 00:03 | |
*** itlinux_ has quit IRC | 00:04 | |
*** artom has joined #openstack-nova | 00:05 | |
*** sapd1_x has joined #openstack-nova | 00:05 | |
openstackgerrit | Merged openstack/python-novaclient master: Blacklist python-cinderclient 4.0.0 https://review.opendev.org/662912 | 00:08 |
---|---|---|
*** tbachman has joined #openstack-nova | 00:13 | |
*** slaweq has joined #openstack-nova | 00:23 | |
*** brinzhang has joined #openstack-nova | 00:26 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Replace 'is comprised of' with 'comprises' https://review.opendev.org/663175 | 00:26 |
*** mriedem_away has quit IRC | 00:30 | |
*** slaweq has quit IRC | 00:33 | |
*** sapd1_x has quit IRC | 00:35 | |
openstackgerrit | Brin Zhang proposed openstack/nova stable/rocky: Replace the invalid index of nova-rocky releasenote https://review.opendev.org/663178 | 00:37 |
*** cmart has quit IRC | 00:41 | |
*** brault has joined #openstack-nova | 00:44 | |
*** brault has quit IRC | 00:49 | |
*** slaweq has joined #openstack-nova | 00:51 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove unnecessary setUp methods https://review.opendev.org/663179 | 00:59 |
alex_xu | The vPMEM spec is ready for review https://review.opendev.org/#/c/601596/, in case of anyone still at the day of spec review day~ | 01:01 |
*** slaweq has quit IRC | 01:03 | |
*** spsurya has joined #openstack-nova | 01:18 | |
*** guozijn has joined #openstack-nova | 01:20 | |
*** slaweq has joined #openstack-nova | 01:21 | |
*** boxiang has quit IRC | 01:26 | |
*** boxiang has joined #openstack-nova | 01:27 | |
*** Sundar has quit IRC | 01:30 | |
*** slaweq has quit IRC | 01:33 | |
*** claudiub has quit IRC | 01:36 | |
*** guozijn has quit IRC | 01:37 | |
*** hongbin has joined #openstack-nova | 01:47 | |
openstackgerrit | ya.wang proposed openstack/nova-specs master: Add spec for expose-auto-converge-post-copy https://review.opendev.org/651681 | 01:52 |
*** slaweq has joined #openstack-nova | 01:54 | |
*** Sundar has joined #openstack-nova | 01:56 | |
*** Sundar has quit IRC | 02:03 | |
*** slaweq has quit IRC | 02:03 | |
*** lbragstad has quit IRC | 02:14 | |
*** slaweq has joined #openstack-nova | 02:14 | |
*** boxiang_ has joined #openstack-nova | 02:16 | |
*** BjoernT has joined #openstack-nova | 02:17 | |
*** boxiang has quit IRC | 02:18 | |
*** tbachman has quit IRC | 02:18 | |
*** tbachman has joined #openstack-nova | 02:19 | |
*** slaweq has quit IRC | 02:25 | |
*** itlinux has joined #openstack-nova | 02:27 | |
*** Dinesh_Bhor has quit IRC | 02:37 | |
*** slaweq has joined #openstack-nova | 02:41 | |
*** minmin has joined #openstack-nova | 02:42 | |
*** abhishekk has joined #openstack-nova | 02:43 | |
*** slaweq has quit IRC | 02:54 | |
*** markvoelker has joined #openstack-nova | 02:59 | |
*** abhishekk has quit IRC | 03:13 | |
*** slaweq has joined #openstack-nova | 03:15 | |
*** Dinesh_Bhor has joined #openstack-nova | 03:19 | |
*** slaweq has quit IRC | 03:26 | |
*** markvoelker has quit IRC | 03:30 | |
*** slaweq has joined #openstack-nova | 03:42 | |
*** guozijn has joined #openstack-nova | 03:50 | |
*** hongbin has quit IRC | 03:52 | |
*** slaweq has quit IRC | 03:55 | |
*** ricolin has joined #openstack-nova | 03:58 | |
*** gyee has quit IRC | 04:02 | |
*** slaweq has joined #openstack-nova | 04:12 | |
*** threestrands has joined #openstack-nova | 04:15 | |
*** bnemec has quit IRC | 04:23 | |
*** gmann has quit IRC | 04:23 | |
*** slaweq has quit IRC | 04:25 | |
*** bnemec has joined #openstack-nova | 04:25 | |
*** markvoelker has joined #openstack-nova | 04:27 | |
*** gmann has joined #openstack-nova | 04:27 | |
*** pcaruana has joined #openstack-nova | 04:30 | |
*** guozijn has quit IRC | 04:36 | |
*** brinzhang has quit IRC | 04:44 | |
*** brinzhang has joined #openstack-nova | 04:44 | |
*** markvoelker has quit IRC | 05:00 | |
*** takashin has left #openstack-nova | 05:00 | |
*** tkajinam has quit IRC | 05:01 | |
*** slaweq has joined #openstack-nova | 05:02 | |
*** tkajinam has joined #openstack-nova | 05:03 | |
*** itlinux has quit IRC | 05:22 | |
*** frankwang has joined #openstack-nova | 05:27 | |
*** damien_r has joined #openstack-nova | 05:28 | |
*** tkajinam has quit IRC | 05:33 | |
*** guozijn has joined #openstack-nova | 05:36 | |
*** ivve has quit IRC | 05:46 | |
*** sapd1 has joined #openstack-nova | 05:48 | |
*** markvoelker has joined #openstack-nova | 05:56 | |
*** luksky has joined #openstack-nova | 06:00 | |
*** phasespace has quit IRC | 06:02 | |
*** tkajinam has joined #openstack-nova | 06:03 | |
*** minmin has quit IRC | 06:03 | |
*** dpawlik has joined #openstack-nova | 06:10 | |
*** maciejjozefczyk has joined #openstack-nova | 06:12 | |
*** dpawlik has quit IRC | 06:15 | |
*** minmin has joined #openstack-nova | 06:16 | |
*** dpawlik has joined #openstack-nova | 06:22 | |
*** markvoelker has quit IRC | 06:28 | |
*** dtantsur|afk is now known as dtantsur\ | 06:29 | |
*** dtantsur\ is now known as dtantsur | 06:29 | |
*** damien_r has quit IRC | 06:30 | |
*** takamatsu has joined #openstack-nova | 06:32 | |
*** aarents has joined #openstack-nova | 06:48 | |
*** lpetrut has joined #openstack-nova | 06:58 | |
*** liuyulong_ has joined #openstack-nova | 06:58 | |
*** rcernin has quit IRC | 06:59 | |
*** brault has joined #openstack-nova | 07:02 | |
*** factor has joined #openstack-nova | 07:03 | |
*** ivve has joined #openstack-nova | 07:04 | |
*** minmin has quit IRC | 07:07 | |
*** tssurya has joined #openstack-nova | 07:13 | |
*** rpittau|afk is now known as rpittau | 07:14 | |
openstackgerrit | Merged openstack/nova master: Make all functional tests reusable by other projects https://review.opendev.org/657659 | 07:20 |
*** aloga has quit IRC | 07:23 | |
*** aloga has joined #openstack-nova | 07:23 | |
*** tesseract has joined #openstack-nova | 07:25 | |
*** liuyulong_ has quit IRC | 07:26 | |
*** minmin has joined #openstack-nova | 07:27 | |
*** ttsiouts has joined #openstack-nova | 07:27 | |
*** damien_r has joined #openstack-nova | 07:28 | |
*** minmin has quit IRC | 07:35 | |
*** ttsiouts has quit IRC | 07:37 | |
*** helenafm has joined #openstack-nova | 07:41 | |
*** whoami-rajat has joined #openstack-nova | 07:45 | |
*** ralonsoh has joined #openstack-nova | 07:51 | |
*** ttsiouts has joined #openstack-nova | 08:02 | |
*** tetsuro has joined #openstack-nova | 08:05 | |
openstackgerrit | Boxiang Zhu proposed openstack/nova master: Validate requested host/node during servers create https://review.opendev.org/661237 | 08:08 |
openstackgerrit | Boxiang Zhu proposed openstack/nova master: Add host and hypervisor_hostname flag to create server https://review.opendev.org/645520 | 08:08 |
*** xek has joined #openstack-nova | 08:09 | |
*** panda|ruck has quit IRC | 08:14 | |
openstackgerrit | Brin Zhang proposed openstack/nova-specs master: Specifying az when restore shelved server https://review.opendev.org/624689 | 08:14 |
*** derekh has joined #openstack-nova | 08:18 | |
*** phasespace has joined #openstack-nova | 08:22 | |
*** markvoelker has joined #openstack-nova | 08:26 | |
*** BjoernT has quit IRC | 08:26 | |
*** panda has joined #openstack-nova | 08:28 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Use SATA bus for cdrom devices when using q35 machine type https://review.opendev.org/663011 | 08:33 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: DNM: Run tempest-full-py3 with q35 machine type https://review.opendev.org/662887 | 08:33 |
*** tkajinam has quit IRC | 08:37 | |
*** threestrands has quit IRC | 08:39 | |
*** tetsuro has quit IRC | 08:46 | |
*** openstackgerrit has quit IRC | 08:47 | |
*** panda is now known as panda|ruck | 08:52 | |
*** openstackgerrit has joined #openstack-nova | 08:54 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Ensure controllers all call super https://review.opendev.org/660950 | 08:54 |
stephenfin | bauzas: Is this clear(er) now? https://review.opendev.org/#/c/660774/3/nova/compute/manager.py | 08:55 |
*** rcernin has joined #openstack-nova | 08:57 | |
*** markvoelker has quit IRC | 08:59 | |
openstackgerrit | Boxiang Zhu proposed openstack/nova master: Add host and hypervisor_hostname flag to create server https://review.opendev.org/645520 | 09:01 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Use SATA bus for cdrom devices when using q35 machine type https://review.opendev.org/663011 | 09:20 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: DNM: Run tempest-full-py3 with q35 machine type https://review.opendev.org/662887 | 09:20 |
*** priteau has joined #openstack-nova | 09:22 | |
kashyap | lyarwood: When you respin, an Ultra-OCD nit (please don't hate me) : s/q35/'q35'/ (or Q35). | 09:25 |
mdbooth | kashyap: We should put quotes round q35? | 09:38 |
kashyap | mdbooth: Not "should", but just a literal of some kind to show that it is a tiny code fragment, if you're using lowercase | 09:38 |
mdbooth | -EPARSE | 09:38 |
mdbooth | Is this in the context of lyarwood 's specific patch, or in general? | 09:39 |
kashyap | In general. | 09:40 |
mdbooth | kashyap: I agree with you: that's an Ultra-OCD nit ;) | 09:40 |
kashyap | mdbooth: I was saying it in the commit message, not in the code -- sorry, I didn't made that clear | 09:40 |
kashyap | Yes, because consistency is better than inconsistency, harmony is better than chaos, etc :D | 09:41 |
kashyap | I mention these (but never -1!) because "you play like you practise". | 09:42 |
lyarwood | kashyap: if I respin I'll change it to Q35 | 09:43 |
kashyap | lyarwood: Merci | 09:43 |
*** markvoelker has joined #openstack-nova | 09:57 | |
openstackgerrit | Boxiang Zhu proposed openstack/nova master: Validate requested host/node during servers create https://review.opendev.org/661237 | 10:09 |
openstackgerrit | Boxiang Zhu proposed openstack/nova master: Add host and hypervisor_hostname flag to create server https://review.opendev.org/645520 | 10:09 |
*** lpetrut has quit IRC | 10:09 | |
*** boxiang_ has quit IRC | 10:10 | |
*** boxiang has joined #openstack-nova | 10:10 | |
*** frankwang has quit IRC | 10:13 | |
*** ttsiouts has quit IRC | 10:15 | |
*** ttsiouts has joined #openstack-nova | 10:16 | |
*** ttsiouts has quit IRC | 10:20 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Use SATA bus for cdrom devices when using Q35 machine type https://review.opendev.org/663011 | 10:21 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: DNM: Run tempest-full-py3 with q35 machine type https://review.opendev.org/662887 | 10:21 |
*** lpetrut has joined #openstack-nova | 10:22 | |
*** dave-mccowan has joined #openstack-nova | 10:26 | |
*** markvoelker has quit IRC | 10:29 | |
*** maciejjozefczyk_ has joined #openstack-nova | 10:30 | |
*** maciejjozefczyk has quit IRC | 10:31 | |
*** ivve has quit IRC | 10:31 | |
openstackgerrit | John Garbutt proposed openstack/nova-specs master: Add Unified Limits Spec https://review.opendev.org/602201 | 10:32 |
*** ivve has joined #openstack-nova | 10:33 | |
mdbooth | lyarwood: FYI you can just recheck the DNM tempest change | 10:35 |
mdbooth | Err, no you can't. Thought you were using Depends-On. | 10:35 |
sean-k-mooney | hehe if its not using Depends-On currently adding it should rerun it right :) | 10:36 |
mdbooth | sean-k-mooney: Can you Depends-On another change in the same repo? It would be really convenient if you could. | 10:36 |
sean-k-mooney | oh the tempest change is no | 10:36 |
sean-k-mooney | nova | 10:36 |
mdbooth | kashyap: https://review.opendev.org/#/c/663011/6..7/nova/virt/libvirt/blockinfo.py ;) | 10:37 |
sean-k-mooney | it can if there is no merge conflict but in repo you should jsut rebase on top of the other | 10:37 |
mdbooth | sean-k-mooney: More thinking about octopus merges. | 10:37 |
*** brinzhang has quit IRC | 10:38 | |
kashyap | mdbooth: One moment | 10:38 |
mdbooth | sean-k-mooney: i.e. You've got several otherwise independent patches, then a change which requires all of them. | 10:38 |
sean-k-mooney | mdbooth: yep i have done that before and if there are no merge conflict it works | 10:38 |
*** frankwang has joined #openstack-nova | 10:38 | |
lyarwood | erm it's a DNM Nova change so nope | 10:38 |
lyarwood | ah sean-k-mooney already said | 10:38 |
mdbooth | lyarwood: And I already noticed ;) | 10:39 |
sean-k-mooney | zuul create a new commit with all of the depencies as parents | 10:39 |
sean-k-mooney | so if the merge is possibel without conflict then zuul will do it if not it will mark the patch as in merge conflict | 10:39 |
mdbooth | In that case, lyarwood should be able to use Depends-On for the tempest job change | 10:40 |
sean-k-mooney | yes he can | 10:40 |
sean-k-mooney | and then just recheck and it will pick up the latest version each time | 10:41 |
*** ivve has quit IRC | 10:41 | |
kashyap | mdbooth: Heh, thanks | 10:41 |
sean-k-mooney | mdbooth: for that to work however the tempest change would have to not be based on the other change or zuul with skip the depends-on as its already satisfied | 10:42 |
mdbooth | sean-k-mooney: It would be academic in this case, tbh. I'm more just interested to know if it works :) I doubt lyarwood would bother on that basis. | 10:43 |
sean-k-mooney | i have used it in the past when i needed a bug fix someone else was writing form my own work but they did not want me to rebase my work ontop | 10:44 |
*** ricolin has quit IRC | 10:47 | |
*** guozijn has quit IRC | 10:49 | |
*** boxiang_ has joined #openstack-nova | 10:54 | |
*** boxiang has quit IRC | 10:57 | |
mdbooth | lyarwood: FYI, don't know if you've been doing this locally but your latest patch just completed a full set of py37 unit tests for me | 11:01 |
sean-k-mooney | mdbooth: the nova compute agent does not work proably under py37 the last time i tried it | 11:01 |
sean-k-mooney | it hangs when you try to boot a vm | 11:01 |
sean-k-mooney | i think eventlets is not working properly | 11:02 |
mdbooth | Which agent? | 11:02 |
sean-k-mooney | nova-compute | 11:02 |
mdbooth | The service? | 11:02 |
sean-k-mooney | yes | 11:02 |
mdbooth | Ah, ok | 11:02 |
*** ttsiouts has joined #openstack-nova | 11:03 | |
sean-k-mooney | it stop produceing all log output in the journal and the vm never boots | 11:03 |
mdbooth | sean-k-mooney: It's definitely running on RHEL 8 | 11:03 |
sean-k-mooney | at least that is what i was seeing on fedora 29 | 11:03 |
sean-k-mooney | rhel8 is not useing py37 | 11:03 |
sean-k-mooney | or atleast i thought it was using py36 | 11:04 |
mdbooth | Obviously | 11:04 |
sean-k-mooney | the unit test do actully run however on py37 as do the fucntional test i belive so it seams to only be an issue when we do actul socket io to connect to libvirt | 11:05 |
sean-k-mooney | to be honest i didnt debug it | 11:05 |
lyarwood | mdbooth: I've been watching it using http://zuul.openstack.org/status and https://github.com/kk7ds/openstack-gerrit-dashboard | 11:09 |
*** ociuhandu has joined #openstack-nova | 11:09 | |
lyarwood | I was just bouncing between unit and functional tests failing for stupid reasons | 11:10 |
lyarwood | should be resolved now | 11:10 |
mdbooth | lyarwood: Ack. | 11:10 |
*** claudiub has joined #openstack-nova | 11:12 | |
*** panda|ruck is now known as panda|ruck|eat | 11:19 | |
*** markvoelker has joined #openstack-nova | 11:26 | |
*** abhishekk has joined #openstack-nova | 11:28 | |
*** guozijn has joined #openstack-nova | 11:29 | |
*** ttsiouts has quit IRC | 11:31 | |
*** ttsiouts has joined #openstack-nova | 11:32 | |
*** ttsiouts has quit IRC | 11:36 | |
aspiers | efried: https://review.opendev.org/#/c/638680/ looking a lot better now, thanks for the recheck | 11:38 |
aspiers | although I do wonder what the hell you were doing awake at that time | 11:38 |
*** maciejjozefczyk_ has quit IRC | 11:39 | |
*** maciejjozefczyk_ has joined #openstack-nova | 11:41 | |
*** xek has quit IRC | 11:49 | |
*** xek has joined #openstack-nova | 11:50 | |
*** markvoelker has quit IRC | 12:00 | |
*** gmann has quit IRC | 12:03 | |
*** ricolin has joined #openstack-nova | 12:03 | |
*** ociuhandu_ has joined #openstack-nova | 12:06 | |
*** ociuhandu has quit IRC | 12:06 | |
*** ttsiouts has joined #openstack-nova | 12:10 | |
mdbooth | artom: Looking at https://review.opendev.org/#/c/644881/19/nova/compute/manager.py | 12:27 |
mdbooth | Which neutron call generates the event when we're using OVS hybrid: setup_networks_on_host or migrate_instance_finish? | 12:27 |
* mdbooth suspects the latter | 12:27 | |
*** jaypipes has joined #openstack-nova | 12:28 | |
*** panda|ruck|eat is now known as panda|ruck | 12:30 | |
sean-k-mooney | migrate_instance_finish calls _update_port_binding_for_instance | 12:30 |
sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/network/neutronv2/api.py#L2728 | 12:30 |
sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/network/neutronv2/api.py#L3186 | 12:31 |
sean-k-mooney | so _update_port_binding_for_instance triggers teh event when the portbinding is updated | 12:31 |
mdbooth | sean-k-mooney: Thanks. | 12:31 |
*** frankwang has quit IRC | 12:37 | |
*** rcernin has quit IRC | 12:37 | |
*** priteau has quit IRC | 12:41 | |
sean-k-mooney | mdbooth: i added some clarification to https://review.opendev.org/#/c/644881/19/nova/compute/manager.py | 12:45 |
openstackgerrit | Adam Spiers proposed openstack/nova master: Track inventory for new MEM_ENCRYPTION_CONTEXT resource class https://review.opendev.org/662105 | 12:45 |
mdbooth | sean-k-mooney: Thanks | 12:46 |
sean-k-mooney | although i think the comment artom left in the code is sufficent | 12:46 |
mdbooth | artom: I think you're missing a test. | 12:53 |
*** priteau has joined #openstack-nova | 12:54 | |
kashyap | Goddamned Gerrit utterly ****s up formatting when you try to quote-reply to a long message, that has some indentation. | 12:56 |
* kashyap manually cleans it up | 12:57 | |
*** ttsiouts has quit IRC | 13:01 | |
kashyap | [No it's not possible; curses under my breath and moves on.] | 13:01 |
*** ttsiouts has joined #openstack-nova | 13:01 | |
artom | mdbooth, sorry, caught me during daycare taxi | 13:03 |
artom | Looks like sean-k-mooney answered your question | 13:04 |
artom | mdbooth, sean-k-mooney, thanks for the reviews, will address shortly | 13:04 |
sean-k-mooney | artom: i havent asked for any changes so mine is a noop :) | 13:04 |
*** BjoernT has joined #openstack-nova | 13:05 | |
*** ttsiouts has quit IRC | 13:05 | |
*** eharney has joined #openstack-nova | 13:06 | |
openstackgerrit | Merged openstack/nova master: Change the default of notification_format to unversioned https://review.opendev.org/603079 | 13:10 |
*** mriedem has joined #openstack-nova | 13:10 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: DNM: Run tempest-full-py3 with q35 machine type https://review.opendev.org/662887 | 13:12 |
*** sapd1_x has joined #openstack-nova | 13:14 | |
*** abhishekk has quit IRC | 13:15 | |
artom | lyarwood, well, depends what we want to show with it | 13:15 |
artom | If we just want it to pass with only the IDE/SATA CDROM changes, then yeah | 13:16 |
artom | Once we actually fix the PCIE ports in Nova, then that bit should be removed from the DNM patch | 13:16 |
artom | Err, that was for a question asked in downstream IRC | 13:16 |
*** lbragstad has joined #openstack-nova | 13:17 | |
*** brinzhang has joined #openstack-nova | 13:18 | |
*** ttsiouts has joined #openstack-nova | 13:22 | |
lyarwood | artom: okay so moving back here, testing q35 instead of the default machine type in upstream CI | 13:23 |
lyarwood | artom: I personally think that's fine in one job but we still need to cover the default somewhere | 13:23 |
artom | lyarwood, yeah, that was my feeling as well - we can't add an entire new job given the gate resource constraints, and we should keep testing the default as well | 13:24 |
artom | So where do we stick q35? nova-next? | 13:24 |
kashyap | artom: An email needs to be written to discuss on the list | 13:25 |
kashyap | [nova] Test 'q35' machine type in the Gate | 13:25 |
kashyap | Or something like that | 13:25 |
artom | kashyap, sounds like a plan | 13:25 |
artom | Who's doing the needful? | 13:25 |
kashyap | I can do that. But not now | 13:25 |
kashyap | One of the main reasons is that upstream and (nor a certain major Linux distro) QEMU doesn't "fix" bugs or add additional features to the 'pc' machine type | 13:26 |
kashyap | Not least because it's considered legacy (20+ years old), and difficult to extend | 13:26 |
artom | That's an argument for outright switching the default, no? | 13:27 |
*** _erlon_ has joined #openstack-nova | 13:27 | |
kashyap | It is. We've had approximately 240 email-long threads about it. | 13:27 |
artom | Hah :( | 13:27 |
kashyap | Sorry, by "we" as in QEMU and libvirt. | 13:28 |
kashyap | Some fear the backwards compat that it "might" break something | 13:28 |
artom | Well, I meant in Nova, but in QEMU works too | 13:28 |
kashyap | But we can't ever be paralyzed | 13:28 |
kashyap | artom: Yes: https://review.openstack.org/#/c/631154/ | 13:28 |
kashyap | "Gracefully handle QEMU machine types for guests" | 13:28 |
artom | kashyap, oh, had no idea that existed, thanks | 13:29 |
kashyap | artom: For now, installer tools are supposed to configure the default. | 13:29 |
artom | Will add to the pile of specs to review | 13:29 |
artom | ... which I guess makes sense (installer configuring the machine type) | 13:30 |
kashyap | That's one reason I put on the back burner for this cycle | 13:30 |
yaawang | johnthetubaguy: mriedem Could you please take a look at auto-converge/post-copy spec? I've updated it. https://review.opendev.org/#/c/651681/ | 13:33 |
*** BjoernT_ has joined #openstack-nova | 13:34 | |
*** spatel has joined #openstack-nova | 13:35 | |
*** egonzalez has left #openstack-nova | 13:35 | |
*** BjoernT has quit IRC | 13:37 | |
*** spatel has quit IRC | 13:39 | |
*** rcernin has joined #openstack-nova | 13:41 | |
*** eharney has quit IRC | 13:41 | |
*** BjoernT_ is now known as BjoernT | 13:47 | |
*** spsurya has quit IRC | 13:55 | |
*** phasespace has quit IRC | 13:57 | |
*** eharney has joined #openstack-nova | 13:57 | |
*** mlavalle has joined #openstack-nova | 14:00 | |
*** eharney has quit IRC | 14:02 | |
*** boxiang has joined #openstack-nova | 14:02 | |
*** BjoernT_ has joined #openstack-nova | 14:05 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Remove global state from the FakeDriver https://review.opendev.org/656709 | 14:05 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Enhance service restart in functional env https://review.opendev.org/512552 | 14:05 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Add functional test coverage for bug 1724172 https://review.opendev.org/512553 | 14:05 |
openstack | bug 1724172 in OpenStack Compute (nova) rocky "Allocation of an evacuated instance is not cleaned on the source host if instance is not defined on the hypervisor" [Medium,Confirmed] https://launchpad.net/bugs/1724172 | 14:05 |
efried | cfriesen: Thanks for the update. | 14:06 |
*** BjoernT has quit IRC | 14:08 | |
aspiers | https://review.opendev.org/#/c/638680/ got a +1 from Zuul finally | 14:09 |
*** dpawlik has quit IRC | 14:10 | |
efried | yup, \o/ | 14:10 |
aspiers | is it just me or are we getting some semi-spam +1 reviews? | 14:10 |
efried | happens all the time | 14:11 |
aspiers | ah ok | 14:11 |
efried | stat padders | 14:11 |
aspiers | :-( | 14:13 |
aspiers | kashyap: funny this should just emerge after we were talking about it the other day https://itnext.io/pyramid-of-doom-the-signs-and-symptoms-of-a-common-anti-pattern-c716838e1819 | 14:14 |
kashyap | Hiya | 14:14 |
* kashyap clicks | 14:15 | |
*** jaosorior has joined #openstack-nova | 14:15 | |
kashyap | aspiers: Hehe, "pyrmaid of doom" | 14:15 |
*** artom has quit IRC | 14:15 | |
*** eharney has joined #openstack-nova | 14:15 | |
gibi | aspiers: they are coming in batches | 14:16 |
gibi | aspiers: https://www.stackalytics.com/?metric=marks&module=nova-group&company=awcloud | 14:17 |
kashyap | aspiers: On "semi-spam +1s", I wonder if they would dare to do that had it were an e-mail | 14:17 |
kashyap | Imagine sending random +1s to e-mail based patch workflows. | 14:17 |
aspiers | right | 14:17 |
*** brinzh has joined #openstack-nova | 14:17 | |
aspiers | perhaps someone in the foundation can have a word? | 14:17 |
kashyap | Each Gerrit change is an island, nobody notices it (beside those who do). So who cares who craps on it | 14:17 |
kashyap | aspiers: This is something we have to live with, as a community-based open source project, afraid. | 14:18 |
aspiers | kashyap: not sure I agree | 14:18 |
aspiers | there's nothing to stop us raising complaints about this behaviour | 14:18 |
kashyap | aspiers: Oh, certainly; there was a recent thread about it on the list, too | 14:18 |
kashyap | (Not sure you noticed it) | 14:18 |
*** brinzhang has quit IRC | 14:19 | |
aspiers | I think I saw it | 14:19 |
lyarwood | mdbooth / kashyap / sean-k-mooney ; https://review.opendev.org/#/c/663011/ is ready for another round of reviews btw | 14:19 |
kashyap | aspiers: I was just implying we can't eliminate this problem entirely. | 14:19 |
aspiers | kashyap: sure, but "This is something we have to live with" sounded more like a statement of resignation | 14:19 |
aspiers | We can probably eliminate 80% of spam reviews with small effort | 14:20 |
kashyap | aspiers: Heh, let me "phrasing guard" down for a bit there. | 14:20 |
aspiers | It's an odd choice of changes to review they picked | 14:20 |
*** guozijn has quit IRC | 14:21 | |
kashyap | lyarwood: Will look in a bit. | 14:21 |
aspiers | https://www.stackalytics.com/?company=awcloud&metric=marks&release=train <- 25 contributors with 100% approval ratio | 14:21 |
*** bnemec has quit IRC | 14:23 | |
*** bnemec has joined #openstack-nova | 14:25 | |
openstackgerrit | Merged openstack/nova master: conf: Remove cells v1 options, group https://review.opendev.org/651310 | 14:27 |
*** lpetrut has quit IRC | 14:28 | |
*** eharney has quit IRC | 14:35 | |
boxiang | https://review.opendev.org/#/c/649963/ need someone to review it, give me some comments, thanks :) | 14:35 |
mriedem | the pci passthrough whitelist config is not mutable so changing it requires a restart, and that means it would be the same for all computes processed by the scheduler yeah? meaning when we create a HostState object per compute node per scheduling request, we're re-parsing that pci passthrouh whitelist spec per node per request, right? https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L227 | 14:36 |
mriedem | if you have several hundred nodes to process per request i'd imagine that eventually adds up | 14:37 |
efried | aspiers: I'm a bit fuzzy yet this morning, but I think there's a hole in https://review.opendev.org/#/c/662105/ - please see comments. | 14:39 |
aspiers | efried: OK | 14:40 |
sean-k-mooney | mriedem: that is done on the compute node but yes | 14:40 |
mriedem | sean-k-mooney: this is done in the scheduler | 14:41 |
sean-k-mooney | the pci whitelist is on each the compute nodes and can be different | 14:41 |
sean-k-mooney | the alias is parsed on the schduler | 14:41 |
sean-k-mooney | there was a patch to cache this too at one point but i think it was blocked because peopel didnt feel it was enough of a perfomace imporvemnt | 14:42 |
openstackgerrit | Dan Smith proposed openstack/nova master: Make nova-next archive using --before https://review.opendev.org/661002 | 14:42 |
sean-k-mooney | mriedem: https://review.opendev.org/#/c/427145/ | 14:43 |
sean-k-mooney | mriedem: not sure if stephenfin still feels its not worth it | 14:44 |
sean-k-mooney | mriedem: but i certenly tried to get it landed 2 years ago :) | 14:44 |
*** BjoernT_ is now known as BjoernT | 14:47 | |
sean-k-mooney | mriedem: by the way https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L227 is loading the pci info frmo the database not parsing the whitelist | 14:47 |
*** eharney has joined #openstack-nova | 14:48 | |
sean-k-mooney | well that is not quite true its contcutiono the PciDeviceStats object form the compute.pci_device_pools | 14:48 |
mriedem | https://bugs.launchpad.net/nova/+bug/1831758 | 14:49 |
openstack | Launchpad bug 1831758 in OpenStack Compute (nova) "passthrough_whitelist is parsed per compute node per scheduling request even though the whitelist doesn't change between requests" [Low,Confirmed] | 14:49 |
mriedem | constructing PciDeviceStats parses the whitelist | 14:50 |
sean-k-mooney | the whitelist does not have to be set on the the schduler node | 14:50 |
*** artom has joined #openstack-nova | 14:51 | |
sean-k-mooney | and it is and i can be different per compute node but you are refering to https://github.com/openstack/nova/blob/master/nova/pci/stats.py#L66-L67 | 14:51 |
openstackgerrit | Adrian Chiris proposed openstack/nova stable/rocky: Move get_pci_mapping_for_migration to MigrationContext https://review.opendev.org/661499 | 14:51 |
openstackgerrit | Adrian Chiris proposed openstack/nova stable/rocky: Allow driver to properly unplug VIFs on destination on confirm resize https://review.opendev.org/661500 | 14:51 |
*** lpetrut has joined #openstack-nova | 14:52 | |
sean-k-mooney | mriedem: i guess that makes sense on the compute node as this class is used in the pci manager but we should not need to do it in the schduler | 14:52 |
*** eharney has quit IRC | 14:54 | |
*** lpetrut has quit IRC | 14:54 | |
*** lpetrut has joined #openstack-nova | 14:55 | |
sean-k-mooney | mriedem: we proably should add a use_whitelist kwarg or something so we can disable this on the schduler | 14:55 |
*** factor has quit IRC | 14:55 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: DNM dump image_meta when attempting to validate image signatures https://review.opendev.org/663348 | 14:55 |
*** icarusfactor has joined #openstack-nova | 14:55 | |
sean-k-mooney | the pci whitelist shoudl only be used in the compute node and the alias should only be used on teh api or schduler nodes. | 14:56 |
*** itlinux has joined #openstack-nova | 14:57 | |
*** itlinux has quit IRC | 14:59 | |
*** sapd1_x has quit IRC | 14:59 | |
efried | can we not save the date stamp of the file? | 15:00 |
efried | or use fsnotify (or whatever that thing is called)? | 15:00 |
*** boxiang has quit IRC | 15:03 | |
*** JamesBenson has joined #openstack-nova | 15:04 | |
*** brinzh has quit IRC | 15:06 | |
*** lennyb has joined #openstack-nova | 15:07 | |
mriedem | we're parsing from the config option value which is not mutable | 15:07 |
*** eharney has joined #openstack-nova | 15:07 | |
sean-k-mooney | mriedem: and as i mention on the bug the config value is only used on teh compute node and not in the schduler | 15:08 |
efried | oh, sorry, yeah, I was thinking of something quite different. Maybe I need more coffee | 15:08 |
mriedem | sean-k-mooney: i would not doubt that config mgmt tooling - if it's setting it - is setting it everywhere because people wouldn't know which services care about it | 15:09 |
mriedem | sean-k-mooney: nor do you know what out of tree filters could be doing with that thing in the HostState object | 15:09 |
sean-k-mooney | mriedem: well the value can and typicaly iss different per compute node | 15:09 |
sean-k-mooney | so if they are doing anything with the config value its proably wronge | 15:10 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: nova-manage: heal port allocations https://review.opendev.org/637955 | 15:10 |
mriedem | yeah that's true | 15:10 |
mriedem | tssurya: why do you have a -1 on https://review.opendev.org/#/c/623558/ ? | 15:12 |
mriedem | it looks like from your numbers the patch makes a noticeable improvement in scheduling time, | 15:13 |
mriedem | and the services query stuff could be optimized separately later | 15:13 |
tssurya | mriedem: I don't remember exactly but I think I also didn't understand some parts of the _filter_enabled_cell_computes function | 15:15 |
tssurya | also I am pretty sure we have a debug per cell for https://review.opendev.org/#/c/623558/2/nova/scheduler/host_manager.py@790 currently in our deployment which should be fixed. | 15:16 |
tssurya | but yeah could be done in a FUP | 15:16 |
mriedem | debug per cell? | 15:18 |
mriedem | so you're running with this patch in production already? | 15:18 |
mriedem | but changed that warning to a debug level message? | 15:18 |
tssurya | yeah | 15:18 |
mriedem | :/ | 15:18 |
tssurya | wait no | 15:18 |
*** brinzhang has joined #openstack-nova | 15:18 | |
tssurya | I mean we are running it in production already | 15:19 |
tssurya | we are still running it as warning | 15:19 |
tssurya | what I meant was it causes a lot of noise | 15:19 |
mriedem | that would be good feedback on the change (that you're running it in prod) and since when | 15:19 |
*** panda|ruck is now known as panda | 15:19 | |
tssurya | because it prints a message per cell | 15:19 |
tssurya | per scheduling | 15:19 |
tssurya | saying it didn't find anything | 15:19 |
tssurya | so removing that cell | 15:19 |
mriedem | sure, so -1 for that i guess, | 15:19 |
mriedem | but to say "not sure what this does but we've been running with it in prod successfully for months now" isn't really justification for a -1 :) | 15:20 |
tssurya | mriedem: the -1 on the patch was for the extra improvement | 15:20 |
tssurya | fo sure | 15:20 |
tssurya | for* | 15:20 |
mriedem | which extra improvement? the services query? | 15:21 |
tssurya | and because I wasn't convinced what _filter_enabled_cell_computes was doing | 15:21 |
tssurya | mriedem: yes | 15:21 |
tssurya | but I am okay if that needs to be done in a seperate patch | 15:21 |
mriedem | ok that's FUP worthy, since this change is already big | 15:21 |
tssurya | plus its ok for us if we don't backport this | 15:21 |
mriedem | did jay answer your questions about _filter_enabled_cell_computes ? | 15:21 |
mriedem | because if not, i wouldn't wait for him since he's gone | 15:22 |
tssurya | mriedem: no | 15:22 |
tssurya | I can remove the -1 | 15:22 |
mriedem | a todo could be added for the services query | 15:22 |
tssurya | also fyi this patch works with https://review.opendev.org/#/c/635532/ and both together give us good performance | 15:22 |
tssurya | belmiro should be putting a blog post soon | 15:23 |
tssurya | so we will link the post to this review | 15:23 |
*** derekh has quit IRC | 15:23 | |
*** derekh has joined #openstack-nova | 15:23 | |
openstackgerrit | Kashyap Chamarthy proposed openstack/nova master: Document mitigation for Intel MDS security flaws https://review.opendev.org/661574 | 15:25 |
*** derekh has quit IRC | 15:25 | |
*** rcernin has quit IRC | 15:26 | |
*** liuyulong has quit IRC | 15:26 | |
mriedem | tssurya: what is your max_placement_results value now? | 15:26 |
*** derekh has joined #openstack-nova | 15:26 | |
*** brinzhang has quit IRC | 15:26 | |
*** gyee has joined #openstack-nova | 15:27 | |
*** jaosorior has quit IRC | 15:27 | |
tssurya | mriedem: 10 still | 15:29 |
*** derekh has quit IRC | 15:29 | |
*** derekh has joined #openstack-nova | 15:29 | |
mriedem | heh | 15:29 |
*** derekh has quit IRC | 15:31 | |
*** derekh has joined #openstack-nova | 15:31 | |
*** ttsiouts has quit IRC | 15:32 | |
*** ttsiouts has joined #openstack-nova | 15:33 | |
*** frankwang has joined #openstack-nova | 15:33 | |
*** derekh has quit IRC | 15:34 | |
*** derekh has joined #openstack-nova | 15:34 | |
*** tssurya has quit IRC | 15:36 | |
*** frankwang has quit IRC | 15:38 | |
*** ttsiouts has quit IRC | 15:38 | |
*** helenafm has quit IRC | 15:39 | |
openstackgerrit | Eric Fried proposed openstack/nova master: Raise if flavor and image disagree on hide_hypervisor_id https://review.opendev.org/663365 | 15:46 |
efried | mriedem: Can we +W https://review.opendev.org/#/c/579897/ now? | 15:47 |
mriedem | jesus look at all of the new review comments i have to parse | 15:51 |
efried | only since your last +2 | 15:53 |
efried | kashyap: you still around? | 15:53 |
mriedem | ok i have it in a tab but in the middle of something | 15:53 |
dansmith | in case you're wondering, | 15:53 |
dansmith | I hate it. | 15:53 |
efried | kashyap: I'd be happy to fix up those couple of typos in https://review.opendev.org/#/c/661574/ if you like. | 15:53 |
*** Sundar has joined #openstack-nova | 15:54 | |
efried | dansmith: necessary evil | 15:54 |
dansmith | not really | 15:54 |
openstackgerrit | Eric Fried proposed openstack/nova master: Document mitigation for Intel MDS security flaws https://review.opendev.org/661574 | 15:55 |
*** itlinux has joined #openstack-nova | 15:56 | |
efried | kashyap: +2 ^ thanks for your patience | 15:56 |
*** _erlon_ has quit IRC | 15:57 | |
openstackgerrit | John Garbutt proposed openstack/nova master: WIP: add scope check, see tests catch the change https://review.opendev.org/657823 | 15:59 |
*** dtantsur is now known as dtantsur|afk | 16:01 | |
*** panda is now known as panda|off | 16:02 | |
*** markvoelker has joined #openstack-nova | 16:04 | |
jaypipes | mriedem: for the record, I haven't died or anything. :P | 16:08 |
*** rpittau is now known as rpittau|afk | 16:08 | |
mriedem | jaypipes: i know you're not dead, but i also assume you don't want/need to be pinged for stuff you're not going to have the time to work on | 16:09 |
jaypipes | mriedem: I'd chatted with Surya about that patch series in Denver and stated I'd rebase, fix merge conflicts and push them, which I did. I also stated I wouldn't have time to shepherd them through to completion and she mentioned she would do that. | 16:10 |
mdbooth | lyarwood: Question in your q35 patch: Any reason we can't be explicit about q35 machine type? Are there an unmanageable number of q35 machine types or something? | 16:10 |
jaypipes | mriedem: I'm around, but as you say, not really time to spend. I'm still friendly, though. :) | 16:10 |
mriedem | jaypipes: ok i didn't mean any offense | 16:11 |
mriedem | just trying to not distract you | 16:11 |
jaypipes | no offense taken at all! | 16:11 |
lyarwood | mdbooth: so yeah there are a few and even more downstream iirc | 16:13 |
lyarwood | mdbooth: $ qemu-kvm --machine help | grep q35 | wc -l | 16:14 |
mdbooth | lyarwood: Can you enumerate the problem? | 16:14 |
lyarwood | mdbooth: 11 | 16:14 |
mdbooth | That is... quite annoying | 16:14 |
lyarwood | mdbooth: We should really but that's additional work outside of this bugfix IMHO | 16:15 |
mdbooth | Worthy of a comment? | 16:15 |
*** damien_r has quit IRC | 16:15 | |
lyarwood | mdbooth: sure | 16:15 |
edleafe | jaypipes: "still" friendly? :-P | 16:18 |
jaypipes | edleafe: :( | 16:19 |
mdbooth | artom: You on top of review comments from dansmith ? | 16:19 |
artom | mdbooth, working on it, yeah | 16:19 |
kashyap | efried: Was out for a bike ride; now here for a bit more. | 16:25 |
kashyap | efried: Ah, you've fixed up the typos; thank you! | 16:27 |
*** lpetrut has quit IRC | 16:28 | |
openstackgerrit | Adrian Chiris proposed openstack/nova stable/queens: Move get_pci_mapping_for_migration to MigrationContext https://review.opendev.org/661571 | 16:28 |
openstackgerrit | Adrian Chiris proposed openstack/nova stable/queens: Allow driver to properly unplug VIFs on destination on confirm resize https://review.opendev.org/661572 | 16:28 |
kashyap | stephenfin: O Docs Aficionado, when you can, please deliver this from its misery: https://review.opendev.org/#/c/661574 | 16:31 |
stephenfin | kashyap: Might be better let alex_xu grab it since he's taken a look a few times already | 16:33 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Delete unused get_all_host_states method https://review.opendev.org/663377 | 16:33 |
kashyap | Oh, right. I totally forgot that; #bumblebee's-memory-7-seconds | 16:33 |
*** panda|off has quit IRC | 16:33 | |
kashyap | mdbooth: On the number of machine types -- yes, *each* QEMU release comes with a versioned machine type | 16:34 |
kashyap | Why versioned machine type? They preserve guest ABI. | 16:34 |
openstackgerrit | Sylvain Bauza proposed openstack/nova master: Pass allocations to virt drivers when resizing https://review.opendev.org/589085 | 16:35 |
*** panda has joined #openstack-nova | 16:35 | |
kashyap | mdbooth: And the "naked" 'q35' aliases to the latest versioned machine type. | 16:35 |
*** xek has quit IRC | 16:37 | |
*** panda has quit IRC | 16:39 | |
*** aram1s has joined #openstack-nova | 16:41 | |
sean-k-mooney | kashyap: the version machine types also create vendor lock in as they are different between distro and make it hard to migrate form one to another in some cases | 16:41 |
sean-k-mooney | so there are pros and cons i prefer just q35 but there are reason to use the versioned form too | 16:42 |
kashyap | I wouldn't call it "vendor lock-in". Each distribution is *supposed* to pick machine types as they see fit. | 16:42 |
kashyap | It's like saying you can install DEBs on an RPM-based system. | 16:42 |
sean-k-mooney | it kind of is when each distro add the name of the distor into the machine type | 16:43 |
sean-k-mooney | kashyap: well you can | 16:43 |
sean-k-mooney | using alien | 16:43 |
kashyap | Yeah, unfortunately, given the "there are more number of Linux distros than Linux users", we have to live with it... | 16:43 |
sean-k-mooney | but its more complicated | 16:43 |
*** bnemec has quit IRC | 16:44 | |
sean-k-mooney | kashyap: it would be nice if we just had a propatable set defiend by qemu or libvirt that all distos used | 16:44 |
kashyap | Sorry, what is a "propatable"? | 16:44 |
sean-k-mooney | portable | 16:45 |
sean-k-mooney | or interoperable | 16:45 |
kashyap | Aaah, I don't know how feasible that is; need to think with a fresh brain :-) | 16:45 |
kashyap | Alright, need to go cook make food. See ya | 16:45 |
*** kashyap has quit IRC | 16:46 | |
sean-k-mooney | i think recent version of libvirt or qemu have got better of checking compatibliy if you use the alais e.g. "q35" | 16:46 |
sean-k-mooney | but i dont know if it will allow q35-centos... to migrate to q35-rhel... even if they are the same | 16:47 |
sean-k-mooney | i know it did not work betweeen centos and ubunut in the past but if sepcifically set them to the same machine type it did | 16:48 |
*** panda has joined #openstack-nova | 16:48 | |
* sean-k-mooney goes to get food | 16:49 | |
aram1s | Hi there! Does anyone know what the 'state' represents for hypervisors? I thought it reflected the state of nova compute service but when I disable it I still see it as up | 16:50 |
aram1s | how can I make it go down? | 16:50 |
aram1s | without explicitly setting it as down. Will it chance based on any other status? | 16:51 |
aram1s | change* | 16:51 |
*** itlinux has quit IRC | 16:56 | |
*** itlinux has joined #openstack-nova | 16:59 | |
melwitt | aram1s: status will also change if you for example, stop the nova compute service | 17:00 |
*** derekh has quit IRC | 17:00 | |
melwitt | other than that, there's the forced_down API https://developer.openstack.org/api-ref/compute/?expanded=update-forced-down-detail#update-forced-down | 17:00 |
stephenfin | mriedem: Could you drop your -2 from these given my comments inline? https://review.opendev.org/#/c/662501/ https://review.opendev.org/#/c/662502/ | 17:01 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Unplug VIFs as part of cleanup of networks https://review.opendev.org/663382 | 17:03 |
*** itlinux has quit IRC | 17:04 | |
*** xek has joined #openstack-nova | 17:05 | |
mriedem | stephenfin: of course the ec2 objects aren't used outside of nova https://review.opendev.org/#/c/662502/ | 17:08 |
mriedem | the ec2 API shim within nova is what's going to be used by the ec2api code | 17:08 |
mriedem | which uses those objects | 17:08 |
stephenfin | It doesn't though | 17:08 |
mriedem | the ec2utils stuff might be different | 17:08 |
*** xek has quit IRC | 17:08 | |
stephenfin | Not used anywhere outside of nova and not used inside nova either | 17:09 |
*** xek has joined #openstack-nova | 17:09 | |
*** itlinux has joined #openstack-nova | 17:09 | |
stephenfin | At least not once I've removed the unused functions from ec2utils | 17:09 |
mriedem | i'd feel a lot more comfortable with this if we could get a tempest run on the ec2api with a dependency on this series | 17:10 |
stephenfin | I can try figure out how to do that | 17:10 |
* stephenfin hopes the ec2api zuul config is set up for a Depends-On | 17:10 | |
mriedem | it should be as easy as adding openstack/nova to https://github.com/openstack/ec2-api/blob/master/.zuul.yaml#L7 and adding a depends-on to your series | 17:11 |
*** JamesBenson has quit IRC | 17:16 | |
aram1s | thanks melwitt! | 17:16 |
*** JamesBen_ has joined #openstack-nova | 17:17 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: ec2: Remove unused functions from 'ec2utils' https://review.opendev.org/662501 | 17:17 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: objects: Remove unused ec2 objects https://review.opendev.org/662502 | 17:17 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: ec2: Remove ec2.CloudController https://review.opendev.org/662503 | 17:17 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: ec2: Pre-move cleanup of utils https://review.opendev.org/662504 | 17:17 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: ec2: Move ec2utils functions to their callers https://review.opendev.org/662505 | 17:17 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: api: Remove 'Debug' middleware https://review.opendev.org/662506 | 17:17 |
stephenfin | mriedem: Thanks for the tip. We'll see how this goes https://review.opendev.org/663386 | 17:19 |
* stephenfin -> 🏃 | 17:19 | |
*** ralonsoh has quit IRC | 17:28 | |
*** _hemna has joined #openstack-nova | 17:31 | |
*** ociuhandu has joined #openstack-nova | 17:33 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Convert HostMapping.cells to a dict https://review.opendev.org/663387 | 17:34 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Cache host to cell mapping in HostManager https://review.opendev.org/663388 | 17:34 |
*** ociuhandu_ has quit IRC | 17:36 | |
artom | I feel like I've messed up my method if I need to mock like 42 things when unit testing it :( | 17:38 |
mriedem | your method is too big | 17:38 |
artom | That's never been a problem before | 17:38 |
*** ociuhandu has quit IRC | 17:39 | |
*** maciejjozefczyk_ has quit IRC | 17:46 | |
*** zul has quit IRC | 17:47 | |
*** Li_Liu has joined #openstack-nova | 17:49 | |
edleafe | artom: generally a "unit" doesn't have 42 dependencies. | 17:50 |
artom | 42 is obviously an exaggeration, but yeah, this bit of code is heavily coupled to a lot of things | 17:51 |
*** xek has quit IRC | 17:54 | |
edleafe | artom: yeah, that's more of an integration test then. | 17:54 |
*** tesseract has quit IRC | 17:57 | |
* artom tries to see if he can do a functional test instead | 17:58 | |
*** _hemna has quit IRC | 18:02 | |
artom | I probably can't, because external events | 18:10 |
artom | Pretty sure the NeutronFixture doesn't handle those ;) | 18:10 |
artom | I suppose I can mock that part out... | 18:10 |
dansmith | artom: you understand that the concern is just that you didn't test that those parts are at all connected, and since you don't have a gate test to make mriedem happy, there really needs to be confirmation that they work together, right? | 18:11 |
dansmith | I mean, I'm pretty sure I could have removed the actual change you made and no tests would fail | 18:11 |
artom | dansmith, right, IOW, we need to test that if _uses_hybrid_plug returns true, we actually wait for events in the compute manager | 18:12 |
dansmith | at a minimum, I want to see a list of vif network_info structure passed into that method, and have the method not wait for events because it calls the helper and it examines the vif structures properly (and one for the opposite case if not already covered elsewhere) | 18:12 |
dansmith | artom: that's one piece yes | 18:13 |
dansmith | I'm totally uninterested in discussing the meaning of "unit" and how it relates to this, and only really interested in seeing that it works | 18:14 |
artom | dansmith, what's the other piece? By mocking _uses_hybrid_plug to return False, it's testing that the existing code paths continue to work | 18:14 |
dansmith | artom: the other piece being that you actually called the helper in a compatible way.. if all you do is check that the thing called your mock with who-knows-what and the mock always returns true or false, that doesn't mean it actually works when the parent calls the actual child implementation | 18:15 |
dansmith | by testing only units against mocks, you don't actually confirm that they're compatible with each other | 18:16 |
artom | dansmith, you just put words on why I'm not liking my current mock-42-things approach | 18:16 |
dansmith | the world can't be reduced to tiny methods with only two external interactions each, so sometimes things get messy | 18:17 |
dansmith | I'm sure that method is too complex and refactoring won't help for backportability, so.. reality strikes | 18:18 |
artom | Stupid reality | 18:18 |
*** Sundar has quit IRC | 18:21 | |
*** damien_r has joined #openstack-nova | 18:21 | |
*** zul has joined #openstack-nova | 18:23 | |
edleafe | artom: it does sound like you will be better off with a functional test that shows those parts working together as intended. Otherwise, you're just testing the python mock library :) | 18:23 |
*** damien_r has quit IRC | 18:26 | |
*** weshay has quit IRC | 18:26 | |
*** weshay has joined #openstack-nova | 18:27 | |
sean-k-mooney | dansmith: i just replied to your question on the other move operations https://review.opendev.org/#/c/644881/19/nova/compute/manager.py@4192 i belive they are safe but it would be good for someone else to double check my logic | 18:31 |
dansmith | sean-k-mooney: I bet mriedem has thoughts on that | 18:31 |
sean-k-mooney | do we allow evacuate to be reverted by the way? | 18:32 |
*** itlinux has quit IRC | 18:32 | |
sean-k-mooney | i did not think so be honestly i have never looked | 18:32 |
dansmith | unfortunately we do, within a window | 18:33 |
sean-k-mooney | ok so that is one case that porably needs to be handeled | 18:33 |
artom | Wait, evacuate can be reverted? | 18:36 |
artom | Also I just realized that the bug my patch links to is probably irrelevant :/ | 18:36 |
sean-k-mooney | thats the bug the waiting behavior was added to fix | 18:37 |
sean-k-mooney | i think | 18:37 |
dansmith | artom: not revert in the conventional sense, | 18:37 |
artom | sean-k-mooney, I mean https://bugs.launchpad.net/nova/+bug/1813789 | 18:37 |
openstack | Launchpad bug 1813789 in OpenStack Compute (nova) "Evacuate test intermittently fails with network-vif-plugged timeout exception" [Medium,In progress] - Assigned to Artom Lifshitz (notartom) | 18:37 |
sean-k-mooney | artom: yes i know | 18:38 |
dansmith | but if you start an evacuation, it fails, and then the original source comes back alive, you can keep it there, barring bugs and timing and all kinds of crazy | 18:38 |
artom | Even if the symptoms are the same, it has to be a different root cause, because upstream doens't use hybrid plug | 18:38 |
sean-k-mooney | artom: that was the bug we intoduce the wait on revert to fix | 18:38 |
dansmith | which I wish wasn't the case | 18:38 |
artom | dansmith, that couldn't have been on purpose... | 18:38 |
sean-k-mooney | dansmith: im assuming that only works if we have not updated the instance host yet | 18:39 |
dansmith | artom: nothing about evacuate was done with a great sense of purpose, but it was decided not to make it explicitly disallowed (to stay put) | 18:39 |
sean-k-mooney | e.g. we failed before that point | 18:39 |
dansmith | personally I'd prefer that the instant you start an evac, the instance _has_ to move somewhere | 18:39 |
*** kaiokmo has quit IRC | 18:39 | |
sean-k-mooney | and it jsut happend to work if the host became alive again? | 18:39 |
dansmith | sean-k-mooney: yeah | 18:39 |
sean-k-mooney | ok so its just happening because we dont keep retrying to move the instance after the failure | 18:40 |
*** kaiokmo has joined #openstack-nova | 18:40 | |
sean-k-mooney | and when it fails we dont update the db | 18:40 |
*** ricolin has quit IRC | 18:41 | |
dansmith | it depends on when we fail | 18:41 |
dansmith | .of course | 18:41 |
dansmith | which is why it seems broken to me, | 18:41 |
sean-k-mooney | so i proably know the answer to this but do we really care about that edgecase? | 18:42 |
dansmith | especially because HA stuff that has made a decision to rebuild the instance is easier knowing it has fenced the original node, etc | 18:42 |
sean-k-mooney | i mean we proably should but other stuff is proably broken at that point | 18:42 |
dansmith | sean-k-mooney: IIRC, mriedem (and maybe others) really want to keep the "can keep it on the source if it's not too late" behavior | 18:42 |
sean-k-mooney | dansmith: no im kind of ok with that | 18:43 |
sean-k-mooney | what i ment was for artom patch | 18:43 |
dansmith | oh, yeah, I think that's worth investigating and fixing if it's related | 18:43 |
sean-k-mooney | do we need to add in extra code to handel that case | 18:43 |
artom | sean-k-mooney, the evacuate case? Maybe in a different patch, but why put it in mine? | 18:44 |
sean-k-mooney | im trying to think how we would even fake that out in a functional test | 18:44 |
*** kaiokmo has quit IRC | 18:44 | |
artom | dansmith, so, turns out I was being too paranoid too soon about mocking things, and it actully works out pretty clean | 18:45 |
sean-k-mooney | artom: just because dansmith asked about the other move operation in yours | 18:45 |
sean-k-mooney | but it could be in another patch | 18:45 |
dansmith | sean-k-mooney: and when mriedem reviews it, that's going to be one of the first things he says :) | 18:45 |
sean-k-mooney | well i dont think artoms patch makes that case any worse but it also might not fix things in that case | 18:46 |
sean-k-mooney | assuming its currently broken in that case | 18:46 |
dansmith | it's a matter of making sure we don't re-have this conversation in a year when someone reports it for evacuate, and maybe fix it a different way | 18:47 |
sean-k-mooney | ya thats fair | 18:47 |
dansmith | also, I think hard reboot used to wait, now doesn't, potentially because we had this convo already and didn't go deep enough | 18:47 |
dansmith | https://review.opendev.org/#/c/553035/ | 18:47 |
artom | But... I thought the whole thing here is that hybrid plug is special in how it wires vifs | 18:48 |
dansmith | heh, about a year ago | 18:48 |
sean-k-mooney | well not quite | 18:48 |
artom | And resize is special in that it leaves in the instance paused on the source until confirm or revert | 18:48 |
sean-k-mooney | some network backeds only send events on port bind and we dont bind the ports on hard reboot or rebuild | 18:48 |
dansmith | no, it's not paused on the source | 18:48 |
dansmith | it's off | 18:48 |
artom | So.. how would that be applicable for evacuate, when the instance was never on the dest to begin with? | 18:48 |
artom | dansmith, sorry, off (aka destoryed) | 18:48 |
artom | sean-k-mooney, ah, so some backends might send the event as soon as we change the port binding to a host, regardless if it's wired or not? | 18:49 |
sean-k-mooney | artom: yes | 18:49 |
artom | I'd say that's a bug in those backends | 18:49 |
sean-k-mooney | it was by design | 18:50 |
artom | It shouldn't be sending the event if the thing isn't actually plugged/wired | 18:50 |
sean-k-mooney | not all networking backend can support that or at least could back when this was firt intoduced | 18:50 |
artom | That doesn't invalidate my argument :) | 18:50 |
sean-k-mooney | odl took 2 years to catch up and intoduce a way for odl to send events to neutron which neutron could send to nova | 18:51 |
artom | The idea was for Nova to be sure we have networking for instance by the time it spawns | 18:51 |
artom | So things like DHCP work | 18:51 |
artom | If we get the event before that's happened, that's not for Nova to fix | 18:51 |
sean-k-mooney | yep i had this argument with the odl folk back in 2014 | 18:51 |
artom | Because we'd be acting like networking is ready, when in reality it isn't | 18:51 |
artom | (yey, 5 years late) | 18:52 |
sean-k-mooney | for what its worth im pretty sure netorking-aci still does not send it when the port is wireded up an send it on bind | 18:52 |
sean-k-mooney | and ovn is not doing the right thing either in all cases | 18:52 |
artom | The Cisco thing? | 18:52 |
sean-k-mooney | yes | 18:53 |
artom | That's just one more argument for people not to buy Cisco ;) | 18:53 |
mriedem | "IIRC, mriedem (and maybe others) really want to keep the "can keep it on the source if it's not too late" behavior" | 18:53 |
mriedem | i have no opinion on that | 18:53 |
mriedem | that i'm aware of | 18:53 |
mriedem | i'm also assuming i don't want to cruft up an already complicated change to cover other cases in the same patch if we're doing backports | 18:54 |
mriedem | b/c there is a >0% chance this will introduce a regression somehow | 18:54 |
mriedem | b/c this is all very brittle sounding | 18:54 |
mriedem | and super duper latent | 18:54 |
mriedem | how would one even recreate this hybrid ovs resize thing in the gate? configure devstack for that backend (is that easy?) and then put a sleep in the nova code between migrate_instance_start(finish?) and the call to the virt driver? | 18:55 |
artom | Like osteoporosis | 18:55 |
sean-k-mooney | mriedem: we have patches to recreate. you configure ovs of iptables and then we can race on teh revert tests | 18:56 |
sean-k-mooney | *ovs to use the iptables firewall driver | 18:56 |
artom | https://review.opendev.org/#/c/660782/5 specifically | 18:56 |
sean-k-mooney | our downstream ci default to iptables firewall still on older releases so that is why we are seeing it downstream. upstream use the openvswtich contrack firewall driver by default | 18:57 |
mriedem | why does https://review.opendev.org/#/c/653498/ come after the fix in the series? | 18:58 |
artom | mriedem, because without hybrid plug it's actually irrelevant to the fix :/ | 18:59 |
artom | Well | 18:59 |
artom | I suppose it tests the non-hybrid-plug case | 18:59 |
mriedem | correct | 19:00 |
mriedem | you said in the ML it was the only job we run that is multi-node for the resize revert stuff right? | 19:00 |
mriedem | or would be | 19:00 |
artom | Yeah | 19:00 |
artom | I can move it below | 19:01 |
*** ociuhandu has joined #openstack-nova | 19:02 | |
mriedem | you know, | 19:02 |
sean-k-mooney | artom: actuly i think you used to have it below and inverted it in a recent revision | 19:02 |
mriedem | test_server_connectivity_cold_migration_revert in the tempest-slow-py3 job would run resize revert in a multi-node job, | 19:02 |
mriedem | if you removed the skip on the test | 19:02 |
mriedem | and the test is skipped b/c of https://bugs.launchpad.net/nova/+bug/1788403 | 19:02 |
openstack | Launchpad bug 1788403 in OpenStack Compute (nova) "test_server_connectivity_cold_migration_revert randomly fails ssh check" [Medium,Fix released] - Assigned to Matt Riedemann (mriedem) | 19:02 |
mriedem | which sounds very familiar to your issue yeah? | 19:02 |
*** itlinux has joined #openstack-nova | 19:03 | |
artom | It would, except again, upstream gate doesn't use hybrid plug, so it's a different root cause | 19:03 |
dansmith | mriedem: you emotionally scarred me with discussion of that opinion in the past, which is why I remember it | 19:03 |
mriedem | dansmith: logs or it never happened! | 19:04 |
dansmith | mriedem: and yeah, not saying fixing the other operations needs to be in this patch, but if we're undoing not only this wait, but also making previous ones (like the hard reboot) no longer accurate, I think fixing them all in the same way (re-using helpers if they apply, etc) is a good idea | 19:04 |
dansmith | mriedem: I can show you the scars. those' | 19:04 |
dansmith | re good enough yeah? | 19:04 |
mriedem | pictures or it didn't happen! | 19:04 |
* dansmith faxes mriedem copies of his therapy bills | 19:04 | |
artom | Faxes? Damn boomers | 19:05 |
*** priteau has quit IRC | 19:06 | |
dansmith | I know you didn't just confuse me with someone of baby boomer age | 19:06 |
sean-k-mooney | mriedem: https://bugs.launchpad.net/nova/+bug/1788403 is marked as fix releaed and i think you fixed it by intoducing the waits on revert correct so the current skip of test_server_connectivity_cold_migration_revert probly should have been removed a while ago | 19:07 |
openstack | Launchpad bug 1788403 in OpenStack Compute (nova) "test_server_connectivity_cold_migration_revert randomly fails ssh check" [Medium,Fix released] - Assigned to Matt Riedemann (mriedem) | 19:07 |
*** priteau has joined #openstack-nova | 19:08 | |
*** ociuhandu has quit IRC | 19:09 | |
* sean-k-mooney get popcorn to watch how that plays out | 19:09 | |
*** amodi has joined #openstack-nova | 19:11 | |
*** panda has quit IRC | 19:19 | |
*** panda has joined #openstack-nova | 19:20 | |
sean-k-mooney | alex_xu: yonglihe i think https://review.opendev.org/#/c/658716/ look fine. sorry for not getting to it yesterday | 19:21 |
mriedem | artom: dansmith: sean-k-mooney: i've dumped a bit in https://review.opendev.org/#/c/644881/ - i'm going to concentrate on something else now | 19:21 |
sean-k-mooney | now to go revew teh virtual perstent memory spec | 19:21 |
artom | mriedem, thank you for your dump :D | 19:22 |
mriedem | sean-k-mooney: yeah i was wondering why i didn't remove the skip on test_server_connectivity_cold_migration_revert | 19:22 |
*** lpetrut has joined #openstack-nova | 19:23 | |
sean-k-mooney | mriedem: is the skip in tempest or in nova out of interest? | 19:24 |
mriedem | tempest | 19:24 |
mriedem | i'll post a tempest change to remove the skip and see what happens | 19:24 |
sean-k-mooney | ah so its the skip decorator | 19:24 |
sean-k-mooney | i would expect it to pass since we default to the ovs firewall | 19:25 |
sean-k-mooney | i dont think neutorn has a multinode iptables job or any project currently so i doubt we will see failures | 19:25 |
sean-k-mooney | but i could be wrong about that | 19:26 |
*** lpetrut has quit IRC | 19:27 | |
mriedem | https://review.opendev.org/663405 | 19:28 |
sean-k-mooney | i need to go stack a multi node env so ill see if i can run that locally too | 19:30 |
sean-k-mooney | efried: haha you got to before i did | 19:35 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Drop pre-cinder 3.44 version compatibility https://review.opendev.org/621061 | 19:36 |
sean-k-mooney | efried: i think we have so issues with those db unit test randomly failing | 19:36 |
sean-k-mooney | its not the first time i have seen them fail on py36 and pass on py27 and py37 | 19:37 |
efried | sean-k-mooney: what are we talking about? | 19:37 |
sean-k-mooney | the hide hypervior id patch | 19:38 |
efried | sean-k-mooney: https://bugs.launchpad.net/nova/+bug/1823251 ? | 19:38 |
openstack | Launchpad bug 1823251 in OpenStack Compute (nova) "Spike in TestNovaMigrationsMySQL.test_walk_versions/test_innodb_tables failures since April 1 2019 on limestone-regionone" [High,Confirmed] | 19:38 |
efried | sean-k-mooney: You'll be my hero if you can figure that one out. | 19:38 |
sean-k-mooney | and yes i was refering to the failures of those tests | 19:38 |
efried | probably even shiny nickel worthy | 19:39 |
sean-k-mooney | what strang is they appear to be failing only in the limestone-regionone | 19:40 |
* artom needs to do some taxi'ing for amily | 19:41 | |
artom | *family | 19:41 |
artom | Gonna have to finish addressing your feedback later tonight, mriedem | 19:41 |
artom | Moving to test_compute_mgr isn't as trivial as it first sounded :( | 19:42 |
sean-k-mooney | efried: looks like gibi is trying to fix it with https://review.opendev.org/#/c/662434/ | 19:45 |
efried | Oh, okay. | 19:46 |
sean-k-mooney | well its not actully a fix but just a cleanup to help find a fix | 19:46 |
efried | except it didn't work? | 19:46 |
efried | yeah | 19:46 |
sean-k-mooney | well its passing but i need to check if it ran on limestone or not | 19:47 |
efried | It failed once | 19:47 |
*** slaweq has quit IRC | 19:47 | |
efried | http://logs.openstack.org/34/662434/1/check/openstack-tox-py27/3ead2fd/testr_results.html.gz | 19:47 |
*** slaweq has joined #openstack-nova | 19:48 | |
*** artom has quit IRC | 19:48 | |
sean-k-mooney | v2 ran on ovh so the fact v2 pass doesnot tell us much | 19:48 |
sean-k-mooney | the one that failed ran on limestone http://logs.openstack.org/34/662434/1/check/openstack-tox-py27/3ead2fd/job-output.txt.gz#_2019-05-31_11_45_41_625314 | 19:49 |
sean-k-mooney | in theory nodepool is building and uploading the same image to all providers so it should be the same but i guess we could check with the infra folks to confirm that | 19:50 |
sean-k-mooney | oh and thats interest so the failure also happens on py27 | 19:52 |
*** imacdonn has quit IRC | 19:53 | |
*** imacdonn has joined #openstack-nova | 19:53 | |
*** brault has quit IRC | 19:58 | |
*** _hemna has joined #openstack-nova | 19:59 | |
*** BjoernT has quit IRC | 20:08 | |
*** damien_r has joined #openstack-nova | 20:10 | |
cfriesen | mriedem: I'm hitting something interesting in pike. when deleting a nova-compute service I'm still managing to hit the "Unable to delete resource provider X: Resource provider has allocations": error, even with your check to see if there are instances on the compute host. | 20:13 |
efried | did you check placement for allocations? | 20:13 |
cfriesen | mriedem: this leaves us with a stale resource_provider entry, whose UUID ends up out of sync with the nova compute_nodes table, which causes scheduling to fail when we add the node back with the same name | 20:14 |
cfriesen | efried: checking that now | 20:14 |
sean-k-mooney | cfriesen: there could be allocation related to stale migrations against that node | 20:14 |
efried | is this what nova manage heal allocations is for? | 20:15 |
efried | (however that's spelled) | 20:15 |
sean-k-mooney | im not sure | 20:16 |
cfriesen | efried: I see 14 allocations for the stale node for resource classes 0, 1, 2. (cpu/ram/disk, I think) | 20:16 |
sean-k-mooney | i dont think so but it may be a sideffect | 20:16 |
efried | cfriesen: Good to know placement wasn't lying. | 20:17 |
sean-k-mooney | cfriesen: do the allcoation consume ids match any instance or migrations | 20:17 |
*** itlinux has quit IRC | 20:17 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Use SATA bus for cdrom devices when using Q35 machine type https://review.opendev.org/663011 | 20:18 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: DNM: Run tempest-full-py3 with q35 machine type https://review.opendev.org/662887 | 20:18 |
*** itlinux has joined #openstack-nova | 20:19 | |
cfriesen | sean-k-mooney: they match 6 instances that are currently running (they had been running on the node that I deleted) | 20:20 |
mriedem | cfriesen: kvm or ironic service? | 20:21 |
cfriesen | mriedem: kvm | 20:21 |
mriedem | heal_allocations is for adding missing allocations to a provider that has instances on it | 20:21 |
sean-k-mooney | cfriesen: ok so placement still thinks they are running on the wrong node | 20:22 |
cfriesen | this is coming from a customer test, but my way of reproducing is to have instances running, then power off the node uncleanly, and then try to delete the node while the automated tools are evacuating the instances that had been running on it. | 20:22 |
mriedem | as sean-k-mooney said, there are probably allocations from migration records | 20:23 |
sean-k-mooney | heal_allocations would add allcotion for the node the vms are now running on but does it also clean up allocation where the vms are nolonger running | 20:23 |
mriedem | b/c you're deleting during a resize or something | 20:23 |
cfriesen | sean-k-mooney: nope. there are allocations with the same consumer id but two differnet resource_provider_id | 20:23 |
cfriesen | mriedem: would there be allocations from the evacuate? | 20:24 |
mriedem | https://bugs.launchpad.net/nova/+bug/1793569 | 20:24 |
openstack | Launchpad bug 1793569 in OpenStack Compute (nova) "Add placement audit commands" [Wishlist,Confirmed] | 20:24 |
mriedem | cfriesen: i don't think there would be migration consumer allocatoins during an evac, | 20:25 |
mriedem | but there could be allocations for the instances that were on that evacuated host | 20:25 |
sean-k-mooney | i dont think there is either but we should claim resouce on the destiation | 20:25 |
cfriesen | they would have all been live-migrated onto that host not too long before I killed the node | 20:25 |
mriedem | when the old host is brought back, it should clean up those evacuated instance allocations on startup | 20:25 |
mriedem | gibi recently had something merged for evacuate that sounds related https://review.opendev.org/#/c/512623/ | 20:26 |
sean-k-mooney | oh ya i rember looking at this a few days ago but not having time to review | 20:28 |
mriedem | cfriesen: see my note on https://review.opendev.org/#/c/657070/2/nova/scheduler/client/report.py as well | 20:32 |
sean-k-mooney | mriedem: was this the patch you were asking about performace of "x in set()" vs "x in dict()" | 20:32 |
mriedem | sean-k-mooney: no | 20:32 |
sean-k-mooney | ok | 20:32 |
mriedem | anywya, we don't have migration consumers for evac | 20:32 |
sean-k-mooney | we dont but we do claim an allcoation candate for teh destination node | 20:33 |
*** _hemna has quit IRC | 20:33 | |
sean-k-mooney | but i dont know if we free the allcoation of the source node in all cases | 20:33 |
mriedem | correct the scheduler will claim (or conductor if you're forcing the dest host) https://github.com/openstack/nova/blob/master/nova/conductor/manager.py#L923 | 20:33 |
mriedem | the source host allocations should be cleaned up when the service is restarted | 20:34 |
openstackgerrit | John Garbutt proposed openstack/nova master: WIP: add functional test for admin_password https://review.opendev.org/663422 | 20:34 |
cfriesen | when evacuating an instance, when are the allocations on the old node supposed to be cleaned up? | 20:34 |
mriedem | in here https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L620 | 20:34 |
mriedem | on restart of the service ^ | 20:34 |
cfriesen | okay, so if we never restart the service they'll just stay stale in the allocations table | 20:34 |
mriedem | right, | 20:35 |
sean-k-mooney | cfriesen: when you start up the compute agent on the souce node after fixing it meaning you wont be able to delete the RP until after that | 20:35 |
mriedem | but presumably that node is dead and fenced anyway | 20:35 |
mriedem | so you're not scheduling to it | 20:35 |
cfriesen | yeah...the problem is that those allocations prevent us from deleting the resource_provider on compute node deletion | 20:35 |
sean-k-mooney | mriedem: i think cfriesen was trying to remove teh RP before repovisoing the fixed node | 20:35 |
sean-k-mooney | cfriesen: why are you delete in the compute node recored out of interest | 20:36 |
cfriesen | customer is deleting the node. no idea why. testing functionality? | 20:36 |
sean-k-mooney | if you are replacing the node with another and reusing the hostname it should not be needed | 20:36 |
sean-k-mooney | ah ok | 20:36 |
sean-k-mooney | customer always break things | 20:36 |
cfriesen | when we add the new node back in with a different hostname it's all fine | 20:36 |
mriedem | the shitty thing i didn't realize until cdent pointed it out in an open review is that https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L2183 won't raise if deleting the provider fails, which means we'll continue to delete the compute service record | 20:37 |
cfriesen | if we reuse the same hostname we get a different UUID in the compute_node table, but we don't get a new entry in the resource_provider table, so the UUIDs don't match and we never schedule anything to the node | 20:37 |
sean-k-mooney | cfriesen: if you add it with the same host name it would be fine too just dont delete the compute node recored and it will fix everything up when the compute agent starts | 20:37 |
*** BjoernT has joined #openstack-nova | 20:37 | |
cfriesen | sean-k-mooney: the compute node record is deleted by running the "delete service" api | 20:37 |
sean-k-mooney | cfriesen: yes. which you dont need to do to replace teh server | 20:38 |
mriedem | cfriesen: https://bugs.launchpad.net/nova/+bug/1829479 could also be related to your issue | 20:38 |
openstack | Launchpad bug 1829479 in OpenStack Compute (nova) "The allocation table has residual records when instance is evacuated and the source physical node is removed" [Undecided,Incomplete] | 20:38 |
sean-k-mooney | and if you dont delete the compute service record you will get the same uuid when the new agent conencts | 20:38 |
cfriesen | sean-k-mooney: I suspect you're right as a workaround. but it *should* work in an ideal world. :) | 20:40 |
cfriesen | mriedem: yeah, that looks like what I'm seeing | 20:40 |
sean-k-mooney | cfriesen: we could "fix it" by having the compute node when it fails to find a compute node record by hostname first check if there is a placemetn RP by looking it up by hostname and then reuse the uuid of the RP which was the old compute node uuid | 20:41 |
openstackgerrit | John Garbutt proposed openstack/nova master: WIP: add functional test for admin_password https://review.opendev.org/663422 | 20:41 |
cfriesen | sean-k-mooney: would work but seems kind of kludgy. can we delete allocations when deleting a resource_provider? | 20:42 |
sean-k-mooney | we current dont allow you to delete inventoryes if they have allocatin intentionally | 20:43 |
cfriesen | I think this code might even have "cascade=true" | 20:43 |
sean-k-mooney | cfriesen: im pretty sure we intentnioally dont use cascade=true | 20:44 |
cfriesen | sean-k-mooney: it's in here, actually | 20:44 |
mriedem | cfriesen: ack i left a comment in https://bugs.launchpad.net/nova/+bug/1829479 | 20:44 |
openstack | Launchpad bug 1829479 in OpenStack Compute (nova) "The allocation table has residual records when instance is evacuated and the source physical node is removed" [Undecided,Incomplete] | 20:44 |
*** dave-mccowan has quit IRC | 20:44 | |
mriedem | cfriesen: we do try to delete the allocations, | 20:45 |
mriedem | the problem in your case is it's doing it by instances.host, | 20:45 |
mriedem | which if you've evacuated those instances, the instances.host points at the new dest compute | 20:45 |
mriedem | but the old allocations are still there | 20:45 |
mriedem | this could be easily recreated in a functional test i think | 20:45 |
mriedem | 1. create server on host1, 2. force down and evacuate from host1, 3. delete compute service for host1, 4. assert provider and allocs still exist for host1 in placement | 20:46 |
mriedem | and 5. restart host1 service should blow up | 20:46 |
mriedem | b/c of the unique constraint on the provider name | 20:46 |
cfriesen | ah, that makes sense | 20:46 |
mriedem | so https://github.com/openstack/nova/blob/653515a45032811b6bc2f1d0fb651472005496ec/nova/scheduler/client/report.py#L2173 is really just not right for (1) in-progress / unconfirmed migrations and (2) evacuated hosts | 20:48 |
cfriesen | agreed | 20:49 |
mriedem | i could start a rolling ball with said functional test to at least recreate the issue so we can throw fixes at it | 20:49 |
sean-k-mooney | cfriesen: just looking at the placmenet fuctional test tehy expresly check that deleteing allocated resouce provider fails just an fyi https://github.com/openstack/placement/blob/master/placement/tests/functional/db/test_resource_provider.py#L363-L367 | 20:49 |
cfriesen | sean-k-mooney: yeah, it's valid that it fails when there are allocations, but setting "cascade=false" is supposed to delete those allocations. except it doesn't | 20:50 |
sean-k-mooney | when calling delete_resource_provider | 20:52 |
mriedem | cfriesen: you mean cascade=true | 20:52 |
sean-k-mooney | initally i assumed when you said cascade=true you were refering to the db schema | 20:52 |
cfriesen | bah, yes. cascade=true | 20:52 |
mriedem | looks like that code was written specifically for an ironic case https://review.opendev.org/#/c/428375/6/nova/scheduler/client/report.py@771 | 20:52 |
mriedem | so not really thinking about migrations and evacuates and suh | 20:52 |
mriedem | *such | 20:52 |
openstackgerrit | Matt Riedemann proposed openstack/nova-specs master: Spec to pre-filter disabled computes with placement https://review.opendev.org/657884 | 20:53 |
mriedem | dansmith: got that spec updated ^ | 20:53 |
sean-k-mooney | i would need to check the placmeent api to confim but the allocation table has teh RP it was allcaoed from so we should be able to simply look up all the allocation for a given rp then delete tehm assuming that is exposed at the api level | 20:55 |
sean-k-mooney | which i proably is not | 20:55 |
icarusfactor | Can I have more than one compute_driver for nova.conf or is it limited to only one? | 20:56 |
sean-k-mooney | ya its not | 20:56 |
sean-k-mooney | so we need to add a new endpoint which is /allocations?rp_uuid=XYZ to be be able to do this cleanly | 20:57 |
sean-k-mooney | icarusfactor: its one per nova.conf | 20:58 |
sean-k-mooney | icarusfactor: but you can have multiple agent on the same host in some cases | 20:58 |
sean-k-mooney | icarusfactor: are you trying to run ironc and libvirt on teh same host? | 20:58 |
icarusfactor | sean-k-mooney, Nifty , I'm running libvirt currently. Was wondering how to do that. | 20:59 |
*** pcaruana has quit IRC | 20:59 | |
sean-k-mooney | its generally not a great idea to do but if you have a second nova.conf that you pass to the second agent you can make it work | 20:59 |
icarusfactor | sean-k-mooney, Just trying and experimenting with different methods , want to use the Stein docker nova driver. | 21:00 |
sean-k-mooney | normally i would recommend running the ironic driver assuming that is the other one you want to use on one of your contoler instead of colcating it on a libvirt compute node | 21:00 |
mriedem | sean-k-mooney: right i asked that here https://review.opendev.org/#/c/428375/6/nova/scheduler/client/report.py@771 | 21:01 |
mriedem | but never got a response | 21:01 |
mriedem | we can get the allocations for a given provider with this API https://developer.openstack.org/api-ref/placement/?expanded=list-resource-provider-allocations-detail#list-resource-provider-allocations | 21:02 |
sean-k-mooney | icarusfactor: i had thought that dirver nova docker driver was nolonger mainatined | 21:02 |
sean-k-mooney | mriedem: oh off the RP endpoint | 21:03 |
sean-k-mooney | cool ya that seams more sane | 21:03 |
icarusfactor | sean-k-mooney, Humm, that is what I was wondering , the Dashboard now shows the docker option. | 21:03 |
sean-k-mooney | at least that way we cant get out of sync | 21:03 |
*** _hemna has joined #openstack-nova | 21:03 | |
*** brault has joined #openstack-nova | 21:03 | |
sean-k-mooney | icarusfactor: dashboard as in horizon? | 21:03 |
icarusfactor | Seeing it in Stien , I was wanting to try this option. | 21:03 |
icarusfactor | yes | 21:03 |
sean-k-mooney | icarusfactor: could that be form zun? | 21:04 |
mriedem | https://review.opendev.org/#/c/657016/ is my change where cdent pointed out that failure to delete the rp doesn't make us fail to delete the compute service | 21:04 |
mriedem | finally connecting all the dots | 21:04 |
mriedem | i'm slow and dumb | 21:04 |
icarusfactor | sean-k-mooney, Cant answer that. | 21:04 |
sean-k-mooney | icarusfactor: https://github.com/openstack/nova-docker | 21:04 |
sean-k-mooney | icarusfactor: it proably related to zun | 21:05 |
sean-k-mooney | mriedem: oh this is a recent change | 21:05 |
icarusfactor | sean-k-mooney, ok , will check zun out , that helps as I was going /dev/null fast. | 21:06 |
mriedem | sean-k-mooney: which? | 21:06 |
sean-k-mooney | https://review.opendev.org/#/c/657016/ | 21:06 |
sean-k-mooney | its from monday where cdent commented | 21:06 |
mriedem | i think i wrote that at the ptg, | 21:08 |
mriedem | but it's a follow on fix to https://review.opendev.org/#/q/I7b8622b178d5043ed1556d7bdceaf60f47e5ac80 | 21:08 |
mriedem | which isn't so new | 21:08 |
*** brault has quit IRC | 21:08 | |
sean-k-mooney | oh may 3rd not june 3rd | 21:08 |
sean-k-mooney | i miss read teh commit date | 21:08 |
mriedem | so we used to always orphan the rps when when deleting a compute service, | 21:08 |
mriedem | then we tried to delete the rps, but if it failed we still deleted the copute service and didn't try all nodes in the case of ironic, | 21:09 |
mriedem | now we realize that we can still fail to delete the rps b/c of evacs and unconfirmed migrations | 21:09 |
mriedem | so not the worst, but still busted... | 21:09 |
mriedem | meanwhile, let's pile a bunch of new code into nova yay wee!!!! | 21:09 |
sean-k-mooney | hehe well stephen is trying to delete a tone of it too :) | 21:10 |
sean-k-mooney | but im guessing we shoudl update https://github.com/openstack/nova/blob/653515a45032811b6bc2f1d0fb651472005496ec/nova/scheduler/client/report.py#L2173 | 21:10 |
sean-k-mooney | to get teh allcotion form the RP via teh placmeent api right | 21:10 |
*** priteau has quit IRC | 21:10 | |
mriedem | stephen is deleting the stuff no one has been using for several years sure :) | 21:11 |
sean-k-mooney | and then if we combind that with your change for ironic that might do the right thing | 21:11 |
mriedem | i need to write the functional recreate test first before we start throwing fixes at it, | 21:11 |
mriedem | because there are several issues here i think and i need to start with the tests | 21:11 |
sean-k-mooney | sure but https://review.opendev.org/#/c/657016/2/nova/api/openstack/compute/services.py is calling delete_resource_provider | 21:13 |
mriedem | i know | 21:13 |
mriedem | ^ is no worse off than what we have today, it just handles 1:M ironic nodes, | 21:13 |
sean-k-mooney | so if we fix and test delete_resource_provider for a singel provider then we can layer your other fix on top. | 21:14 |
sean-k-mooney | yes | 21:14 |
mriedem | cdent's -1 is that the commit message is asserting that if we fail to delete the rp we won't delete the compute service which is false | 21:14 |
sean-k-mooney | i think they shoudl stay seperate patches too because its really two sperate problems | 21:14 |
mriedem | so before refactoring delete_resource_provider and how that works for allocations, i would need to think about whether or not those should be separate changes (i think they should for sanity) | 21:15 |
mriedem | right | 21:15 |
mriedem | but right now i have familial duties so can't work on this until tomorrow | 21:15 |
sean-k-mooney | mriedem: night o/ if you want me to try on work on part of this let me know | 21:16 |
sean-k-mooney | i need to go finish some spec reviews from yesterday | 21:16 |
*** tjgresha has quit IRC | 21:18 | |
*** Sundar has joined #openstack-nova | 21:21 | |
*** rcernin has joined #openstack-nova | 21:26 | |
*** JamesBen_ has quit IRC | 21:35 | |
*** _hemna has quit IRC | 21:36 | |
*** damien_r has quit IRC | 21:39 | |
*** aram1s has quit IRC | 21:44 | |
openstackgerrit | Adam Spiers proposed openstack/nova master: Track inventory for new MEM_ENCRYPTION_CONTEXT resource class https://review.opendev.org/662105 | 21:45 |
*** gmann has joined #openstack-nova | 21:45 | |
*** dave-mccowan has joined #openstack-nova | 21:47 | |
*** itlinux has quit IRC | 21:50 | |
*** slaweq has quit IRC | 21:52 | |
*** Sundar has quit IRC | 21:52 | |
*** itlinux has joined #openstack-nova | 21:57 | |
*** whoami-rajat has quit IRC | 22:00 | |
*** slaweq has joined #openstack-nova | 22:03 | |
*** rcernin has quit IRC | 22:04 | |
*** slaweq has quit IRC | 22:08 | |
*** icarusfactor has quit IRC | 22:12 | |
openstackgerrit | Adam Spiers proposed openstack/nova master: Track inventory for new MEM_ENCRYPTION_CONTEXT resource class https://review.opendev.org/662105 | 22:13 |
*** icarusfactor has joined #openstack-nova | 22:13 | |
openstackgerrit | Sundar Nadathur proposed openstack/nova-specs master: Nova Cyborg interaction specification. https://review.opendev.org/603955 | 22:13 |
*** icarusfactor has quit IRC | 22:14 | |
*** icarusfactor has joined #openstack-nova | 22:15 | |
*** itlinux has quit IRC | 22:15 | |
*** icarusfactor has quit IRC | 22:16 | |
*** icarusfactor has joined #openstack-nova | 22:16 | |
*** icarusfactor has quit IRC | 22:18 | |
*** mlavalle has quit IRC | 22:18 | |
*** icarusfactor has joined #openstack-nova | 22:18 | |
*** icarusfactor has quit IRC | 22:19 | |
*** icarusfactor has joined #openstack-nova | 22:20 | |
*** icarusfactor has quit IRC | 22:21 | |
*** icarusfactor has joined #openstack-nova | 22:21 | |
*** icarusfactor has quit IRC | 22:23 | |
*** icarusfactor has joined #openstack-nova | 22:23 | |
*** icarusfactor has quit IRC | 22:26 | |
*** icarusfactor has joined #openstack-nova | 22:26 | |
*** icarusfactor has quit IRC | 22:28 | |
*** icarusfactor has joined #openstack-nova | 22:29 | |
*** icarusfactor has quit IRC | 22:48 | |
*** dave-mccowan has quit IRC | 22:56 | |
*** tkajinam has joined #openstack-nova | 23:00 | |
*** artom has joined #openstack-nova | 23:09 | |
*** JamesBenson has joined #openstack-nova | 23:12 | |
*** itlinux has joined #openstack-nova | 23:16 | |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: Run revert resize tests in nova-live-migration https://review.opendev.org/653498 | 23:26 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: Revert resize: wait for events according to hybrid plug https://review.opendev.org/644881 | 23:26 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: [DNM] use iptables in nova-multinode https://review.opendev.org/660782 | 23:26 |
artom | mriedem, dansmith, think I got everything ^^ Don't expect anything else to happen tonight, I'm gone after this anyways, just an FYI | 23:27 |
artom | *I don't expect | 23:27 |
*** JamesBenson has quit IRC | 23:32 | |
*** _hemna has joined #openstack-nova | 23:33 | |
*** claudiub has quit IRC | 23:42 | |
*** luksky has quit IRC | 23:46 | |
*** itlinux has quit IRC | 23:46 | |
*** slaweq has joined #openstack-nova | 23:48 | |
*** brinzhang has joined #openstack-nova | 23:48 | |
*** frankwang has joined #openstack-nova | 23:51 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!