*** aspiers[m] has quit IRC | 00:00 | |
*** nicolasbock has quit IRC | 00:00 | |
*** slaweq has joined #openstack-nova | 00:13 | |
*** slaweq has quit IRC | 00:24 | |
*** hshiina has joined #openstack-nova | 00:58 | |
*** tetsuro has joined #openstack-nova | 00:59 | |
*** tetsuro has quit IRC | 01:06 | |
*** erlon has quit IRC | 01:10 | |
*** lbragstad has joined #openstack-nova | 01:14 | |
*** slaweq has joined #openstack-nova | 01:14 | |
*** slaweq has quit IRC | 01:24 | |
*** bhagyashris has joined #openstack-nova | 01:28 | |
*** rcernin has quit IRC | 01:36 | |
*** eandersson has quit IRC | 01:45 | |
*** Dinesh_Bhor has joined #openstack-nova | 02:00 | |
*** lbragstad has quit IRC | 02:08 | |
*** slaweq has joined #openstack-nova | 02:11 | |
openstackgerrit | Zhenyu Zheng proposed openstack/nova master: Bump compute service to indicate attach/detach root volume is supported https://review.openstack.org/614750 | 02:20 |
---|---|---|
*** slaweq has quit IRC | 02:24 | |
*** mrsoul has quit IRC | 02:27 | |
*** tetsuro has joined #openstack-nova | 02:34 | |
*** itlinux has quit IRC | 02:36 | |
*** annp has joined #openstack-nova | 02:39 | |
*** mhen has quit IRC | 02:45 | |
*** mhen has joined #openstack-nova | 02:48 | |
*** hongbin has joined #openstack-nova | 02:56 | |
*** slaweq has joined #openstack-nova | 03:12 | |
*** lbragstad has joined #openstack-nova | 03:14 | |
*** slaweq has quit IRC | 03:24 | |
*** edmondsw has quit IRC | 03:34 | |
*** Dinesh_Bhor has quit IRC | 03:34 | |
*** edleafe has quit IRC | 03:34 | |
*** udesale has joined #openstack-nova | 03:45 | |
*** Kevin_Zheng has quit IRC | 03:47 | |
openstackgerrit | Zhenyu Zheng proposed openstack/nova master: WIP per instance serial https://review.openstack.org/619953 | 03:50 |
*** sridharg has joined #openstack-nova | 03:54 | |
*** tetsuro has quit IRC | 04:03 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Add a bug tag for nova doc https://review.openstack.org/619434 | 04:11 |
*** slaweq has joined #openstack-nova | 04:11 | |
*** slaweq has quit IRC | 04:24 | |
*** lbragstad has quit IRC | 04:32 | |
*** hongbin has quit IRC | 04:34 | |
*** tetsuro has joined #openstack-nova | 04:37 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (3) https://review.openstack.org/574104 | 04:43 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (4) https://review.openstack.org/574106 | 04:43 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (5) https://review.openstack.org/574110 | 04:44 |
*** ivve has joined #openstack-nova | 04:44 | |
*** bhagyashris has quit IRC | 04:54 | |
*** Dinesh_Bhor has joined #openstack-nova | 05:09 | |
*** janki has joined #openstack-nova | 05:12 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (6) https://review.openstack.org/574113 | 05:15 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (7) https://review.openstack.org/574974 | 05:15 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (8) https://review.openstack.org/575311 | 05:16 |
*** slaweq has joined #openstack-nova | 05:16 | |
*** bhagyashris has joined #openstack-nova | 05:16 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in libvirt/test_driver.py (7) https://review.openstack.org/571992 | 05:16 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in libvirt/test_driver.py (8) https://review.openstack.org/571993 | 05:16 |
*** slaweq has quit IRC | 05:24 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (9) https://review.openstack.org/575581 | 05:32 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (10) https://review.openstack.org/576017 | 05:32 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (11) https://review.openstack.org/576018 | 05:33 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (12) https://review.openstack.org/576019 | 05:34 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (13) https://review.openstack.org/576020 | 05:34 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (14) https://review.openstack.org/576027 | 05:34 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (15) https://review.openstack.org/576031 | 05:34 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (16) https://review.openstack.org/576299 | 05:35 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (17) https://review.openstack.org/576344 | 05:35 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (18) https://review.openstack.org/576673 | 05:35 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (19) https://review.openstack.org/576676 | 05:36 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (20) https://review.openstack.org/576689 | 05:36 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (21) https://review.openstack.org/576709 | 05:36 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (22) https://review.openstack.org/576712 | 05:37 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Use links to placement docs in nova docs https://review.openstack.org/614056 | 05:39 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: api-ref: Add descriptions for vol-backed snapshots https://review.openstack.org/615084 | 05:40 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Fix best_match() deprecation warning https://review.openstack.org/611204 | 05:40 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in virt/test_block_device.py https://review.openstack.org/566153 | 05:41 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove Placement API reference https://review.openstack.org/614437 | 05:41 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Use oslo_db.sqlalchemy.test_fixtures https://review.openstack.org/609352 | 05:41 |
openstackgerrit | Takashi NATSUME proposed openstack/nova stable/rocky: Remove unnecessary redirect https://review.openstack.org/607400 | 05:42 |
*** ratailor has joined #openstack-nova | 05:59 | |
*** slaweq has joined #openstack-nova | 06:16 | |
*** whoami-rajat has joined #openstack-nova | 06:21 | |
*** slaweq has quit IRC | 06:24 | |
*** diga has joined #openstack-nova | 06:42 | |
*** frippe75 has joined #openstack-nova | 06:50 | |
*** alexchadin has joined #openstack-nova | 06:58 | |
*** Luzi has joined #openstack-nova | 07:00 | |
frippe75 | Trying to deploy Rocky on CentOS 7.5.1804 without success from a nova perspective. Everything looks good but I cannot spawn instances (NoValidHost). It looks like the nova installation is not complete. | 07:03 |
frippe75 | Main issue I find is : There are no compute resource providers in the Placement service nor are there compute nodes in the database. Anyone care to take a look? https://pastebin.com/raw/wPfaY4YP | 07:04 |
frippe75 | I deployed using packstack but have not found a resolution via channel #rdo | 07:04 |
*** adrianc has joined #openstack-nova | 07:06 | |
openstackgerrit | Zhenyu Zheng proposed openstack/nova-specs master: Amend the detach-boot-volume design https://review.openstack.org/619161 | 07:12 |
*** Dinesh_Bhor has quit IRC | 07:14 | |
*** adrianc has quit IRC | 07:17 | |
*** adrianc has joined #openstack-nova | 07:17 | |
openstackgerrit | Zhenyu Zheng proposed openstack/nova master: Per-instance serial number https://review.openstack.org/619953 | 07:21 |
*** Dinesh_Bhor has joined #openstack-nova | 07:28 | |
*** Kevin_Zheng has joined #openstack-nova | 07:31 | |
*** bhagyashris_ has joined #openstack-nova | 07:34 | |
*** bhagyashris has quit IRC | 07:34 | |
*** lpetrut has joined #openstack-nova | 07:34 | |
*** takashin has quit IRC | 07:36 | |
*** takashin has joined #openstack-nova | 07:37 | |
bhagyashris_ | artom: Hi, | 07:45 |
*** moshele has joined #openstack-nova | 07:47 | |
*** slaweq has joined #openstack-nova | 07:49 | |
frippe75 | during nova-compute launch get_available_resource(self, nodename) should get called and populate the resources into nova... But seems not to... | 07:52 |
*** jangutter has joined #openstack-nova | 07:53 | |
*** ociuhandu has joined #openstack-nova | 07:56 | |
*** Dinesh_Bhor has quit IRC | 07:58 | |
*** ociuhandu has quit IRC | 07:58 | |
*** takashin has left #openstack-nova | 08:00 | |
*** ccamacho has quit IRC | 08:01 | |
*** ccamacho has joined #openstack-nova | 08:01 | |
frippe75 | I see a log entry where it "gets created" ... "Compute node record created for openstack.frippe.com:openstack.frippe.com with uuid: 6fd2553b-26bb-466e-bde3-57c753dfa50a" | 08:04 |
frippe75 | But after that I get this all the time in the nova-compute.log "No compute node record for host openstack.frippe.com: ComputeHostNotFound_Remote:" | 08:04 |
frippe75 | Why does it say hostname:hostname when it get created?? | 08:05 |
*** trident has quit IRC | 08:10 | |
*** trident has joined #openstack-nova | 08:11 | |
*** ralonsoh has joined #openstack-nova | 08:25 | |
*** helenfm has joined #openstack-nova | 08:29 | |
*** xek has joined #openstack-nova | 08:31 | |
*** frippe75 has quit IRC | 08:38 | |
*** Dinesh_Bhor has joined #openstack-nova | 08:39 | |
*** tssurya has joined #openstack-nova | 09:07 | |
openstackgerrit | Zhenyu Zheng proposed openstack/nova master: Bump compute service to indicate attach/detach root volume is supported https://review.openstack.org/614750 | 09:09 |
*** xek_ has joined #openstack-nova | 09:14 | |
*** xek has quit IRC | 09:16 | |
*** fyx_ has joined #openstack-nova | 09:17 | |
*** Dinesh_Bhor has quit IRC | 09:17 | |
*** TheJulia_ has joined #openstack-nova | 09:17 | |
*** spsurya_ has joined #openstack-nova | 09:17 | |
*** k_mouza has joined #openstack-nova | 09:17 | |
*** Dinesh_Bhor has joined #openstack-nova | 09:18 | |
*** tinwood_ has joined #openstack-nova | 09:20 | |
*** Miouge- has joined #openstack-nova | 09:23 | |
*** bauzas_ has joined #openstack-nova | 09:23 | |
*** mgagne_ has joined #openstack-nova | 09:23 | |
*** gary_perkins_ has joined #openstack-nova | 09:24 | |
*** udesale has quit IRC | 09:25 | |
*** spsurya has quit IRC | 09:25 | |
*** zigo has quit IRC | 09:25 | |
*** tinwood has quit IRC | 09:25 | |
*** gary_perkins has quit IRC | 09:25 | |
*** bauzas has quit IRC | 09:25 | |
*** jroll has quit IRC | 09:25 | |
*** dr_gogeta86 has quit IRC | 09:25 | |
*** Miouge has quit IRC | 09:25 | |
*** mgagne has quit IRC | 09:25 | |
*** nicholas has quit IRC | 09:25 | |
*** fyx has quit IRC | 09:25 | |
*** TheJulia has quit IRC | 09:25 | |
*** bauzas_ is now known as bauzas | 09:25 | |
*** gary_perkins_ is now known as gary_perkins | 09:25 | |
*** spsurya_ is now known as spsurya | 09:25 | |
*** TheJulia_ is now known as TheJulia | 09:25 | |
*** fyx_ is now known as fyx | 09:25 | |
*** udesale has joined #openstack-nova | 09:26 | |
*** xek_ is now known as xek | 09:28 | |
*** jroll has joined #openstack-nova | 09:32 | |
*** jaosorior has joined #openstack-nova | 09:38 | |
*** derekh has joined #openstack-nova | 09:38 | |
*** k_mouza has quit IRC | 09:57 | |
*** k_mouza has joined #openstack-nova | 09:57 | |
*** moshele has quit IRC | 10:01 | |
*** bhagyashris_ has quit IRC | 10:01 | |
*** Dinesh_Bhor has quit IRC | 10:02 | |
*** moshele has joined #openstack-nova | 10:05 | |
*** cdent has joined #openstack-nova | 10:13 | |
*** xek has quit IRC | 10:34 | |
*** xek has joined #openstack-nova | 10:34 | |
*** alexchadin has quit IRC | 10:43 | |
*** erlon has joined #openstack-nova | 10:43 | |
*** Dinesh_Bhor has joined #openstack-nova | 10:44 | |
*** sapd1_ has joined #openstack-nova | 10:45 | |
*** zigo has joined #openstack-nova | 10:51 | |
openstackgerrit | Michael Still proposed openstack/nova master: Remove utils.execute() calls from xenapi. https://review.openstack.org/619700 | 10:53 |
openstackgerrit | Michael Still proposed openstack/nova master: Remove utils.execute() from libvirt remotefs calls. https://review.openstack.org/619701 | 10:53 |
openstackgerrit | Michael Still proposed openstack/nova master: Remove utils.execute() from quobyte libvirt storage driver. https://review.openstack.org/619702 | 10:53 |
openstackgerrit | Michael Still proposed openstack/nova master: Move nova.libvirt.utils away from using nova.utils.execute(). https://review.openstack.org/619703 | 10:53 |
openstackgerrit | Michael Still proposed openstack/nova master: Imagebackend should call processutils.execute directly. https://review.openstack.org/619704 | 10:53 |
openstackgerrit | Michael Still proposed openstack/nova master: Remove final users of utils.execute() in libvirt. https://review.openstack.org/619705 | 10:53 |
openstackgerrit | Michael Still proposed openstack/nova master: Remove the final user of utils.execute() from virt.images https://review.openstack.org/620007 | 10:53 |
openstackgerrit | Michael Still proposed openstack/nova master: Remove utils.execute() from the hyperv driver. https://review.openstack.org/620008 | 10:53 |
openstackgerrit | Michael Still proposed openstack/nova master: Remove utils.execute() from virt.disk.api. https://review.openstack.org/620009 | 10:53 |
openstackgerrit | Michael Still proposed openstack/nova master: Move a generic bridge helper to a linux_net privsep file. https://review.openstack.org/620010 | 10:53 |
*** diga has quit IRC | 10:56 | |
*** adrianc has quit IRC | 10:56 | |
*** udesale has quit IRC | 10:58 | |
*** adrianc has joined #openstack-nova | 11:07 | |
*** sapd1_ has quit IRC | 11:18 | |
*** k_mouza has quit IRC | 11:31 | |
*** dpawlik has joined #openstack-nova | 11:33 | |
*** davidsha_ has joined #openstack-nova | 11:35 | |
*** tbachman has quit IRC | 11:42 | |
*** tinwood_ is now known as tinwood | 11:44 | |
*** k_mouza has joined #openstack-nova | 11:44 | |
*** Dinesh_Bhor has quit IRC | 11:45 | |
*** sambetts_ is now known as sambetts | 11:55 | |
sean-k-mooney | o/ | 11:58 |
cdent | o/ | 12:01 |
sean-k-mooney | cdent: have a good weekend | 12:01 |
*** lpetrut has quit IRC | 12:03 | |
cdent | sean-k-mooney: I made a pact with myself to not do work over the weekend, but that ended up being easy because I seem to have a cold. So I sat in front of the netflix watching junk tv and feeling like junk. | 12:05 |
*** lpetrut has joined #openstack-nova | 12:08 | |
sean-k-mooney | well the cold suck but the fact you didnt work on your weekend is good. i spent a significat part of the weekend just listening to audio books and trying not to do anything technical | 12:08 |
sean-k-mooney | until lik 10 PM sunday when i decided to deploy kubernetes on an old laptop but i almost made it lol | 12:09 |
*** ratailor has quit IRC | 12:10 | |
cdent | I think if I hadn't had the cold I probably wouldn't have made it | 12:11 |
tobias-urdin | oh boy, after reading api-ref for nova when searching for some help, seeing all the red boxes for the proxy apis deprecation almost gave me tears, nova will be so clean in the future | 12:34 |
cdent | relatively speaking :) | 12:39 |
*** N3l1x has joined #openstack-nova | 12:46 | |
*** tetsuro has quit IRC | 12:57 | |
*** dtantsur|afk is now known as dtantsur|mtg | 12:59 | |
*** jaypipes has quit IRC | 13:13 | |
*** k_mouza_ has joined #openstack-nova | 13:19 | |
*** k_mouza has quit IRC | 13:23 | |
*** tbachman has joined #openstack-nova | 13:23 | |
*** tbachman_ has joined #openstack-nova | 13:25 | |
*** takashin has joined #openstack-nova | 13:26 | |
*** tbachman has quit IRC | 13:28 | |
*** tbachman_ is now known as tbachman | 13:28 | |
sean-k-mooney | tobias-urdin: nova will only be clean if we actully remove the proxy apips form the codebase | 13:30 |
sean-k-mooney | tobias-urdin: to do that would require us to increase teh minium microversion we support | 13:31 |
sean-k-mooney | that is not something we have done since we introduced microverions as far as i am aware | 13:31 |
cdent | sean-k-mooney: there was talk at summit about doing such a thing | 13:32 |
cdent | but there are _many_ prereqs to make it possible. | 13:32 |
sean-k-mooney | cdent: it would be nice if we could but ya i assumed we would have a lot of work to do first | 13:32 |
sean-k-mooney | cdent: was there any progress on a common way of reporting errors at the summig | 13:33 |
sean-k-mooney | *summit | 13:33 |
cdent | not really, no | 13:34 |
sean-k-mooney | oh well i knew that was a long shot :) | 13:34 |
* cdent needs to make lunch/breakfast before sched meeting, brb | 13:34 | |
*** gouthamr has quit IRC | 13:34 | |
*** gouthamr has joined #openstack-nova | 13:40 | |
*** tetsuro has joined #openstack-nova | 13:40 | |
tobias-urdin | sean-k-mooney: true, i'm just happy about all the effort on the structure and cleanup of nova :) | 13:40 |
*** dave-mccowan has joined #openstack-nova | 13:46 | |
*** dave-mccowan has quit IRC | 13:50 | |
*** dave-mccowan has joined #openstack-nova | 13:52 | |
*** efried has joined #openstack-nova | 13:53 | |
*** rtjure has joined #openstack-nova | 13:53 | |
*** sapd1_ has joined #openstack-nova | 13:53 | |
*** mriedem has joined #openstack-nova | 13:54 | |
efried | n-sch meeting in 5 minutes in #openstack-meeting-alt | 13:55 |
*** edleafe has joined #openstack-nova | 13:57 | |
*** rtjure has quit IRC | 13:58 | |
*** awaugama has joined #openstack-nova | 13:59 | |
*** _hemna has joined #openstack-nova | 13:59 | |
mriedem | yikun: fyi https://blueprints.launchpad.net/nova/+spec/initial-allocation-ratios is back in a runway slot | 14:04 |
mriedem | https://etherpad.openstack.org/p/nova-runways-stein | 14:05 |
mriedem | dansmith: we should update the topic at some point for the current runways | 14:05 |
*** mmethot has joined #openstack-nova | 14:06 | |
*** lbragstad has joined #openstack-nova | 14:07 | |
*** sapd1_ has quit IRC | 14:12 | |
*** k_mouza has joined #openstack-nova | 14:15 | |
*** cfriesen has joined #openstack-nova | 14:15 | |
*** k_mouza_ has quit IRC | 14:19 | |
*** efried has quit IRC | 14:19 | |
*** efried has joined #openstack-nova | 14:20 | |
*** k_mouza has quit IRC | 14:21 | |
*** k_mouza has joined #openstack-nova | 14:22 | |
*** _alastor_ has joined #openstack-nova | 14:23 | |
*** jroll has quit IRC | 14:24 | |
*** jroll has joined #openstack-nova | 14:25 | |
*** zul has joined #openstack-nova | 14:25 | |
*** tbachman has quit IRC | 14:26 | |
dansmith | mriedem: ack | 14:27 |
*** tbachman has joined #openstack-nova | 14:27 | |
*** ChanServ sets mode: +o dansmith | 14:27 | |
*** dansmith changes topic to "Current runways: io-semaphore-for-concurrent-disk-ops / reshape-provider-tree / initial-allocation-ratios -- This channel is for Nova development. For support of Nova deployments, please use #openstack." | 14:28 | |
*** ChanServ sets mode: -o dansmith | 14:29 | |
alex_xu | mriedem: do you know resize works with numa topology change in the new flavor? I can't found we parse the numa topo from new flavor. Try to figure out it maybe a bug or something we don't support well yet. | 14:29 |
*** eharney has joined #openstack-nova | 14:29 | |
mriedem | alex_xu: i think it's supported...we'll do the move_claim during resize which gets the new_flavor off the migration record in the resource tracker | 14:31 |
mriedem | and the request spec uses the new flavor (i think) when it runs through the scheduler (numa topo filter) during scheduling for the resize | 14:31 |
alex_xu | mriedem: yea...I saw the move claim code also. | 14:32 |
alex_xu | mriedem: but the request spec wont extract new numa topo from the new flavor. I guess we missed something | 14:32 |
*** bnemec has joined #openstack-nova | 14:33 | |
mriedem | cfriesen might know off the top of his head faster than me | 14:34 |
mriedem | but i thought cold migration worked | 14:34 |
mriedem | alex_xu: you're saying we don't update https://github.com/openstack/nova/blob/master/nova/objects/request_spec.py#L55 for the new flavor right? | 14:35 |
*** k_mouza has quit IRC | 14:36 | |
alex_xu | mriedem: yes | 14:36 |
alex_xu | mriedem: we generate new numa_topology in move claim. but we still use old numa_topology in the scheduler | 14:36 |
*** k_mouza has joined #openstack-nova | 14:37 | |
mriedem | now i'm having a hard time finding where we set the new_flavor on the request spec before calling the scheduler | 14:38 |
alex_xu | hah, probably because we don't have that :) | 14:39 |
mriedem | heh, no i think that happens b/c i just wrote a functional regression test that relies on it | 14:40 |
mriedem | https://review.openstack.org/#/c/619123/ | 14:40 |
alex_xu | oh, yea, I miss read that. but still not parse the numa stuff from flavor | 14:41 |
*** takamatsu has joined #openstack-nova | 14:43 | |
alex_xu | mriedem: it's late for me, I will dig into more tomorrow. thanks for the info, good to know that isn't something we don't support, then it is probably a bug... | 14:43 |
mriedem | alex_xu: https://github.com/openstack/nova/blob/master/nova/conductor/manager.py#L316 is where we set the new flavor on reqspec | 14:43 |
mriedem | alex_xu: sure, i'll ping you if i figure something out :) | 14:44 |
mriedem | good night | 14:44 |
mriedem | stephenfin: are you aware of cold migration/resize support for numa topology changes ^ ? | 14:48 |
mriedem | i thought that was all baked in long ago | 14:48 |
*** udesale has joined #openstack-nova | 14:48 | |
kashyap | mriedem: IIRC, he's out for a few more days. | 14:49 |
mriedem | ok i'll wait for cfriesen then | 14:49 |
*** mvkr has quit IRC | 14:50 | |
*** tbachman has quit IRC | 14:50 | |
*** tbachman has joined #openstack-nova | 14:51 | |
*** tetsuro has quit IRC | 14:52 | |
mriedem | dansmith: do you think https://review.openstack.org/#/c/607735/ is worth sending to rocky? | 14:54 |
mriedem | it's a pretty latent issue, so not really sure it's worth it | 14:55 |
*** whoami-rajat has quit IRC | 14:55 | |
dansmith | mriedem: it is, but it's a pretty trivial thing to push back and we know it's a problem for people | 14:55 |
dansmith | we also know we can't push it back any farther, | 14:55 |
dansmith | so it's not like it can go back to kilo or anything | 14:55 |
mriedem | sure | 14:56 |
mriedem | ok | 14:56 |
*** tbachman has quit IRC | 15:00 | |
*** takashin has left #openstack-nova | 15:01 | |
cfriesen | mriedem: alex_xu: pretty sure cold migration *was* working before all the placement stuff, but I seem to remember seeing at least one bug saying it's currently broken. | 15:04 |
cfriesen | we're on pike at the moment and it seems to be working, but we do have a few patches in that area for other features. | 15:05 |
sean-k-mooney | cold migration in what context? | 15:06 |
sean-k-mooney | mriedem: stephenfin is on PTO until next week | 15:06 |
sean-k-mooney | mriedem: he left his znc bouncer running | 15:06 |
mriedem | cfriesen: as alex pointed out, | 15:07 |
mriedem | i don't see the request spec numa topologies field updated before it goes through the scheduler | 15:07 |
sean-k-mooney | mriedem: oh alex_xu was asking about numa toplogy changes for resize/cold migration it "should" work upstream | 15:08 |
*** dpawlik has quit IRC | 15:08 | |
mriedem | https://github.com/openstack/nova/blob/master/nova/scheduler/filters/numa_topology_filter.py#L74 | 15:08 |
*** sridharg has quit IRC | 15:08 | |
mriedem | i don't see RequestSpec.numa_topology updated from the new flavor *before* we hit the numa filter | 15:08 |
mriedem | so i don't see how the scheduling is working | 15:09 |
sean-k-mooney | hum so you think its using the old flavor perhaps | 15:09 |
sean-k-mooney | i can try testing this in a hour or so | 15:09 |
jmlowe | Has anybody had trouble with the placement client in nova compute hanging? | 15:10 |
sean-k-mooney | i jsut need to get my dev enviornemt running after the weekend | 15:10 |
mriedem | jmlowe: never heard of that | 15:10 |
jmlowe | Specifically nova.compute.resource_tracker._update_available_resource was holding a lock for about 2 min | 15:10 |
*** dpawlik has joined #openstack-nova | 15:10 | |
jmlowe | Sometimes it would do it and sometimes not | 15:11 |
jmlowe | no errors anywhere | 15:11 |
cdent | jmlowe: how many instances on that compute node? | 15:12 |
jmlowe | It did wreak havoc with instance launches | 15:12 |
cdent | are you sure it was talking to placement where things were stuck, or just somewhere in _udpate*? | 15:12 |
cdent | alos | 15:12 |
cdent | ^walso | 15:13 |
cfriesen | mriedem: sorry, had to answer a call from my boss. let me take a quick look | 15:13 |
* cdent passed jmlowe some bourbon | 15:13 | |
jmlowe | I got it down to 2 min to run _update_inventory in nova/scheduler/client/report.py | 15:14 |
mriedem | jmlowe: how many instances on that compute? | 15:15 |
jmlowe | it seems to have made the http put call correctly, but then just waits for tcp timeout | 15:15 |
mriedem | libvirt or vcenter or ironic? | 15:15 |
jmlowe | it's all of my 280 computes, so 0 - 24 | 15:15 |
jmlowe | libvirt | 15:15 |
mriedem | hmm, well there is a lock in the resource tracker when that is called, which will make things in nova-compute slow to a crawl, | 15:16 |
mriedem | but why the placement response would be so slow idk, | 15:16 |
mriedem | have you traced the request via request ID in the placement api logs? | 15:16 |
mriedem | also, which release? | 15:16 |
jmlowe | yes, nothing in placement takes more than a second or so | 15:16 |
jmlowe | queens | 15:16 |
jmlowe | seems like the there's a bug in some underlying client library that is occasionally not returning from a http call without a timeout | 15:18 |
mriedem | hmm, nova is using keystoneauth1 to send the requests to placement | 15:18 |
mriedem | you might be able to enable some debug logging there | 15:18 |
cdent | jmlowe: I think you should make jeremy find and fix this | 15:19 |
* fungi wonders what he broke now | 15:20 | |
cdent | not you fungi | 15:20 |
*** lpetrut has quit IRC | 15:20 | |
jmlowe | I did, then went on a 6 week road trip | 15:20 |
fungi | yay! for once something's not my fault | 15:20 |
jmlowe | then he went | 15:20 |
cdent | The jeremy of which I speak is an old friend (on the order of 30 years) | 15:21 |
cdent | Typical of him. | 15:21 |
cfriesen | mriedem: alex_xu: I think it's this code that updates the flavor on a resize: https://github.com/openstack/nova/blob/master/nova/conductor/manager.py#L304-L316 | 15:21 |
mriedem | cfriesen: but that doesn't update the RequestSpec.numa_topology field | 15:22 |
mriedem | the else block | 15:22 |
mriedem | or RequestSpec.pci_requests for that matter | 15:22 |
mriedem | so if resize with a new numa topology works today, it's getting lucky b/c the scheduler picks a host that fits the original topology, and then the RT.move_claim on the compute fits the new requested numa topology | 15:23 |
*** mvkr has joined #openstack-nova | 15:23 | |
mriedem | as far as i can tell anyway | 15:23 |
sean-k-mooney | mriedem: well we trow away all the host pinning info and recalulate it on the compute node anyway so it is likely working becaue it gets lucky or we retry | 15:24 |
cdent | jmlowe: do you have load balancers or proxies between n-cpu placement? | 15:24 |
jmlowe | I do | 15:25 |
cdent | connections perhaps not closing? | 15:25 |
mriedem | cdent: fwiw, that ENABLED_PYTHON3_PACKAGES variable in devstack looks like it should be killed now | 15:25 |
mriedem | "# Special case some services that have experimental | 15:25 |
mriedem | # support for python3 in progress, but don't claim support | 15:25 |
mriedem | # in their classifier" | 15:25 |
cdent | mriedem: it does come into play | 15:25 |
mriedem | yeah i see where it's used | 15:25 |
mriedem | but i'm just thinking it shouldn't exist anymore | 15:25 |
jmlowe | probably, I'd expect the http client to close the connection once it received a 200 responxe | 15:25 |
mriedem | given the py3 first goal | 15:25 |
mriedem | e.g. neutron isn't in that list | 15:26 |
mriedem | nor keystone | 15:26 |
*** munimeha1 has joined #openstack-nova | 15:26 | |
sean-k-mooney | mriedem: well it either shouldn not exists or default to ture right | 15:26 |
cdent | mriedem: yes, I agree that it appears to be an anachronism that shouldn't be used at all, but that it sometimes does get used | 15:26 |
*** moshele has quit IRC | 15:26 | |
cdent | it is currently what makes my fix work | 15:26 |
mriedem | i shall put out the dhellmann signal | 15:26 |
* cdent wonders what shape that takes | 15:26 | |
sean-k-mooney | mriedem: there are some project that dont rung on python3 i belive | 15:26 |
*** Luzi has quit IRC | 15:26 | |
jmlowe | Testing now, but it seems I can mitigate all of this with the timeout setting in the placement section of nova.conf | 15:26 |
mriedem | cdent: a peach? | 15:27 |
mriedem | jmlowe: do you have to set connection timeouts for other nova interactions, like with glance? | 15:28 |
jmlowe | I've never needed them before | 15:28 |
*** tbachman has joined #openstack-nova | 15:29 | |
mriedem | weird. i think in queens we were going through ksa for glance interactions as well | 15:29 |
mriedem | so it should all be the same | 15:29 |
mriedem | from a client perspective i mean, | 15:29 |
mriedem | just different api servers | 15:29 |
jmlowe | I have been seeing the occasional ssl error in neutron clients, so could even be as simple as some sort of bug in haproxy | 15:31 |
cfriesen | mriedem: sean-k-mooney: yeah, I think you might be right, and most of the time it gets fixed up by the claim. ick. | 15:31 |
sean-k-mooney | cfriesen: mriedem we need to do a cleanup/audit of the migraion codepaths in general. artoms live migration spec will help but we likely have some cleanup to do for resize/cold migrate too. | 15:33 |
cdent | jmlowe: what's hosting your placement service? mod_wsgi, uwsgi, something else? | 15:34 |
sean-k-mooney | cfriesen: mriedem i know in pratcice cold migrate and resize "usually" work and results in the numa toplogy being recaluated but it may be just getting lucky or relying on retries. when we model numa in placemnt this will cange and we will have to correct the behavior for that | 15:35 |
sean-k-mooney | cfriesen: by the way i dont know if you saw my ML post regarding the pci numa affitiy policies | 15:36 |
cdent | efried: thanks for the +w on the external placement in nova change. I assume we want to let its child ( https://review.openstack.org/#/c/618215/ ) sit until more things have settled | 15:36 |
cfriesen | sean-k-mooney: glanced at it. haven't had a chance to dig in to it, they've got my focused on other stuff at the moment. | 15:36 |
sean-k-mooney | cfriesen: i did not test all the ploices but at least the prefer policy does not work as intended so we will need to fix that | 15:37 |
efried | cdent: Been avoiding that one until "WIP" goes away. | 15:37 |
sean-k-mooney | + add the ablity for it to work with neutron sriov ports | 15:37 |
efried | cdent: Basically taking any excuse to defer reviews while I try to get my feet back under me. | 15:38 |
cdent | efried: i've left it wip in expectation of it needing to hold. How do I mark something "i'd like reviews but this isn't done"? But yeah, understand the need to defer. | 15:38 |
cdent | (sometimes its the concept not the implementation that needs the review) | 15:38 |
efried | cdent: If you'd like me to -2 it, I could do that. Or you could -W it. Not sure either would help get it more attention, though. | 15:39 |
mriedem | cdent: should probably send a separate email to the ML about https://review.openstack.org/#/c/617941/ so nova people not paying much attention to placement know what's going on with tests now | 15:39 |
efried | what are we actually waiting on? | 15:39 |
*** dpawlik has quit IRC | 15:39 | |
cdent | mriedem: aye aye | 15:39 |
mriedem | b/c when i rebase and my tests start failing i'm going to wonder wtf | 15:40 |
* cdent nods | 15:40 | |
sean-k-mooney | cdent: mriedem efried is https://review.openstack.org/#/c/599208/ still considerd a requiremetn before the placement extraction is complete | 15:41 |
cdent | yes | 15:41 |
sean-k-mooney | is there a list of what is out standing beyond that or is that the final feature | 15:42 |
mriedem | https://etherpad.openstack.org/p/BER-placement-extract | 15:42 |
sean-k-mooney | mriedem: cool thanks | 15:42 |
mriedem | bauzas: did you get a functional test written that does a reshape and then schedules to the child provider inventory resource? | 15:43 |
mriedem | cdent: why were these tests removed? https://review.openstack.org/#/c/617941/21/nova/tests/unit/cmd/test_status.py | 15:44 |
mnaser | https://review.openstack.org/#/c/619349/ simple backport if anyone has a second (i'll babysit the rest) | 15:44 |
mriedem | lyarwood: ^ just consider my backport a proxy +2 there | 15:46 |
cdent | mriedem: because the tests uses the rp_objects directly | 15:46 |
cdent | the follow on patch may even remove the status command since it can no longer work | 15:46 |
cdent | there's a fixme added in cmd/status.py | 15:47 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Send RP uuid in the port binding https://review.openstack.org/569459 | 15:47 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Test boot with more ports with bandwidth request https://review.openstack.org/573317 | 15:47 |
mriedem | cdent: ok but the rp objects aren't removed in that patch, so it seemed out of place | 15:47 |
mriedem | we could count rps using the placement api right? | 15:47 |
mriedem | or using the fixture | 15:47 |
cdent | I think that one can still work, but not the inventory-related one | 15:48 |
cdent | I went through so many iterations on that series of changes, I may have lost my place | 15:48 |
cdent | I assumed then (and still do) that we'll need to do some "dynamic tidying" | 15:49 |
mriedem | on that size of change i can see why something would get lost | 15:49 |
mriedem | it just stuck out to me looking at it fresh | 15:49 |
* cdent nods | 15:51 | |
cdent | my original plan was to not remove anything from nova in the first change, and just change the tests, but it proved to confusing why debugging. | 15:52 |
mriedem | anyway, i think the command could count compute resource providers by providers with VCPU inventory as it does today | 15:52 |
mriedem | and i know you have a spec for filtering that way as well | 15:52 |
cdent | I abandoned that spec because people weren't sure that was a sufficient use case, which confused me | 15:52 |
mriedem | heh, well we have a use case right here | 15:52 |
mriedem | nova-status upgrade check already hits the placement REST API to check the version in _check_placement | 15:53 |
mriedem | anywhere | 15:53 |
mriedem | *anyway | 15:53 |
mriedem | we might want to undo https://review.openstack.org/#/c/617941/21/nova/tests/unit/cmd/test_status.py in a follow up | 15:54 |
mriedem | unless we remove that upgrade check, but that's another discussion | 15:54 |
cdent | well, we'll have to at least change the upgrade check, in which case the test will have to change so it does rather go together | 15:54 |
*** dpawlik has joined #openstack-nova | 15:55 | |
mriedem | sure, the test has to exist to change it though | 15:55 |
mriedem | unless you're agreeing with me | 15:55 |
*** janki has quit IRC | 15:55 | |
cdent | I'm mostly agreeing with you, except for the part about it being another discussion | 15:55 |
cdent | if we're going to remove the command, then job done | 15:55 |
mriedem | it depends on what nova-compute does on startup - if it can't create a resource provider b/c placement isn't running or nova-compute isn't configured to talk to it, then we can probably remove the check | 15:56 |
mriedem | otherwise it's useful as a base install verification that you've got computes reporting in and resource providers for those computes | 15:56 |
mriedem | FFU makes me nervous about when we can remove these things... | 15:57 |
mriedem | b/c anyone can FFU from any release to another presumably | 15:57 |
mriedem | having said that, i'm not sure these would work for FFU'ers anyway b/c the placement api would likely be down | 15:58 |
mriedem | dansmith: is that correct? we can expect placement-api to be down during an FFU? | 15:58 |
*** dpawlik has quit IRC | 15:59 | |
sean-k-mooney | cdent: placement does not randomise allocation candiates by defualt correct. they are just retruned in db order? | 16:00 |
mriedem | https://docs.openstack.org/nova/latest/configuration/config.html#placement.randomize_allocation_candidates | 16:00 |
*** tbachman has quit IRC | 16:01 | |
sean-k-mooney | mriedem: thanks ya its off by default. i was wondering if that could be related to this ml list post http://lists.openstack.org/pipermail/openstack-discuss/2018-November/000209.html | 16:02 |
sean-k-mooney | that said i woudl have expected the default weigher to kick in and spread the instances | 16:03 |
cdent | mriedem: yeah, I'm wondering if coupling nova-status to placement status is a good/safe idea? If they are supposed to be independently upgraded (for some value of "independent" maybe placement-status should do some kind of check? Except that we expect placement to upgrade first (usually). /me throws hands | 16:04 |
cdent | sean-k-mooney: that config item was left as not random by default so as to encoureage/allow pack, what would the weigher be doing? | 16:05 |
cdent | changing the config is certainly something they could at least try | 16:05 |
sean-k-mooney | cdent: the weighers used to spread by default | 16:05 |
cdent | my impression was that most of the system was biased towards packing to save $$ | 16:05 |
sean-k-mooney | cdent: i think they still do | 16:05 |
cdent | probably still worth trying the config setting just to see? | 16:06 |
mriedem | cdent: i left a note to self / open question in the check code so i can maybe come back to it some other time when it comes up, or is a more pressing decision that needs to be made | 16:06 |
sean-k-mooney | ya i was also gong to ask them to provide the nova.conf the schduler is useing not the compute node one | 16:07 |
cdent | good point sean-k-mooney | 16:07 |
cdent | mriedem: ✔ | 16:07 |
dansmith | mriedem: not just expect, but require | 16:07 |
mriedem | ok yeah then in that case, nova-status upgrade check will not work during FFU today | 16:09 |
mriedem | it will fail the check placement check since the API won't be up | 16:10 |
dansmith | I thought nova-status was to be db-only anyway? I guess not once placement is a different db, eh? | 16:10 |
*** udesale has quit IRC | 16:10 | |
mriedem | there is a check for the minimum placement version | 16:11 |
mriedem | otherwise it looks in the nova_api db yeah | 16:11 |
dansmith | so just have to make the api-bound check graceful I suppose | 16:11 |
mriedem | was discussing if we should change that to hit the placement API now, or just remove the placement-related checks | 16:11 |
mriedem | the upgrade checkers things were written before FFU was really a consideration i think | 16:12 |
mriedem | so i haven't put a ton of thought into it | 16:12 |
dansmith | well, I thought the plan was for that thing to be db-only anyway, so it wouldn't matter, but the scope has definitely gotten larger as we find more things for it to do | 16:12 |
mriedem | e.g. if we start saying that nova requires cinder >= 3.44 to remove our api compat code, presumably we'd want an upgrade check for that | 16:12 |
mriedem | yeah | 16:12 |
dansmith | well, if the api is up and you're not doing an ffu, it'd be nice to get that check, you just need to be super graceful I think | 16:13 |
dansmith | maybe a new category, peer to warning, error, for "manual check" or "unknown" ? | 16:13 |
dansmith | like "I would check this for you, but I can't so be sure you check it" | 16:13 |
mriedem | i was thinking it'd just be a warning | 16:13 |
mriedem | warning is already "this might be a problem, but i'm not sure" | 16:14 |
dansmith | do the deployment things hork on warning? | 16:14 |
dansmith | if not, then cool | 16:14 |
mriedem | do the deployment things run the upgrade checkers... heh | 16:14 |
mriedem | osa runs it | 16:14 |
mriedem | i don't think tripleo does | 16:14 |
dansmith | owalsh_: ? | 16:14 |
mriedem | btw, i ask this at least once every 2 weeks in here :) | 16:14 |
dansmith | you said you didn't know :) | 16:15 |
mriedem | rhetorical | 16:15 |
mriedem | i know tripleo doesn't run it | 16:15 |
dansmith | okay | 16:15 |
mriedem | http://codesearch.openstack.org/?q=nova-status&i=nope&files=&repos= | 16:15 |
mriedem | grenade, kolla-ansible and osa | 16:15 |
mriedem | so i was going to say, | 16:16 |
mriedem | unless/until someone comes along saying, "hey this doesn't work for me during FFU" i don't have much motivation to care about making it graceful in those cases | 16:16 |
dansmith | :/ | 16:17 |
dansmith | does it stack trace now? | 16:17 |
mriedem | if placement is down? | 16:17 |
mriedem | it will return a failure | 16:17 |
dansmith | yeah | 16:17 |
*** mschuppert has joined #openstack-nova | 16:17 | |
dansmith | oh you mean it just reports error instead of your proposed warning? | 16:17 |
mriedem | https://github.com/openstack/nova/blob/master/nova/cmd/status.py#L230 | 16:17 |
mriedem | yes | 16:17 |
mriedem | there is the guantlet of ksa exceptions, | 16:18 |
mriedem | so without testing it in devstack i'm not exactly sure | 16:18 |
dansmith | okay | 16:18 |
*** itlinux has joined #openstack-nova | 16:18 | |
dansmith | well, if it just reports error that's reasonable enough I think | 16:18 |
mriedem | that likely means any tooling running it during an FFU should ignore the results, which makes me wonder why even run it during FFU | 16:19 |
* cdent strokes chin | 16:19 | |
mriedem | but yeah, i just don't have the energy to figure out what our official process/stance is for upgrade checkers during FFU :) | 16:19 |
mriedem | especially that now it's a community wide goal | 16:20 |
mriedem | and FFU is still very nebulous to me | 16:20 |
dansmith | well, | 16:20 |
dansmith | the checking of things that have to be done before moving on is pretty critical to FFU | 16:20 |
mriedem | my most basic understanding is take all control plane services down, and roll through each release running data migrations and schema migrations | 16:21 |
dansmith | I think everyone right now is just doing it very manually, including the deployment projects | 16:21 |
dansmith | lyarwood might be a good person to ask about this | 16:21 |
dansmith | lyarwood: mschuppert: the question is why tripleo doesn't run nova-status during any upgrade, including ffu, even if just to collect/log the status | 16:22 |
dansmith | and/or I guess: if/do tripleo people use it whilst trying to get a particular N->M transition working, and then just not run it programmatically for everyone, assuming they have the steps perfected? | 16:23 |
lyarwood | dansmith / mriedem ; no reason, I did push an example up for the upgrades team a while ago and asked them to take it forward but I assume they just didn't follow up | 16:23 |
lyarwood | this came up again at PTG, didn't we create tripleo bugs to track this during S? | 16:24 |
*** hshiina has quit IRC | 16:24 | |
* lyarwood looks | 16:24 | |
*** hshiina has joined #openstack-nova | 16:24 | |
lyarwood | https://bugs.launchpad.net/tripleo/+bug/1777060 | 16:24 |
openstack | Launchpad bug 1777060 in tripleo "nova-status should be used during deployment and upgrades" [High,New] - Assigned to Lee Yarwood (lyarwood) | 16:24 |
*** annp has quit IRC | 16:25 | |
dansmith | cool | 16:25 |
*** dave-mccowan has quit IRC | 16:27 | |
mriedem | until we actually have any kind of FFU ci testing it's also hard for me to care a ton about stuff like this | 16:28 |
mriedem | i mean, i don't want to lose sleep over it | 16:28 |
mriedem | when i have so many other things i can lose sleep over | 16:28 |
*** dave-mccowan has joined #openstack-nova | 16:28 | |
dansmith | we could easily just run it during grenade before we bring things back up and log the output right? | 16:29 |
mriedem | we do run nova-status upgrade check during grenade | 16:30 |
dansmith | but not when *everything* is down right? | 16:30 |
dansmith | only during nova-upgrade? | 16:30 |
mriedem | http://git.openstack.org/cgit/openstack-dev/grenade/tree/projects/60_nova/upgrade.sh#n88 | 16:30 |
dansmith | right, | 16:31 |
mriedem | we specifically bring placement up before running the check | 16:31 |
*** gyee has joined #openstack-nova | 16:31 | |
dansmith | right | 16:31 |
dansmith | and other projects before us would be up (i.e. keystone) | 16:31 |
mriedem | yeah i mean i could run it before starting placement | 16:31 |
mriedem | and see it fail | 16:31 |
dansmith | actually verify preupgrade might run with nothing | 16:31 |
dansmith | sorry verify_noapi preupgrade | 16:32 |
dansmith | https://review.openstack.org/620104 | 16:33 |
dansmith | mriedem: anyway, don't lose sleep over it, I'll check in on that later to see how it goes | 16:34 |
mriedem | ack thanks | 16:34 |
*** tbachman has joined #openstack-nova | 16:38 | |
*** tbachman has quit IRC | 16:39 | |
*** tbachman has joined #openstack-nova | 16:40 | |
openstackgerrit | Adam Spiers proposed openstack/nova-specs master: Add spec for libvirt driver launching AMD SEV-encrypted instances https://review.openstack.org/609779 | 16:45 |
*** jangutter has quit IRC | 16:53 | |
mriedem | i guess we never documented anywhere officially that we only support n-1 computes.. | 16:59 |
mriedem | even though it comes up every so often | 16:59 |
sean-k-mooney | compute older then n-1 may work in some cases however we just dont test them | 17:00 |
mriedem | yes i know that. | 17:01 |
mriedem | what i'm asking is, didn't we ever document this because the last time it came up, i thought we said someone would document it. | 17:01 |
mriedem | which is i guess why it didn't get done. | 17:02 |
mriedem | https://docs.openstack.org/nova/rocky/contributor/project-scope.html?highlight=compatibility#upgrade-expectations is about as close as it gets | 17:03 |
mriedem | fuck i hate that banner | 17:03 |
*** whoami-rajat has joined #openstack-nova | 17:03 | |
sean-k-mooney | we dont state i t explictily in https://docs.openstack.org/nova/rocky/user/upgrade.html#rolling-upgrade-process but we do mention n to n+1 a few times | 17:04 |
cdent | oh yeah that banner doth suck | 17:04 |
*** jmlowe has quit IRC | 17:04 | |
sean-k-mooney | e.g. in relation to db changes | 17:04 |
cdent | "someone" has a lot of work on their place | 17:04 |
mriedem | https://docs.openstack.org/nova/rocky/contributor/process.html?highlight=compatibility#smooth-upgrades | 17:04 |
sean-k-mooney | mriedem: ok so we do say we only support "only support upgrades between N and N+1 major versions, to reduce technical debt relating to upgrades" | 17:05 |
mriedem | yes, that's good enough for me | 17:06 |
mriedem | the question in -dev and the ML is if that also applies to inter-service compat | 17:06 |
mriedem | e.g. nova and cinder | 17:06 |
mriedem | and i don't think it should | 17:06 |
mriedem | b/c we have versioned REST APIs | 17:06 |
sean-k-mooney | mriedem: right if the rest apis are versioned coorectly it should not | 17:07 |
mriedem | which means you shouldn't have to take down your entire cloud to upgrade nova | 17:07 |
mriedem | i.e. you can leave cinder n-2 and upgrade nova and it should work | 17:07 |
mriedem | we don't test it, but it should work | 17:07 |
mriedem | unless otherwise noted as we've dropped some compat | 17:08 |
sean-k-mooney | yes you should be able to do a service wise rolling upgrade | 17:08 |
sean-k-mooney | and you should be able to skip upgrade some services if you dont need too | 17:08 |
sean-k-mooney | i generally parsed the version n contolplane with n-1 agents compatiablity to only applcy within a singel service | 17:09 |
*** mvkr has quit IRC | 17:10 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/queens: [stable-only] Add report_ironic_standard_resource_class_inventory option https://review.openstack.org/620111 | 17:11 |
mriedem | smcginnis: cdent: regarding that question about nova requiring cinder >= rocky, i'll likely drop the API compat we have for cinder < rocky (really queens b/c nova-api checks for cinder 3.44 which was added in queens), with a release note and potentially an upgrade check to look at the service catalog and make sure cinder >= 3.44 is available | 17:13 |
* mriedem adds it to the todo list | 17:14 | |
smcginnis | mriedem: Queens should be a good point. I would think from there we can probably clean up a lot of code. | 17:14 |
mriedem | still need my patches to migrate old bdm attachments, as discussed in berlin, | 17:15 |
mriedem | or do someone online when the attachments are used, but i haven't put brain power into that | 17:16 |
mriedem | definitely need https://review.openstack.org/#/c/541420/ for bfv though | 17:16 |
*** helenfm has quit IRC | 17:16 | |
*** imacdonn has quit IRC | 17:18 | |
*** imacdonn has joined #openstack-nova | 17:19 | |
*** KeithMnemonic has joined #openstack-nova | 17:30 | |
KeithMnemonic | mriedem is there a chance to get some cores to review your patch https://review.openstack.org/#/c/614872/1 ? | 17:31 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/pike: [stable-only] Add report_ironic_standard_resource_class_inventory option https://review.openstack.org/620113 | 17:31 |
mriedem | KeithMnemonic: queens needs to go first https://review.openstack.org/#/c/614868/ | 17:32 |
mriedem | but yeah dansmith lyarwood https://review.openstack.org/#/q/I98a2785c07f7af02ad83650c72d9e1868290ece4 | 17:32 |
mriedem | easy backports | 17:32 |
KeithMnemonic | thanks! | 17:33 |
mriedem | yw | 17:34 |
mriedem | thanks for the reminder | 17:34 |
sean-k-mooney | anyone know a better tool to search irc logs then googles site search | 17:35 |
cdent | edleafe made a thing, but I don't know if he made it live | 17:35 |
*** imacdonn has quit IRC | 17:36 | |
edleafe | sean-k-mooney: It's still rough, but you can try https://ircsearch.leafe.com | 17:36 |
*** imacdonn has joined #openstack-nova | 17:36 | |
*** sambetts is now known as sambetts|afk | 17:36 | |
sean-k-mooney | edleafe: does that use a local copy of the logs or does it search easedrop.openstack.org | 17:37 |
edleafe | sean-k-mooney: it uses its own elasticsearch database | 17:38 |
openstackgerrit | Adrian Chiris proposed openstack/nova master: add get_pci_request_from_vif to request.py https://review.openstack.org/609166 | 17:39 |
openstackgerrit | Adrian Chiris proposed openstack/nova master: Allow per-port modification of vnic_type and profile https://review.openstack.org/607365 | 17:39 |
openstackgerrit | Adrian Chiris proposed openstack/nova master: Add get_instance_pci_request_from_vif https://review.openstack.org/619929 | 17:39 |
openstackgerrit | Adrian Chiris proposed openstack/nova master: SR-IOV Live migration indirect port support https://review.openstack.org/620115 | 17:39 |
mriedem | use_cow_images=true and force_raw_images=true (defaults) is always confusing | 17:44 |
sean-k-mooney | edleafe: thanks i found the message from bauzas i was looking for but looks like he did not past the placmetend output publicaly https://ircsearch.leafe.com/timeline-middle/%23openstack-nova/2018-10-03T17:18:56 | 17:46 |
adrianc | sean-k-mooney: Hi, added you to the above commits. ive also commented on the related spec: https://review.openstack.org/#/c/605116/ | 17:46 |
sean-k-mooney | adrianc: hi i was looking at the previous version earlier today. im planning to spend tomorrow testing what you have pushed so far | 17:47 |
sean-k-mooney | adrianc: is it in a functional state | 17:47 |
*** jackding has quit IRC | 17:48 | |
edleafe | sean-k-mooney: glad it was useful for you. I wrote it because I was annoyed that I couldn't find info from a conversation | 17:48 |
*** imacdonn has quit IRC | 17:48 | |
sean-k-mooney | edleafe: ya i spent 20 mins looking of it with googles site: feature and did not find it | 17:49 |
adrianc | sean-k-mooney: ive tested with SRIOV MacVtap, however you it is required that the neutron mech driver to support multiple port bindings | 17:49 |
sean-k-mooney | edleafe: i found it using your seacher in 90 seocnds or so | 17:49 |
sean-k-mooney | adrianc: without the neutron change it should migration but the portstatus will be down correct | 17:50 |
adrianc | sean-k-mooney: i have a POC patch for neutron sriovnicswitch if you are interested | 17:50 |
edleafe | sean-k-mooney: the fulltext search in elasticsearch is awesome | 17:50 |
sean-k-mooney | adrianc: sure if you have it pushed i can pull it down and test with that also | 17:50 |
adrianc | sean-k-mooney: without it the port will be down and no MAC will be allocated to the VF, ill push it as POC code for neutron | 17:52 |
sean-k-mooney | adrianc: i normally dont have meeting on tuesday so it the day i set aside to test complicated stuff end to end so i am happy to pull in all the changes you have and try an replicate it locally | 17:52 |
sean-k-mooney | adrianc: you should still have a mac at least in the libvirt xml | 17:52 |
*** derekh has quit IRC | 17:52 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/rocky: Use long_rpc_timeout in select_destinations RPC call https://review.openstack.org/620121 | 17:53 |
sean-k-mooney | adrianc: without the neutron change the only stuff that wont work is the operation preferment by the sriovnic agent | 17:53 |
sean-k-mooney | the vf mac is set by nova in the libvirt xml | 17:53 |
sean-k-mooney | that is based on the neutron port and should not be effect by the ports bindings or the port status | 17:54 |
adrianc | sean-k-mooney: yes, you are right, the issue is IIRC, the MAC on the source is not zeroed | 17:54 |
*** imacdonn has joined #openstack-nova | 17:55 | |
adrianc | did it a while back, but i remember it was needed :) | 17:55 |
sean-k-mooney | adrianc: ah when the vf is unbound it keeps the vm mac | 17:55 |
*** moshele has joined #openstack-nova | 17:55 | |
adrianc | ya | 17:55 |
sean-k-mooney | adrianc: that is assuming the vf is rebound to the kenel dirivce and does not stay bound to vfio-pci | 17:55 |
sean-k-mooney | sorry never mind | 17:56 |
sean-k-mooney | with macvtap its not bound to vfio-pci | 17:56 |
*** k_mouza_ has joined #openstack-nova | 17:56 | |
sean-k-mooney | ok well regarding you question on the spec https://review.openstack.org/#/c/605116/6/specs/stein/approved/libvirt-neutron-sriov-livemigration.rst@111 | 17:57 |
sean-k-mooney | i was leaning towords option 2 | 17:57 |
sean-k-mooney | using a new pci request with an new uuid | 17:57 |
sean-k-mooney | but i was hoping to avoid data model changes | 17:58 |
*** k_mouza has quit IRC | 17:58 | |
sean-k-mooney | ill keep both option in mind when looking at your code. | 17:59 |
adrianc | you will need to keep that request_id somewhere | 17:59 |
dansmith | mriedem: looks like artom answered questions and pushed up a tweak to this and you were +0.9 before.. can you circle back? https://review.openstack.org/#/c/599587/ | 18:00 |
sean-k-mooney | adrianc: i was wondering could we use a uuid5 that is derived from the host+vif port id | 18:00 |
mriedem | dansmith: yeah, was thinking about it in my mental queue earlier | 18:00 |
dansmith | mriedem: ack, thanks | 18:00 |
dansmith | bauzas: I assume you're going to be the +W on that? | 18:01 |
adrianc | sean-k-mooney: imposing a certain logic on the uuid creation doesnt sound like something that will fly, is there a precedence in nova ? | 18:02 |
bauzas | dansmith: mriedem: sorry folks, was on some internal issue | 18:03 |
sean-k-mooney | adrianc: neutron are using uuid5's for generating the placemnet uuids for bandwith awere scheduling. | 18:03 |
bauzas | dansmith: and yeah, i was about looking at https://review.openstack.org/#/c/599587/ | 18:03 |
sean-k-mooney | adrianc: i dont know of a precedent in nova for doing the same | 18:04 |
bauzas | mriedem: for the functional test, I didn't have time yet | 18:04 |
* bauzas looks at OSP13 | 18:04 | |
sean-k-mooney | adrianc: but even if we did not generate the deterministaclly we could put the pci request id in the migration data we pass back | 18:04 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Default zero disk flavor to RULE_ADMIN_API in Stein https://review.openstack.org/603910 | 18:07 |
adrianc | sean-k-mooney: so we extend the LiveMigrateData object, using the same request makes sense as in a way its the same request claimed on a different host. downside is that you have a point in time where you will have a PCI device allocated on the source host and a PCI device claimed on the destination host for the same request ID | 18:08 |
sean-k-mooney | adrianc: this may be a good usecase for a uuid5 however. https://docs.python.org/2/library/uuid.html#uuid.uuid5 the namespace uuid would be the neutron port uuid and the name would be the hostname. the adress space of UUIDs should be sufficent such that a colision is very unlikely. | 18:09 |
sean-k-mooney | adrianc: using the same request id would also work as long as we can clean up | 18:09 |
sean-k-mooney | adrianc: if we could avoid relying on the periodic task that would be better | 18:10 |
sean-k-mooney | adrianc: could we remove the source claim in post migrate | 18:10 |
artom | dansmith, oh hey, thanks for pushing that :) | 18:11 |
*** davidsha_ has quit IRC | 18:13 | |
adrianc | sean-k-mooney: lemme check, p.s https://review.openstack.org/#/c/620123/1 | 18:14 |
*** slaweq_ has joined #openstack-nova | 18:15 | |
mriedem | artom: replies on the previous PS fwiw | 18:17 |
mriedem | i think you're understimating the claim issue | 18:17 |
mriedem | the claim happens on the dest before live migration starts | 18:17 |
mriedem | and returns to conductor | 18:17 |
mriedem | but it's the source that will need to orchestrate what happens with the claim after a successful or failed live migration | 18:17 |
mriedem | maybe it's as simple as rt.drop_move_claim like you said | 18:18 |
sean-k-mooney | mriedem: artom for what its worth we will have to do the exact same claims dance for the sriov migration as well. the main difference being one is in the pci code and the other is in the numa code | 18:18 |
mriedem | the claims dance is one of my most hated dances | 18:19 |
mriedem | right up there with line dancing | 18:19 |
artom | There are dances you like? | 18:19 |
sean-k-mooney | and river dacne if you an irish person | 18:19 |
mriedem | dude | 18:19 |
mriedem | THE HUMPTY DANCE | 18:19 |
artom | I... Have I been missing out? | 18:19 |
mriedem | dansmith: that reminds me, was it you that didn't get my humpty dance reference while we were strolling through the death camp? | 18:19 |
mriedem | https://www.youtube.com/watch?v=PBsjggc5jHM | 18:20 |
adrianc | sean-k-mooney: to which source claim are you referring ? | 18:20 |
dansmith | mriedem: I knew you were talking about this song, I just didn't get how it related to, uh, mass murder | 18:21 |
mriedem | i don't remember | 18:21 |
artom | mriedem, you can pass the host to drop_move_claim | 18:22 |
adrianc | sean-k-mooney: if you mean removing the free_instance_allocations() in _post_live_migration then yes, but it will be another place we rely on the periodic resource_tracker job | 18:22 |
artom | So even if we call it from the source, we could drop the claim on the... wait, if we call it on the source, we just drop the resources on the source, since the instance is on the dest | 18:23 |
adrianc | artom: Hi, in regards to numa aware live migration, the plan is to converge for stein right ? as the SRIOV live migration will not mean much without it | 18:24 |
sean-k-mooney | adrianc: in option 1 we would have 2 vfs calimed with the same pci request uuid so if we do that i was wondering if we can avaoid relying on the periodic heal and proactivly release the vf on the source when we then migration completes | 18:24 |
sean-k-mooney | adrianc: well the sriov migration should be doable without the numa one | 18:24 |
mriedem | artom: see | 18:25 |
*** ircuser-1 has joined #openstack-nova | 18:25 | |
adrianc | sean-k-mooney: in the PS i am freeing the instance allocation on the source node. | 18:25 |
sean-k-mooney | adrianc: i ok ill read what you are currently doing and then ill respond to the question on the spec or updated it to match what you have implmeneted | 18:26 |
adrianc | sean-k-mooney: unless you request dedicated CPUs right ? (previous comment) | 18:27 |
sean-k-mooney | adrianc: yes but wwe shoudl treat these as seperate specs and seperate work items but makes sure they both work togeter in the end | 18:28 |
artom | mriedem, wait, so the cell conductor isn't involved at all? It's just superconductor and the source and dest? | 18:28 |
adrianc | sean-k-mooney: i agree, they do not depend, i was just wondering if its planned for stein as well :) | 18:28 |
sean-k-mooney | adrianc: in the simple case fo a floating instace with neutron sriov interface there is no numa affinity or numa topology for the guest | 18:28 |
*** jmlowe has joined #openstack-nova | 18:29 | |
sean-k-mooney | adrianc: the numa aware migration is proably more impactful to land in stien then sriov but hopefully both can land | 18:29 |
*** k_mouza has joined #openstack-nova | 18:31 | |
mriedem | artom: yes | 18:31 |
mriedem | superconductor orchestrates everything to find the correct dest host, then kicks things off with an rpc cast to the source | 18:31 |
mriedem | and then source/dest computes just rpc back and forth | 18:31 |
mriedem | there is no reschedule or anything within the cell conductor for live migration | 18:32 |
artom | OK, I need to eat, but I think I'm starting to understand the problem you're explaining, mriedem. Namely: we can't keep a claim context going, so... I guess we'll need to shove it in the migration context, like with cold migration? | 18:32 |
mriedem | i guess? | 18:32 |
mriedem | the instance.migration_context is still a bit of a mystery to me | 18:32 |
mriedem | but i also need to eat | 18:32 |
artom | You and everyone else | 18:32 |
mriedem | dansmith: comments on the numa live migration spec which maybe you can answer, | 18:32 |
mriedem | re: move claims | 18:32 |
artom | I think if Nikola came back today, he'd still know more than all of us combined | 18:32 |
mriedem | on that very hairy part of the code? i agree. | 18:33 |
mriedem | there are also TODOs in there from him about the move claim stuff for reize | 18:33 |
mriedem | *resize | 18:33 |
mriedem | https://github.com/openstack/nova/blob/594c653dc1a312d0364ad24c703e1a9b228133e1/nova/compute/manager.py#L3988 | 18:34 |
mriedem | anyway, turkey leftovers | 18:34 |
*** k_mouza_ has quit IRC | 18:35 | |
*** k_mouza has quit IRC | 18:35 | |
sean-k-mooney | mriedem: when you are back maybe you could weigh in on https://review.openstack.org/#/c/605116/6/specs/stein/approved/libvirt-neutron-sriov-livemigration.rst@111 also. | 18:36 |
*** moshele has quit IRC | 18:40 | |
*** ralonsoh has quit IRC | 18:41 | |
*** adrianc has quit IRC | 18:46 | |
*** tssurya has quit IRC | 18:51 | |
*** slaweq_ has quit IRC | 18:51 | |
*** eandersson has joined #openstack-nova | 18:54 | |
*** dave-mccowan has quit IRC | 19:00 | |
*** dave-mccowan has joined #openstack-nova | 19:01 | |
*** brault has quit IRC | 19:01 | |
*** brault has joined #openstack-nova | 19:04 | |
*** mvkr has joined #openstack-nova | 19:18 | |
*** itlinux has quit IRC | 19:41 | |
*** itlinux has joined #openstack-nova | 20:05 | |
*** erlon has quit IRC | 20:06 | |
*** whoami-rajat has quit IRC | 20:06 | |
*** dave-mccowan has quit IRC | 20:21 | |
*** moshele has joined #openstack-nova | 20:35 | |
*** slaweq_ has joined #openstack-nova | 20:37 | |
*** slaweq_ has quit IRC | 20:37 | |
*** dave-mccowan has joined #openstack-nova | 20:39 | |
*** jmlowe has quit IRC | 20:39 | |
*** moshele has quit IRC | 20:41 | |
*** ivve has quit IRC | 20:56 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Remove ironic/pike note from *_allocation_ratio help https://review.openstack.org/620154 | 20:59 |
*** moshele has joined #openstack-nova | 21:01 | |
*** jmlowe has joined #openstack-nova | 21:12 | |
*** moshele has quit IRC | 21:15 | |
mnaser | so while answering an ML post about all_tenants and friends, i found this TODO since 2015 -- https://github.com/openstack/nova/commit/be41910ac6be28060d9007778fb33766077de59b | 21:15 |
mnaser | do we just drop that part of the code at that point? given it's been uncommented for years now | 21:15 |
*** moshele has joined #openstack-nova | 21:19 | |
artom | mriedem, does the superconductor really rescheduler if the live migrations fails? I'm looking but I can't find anything in _execute, and in the conductor manager if there's a failure in _live_migrate it just sets an error. | 21:19 |
artom | Not sure it's super relevant to the spec, but for my own personal edification | 21:20 |
mriedem | mnaser: you mean drop it at this point? | 21:21 |
mnaser | mriedem: i think so? i mean it's just dead code for 5 years, do we want to muck around with microversion bumps and blah | 21:22 |
mriedem | mnaser: tbc, you're saying just ditch the commented out code since no one cares enough to change it with a new microversion | 21:23 |
mnaser | yep | 21:23 |
mriedem | shrug | 21:24 |
mriedem | i don't see anyone caring enough to change it | 21:24 |
mriedem | the fact you have to supply the all_tenants parameter to filter on project_id does always confuse me | 21:24 |
mnaser | well | 21:25 |
mriedem | but at least it's documented in the api-ref | 21:25 |
mnaser | the clients workaround it now.. | 21:25 |
mnaser | pretty sure you dont need to do that anymore with the cli | 21:25 |
mriedem | https://docs.openstack.org/python-novaclient/latest/cli/nova.html#nova-list | 21:25 |
mriedem | nova list --tenant will just implicitly add --all-tenants | 21:25 |
mriedem | if that's what you mean by workaround | 21:25 |
mnaser | yeah | 21:26 |
mnaser | https://github.com/openstack/python-openstackclient/blob/master/openstackclient/compute/v2/server.py#L1147-L1153 | 21:26 |
mnaser | same for osc | 21:26 |
mriedem | if you want to push a patch to remove the cruft, fine by me | 21:27 |
mriedem | i might even +2 that | 21:27 |
mriedem | anything to make that method smaller b/c god is it long | 21:27 |
mriedem | bnemec: have you ever heard of requests for something like a PostitiveIntOpt or PositiveFloatOpt in oslo.config? we have some options which can be set to 0.0 as the min, but really shouldn't be <= 0. | 21:35 |
mriedem | but we can't really describe that with just min | 21:35 |
dansmith | choices=range(1,1000) ? :P | 21:37 |
bnemec | mriedem: So an opt where min is a < comparison instead of a <=? | 21:37 |
mriedem | something like that | 21:38 |
mriedem | for context https://review.openstack.org/#/c/602804/9/nova/conf/compute.py | 21:38 |
mriedem | initial_cpu_allocation_ratio should never be 0.0 | 21:38 |
dansmith | we really just want a validation function parameter, right? | 21:38 |
mriedem | yeah | 21:38 |
dansmith | we wanted that for something else recently | 21:38 |
openstackgerrit | Mohammed Naser proposed openstack/nova master: Drop cruft code for all_tenants behaviour https://review.openstack.org/620165 | 21:38 |
dansmith | validator=lambda str: ... | 21:38 |
*** xek has quit IRC | 21:39 | |
bnemec | A validator callback seems like something we could do. | 21:41 |
bnemec | Alternatively, in this case min=0.000001 is probably also sane. | 21:41 |
*** awaugama has quit IRC | 21:44 | |
bnemec | You could also create a custom type that did the validation in the constructor. | 21:46 |
bnemec | Subclass Float and put whatever logic you need in there: https://github.com/openstack/oslo.config/blob/master/oslo_config/types.py#L409 | 21:46 |
mriedem | I wasn't sure how kosher subclassing oslo.config opt types was | 21:47 |
bnemec | Then create the opt as Opt(type=MyCustomType, ...). | 21:47 |
bnemec | They're part of the public API so I'd say they're fair game. | 21:47 |
mriedem | ok yeah that's probably cleanest | 21:49 |
bnemec | They danger might be creating a completely new class as a type, which could theoretically be done, but if we ever added to the type API you might get broken. | 21:49 |
bnemec | As long as you inherit from an existing type you should be okay though. | 21:49 |
bnemec | They all descend from a single ABC. | 21:49 |
mriedem | the adam of config opts? | 21:50 |
bnemec | Indeed. | 21:51 |
bnemec | Also, I think I was wrong. You want to do validation in __call__. | 21:51 |
bnemec | That's where we're doing it in the existing types: https://github.com/openstack/oslo.config/blob/master/oslo_config/types.py#L830 | 21:51 |
mriedem | ah yeah https://github.com/openstack/oslo.config/blob/master/oslo_config/types.py#L305 | 21:52 |
bnemec | Yeah, better example. :-) | 21:52 |
*** wolverineav has joined #openstack-nova | 21:53 | |
*** moshele has quit IRC | 21:55 | |
*** eharney has quit IRC | 21:56 | |
mriedem | welp https://review.openstack.org/#/c/613126/ kind of blows up https://specs.openstack.org/openstack/nova-specs/specs/stein/approved/initial-allocation-ratios.html#manually-set-placement-allocation-ratios-are-overwritten | 22:03 |
mriedem | how am i not surprised | 22:03 |
mriedem | efried: you'll have some context on ^ | 22:05 |
mriedem | i'm not exactly sure what we should do about it, outside of passing an initial flag to update_provider_tree or something gross like that...although maybe upt can check the provider tree to see if it already has inventory with allocation_ratio set, and if so, don't provide a value unless CONF.*_allocation_ratio is not None | 22:06 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: Give drop_move_claim() correct docstring https://review.openstack.org/620170 | 22:09 |
artom | mriedem, ^^ related to the numa live migration spec. I *think* I'm right, and it might clarify the confusion around where we're removing usages. | 22:09 |
mriedem | artom: totally forgot you pinged me earlier, sec | 22:10 |
mriedem | artom: superconductor does not reschedule if live migration fails, no | 22:10 |
mriedem | it reschedules if the pre-checks on the dest/source fail | 22:10 |
artom | Right, sorry, I wasn't being precise enough. Though I can't find it rescheduling *anywhere* | 22:11 |
mriedem | so here conductor asks the scheduler for a host https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L370 | 22:11 |
mriedem | in this try block it does the pre-checks for dest/source https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L385 | 22:12 |
mriedem | if that fails, blacklist the host https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L391 | 22:12 |
artom | *facepalm* | 22:12 |
artom | Got it, sorry. | 22:12 |
mriedem | set host=None and hit the while again | 22:12 |
mriedem | https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L366 | 22:12 |
mriedem | np | 22:12 |
artom | My brain skipped over _find_destination() as an atomic thing | 22:12 |
artom | Thanks for taking the time to walk me through this :) | 22:13 |
artom | Now that comment patch, if you have time ;) The patch itself is trivial, but requires thinking to make sure I got it right. | 22:14 |
openstackgerrit | Merged openstack/nova master: Add debug logs when doubling-up allocations during scheduling https://review.openstack.org/617016 | 22:14 |
openstackgerrit | Merged openstack/nova stable/rocky: Default embedded instance.flavor.is_public attribute https://review.openstack.org/619349 | 22:14 |
*** sambetts|afk has quit IRC | 22:14 | |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: Give drop_move_claim() correct docstring https://review.openstack.org/620170 | 22:16 |
*** sambetts_ has joined #openstack-nova | 22:19 | |
*** dave-mccowan has quit IRC | 22:27 | |
*** aspiers has quit IRC | 22:27 | |
*** itlinux has quit IRC | 22:28 | |
mriedem | artom: ok some random comments inline | 22:31 |
mriedem | i guess drop_move_claim called from the source on confirm_resize always confuses me, | 22:31 |
mriedem | because the claim is made on the dest during prep_resize | 22:31 |
efried | mriedem: Sorry, I'm back now. We're trying to figure out how to know whether allocation ratios were defaulted by placement or conf or nova? | 22:31 |
mriedem | efried: yeah - need to determine when initial_*_allocation_ratio config should be used (first time the compute was created) and report that to placement, or if we should set allocation ratios because CONF.*_allocation_ratio is some non-default value, or if we should leave it alone because the admin set some allocation_ratio via the placement API directly | 22:32 |
mriedem | i think we can know the initial case from upt if the provider tree doesn't have those inventory resource class keys in it | 22:33 |
mriedem | otherwise only ever overwrite if CONF.*_allocation_ratio is not None | 22:33 |
efried | "if the provider tree doesn't have those inventory resource class keys" <== I don't think this can happen within upt | 22:34 |
mriedem | on initial startup, wont upt just have a root provider with no inventory on it? | 22:34 |
mriedem | the driver reports the inventory initially | 22:34 |
efried | I don't think so, because of the (legacy) code that's bootstrapping the CPU/MEMORY_MB/DISK_GB inventories before we get to upt | 22:34 |
mriedem | i'm not sure what you're referring to | 22:35 |
efried | I could be wrong, it's possible we've removed that at this point. | 22:35 |
efried | but there was a time when code *outside* of virt driver space would do the initial populate of placement | 22:35 |
mriedem | by calling driver.get_inventory() right? | 22:35 |
mriedem | that no longer happens if the driver implements upt | 22:36 |
efried | get_available_resource | 22:36 |
mriedem | yeah, get_available_resource was before get_inventory | 22:36 |
efried | which *does* still get called I believe. | 22:36 |
mriedem | i think it's upt -> get_inventory -> get_available_resource | 22:36 |
efried | gar doesn't get called in the rt update flow, but it still gets called somewhere else, lemme find it. | 22:36 |
mriedem | https://github.com/openstack/nova/blob/c1de096098344c733565c244163fc3ebf8c35e68/nova/compute/resource_tracker.py#L711 | 22:37 |
*** aspiers has joined #openstack-nova | 22:37 | |
mriedem | we'd create the provider root initially here https://github.com/openstack/nova/blob/c1de096098344c733565c244163fc3ebf8c35e68/nova/compute/resource_tracker.py#L921 | 22:37 |
mriedem | then flush inventory from upt here https://github.com/openstack/nova/blob/c1de096098344c733565c244163fc3ebf8c35e68/nova/compute/resource_tracker.py#L944 | 22:38 |
mriedem | else this https://github.com/openstack/nova/blob/c1de096098344c733565c244163fc3ebf8c35e68/nova/compute/resource_tracker.py#L951 | 22:38 |
mriedem | failing those, this https://github.com/openstack/nova/blob/c1de096098344c733565c244163fc3ebf8c35e68/nova/compute/resource_tracker.py#L961 | 22:38 |
mriedem | which would use the values from gar | 22:38 |
mriedem | https://github.com/openstack/nova/blob/c1de096098344c733565c244163fc3ebf8c35e68/nova/scheduler/client/report.py#L1442 | 22:38 |
mriedem | might be interesting to drop that final path, | 22:39 |
mriedem | looks like maybe only the hyperv driver hasn't implemented an alternative | 22:39 |
mriedem | oh and zvm | 22:40 |
mriedem | forgot that was in tree... | 22:40 |
efried | yeah, I remember looking at this when I was working on https://review.openstack.org/#/c/615705/ to see if I could just real quick implement upt for all the drivers. | 22:40 |
efried | and realizing that was going to be more work than I was ready to undertake at the time. | 22:41 |
efried | but unwinding the gar stuff, that's going to require some synapses I haven't yet explored. | 22:41 |
mriedem | heh, good thing you noted that because i was just thinking about removing that old code path to see what would break | 22:42 |
mriedem | i haven't seen hyper-v run on the latest version of that, but zvm failed | 22:42 |
efried | any case, I think the point is that we can't rely on upt being in the code path for initial population of the root provider inventories. | 22:42 |
efried | yet | 22:42 |
mriedem | no it doesn't need to be though | 22:43 |
efried | if we want upt to be able to decide whether to use initial_*_allocation_ratio it does. | 22:43 |
mriedem | as of https://review.openstack.org/#/c/613126/, if a driver implements upt, it sets the allocation_ratio on the inventory it reports, | 22:43 |
mriedem | otherwise that normalize method in the RT does | 22:43 |
mriedem | _normalize_inventory_from_cn_obj | 22:43 |
efried | right, it sets the allocation ratio based on the non-initial_* conf values. | 22:43 |
mriedem | i think upt can determine if initi allocation ratios can be used though | 22:43 |
efried | how? | 22:43 |
mriedem | if the inventory for a given class is not in the tree, it's initial | 22:44 |
*** cdent has quit IRC | 22:44 | |
efried | Wait, did you just prove (to yourself, at least) that if upt is implemented, it *does* get first crack at the root provider inventory? | 22:45 |
mriedem | yes | 22:45 |
mriedem | i believe so anyway | 22:45 |
efried | should be relatively easy to prove with a func test? | 22:46 |
mriedem | i think the one i wrote here will do it https://review.openstack.org/#/c/613126/4/nova/tests/functional/compute/test_resource_tracker.py | 22:46 |
mriedem | along with the fake driver todo being resolved https://review.openstack.org/#/c/613126/4/nova/virt/fake.py | 22:47 |
mriedem | but yeah https://review.openstack.org/#/c/602804/ really needs to run through the expected / support scenarios in a functional test, | 22:47 |
mriedem | 1. initial create | 22:47 |
mriedem | 2. overwrite in placement API | 22:47 |
mriedem | 3. overwrite via config | 22:47 |
mriedem | and make sure #2 doesn't get f'ed up when the periodic runs | 22:48 |
efried | so we're talking about changing https://review.openstack.org/#/c/613126/4/nova/virt/libvirt/driver.py and its brethren to have logic like: | 22:49 |
efried | inv = ptree.data(root) | 22:49 |
efried | if CPU not in inv: ratio = CONF.initial_cpu_allocation_ratio or 16.0 | 22:49 |
efried | else: ratio = CONF.cpu_allocation_ratio or 16.0 | 22:49 |
efried | and similar for mem/disk? | 22:49 |
efried | oh, except f'ed up when periodic runs | 22:50 |
mriedem | the "or 16.0" gets removed | 22:50 |
mriedem | CONF.initial_cpu_allocation_ratio defaults to 16.0 | 22:50 |
efried | the second one? | 22:50 |
mriedem | both | 22:50 |
*** wolverineav has quit IRC | 22:51 | |
mriedem | the non-initial else becomes only set the ratio if CONF.cpu_allocation_ratio is not None | 22:51 |
efried | right | 22:51 |
efried | so | 22:51 |
mriedem | in other words, don't change the allocation ratio if it was set externally | 22:51 |
mriedem | and config hasn't changed | 22:51 |
*** wolverineav has joined #openstack-nova | 22:51 | |
efried | if CPU not in inv: ratio = CONF.initial_cpu_allocation_ratio # which defaults to 16.0 | 22:51 |
efried | elif CONF.cpu_allocation_ratio: ratio = CONF.cpu_allocation_ratio | 22:51 |
efried | # else no-op, leave it tf alone | 22:51 |
mriedem | exactly | 22:52 |
efried | I guess technically that last bit would have to be | 22:52 |
efried | else: ratio = data[CPU][ratio] | 22:52 |
efried | so it doesn't get omitted and wind up with the placement default :( | 22:52 |
mriedem | yup | 22:52 |
efried | okay, I can dig it. Assuming we can prove the upt-gets-initial-look thing. Let me take a look at that test you highlighted... | 22:53 |
efried | still looks slightly holey. | 22:55 |
efried | Let me go list all the permutations... | 22:55 |
mriedem | in yikun's patch? | 22:55 |
mriedem | i'm dumping notes in there | 22:55 |
efried | I'll just pastebin 'em for now | 22:55 |
efried | mm, if both values are set we want to end up with the conf one, so my above algo won't quite work | 22:58 |
mriedem | why not? | 22:58 |
efried | you would end up with the CONF.initial | 22:58 |
efried | until the next time update runs. | 22:59 |
efried | works if you reverse the conditions I think. | 22:59 |
efried | if CONF.cpu_allocation_ratio: ratio = CONF.cpu_allocation_ratio | 23:00 |
efried | elif CPU not in inv: ratio = CONF.initial_cpu_allocation_ratio # which defaults to 16.0 | 23:00 |
efried | else: ratio = data[CPU][allocation_ratio] | 23:00 |
mriedem | you should only get initial config if CPU not in inv though | 23:00 |
mriedem | ok i think either would be ok | 23:00 |
openstackgerrit | Michael Still proposed openstack/nova master: Move bridge creation to privsep. https://review.openstack.org/620180 | 23:00 |
efried | but in all cases if CONF.cpu_allocation_ratio is set you want *that* value - i.e. ignore initial_* | 23:01 |
mriedem | well well well, look who it is | 23:01 |
efried | brb | 23:01 |
mriedem | efried: true yeah | 23:01 |
mriedem | ok i got your point now | 23:01 |
*** rcernin has joined #openstack-nova | 23:01 | |
openstackgerrit | Merged openstack/nova stable/queens: Fix NoneType error in _notify_volume_usage_detach https://review.openstack.org/614868 | 23:04 |
efried | mriedem: I think we might strive to embed that logic in a base class helper somehow. | 23:06 |
efried | mriedem: to replace (and subsume) https://review.openstack.org/#/c/613126/4/nova/virt/driver.py@860 | 23:07 |
mriedem | yeah it would be nice to not duplicate it all over the place | 23:08 |
*** slaweq has quit IRC | 23:10 | |
*** slaweq has joined #openstack-nova | 23:11 | |
mriedem | alright with that i'm done | 23:12 |
mriedem | efried: thanks for the sound board | 23:12 |
efried | fo sho | 23:12 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Drop cruft code for all_tenants behaviour https://review.openstack.org/620165 | 23:12 |
*** wolverineav has quit IRC | 23:13 | |
*** mriedem is now known as mriedem_away | 23:13 | |
*** wolverineav has joined #openstack-nova | 23:14 | |
*** wolverineav has quit IRC | 23:21 | |
*** wolverineav has joined #openstack-nova | 23:21 | |
*** slaweq has quit IRC | 23:24 | |
*** itlinux has joined #openstack-nova | 23:28 | |
*** munimeha1 has quit IRC | 23:30 | |
*** dave-mccowan has joined #openstack-nova | 23:36 | |
openstackgerrit | Eric Fried proposed openstack/nova master: Add missing ws seperator between words https://review.openstack.org/618491 | 23:42 |
*** dave-mccowan has quit IRC | 23:44 | |
*** wolverineav has quit IRC | 23:50 | |
*** wolverineav has joined #openstack-nova | 23:52 | |
*** gyee has quit IRC | 23:52 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!