*** slaweq has quit IRC | 00:01 | |
*** ociuhandu has quit IRC | 00:02 | |
*** slaweq has joined #openstack-nova | 00:11 | |
*** xek has quit IRC | 00:12 | |
*** slaweq has quit IRC | 00:18 | |
*** brinzhang has joined #openstack-nova | 00:32 | |
openstackgerrit | sean mooney proposed openstack/nova master: add [libvirt]/max_queues config option https://review.opendev.org/695118 | 00:38 |
---|---|---|
*** slaweq has joined #openstack-nova | 00:42 | |
*** yaawang has quit IRC | 00:44 | |
*** brinzhang_ has joined #openstack-nova | 00:45 | |
*** slaweq has quit IRC | 00:46 | |
*** brinzhang has quit IRC | 00:49 | |
*** TxGirlGeek has quit IRC | 00:52 | |
*** yaawang has joined #openstack-nova | 00:57 | |
*** slaweq has joined #openstack-nova | 01:01 | |
*** nanzha has joined #openstack-nova | 01:05 | |
*** slaweq has quit IRC | 01:05 | |
*** Liang__ has joined #openstack-nova | 01:08 | |
*** mlavalle has quit IRC | 01:21 | |
*** ociuhandu has joined #openstack-nova | 01:22 | |
*** slaweq has joined #openstack-nova | 01:25 | |
*** slaweq has quit IRC | 01:30 | |
*** ociuhandu has quit IRC | 01:34 | |
*** lifeless has quit IRC | 01:34 | |
mriedem | melwitt: you're going to have to ping me earlier tomorrow about that tempest patch, it's too late for me to dig into that right now | 01:35 |
mriedem | but i'll star it | 01:35 |
melwitt | mriedem: ok, np. it's not urgent but something I thought you might have thoughts on. thanks | 01:36 |
mriedem | also, | 01:40 |
mriedem | your downstream ci people know they can just write tempest plugins for the tests they want to use right? | 01:41 |
mriedem | it doesn't need to be in the main repo | 01:41 |
mriedem | so i don't think it's a great pattern to set with like, "we'd like to test x which is 99% already covered elsewhere" | 01:41 |
mriedem | anyway, i've got to drop but i left that comment on there | 01:43 |
melwitt | I didn't know how tempest plugins work | 01:43 |
mriedem | cinder has one, check it out | 01:43 |
mriedem | like devstack plugins | 01:43 |
mriedem | https://github.com/openstack/cinder-tempest-plugin | 01:43 |
mriedem | anyway, o/ | 01:43 |
melwitt | ok | 01:44 |
*** mriedem has quit IRC | 01:44 | |
*** mdbooth has quit IRC | 01:49 | |
*** lifeless has joined #openstack-nova | 01:49 | |
*** ozzzo has quit IRC | 01:50 | |
*** Liang__ is now known as LiangFang | 01:54 | |
*** mdbooth has joined #openstack-nova | 01:56 | |
*** awalende has joined #openstack-nova | 01:57 | |
*** awalende has quit IRC | 02:02 | |
*** ileixe has quit IRC | 02:05 | |
*** ileixe has joined #openstack-nova | 02:06 | |
*** lifeless has quit IRC | 02:12 | |
*** macz has joined #openstack-nova | 02:12 | |
*** slaweq has joined #openstack-nova | 02:13 | |
*** igordc has quit IRC | 02:18 | |
*** slaweq has quit IRC | 02:18 | |
mnaser | sean-k-mooney: something in your domain -- i was wondering if libvirt by default makes sure that vcpus for domains live inside the same numa node | 02:22 |
mnaser | i'm playing with new epyc rome hardware and seeing poor memory access times with sysbench, im wondering if the reasoning behind it is that cpu time is flopping back and forth between numa nodes | 02:22 |
*** lifeless has joined #openstack-nova | 02:24 | |
*** macz has quit IRC | 02:24 | |
*** slaweq has joined #openstack-nova | 02:24 | |
*** nanzha has quit IRC | 02:25 | |
*** nanzha has joined #openstack-nova | 02:27 | |
*** slaweq has quit IRC | 02:30 | |
*** ociuhandu has joined #openstack-nova | 02:35 | |
*** slaweq has joined #openstack-nova | 02:35 | |
*** abaindur has quit IRC | 02:36 | |
*** alex_xu has joined #openstack-nova | 02:38 | |
alex_xu | melwitt: hellow, are you still here | 02:38 |
alex_xu | s/hellow/hello/ | 02:39 |
*** ociuhandu has quit IRC | 02:39 | |
*** mkrai has joined #openstack-nova | 02:40 | |
*** slaweq has quit IRC | 02:48 | |
*** gyee has quit IRC | 02:53 | |
*** HagunKim has joined #openstack-nova | 02:54 | |
HagunKim | Hi, How can I see Nova version like 19.0.3, 20.0.1 on my nova-compute host? | 02:55 |
*** slaweq has joined #openstack-nova | 03:00 | |
*** ileixe has quit IRC | 03:00 | |
*** slaweq has quit IRC | 03:04 | |
*** ileixe has joined #openstack-nova | 03:04 | |
*** slaweq has joined #openstack-nova | 03:05 | |
*** slaweq has quit IRC | 03:16 | |
*** macz has joined #openstack-nova | 03:18 | |
*** nanzha has quit IRC | 03:22 | |
*** nanzha has joined #openstack-nova | 03:22 | |
*** macz has quit IRC | 03:22 | |
*** slaweq has joined #openstack-nova | 03:26 | |
*** macz has joined #openstack-nova | 03:28 | |
*** slaweq has quit IRC | 03:31 | |
*** macz has quit IRC | 03:33 | |
*** slaweq has joined #openstack-nova | 03:35 | |
*** slaweq has quit IRC | 03:40 | |
*** mkrai has quit IRC | 03:42 | |
*** mkrai_ has joined #openstack-nova | 03:42 | |
*** ricolin has joined #openstack-nova | 03:43 | |
*** nanzha has quit IRC | 03:53 | |
*** nanzha has joined #openstack-nova | 03:53 | |
*** slaweq has joined #openstack-nova | 03:55 | |
openstackgerrit | Merged openstack/nova master: Remove functional test specific nova code https://review.opendev.org/683609 | 03:58 |
*** slaweq has quit IRC | 04:02 | |
*** liuyulong has quit IRC | 04:05 | |
*** slaweq has joined #openstack-nova | 04:10 | |
*** slaweq has quit IRC | 04:15 | |
*** bhagyashris has joined #openstack-nova | 04:31 | |
*** slaweq has joined #openstack-nova | 04:39 | |
*** slaweq has quit IRC | 04:44 | |
openstackgerrit | Brin Zhang proposed openstack/nova-specs master: Add composable flavor properties https://review.opendev.org/663563 | 04:58 |
*** slaweq has joined #openstack-nova | 05:01 | |
*** sapd1_ has joined #openstack-nova | 05:02 | |
*** bhagyashris has quit IRC | 05:04 | |
*** sapd1 has quit IRC | 05:04 | |
*** slaweq has quit IRC | 05:08 | |
*** brinzhang has joined #openstack-nova | 05:09 | |
*** bhagyashris has joined #openstack-nova | 05:09 | |
*** brinzhang has quit IRC | 05:10 | |
*** brinzhang has joined #openstack-nova | 05:11 | |
*** slaweq has joined #openstack-nova | 05:11 | |
*** brinzhang_ has quit IRC | 05:12 | |
*** threestrands has joined #openstack-nova | 05:18 | |
*** slaweq has quit IRC | 05:20 | |
*** pcaruana has joined #openstack-nova | 05:25 | |
*** ratailor has joined #openstack-nova | 05:27 | |
*** slaweq has joined #openstack-nova | 05:29 | |
*** ociuhandu has joined #openstack-nova | 05:31 | |
*** slaweq has quit IRC | 05:33 | |
*** ociuhandu has quit IRC | 05:35 | |
*** slaweq has joined #openstack-nova | 05:42 | |
*** jmlowe has quit IRC | 05:46 | |
*** slaweq has quit IRC | 05:47 | |
*** links has joined #openstack-nova | 05:48 | |
*** brinzhang_ has joined #openstack-nova | 05:52 | |
*** jraju__ has joined #openstack-nova | 05:54 | |
*** links has quit IRC | 05:54 | |
*** jraju__ has quit IRC | 05:54 | |
*** ileixe has quit IRC | 05:55 | |
*** brinzhang has quit IRC | 05:55 | |
*** awalende has joined #openstack-nova | 05:58 | |
*** ileixe has joined #openstack-nova | 05:59 | |
*** brinzhang has joined #openstack-nova | 06:00 | |
*** awalende has quit IRC | 06:02 | |
*** brinzhang_ has quit IRC | 06:03 | |
*** slaweq has joined #openstack-nova | 06:11 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/nova master: Imported Translations from Zanata https://review.opendev.org/694717 | 06:13 |
*** slaweq has quit IRC | 06:15 | |
*** zhanglong has joined #openstack-nova | 06:20 | |
*** Nick_A has quit IRC | 06:26 | |
*** slaweq has joined #openstack-nova | 06:27 | |
*** slaweq has quit IRC | 06:32 | |
*** larainema has joined #openstack-nova | 06:33 | |
*** slaweq has joined #openstack-nova | 06:39 | |
*** slaweq has quit IRC | 06:44 | |
*** slaweq has joined #openstack-nova | 07:11 | |
*** dpawlik has joined #openstack-nova | 07:14 | |
*** brinzhang has quit IRC | 07:15 | |
*** slaweq has quit IRC | 07:15 | |
*** brinzhang has joined #openstack-nova | 07:16 | |
*** udesale has joined #openstack-nova | 07:25 | |
*** tosky has joined #openstack-nova | 07:35 | |
*** priteau has quit IRC | 07:38 | |
*** ileixe has quit IRC | 07:44 | |
*** mkrai_ has quit IRC | 07:45 | |
*** mkrai_ has joined #openstack-nova | 07:45 | |
*** awalende has joined #openstack-nova | 07:47 | |
*** slaweq has joined #openstack-nova | 07:56 | |
*** dpawlik has quit IRC | 07:56 | |
*** ccamacho has joined #openstack-nova | 07:57 | |
gibi | HagunKim: $ nova-compute --version | 07:57 |
gibi | 20.1.0 | 07:57 |
*** tesseract has joined #openstack-nova | 07:58 | |
*** threestrands has quit IRC | 07:58 | |
HagunKim | gibi: Thanks! | 07:58 |
*** macz has joined #openstack-nova | 07:59 | |
*** abaindur has joined #openstack-nova | 08:02 | |
*** abaindur has joined #openstack-nova | 08:03 | |
*** macz has quit IRC | 08:04 | |
*** ociuhandu has joined #openstack-nova | 08:09 | |
*** tkajinam has quit IRC | 08:17 | |
*** ociuhandu has quit IRC | 08:19 | |
*** dpawlik has joined #openstack-nova | 08:21 | |
bauzas | good morning Nova | 08:28 |
gibi | bauzas: good morning | 08:31 |
bauzas | gibi: just doing a few stuff now, but if you're OK, I'd like to discuss on the audit command in 30 mins | 08:31 |
bauzas | fine with you ? | 08:31 |
gibi | bauzas: sure, ping me | 08:31 |
bauzas | tl;dr: when calling --delete, you haven't use the --verbose also, so I don't see if we tried to delete the allocation | 08:32 |
bauzas | gibi: ^ | 08:32 |
gibi | bauzas: I can quickly try --verbose + --delete | 08:33 |
bauzas | gibi: would be nice, thanks | 08:33 |
gibi | bauzas: http://paste.openstack.org/show/786387/ | 08:33 |
gibi | the output of the --verbose --delete seems like to be the same as the output of only --verbose | 08:34 |
bauzas | gibi: weirod | 08:34 |
gibi | yepp | 08:34 |
gibi | yesterday I had to leave but I can try to attach a debugger now | 08:35 |
bauzas | looks like the conditional doesn't work | 08:35 |
bauzas | I mean in https://review.opendev.org/#/c/670112/10/nova/cmd/manage.py@2806 | 08:35 |
*** ivve has joined #openstack-nova | 08:36 | |
gibi | bauzas: debugger helped it is the args definition that is buggy https://review.opendev.org/#/c/670112/10/nova/cmd/manage.py@2887 | 08:37 |
bauzas | argh | 08:38 |
gibi | I guess you are missing some test coverage for --delete :) | 08:38 |
bauzas | but I'm testing it | 08:38 |
bauzas | gibi: https://review.opendev.org/#/c/670112/10/nova/tests/functional/test_nova_manage.py@1417 | 08:39 |
gibi | bauzas: you are calling audit() directly from the test so skipping the arg parser | 08:39 |
bauzas | OH ! | 08:39 |
bauzas | yeah this | 08:39 |
bauzas | how could I test it then ? | 08:39 |
gibi | bauzas: good question. I just quickly checked and the heal_allocation tests do not cover args parsing either | 08:41 |
bauzas | yeah | 08:41 |
*** nanzha has quit IRC | 08:41 | |
gibi | I would not worry about it then in the functional env | 08:44 |
gibi | it seems we don't have the facility there | 08:44 |
gibi | there are some nova-manage tests in https://github.com/openstack/nova/blob/master/gate/post_test_hook.sh there we can cover the args parsing too | 08:45 |
openstackgerrit | Sylvain Bauza proposed openstack/nova stable/train: Don't delete compute node, when deleting service other than nova-compute https://review.opendev.org/695145 | 08:46 |
gibi | I guess this can be replaced with your audit call https://github.com/openstack/nova/blob/master/gate/post_test_hook.sh#L81 | 08:46 |
*** nanzha has joined #openstack-nova | 08:48 | |
*** elod_off is now known as elod | 08:50 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova stable/pike: Only nil az during shelve offload https://review.opendev.org/693839 | 08:50 |
*** ralonsoh has joined #openstack-nova | 08:51 | |
*** rpittau|afk is now known as rpittau | 08:53 | |
*** nanzha has quit IRC | 08:54 | |
*** ociuhandu has joined #openstack-nova | 08:55 | |
*** damien_r has joined #openstack-nova | 08:58 | |
*** nanzha has joined #openstack-nova | 08:58 | |
*** ociuhandu has quit IRC | 08:59 | |
*** ociuhandu has joined #openstack-nova | 09:00 | |
*** ociuhandu has quit IRC | 09:03 | |
*** jawad_axd has joined #openstack-nova | 09:03 | |
*** ociuhandu has joined #openstack-nova | 09:04 | |
*** ociuhandu has quit IRC | 09:11 | |
*** ociuhandu has joined #openstack-nova | 09:11 | |
*** ociuhandu has quit IRC | 09:13 | |
*** tosky has quit IRC | 09:13 | |
*** ociuhandu has joined #openstack-nova | 09:14 | |
*** tosky has joined #openstack-nova | 09:14 | |
*** ociuhandu has quit IRC | 09:15 | |
*** ociuhandu has joined #openstack-nova | 09:16 | |
*** abaindur has quit IRC | 09:17 | |
*** ociuhandu has quit IRC | 09:17 | |
stephenfin | gibi, bauzas: If either of you have bandwidth available this week, I'd appreciate some reviews on the removal of nova-network series https://review.opendev.org/#/q/topic:bp/remove-nova-network-ussuri+status:open | 09:17 |
* stephenfin will be bugging mriedem too since there are a few API removals in there | 09:18 | |
*** ociuhandu has joined #openstack-nova | 09:18 | |
gibi | stephenfin: added to my list, but no hard promises | 09:19 |
stephenfin | ack | 09:19 |
*** ociuhandu has quit IRC | 09:19 | |
*** ociuhandu has joined #openstack-nova | 09:20 | |
bauzas | stephenfin: sure, I want to dig into reviews this week | 09:21 |
*** ociuhandu has quit IRC | 09:26 | |
*** ociuhandu has joined #openstack-nova | 09:27 | |
*** ociuhandu has quit IRC | 09:29 | |
*** ociuhandu has joined #openstack-nova | 09:29 | |
*** ociuhandu has quit IRC | 09:32 | |
*** ociuhandu has joined #openstack-nova | 09:32 | |
*** ociuhandu has quit IRC | 09:34 | |
*** ociuhandu has joined #openstack-nova | 09:35 | |
*** ociuhandu has quit IRC | 09:37 | |
*** ociuhandu has joined #openstack-nova | 09:37 | |
*** ociuhandu has quit IRC | 09:39 | |
*** ociuhandu has joined #openstack-nova | 09:40 | |
*** ociuhandu has quit IRC | 09:43 | |
*** awalende has quit IRC | 09:43 | |
*** ociuhandu has joined #openstack-nova | 09:43 | |
*** dpawlik has quit IRC | 09:45 | |
*** ociuhandu has quit IRC | 09:45 | |
*** ociuhandu has joined #openstack-nova | 09:46 | |
*** rcernin has quit IRC | 09:46 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: docs: Rewrite quotas documentation https://review.opendev.org/667165 | 09:46 |
*** derekh has joined #openstack-nova | 09:47 | |
*** awalende has joined #openstack-nova | 09:48 | |
*** ociuhandu has quit IRC | 09:48 | |
*** ociuhandu has joined #openstack-nova | 09:49 | |
*** mkrai_ has quit IRC | 09:49 | |
*** martinkennelly has joined #openstack-nova | 09:50 | |
*** ociuhandu has quit IRC | 09:50 | |
*** mkrai has joined #openstack-nova | 09:50 | |
*** ociuhandu has joined #openstack-nova | 09:51 | |
*** awalende has quit IRC | 09:52 | |
*** ociuhandu has quit IRC | 09:52 | |
*** awalende has joined #openstack-nova | 09:52 | |
*** priteau has joined #openstack-nova | 09:52 | |
*** ociuhandu has joined #openstack-nova | 09:53 | |
*** ociuhandu has quit IRC | 09:56 | |
*** ociuhandu has joined #openstack-nova | 09:57 | |
*** jaosorior has joined #openstack-nova | 09:57 | |
*** ociuhandu has quit IRC | 09:58 | |
*** ociuhandu has joined #openstack-nova | 09:58 | |
openstackgerrit | Kashyap Chamarthy proposed openstack/nova master: libvirt: Bump MIN_{LIBVIRT,QEMU}_VERSION for "Ussuri" https://review.opendev.org/695056 | 10:03 |
*** nanzha has quit IRC | 10:06 | |
*** priteau has quit IRC | 10:06 | |
*** nanzha has joined #openstack-nova | 10:08 | |
*** mkrai has quit IRC | 10:20 | |
*** dpawlik has joined #openstack-nova | 10:20 | |
*** mkrai has joined #openstack-nova | 10:23 | |
*** dpawlik has quit IRC | 10:24 | |
*** dtantsur|afk is now known as dtantsur | 10:27 | |
openstackgerrit | Liang Fang proposed openstack/nova-specs master: Support volume local cache https://review.opendev.org/689070 | 10:28 |
*** LiangFang has quit IRC | 10:32 | |
*** ricolin has quit IRC | 10:32 | |
*** HagunKim has quit IRC | 10:33 | |
*** xek has joined #openstack-nova | 10:35 | |
*** damien_r has quit IRC | 10:36 | |
*** damien_r has joined #openstack-nova | 10:38 | |
*** brinzhang_ has joined #openstack-nova | 10:38 | |
openstackgerrit | Merged openstack/nova master: Don't delete compute node, when deleting service other than nova-compute https://review.opendev.org/694756 | 10:39 |
*** brinzhang has quit IRC | 10:42 | |
*** mkrai has quit IRC | 10:43 | |
*** zhanglong has quit IRC | 10:46 | |
*** dtantsur is now known as dtantsur|brb | 10:48 | |
*** dpawlik has joined #openstack-nova | 10:51 | |
*** brinzhang has joined #openstack-nova | 11:01 | |
*** brinzhang_ has quit IRC | 11:05 | |
*** PrinzElvis has quit IRC | 11:08 | |
*** bhagyashris has quit IRC | 11:10 | |
*** macz has joined #openstack-nova | 11:12 | |
*** dpawlik has quit IRC | 11:12 | |
*** macz has quit IRC | 11:16 | |
*** ociuhandu has quit IRC | 11:21 | |
*** lpetrut has joined #openstack-nova | 11:26 | |
*** mkrai has joined #openstack-nova | 11:34 | |
*** tbachman has quit IRC | 11:38 | |
openstackgerrit | Sylvain Bauza proposed openstack/nova master: Add a placement audit command https://review.opendev.org/670112 | 11:41 |
openstackgerrit | Sylvain Bauza proposed openstack/nova master: Avoid PlacementFixture silently swallowing kwargs https://review.opendev.org/695180 | 11:41 |
*** brinzhang_ has joined #openstack-nova | 11:45 | |
*** ratailor has quit IRC | 11:46 | |
*** dpawlik has joined #openstack-nova | 11:49 | |
*** brinzhang has quit IRC | 11:49 | |
*** ociuhandu has joined #openstack-nova | 11:51 | |
gibi | stephenfin: I think your patch https://review.opendev.org/#/c/694330/ in devstack-plugin-ceph the causing nova-live-migration job to fail on stable/pike. Opened a bug here https://bugs.launchpad.net/nova/+bug/1853280 | 11:52 |
openstack | Launchpad bug 1853280 in OpenStack Compute (nova) pike "nova-live-migration job constantly fails on stable/pike" [Undecided,New] | 11:52 |
gibi | lyarwood: ^^ for the yesterday stable/pike issue | 11:52 |
*** dpawlik has quit IRC | 11:53 | |
artom | stephenfin, heya, could I get you to take another look at https://review.opendev.org/#/c/672595/60? | 12:10 |
artom | sean-k-mooney, same question for https://review.opendev.org/#/c/691062/ :) | 12:11 |
*** PrinzElvis has joined #openstack-nova | 12:15 | |
*** brinzhang_ has quit IRC | 12:16 | |
*** mkrai has quit IRC | 12:17 | |
sean-k-mooney | artom: sure did you fix up the other pathce below if so ill start form the bottom and work up | 12:17 |
*** dave-mccowan has quit IRC | 12:17 | |
artom | sean-k-mooney, yeah, the series is ready to go - Joe even merged some of the bottom ones | 12:21 |
* artom is happy to see sean-k-mooney not angry at him for making him write unit tests. | 12:21 | |
sean-k-mooney | i have not written them yet so how can i be angrey :) | 12:22 |
*** larainema has quit IRC | 12:23 | |
artom | Ah, the anger comes later, I see | 12:23 |
*** pcaruana has quit IRC | 12:24 | |
sean-k-mooney | ohly after fighting with mocking for about an hour. although what you want me to test should be strait forward so you will likely get a pass | 12:24 |
artom | Yep, I know how "unit testing" some of those "functions" can be | 12:25 |
*** awalende has quit IRC | 12:26 | |
artom | But you made such a nice clean new one... :) | 12:26 |
*** mlavalle has joined #openstack-nova | 12:29 | |
openstackgerrit | Kashyap Chamarthy proposed openstack/nova master: Pick NEXT_MIN libvirt/QEMU versions for "V" release https://review.opendev.org/694821 | 12:30 |
openstackgerrit | Kashyap Chamarthy proposed openstack/nova master: libvirt: Bump MIN_{LIBVIRT,QEMU}_VERSION for "Ussuri" https://review.opendev.org/695056 | 12:30 |
*** mkrai has joined #openstack-nova | 12:36 | |
*** pcaruana has joined #openstack-nova | 12:50 | |
*** awalende has joined #openstack-nova | 12:52 | |
*** mgariepy has joined #openstack-nova | 12:54 | |
*** derekh has quit IRC | 12:55 | |
*** awalende has quit IRC | 12:56 | |
kashyap | stephenfin: Heya, one reason why I also separate out the NEXT_MIN is that it lets us settle on something while we chip away at bump + resulting clean-up | 12:57 |
*** ociuhandu has quit IRC | 12:58 | |
*** mkrai has quit IRC | 12:59 | |
*** derekh has joined #openstack-nova | 13:02 | |
stephenfin | kashyap: The clean up is all optional though, right? | 13:03 |
stephenfin | I mean, it's just dead code | 13:03 |
kashyap | stephenfin: I mean the immediate clean-up to make the bump work; not the "full clean-up of removing dead constants" | 13:03 |
stephenfin | ah, I figured it would just be test removal | 13:03 |
stephenfin | gibi: I assume the plugin isn't versioned? | 13:04 |
kashyap | :-) Hehe, you've also done part of this clean-up before, maybe you forgot briefly | 13:04 |
gibi | stephenfin: I think it is not branched | 13:04 |
stephenfin | kashyap: yup, but they could all be done after the fact | 13:04 |
gibi | stephenfin: I haven't checked if it is versioned somehow | 13:04 |
*** tbachman has joined #openstack-nova | 13:04 | |
kashyap | stephenfin: Oh, certainly; that's how we do anyway - removing the dead constants after the bump. | 13:05 |
stephenfin | kashyap: so the only thing we'd have to clean up in that patch is anything that assumes the version is == the old minimum | 13:05 |
openstackgerrit | Mark Goddard proposed openstack/nova master: Add functional regression test for bug 1853009 https://review.opendev.org/695012 | 13:05 |
openstack | bug 1853009 in OpenStack Compute (nova) "Ironic node rebalance race can lead to missing compute nodes in DB" [Undecided,In progress] https://launchpad.net/bugs/1853009 - Assigned to Mark Goddard (mgoddard) | 13:05 |
openstackgerrit | Mark Goddard proposed openstack/nova master: Prevent deletion of a compute node belonging to another host https://review.opendev.org/694802 | 13:05 |
openstackgerrit | Mark Goddard proposed openstack/nova master: Clear rebalanced compute nodes from resource tracker https://review.opendev.org/695187 | 13:05 |
openstackgerrit | Mark Goddard proposed openstack/nova master: Invalidate provider tree when compute node disappears https://review.opendev.org/695188 | 13:05 |
openstackgerrit | Mark Goddard proposed openstack/nova master: Fix inactive session error in compute node creation https://review.opendev.org/695189 | 13:05 |
*** priteau has joined #openstack-nova | 13:05 | |
kashyap | stephenfin: That's what I did (or I thought I did) here: https://review.opendev.org/#/c/695056/2 | 13:06 |
kashyap | stephenfin: If you skimmed the commit, don't mistake the listed constants as the ones I've removed -- I just noted that they'll be removed in _future_ patches | 13:06 |
*** ricolin has joined #openstack-nova | 13:07 | |
mnaser | sean-k-mooney: i'm trying to optimize a system to minimize memory latency. i have a total of 8 numa nodes, 4 inside every socket (distances numa<=>numa on same socket is 10, but cross numa in same socket is 12, cross socket+cross numa is 32) | 13:10 |
mnaser | so its really expensive to do cross-numa cross-socket (and even somewhat to do cross-numa in socket too). i have 12 threads inside a NUMA node too | 13:11 |
mnaser | i'm thinking if i give the VM 2 NUMA nodes (instead of none), will that end up increasing performance because hopefully the two pair of NUMA nodes end up being close by (rather than cross socket) | 13:12 |
stephenfin | gibi: okay, I can fix that now | 13:12 |
mnaser | and maybe at least the VM will be aware that it's crossing numa paths | 13:12 |
gibi | stephenfin: no tags, no branches in devstack-plugin-ceph so I don't know how to pin | 13:12 |
gibi | stephenfin: thanks for fixing | 13:12 |
stephenfin | I'm not going to pin. I'm going to simply check if Python 3 is enabled | 13:12 |
gibi | stephenfin: good idea | 13:12 |
sean-k-mooney | mnaser: you using first gen amd eypc cpus yes | 13:12 |
mnaser | sean-k-mooney: second gen :) | 13:13 |
sean-k-mooney | second gen have 1 io die per socket and only 1 numa node per socket | 13:13 |
sean-k-mooney | well by second gen i ment zen2 backed eypc cpus | 13:13 |
mnaser | really? i actually have BIOS settings to allow me to change an "NPS" settting which makes me select 1/2/4 | 13:14 |
sean-k-mooney | im not sure if they had a refresh based on zen 1 | 13:14 |
sean-k-mooney | that shoudl be there for zen1 | 13:14 |
mnaser | https://developer.amd.com/wp-content/resources/56745_0.80.pdf see "NUMA Nodes per Socket (NPS)" -- i set that to 4 so i have 8 numa nodes in total | 13:14 |
sean-k-mooney | the recomendation i have gien internally is to disable multiple numa nodes per socket for zen | 13:14 |
mnaser | really? i've seen that it has resulted in less than ideal NUMA node latency (but a more consistent one inside a socket) | 13:15 |
sean-k-mooney | hum ok so you are using gen2 | 13:16 |
mnaser | yep | 13:16 |
mnaser | `numactl -N 0 -m 0 sysbench memory --threads=8 --memory-total-size=32G run` with NPS=4 was *WAY* faster than with NPS=1 | 13:16 |
sean-k-mooney | i might need to reasses | 13:16 |
sean-k-mooney | yes it would be | 13:16 |
sean-k-mooney | so basically we were under the impression that gen 2 would only have 1 numa node so we decided not to optimise for the up to 8 numa nodes per socket that can be exposed in gen 1 | 13:17 |
mnaser | gen 2 can even go up to 16 numa nodes, you can make it expose a numa node per each l3 cache | 13:18 |
sean-k-mooney | mnaser: so to your orgincal question giving the guest 2 numa nodes will imporve the guest performace in general | 13:18 |
*** bhagyashris has joined #openstack-nova | 13:18 | |
mnaser | sean-k-mooney: https://softwarefactory-project.io/logs/45/16145/11/check/benchmark-combine/ba8ff3a/phoronix-results-cloud-fedora-30-vexxhost-vexxhost-nodepool-tripleo-0002241618/ in this case, much older hardware vs gen 2 epyc.. look at the difference in memory performance | 13:18 |
sean-k-mooney | so the trade of with that is it limits the vms signifcantly | 13:19 |
sean-k-mooney | or rather how you create your flavors | 13:19 |
sean-k-mooney | if you enable all 16 | 13:19 |
stephenfin | gibi: https://review.opendev.org/695191 | 13:20 |
sean-k-mooney | then you vms with a numa topology can only have 3-4 cpus per numa node due to how we do numa in nova | 13:20 |
stephenfin | gibi: Thank God I included https://review.opendev.org/#/c/692374/6/gate/live_migration/hooks/ceph.sh@11 :D | 13:20 |
mnaser | i see, i'm ok with 8 if they are living next to each other (and i guess giving the vm the proper awareness means it will know how to address things) | 13:20 |
sean-k-mooney | e.g. if you expose 1 numa node per ccx it give you the best performance but you are forst to either have multipl numa nodes for larger vms or use no numa features | 13:21 |
mnaser | i rather 8 because it gives me bigger memory slices and more cores (so i can have an entire VM live in a single NUMA node) | 13:21 |
sean-k-mooney | yes but that raises an intersting point | 13:21 |
gibi | stephenfin: deep down you knew that you need to check for python3 enabdled :) | 13:21 |
sean-k-mooney | we have talked about breaking the 1:1 mapping betwen numa nodes in the guest and the host | 13:22 |
sean-k-mooney | if we did that we could decuple this | 13:22 |
gibi | stephenfin: thanks for the fix | 13:22 |
*** dave-mccowan has joined #openstack-nova | 13:22 | |
sean-k-mooney | there are tradoffs but if it was a policy on the guest that would mean you could optimise the hardware config without limiting the slicing | 13:23 |
stephenfin | artom: Yeah, I started looking at it this morning before lunch. ngl, it's tough going /o\ | 13:26 |
mnaser | sean-k-mooney: right, which is odd because when i run `numastat -cv qemu-kvm` -- it seems like the VM lives in a single node | 13:26 |
mnaser | but looking at the performance benchmarks, the numbers are pretty bad | 13:26 |
mnaser | i mean, its not hosting the *entire* VM in a single NUMA node, but you can tell the memory allocated to the VM does live in that node | 13:27 |
mnaser | https://www.irccloud.com/pastebin/9h0uziTB/ | 13:27 |
sean-k-mooney | mnaser: also damb the intel chip got schooled | 13:27 |
mnaser | sean-k-mooney: it did in everything except memory bound stuff, i think qpi is just insanely fast | 13:27 |
sean-k-mooney | qpi was retired its no UPI | 13:28 |
*** rouk has quit IRC | 13:28 | |
sean-k-mooney | partly for licensing reasons but also upi is fater the qpi | 13:28 |
artom | stephenfin, I know :( | 13:29 |
sean-k-mooney | i thought the inifity fabric in zen processor was ment to be faster however but i dont think we have ever really seen benchmarks of the too fabrics | 13:29 |
artom | You efforts are appreciated | 13:29 |
artom | It's a big patch, I tried documenting what I thought was worth it | 13:29 |
mnaser | sean-k-mooney: the numbers i see make it seem pretty slow. | 13:29 |
artom | sean-k-mooney, so actually I think you have enough good points on the whitebox series to warrant a respin, if you don't mind looking at it again after | 13:30 |
sean-k-mooney | mnaser: could you provide us with the output form virsh capablitys by the way | 13:30 |
sean-k-mooney | i would be interseted in seeing what data we are feeding to nova for it to make decieions | 13:30 |
bauzas | gibi: I updated the placement audit command + added a FUP for testing the PlacementFixture | 13:31 |
sean-k-mooney | artom: sure ping me when its done | 13:31 |
sean-k-mooney | artom: there was noting i would block on but they were all minor imporvements | 13:31 |
gibi | bauzas: It is on my list for a proper code review because so far I only tried the tool but did not properly read the code | 13:31 |
bauzas | holy shit, pep8 | 13:31 |
bauzas | gibi: sure, no worries | 13:31 |
mnaser | https://www.irccloud.com/pastebin/veCa0dq8/ | 13:31 |
mnaser | sean-k-mooney: ^ | 13:32 |
bauzas | gibi: I provided a functest for verifying the deletion, it should work | 13:32 |
bauzas | gibi: given your notes, I think I can try to have better logs | 13:33 |
gibi | bauzas: the log improvement is not something I will -1 but If you have time then better logs are always welcome | 13:34 |
bauzas | gibi: I think it's important | 13:34 |
bauzas | so I would understand a -1 for this | 13:34 |
*** mriedem has joined #openstack-nova | 13:34 | |
*** macz has joined #openstack-nova | 13:35 | |
sean-k-mooney | i need to jump on a call but im seeing som intersting thing in that output notably the way the cache is reported | 13:35 |
*** damien_r has quit IRC | 13:36 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/train: Don't delete compute node, when deleting service other than nova-compute https://review.opendev.org/695145 | 13:37 |
sean-k-mooney | mnaser: i dont know if you remember but i have spoke about the need to model cpus per cache node and cache nodes per numa node in the past at the denver ptg | 13:37 |
*** dpawlik has joined #openstack-nova | 13:39 | |
sean-k-mooney | mnaser: noone else seams to be on the call so ill expand on that | 13:39 |
*** macz has quit IRC | 13:39 | |
*** maciejjozefczyk has quit IRC | 13:39 | |
sean-k-mooney | mnaser: cache banks 1 and 2 both part of numa node 1 in that config | 13:40 |
sean-k-mooney | it looks like the 24 core sku has 3 active cpus per ccx and hence per cache region | 13:40 |
sean-k-mooney | so your 8 core vm needs to be spread across 3 cache regions and 2 numa nodes | 13:40 |
mnaser | sean-k-mooney: it does, when i enable numa node per l3 cache, i end up with 6 threads in a numa node | 13:40 |
sean-k-mooney | oh ya i guess you could fit it on 2 cache regions and 1 numa node if you are using hyperthreads | 13:41 |
sean-k-mooney | mnaser: so rather then select core fomr the same numa node if we wanted to optimise in nova we would want to select cores in the same cache region first | 13:42 |
sean-k-mooney | we can actuly model the cache toploy in libvirt too and expose that to the guest | 13:42 |
sean-k-mooney | mnaser: it would be interesting to see how a 6 core vm faired against the 8 core vm | 13:43 |
*** eharney has quit IRC | 13:43 | |
mnaser | yeah, i read a bit about that. i'm wondering if performance wise, exposing numa node per l3 cache and then doing 2 numa nodes in a VM would result in good overall performance | 13:43 |
*** ociuhandu has joined #openstack-nova | 13:44 | |
mnaser | you'd have 4 cores each sitting in a numa node and hopefully the guest is smart enough to understand the distances involved | 13:44 |
sean-k-mooney | yes although again nova does not today fully understand the hierachy | 13:44 |
sean-k-mooney | e.g. nova could select 2 numa nodes form differnet sockets | 13:45 |
sean-k-mooney | ideally it woudl select 2 form the same socket to miniums the numa distance | 13:45 |
sean-k-mooney | that is actully just an optimization of the current behavior | 13:45 |
sean-k-mooney | rahter then a change | 13:45 |
mnaser | sean-k-mooney: right but at least the guest will be aware there is possible latency | 13:45 |
sean-k-mooney | yep | 13:45 |
mnaser | yep, indeed, its not optimal, but its an improvement | 13:45 |
*** bhagyashris has quit IRC | 13:46 | |
*** bhagyashris has joined #openstack-nova | 13:46 | |
sean-k-mooney | so the reason i suggested disabling the multiple numa nodes in zen 1 was our pms insisting vnf dont understand numa | 13:46 |
sean-k-mooney | after 5+ years of eveyone explaiming what numa is to vnf vendors i dont really buy that | 13:47 |
mnaser | im wondering which would be better: NPS=4 + no NUMA per L3 + 1 numa node .. vs .. NPS=4 + NUMA per L3 + 2 numa nodes -- im thinking the latter will likely be much faster (or at least the VM will be 'smarter' at understanding the topology.. somewhat) | 13:47 |
mnaser | it might be unpredictable because the NUMA node might have a distance of 32, or 12, or 10 .. but at least it might know there is distance | 13:48 |
*** ociuhandu has quit IRC | 13:48 | |
sean-k-mooney | the later i think would optimize the memory latency | 13:48 |
mnaser | im gathering some benchmark info | 13:50 |
mnaser | and then ill try the two numa node + l3 per numa | 13:50 |
sean-k-mooney | exposing NUMA per L3 should give the best performance it just requires your vms to have multiple numa nodes if they have more then 3-4 cores or >32-64G of ram | 13:50 |
mnaser | yep, most of the VMs will be 8 core / 8 gb | 13:50 |
*** mdbooth has quit IRC | 13:51 | |
mnaser | also FWIW those aren't dedicated cores | 13:51 |
*** damien_r has joined #openstack-nova | 13:51 | |
sean-k-mooney | sure you just setting hw:numa_nodes=2 but not enabling pinning | 13:51 |
aarents | hi dansmith, can you confirm that the last update is fine for you ? https://review.opendev.org/#/c/670000 | 13:51 |
mnaser | yep, os that way im thinknig nova should do 4 cores per NUMA, 4gb in each numa node | 13:51 |
sean-k-mooney | if you do that you should also set hw:mem_page_size to either large or small | 13:52 |
sean-k-mooney | mnaser: yes it would | 13:52 |
sean-k-mooney | the reason for doing hw:mem_page_size by the way is you want to enable novas numa aware memory tracking | 13:53 |
*** mdbooth has joined #openstack-nova | 13:53 | |
sean-k-mooney | if your dont want to use hugepages set it to "small" | 13:53 |
sean-k-mooney | that will use 4k pages | 13:53 |
mnaser | sean-k-mooney: i think huge pages is soemthing id want, but i think transparent huge pages are enabled right now | 13:53 |
sean-k-mooney | right but you should still set hw:mem_page_size=small if you have a numa toplogy and are not using explcit hugepages | 13:54 |
sean-k-mooney | if you dont you can get OOM events | 13:54 |
mnaser | i think id probably want large anyways, host has 512G memory and plenty of reserved memoryt | 13:54 |
sean-k-mooney | ya if you dont allow memory oversubsription then there is no reason not to use hugepages | 13:55 |
mnaser | ya nope no oversubscription | 13:55 |
*** liuyulong has joined #openstack-nova | 13:57 | |
mnaser | epyc rome numbers https://www.irccloud.com/pastebin/3bgON0HX/ | 13:58 |
mnaser | intel (but much older hardware) https://www.irccloud.com/pastebin/6M7B4QLY/ | 13:59 |
mnaser | sean-k-mooney: ^ fyi, interesting to see latency numbers | 13:59 |
coreycb | kashyap: do you know if anything has landed for removing cpu flags, similar to cpu_model_extra_flags? | 14:01 |
kashyap | coreycb: Hi...afraid, no; but I've filed a Blueprint for it: | 14:01 |
kashyap | coreycb: https://blueprints.launchpad.net/nova/+spec/allow-disabling-cpu-flags | 14:01 |
coreycb | kashyap: thanks | 14:02 |
coreycb | kashyap: fyi this bug is why I'm asking: https://bugs.launchpad.net/bugs/1853200 | 14:03 |
openstack | Launchpad bug 1853200 in libvirt (Ubuntu) "cpu features hle and rtm disabled for security are present in /usr/share/libvirt/cpu_map.xml" [High,Confirmed] - Assigned to Ubuntu Security Team (ubuntu-security) | 14:03 |
kashyap | coreycb: I know :-( | 14:03 |
kashyap | And I guessed as much | 14:03 |
kashyap | coreycb: One (very valid) 'workaround' is that have QEMU add new "named CPU models" to remove the said flags. | 14:04 |
coreycb | kashyap: ok I'll mention that, thanks | 14:05 |
kashyap | coreycb: For that, upstream QEMU folks must add them...please file a QEMU "RFE" bug on launchpad for it | 14:05 |
sean-k-mooney | mnaser: did you enable cluster on die for the intel system out of interest | 14:06 |
coreycb | kashyap: thanks I'll pass the details on to cpaelzer, he's our qemu maintainer | 14:06 |
kashyap | (Nod) | 14:06 |
sean-k-mooney | mnaser: but yes the amd number seam to be much higher | 14:06 |
sean-k-mooney | mnaser: that is quite surprising to be honest | 14:07 |
*** priteau has quit IRC | 14:11 | |
*** priteau has joined #openstack-nova | 14:12 | |
*** priteau has quit IRC | 14:13 | |
*** jaosorior has quit IRC | 14:18 | |
*** bhagyashris has quit IRC | 14:30 | |
*** damien_r has quit IRC | 14:31 | |
*** munimeha1 has joined #openstack-nova | 14:33 | |
*** mmethot has quit IRC | 14:34 | |
*** mmethot has joined #openstack-nova | 14:35 | |
*** nweinber has joined #openstack-nova | 14:37 | |
*** maciejjozefczyk has joined #openstack-nova | 14:38 | |
*** PrinzElvis has quit IRC | 14:38 | |
efried | bauzas, gibi, stephenfin: would one of you please have a look at the nova/cyborg spec update https://review.opendev.org/#/c/684151/ and +A if appropriate? | 14:39 |
bauzas | efried: yeah, I can do | 14:39 |
efried | thanks bauzas | 14:39 |
stephenfin | I didn't review the original so I'll defer to others | 14:39 |
*** eharney has joined #openstack-nova | 14:39 | |
bauzas | I promised to do spec reviews, in particular following the ones we discussed at the PTG | 14:39 |
efried | After all, melissaml and Viere are +1. | 14:39 |
*** jawad_axd has quit IRC | 14:45 | |
efried | dustinc: Did you pick up what I laid down yesterday about compute node conflicts for provider config? http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2019-11-19.log.html#t2019-11-19T22:19:41 | 14:45 |
aarents | Hi guys I need some review on this https://review.opendev.org/#/c/678016/ gibi already put +2, if you can have a look ? | 14:45 |
*** jawad_axd has joined #openstack-nova | 14:46 | |
bauzas | aarents: /me clicks | 14:46 |
efried | dansmith: TL;DR: since, ironic notwithstanding, it would be insane for one compute node's $name to be the same as another's $uuid, we're just going to punt if there's a conflict across the whole $name+$uuid space. With that in play, it makes sense to treat $COMPUTE_NODE as a "default", which can be overridden by a specific $name/$uuid. And all that means we can detect conflicts immediately (on startup) and fail (to start up) if o | 14:47 |
*** maciejjozefczyk has quit IRC | 14:47 | |
dansmith | buffer overflow | 14:48 |
*** jawad_ax_ has joined #openstack-nova | 14:49 | |
*** bhagyashris has joined #openstack-nova | 14:49 | |
aarents | thks | 14:50 |
*** jawad_axd has quit IRC | 14:50 | |
sean-k-mooney | efried: ill try and review the cyborg spec shortly | 14:51 |
*** jawad_axd has joined #openstack-nova | 14:51 | |
efried | sean-k-mooney: thanks. It's just an update to the existing spec, nothing earth-shaking. | 14:51 |
dansmith | efried: your message was too long and was cut off after "fail to (start up)" | 14:52 |
sean-k-mooney | ya looks short | 14:52 |
efried | dansmith: oh, weird. "...fail (to start up) if one is detected." | 14:52 |
efried | that was all. | 14:52 |
efried | my buffer says I had another 25c | 14:52 |
efried | clearly I need to work on my TL;DRing. | 14:52 |
dansmith | efried: okay, so I think I was (and likely still am) missing some context yesterday and I was hurrying, | 14:53 |
*** maciejjozefczyk has joined #openstack-nova | 14:53 | |
dansmith | but the problem is that ironic nodes specified there may not exist as providers on startup, is that it? | 14:53 |
*** jawad_ax_ has quit IRC | 14:53 | |
efried | that's pretty much the only scenario that makes us have to deal with this, yeah. | 14:53 |
dansmith | so we wanted to make that hard-fail if you provide something that doesn't exist there yeah? | 14:54 |
efried | no | 14:54 |
efried | generically the "problem" is that theoretically you can specify a config by $name and another by $uuid, but those are for the same node, but we wouldn't know that until that node "appeared". | 14:54 |
efried | In practice this shouldn't be possible because for ironic, $name == $uuid. | 14:55 |
*** jawad_axd has quit IRC | 14:55 | |
dansmith | you mean if you had two entries in the list, one by name and one by uuid and they were in fact the same provider? | 14:56 |
*** damien_r has joined #openstack-nova | 14:56 | |
efried | right | 14:56 |
*** damien_r has quit IRC | 14:56 | |
dansmith | that seems like a really tiny detail to be concerned about.. did this come up in some testing or something? | 14:56 |
efried | more code inspection | 14:56 |
dansmith | remind me why we have the by-name option anyway? | 14:57 |
efried | ikr | 14:57 |
dansmith | uuid or the "self" option should really be all we need I would think | 14:57 |
efried | It was because you don't necessarily know the UUID yet in a green field. | 14:57 |
efried | but you want your tripleo to be able to lay down the config a priori. | 14:57 |
dansmith | in which case, non-ironic? in that case you use $COMPUTE_NODE yeah? | 14:57 |
efried | Yeah, I would think so. gibi, can we convince you we don't need identification by name? | 14:58 |
efried | (I think it was gibi who talked us into it) | 14:58 |
dansmith | either way, | 14:58 |
dansmith | I'm not sure we really need to care that much if someone puts a thing in there twice | 14:58 |
efried | right, you kinda f'ed up if that happens, but the code has to do *something* in that case. | 14:59 |
dansmith | *if* we can detect and log an error later that's helpful, but.. | 14:59 |
dansmith | your concern is just not being able to detect that at startup? | 14:59 |
efried | right. | 14:59 |
gibi | efried: I'm sort on context. Is it about provider config? | 14:59 |
*** amodi has quit IRC | 14:59 | |
efried | if we can detect it at startup, we can halt the service and force you to fix it. But we don't want to kill the service if it creeps in after we've already started running. | 15:00 |
efried | gibi: yes | 15:00 |
dansmith | efried: if it's hard/impossible to do at start, is clearly wrong config, and we can log an error later I think that's reasonable | 15:00 |
dansmith | efried: no, don't halt.. log an error periodically | 15:00 |
efried | dansmith: ack | 15:00 |
*** dtantsur|brb is now known as dtantsur | 15:00 | |
gibi | efried: so in case of a compute RP the name is known before the compute was ever started but the uuid only known after the compute creates the RP | 15:00 |
dansmith | we should never kill nova-compute after it's hit steady-state | 15:00 |
efried | right | 15:00 |
dansmith | gibi: we have $COMPUTE_NODE for that | 15:00 |
efried | gibi: Right, in that case, we can use the special ``$COMPUTE_NODE`` ... yeah ^ | 15:01 |
gibi | dansmith: I see. | 15:01 |
dansmith | they don't need to know either the name or the uuid for the compute node | 15:01 |
gibi | so instead of allowing RP identification by name in general, we add a specific case for compute node | 15:01 |
dansmith | we have that already | 15:01 |
dansmith | uuid can be a uuid or the special string "$COMPUTE_NODE" | 15:02 |
gibi | so far I'm OK with this. I guess if other RPs needs to be identified by name then we will add other symboles for that or we re-think the identify-by-name feature | 15:02 |
dansmith | efried: note that I'm okay with either removing name, or just detecting and logging this case later | 15:02 |
bauzas | gibi: dansmith: I haven't paid attention yet to this convo as I'm dragged into some internal all-hands but when I was looking on how to use the new COMPUTE_NODE trait, I think that we miss the world of nested RPs | 15:02 |
bauzas | unless we have some way to have some traits to be cascaded into child RPs | 15:03 |
bauzas | (and I really want to avoid saying 'inheritance') | 15:03 |
*** damien_r has joined #openstack-nova | 15:03 | |
bauzas | but for example, if you lookup some RP, you need to call the RP and check its trait to know whether the original RP is still related to a compute node | 15:03 |
efried | bauzas: The only rp we're supporting this "wildcard" for right now is the root. If we find we need a way to generically identify its children later, we'll have more designing to do. | 15:04 |
bauzas | you need to call the *root* RP | 15:04 |
bauzas | (sorry for confusing) | 15:04 |
dansmith | is bauzas talking about this same thing? | 15:04 |
efried | almost :P | 15:04 |
gibi | hm, the neutron device RPs (provising bandwidth) also has a stable name generated from the device name. But as I don't have a use case for provider config related to the bandwidth feature I rest my case. | 15:04 |
bauzas | efried: sure, I'm just saying this is only convenient when you're in a flat world | 15:05 |
*** johnthetubaguy has joined #openstack-nova | 15:05 | |
efried | I agree. Predictably-named nested RPs make a case for identifying by name. | 15:05 |
efried | though I think we've talked about the brittleness of relying on names before. | 15:06 |
*** maciejjozefczyk has quit IRC | 15:06 | |
*** damien_r has quit IRC | 15:06 | |
gibi | efried: we have predicatable named nested RPs | 1110cf59-cabf-526c-bacc-08baabbac692 | aio:Open vSwitch agent:br-test | 15:06 |
mnaser | sean-k-mooney: sorry, was in a meeting too then, but no that wasnt enabled, but they are much older and also running ddr3 memory for comparision | 15:06 |
*** tbachman has quit IRC | 15:06 | |
mnaser | much lower latency but lower bandwidth | 15:07 |
gibi | efried: but as I said I have no direct use case so I don't want to push for the naming support in provider config | 15:07 |
sean-k-mooney | so that was cross socket latency | 15:07 |
bauzas | efried: FWIW, we stecked a few things at the PTG between gibi, stephenfin and me that were tied to naming conventions :) | 15:07 |
*** davee_ has joined #openstack-nova | 15:07 | |
bauzas | we sketched* | 15:07 |
efried | I know, what I'm saying is that we shouldn't have architected anything to rely on naming conventions, and we shouldn't perpetuate such things. | 15:07 |
efried | but that ship has probably sailed. | 15:07 |
mnaser | sean-k-mooney: yep! | 15:07 |
sean-k-mooney | mnaser: i was trying to figure out if the intel number were between socket or two numa nodes in the same socket | 15:07 |
*** johnthetubaguy has quit IRC | 15:07 | |
bauzas | efried: indeed | 15:07 |
*** damien_r has joined #openstack-nova | 15:07 | |
sean-k-mooney | ya those latency numbers for amd are bad | 15:07 |
mnaser | nope, between two sockets, which is interesting in just how much faster cross socket latency is low | 15:07 |
efried | In any case, if gibi doesn't have any real, current bw use cases, and bauzas doesn't have any real, current vgpu use cases, I'm fine making the decision to remove identify-by-name to limit scope/complexity. | 15:08 |
gibi | efried: there are things already architectured that way. like the hostname of the compute node is an id neutron use. and nova user the name of the device RP from neutron | 15:08 |
mnaser | sean-k-mooney: im looking at the bios for the speed between the two sockets, i learned something about the amd architecture that doesnt seem ideal | 15:08 |
efried | we've removed a lot of bells and whistles for that reason, just to make this go. | 15:08 |
gibi | efried: agree with the limited scope | 15:08 |
*** dpawlik has quit IRC | 15:08 | |
gibi | efried: go for it | 15:08 |
bauzas | efried: it's not really a vgpu thingy | 15:08 |
sean-k-mooney | so that lend more crednece to enableing numa per ccx and usign muiti numa guests | 15:08 |
gibi | efried: we can always add that later | 15:08 |
bauzas | efried: it's more what gibi said | 15:09 |
efried | otoh, where I started this morning actually handles things including name pretty well IMO | 15:09 |
bauzas | efried: some other projets create their own RPs and we need to know them | 15:09 |
sean-k-mooney | mnaser: the fact the infity fabric speed used to be tied to the meory speed | 15:09 |
efried | and dustinc has mostly already coded it up | 15:09 |
*** johnthetubaguy has joined #openstack-nova | 15:09 | |
efried | so it's really just a question of test surface | 15:09 |
mnaser | sean-k-mooney: "By default, the BIOS for EPYC 7002 Series processors will run at the maximum allowable clock frequency by the platform. This configuration results in the maximum memory bandwidth for the processor, but in some cases, it may not be the lowest latency. The Infinity Fabric will have a maximum speed of 1467 MHz (lower in some platforms), resulting in a single clock penalty to transfer data from the | 15:09 |
mnaser | memory channels onto the Infinity Fabric to progress through the SoC. To achieve the lowest latency, you can set the memory frequency to be equal to the Infinity Fabric speed. Lowering the memory clock speed also results in power savings in the memory controller, thus allowing the rest of the SoC to consume more power potentially resulting in a performance boost elsewhere, depending on the workload." | 15:09 |
bauzas | efried: for the moment, we infer their existences based on naming convention but I'm all up for other ideas :-) | 15:09 |
gibi | bauzas: a counter example is shared disk. there we said use the uuid of the placement aggregate instead of the name of the sharing RP in the nova conf. | 15:10 |
sean-k-mooney | mnaser: ya so in gen 1 they were tied of the same clock | 15:10 |
gibi | bauzas: so we are getting better at this | 15:10 |
bauzas | gibi: aaaaaand we discussed 30 mins of any potential issues with it :) | 15:10 |
dansmith | efried: reminder: I'm fine with logging the conflict late when the admin configured it wrong | 15:10 |
*** JamesBenson has joined #openstack-nova | 15:11 | |
sean-k-mooney | so your memroy bandwith was limited by the speed at which the infinity frabric could run | 15:11 |
bauzas | the chicken-and-egg problem | 15:11 |
dansmith | maybe nobody else is :) | 15:11 |
gibi | bauzas: that is what PTG for. isn't it? ;) | 15:11 |
sean-k-mooney | mnaser: they split them to allwo you to tune them seperatly | 15:11 |
bauzas | gibi: that's surely some justification for my expense report, for sure :p | 15:11 |
sean-k-mooney | mnaser: so you can optimise for latency or througput | 15:11 |
*** tbachman has joined #openstack-nova | 15:12 | |
efried | dansmith: That wfm too. The only question is, in addition to logging the error, we have to *pick one* to use. | 15:12 |
bauzas | "look ! this isn't a very expensive dinner, this was just a brainstorming session in a crowdy area !" | 15:12 |
efried | or, I suppose, refuse to use either. | 15:12 |
dansmith | efried: refuse to use either | 15:12 |
bauzas | "and we ordered snacks meanwhile" | 15:12 |
dansmith | IMHO | 15:12 |
efried | dansmith: ack. I agree. We shall make it so. | 15:12 |
efried | dansmith: Since you're being so nice already, and I've obv got you swapped out of whatever else you would be doing, can we talk about shelve-offloading an instance with a vTPM? | 15:12 |
*** ociuhandu has joined #openstack-nova | 15:13 | |
dansmith | efried: yeah | 15:13 |
efried | Do you have a better idea than "create a glance (or maybe swift?) object for the vdev file"? | 15:13 |
gibi | bauzas: there wasn't too much snack provided during the PTG so _we had to_ order some :) | 15:13 |
*** shilpasd has joined #openstack-nova | 15:14 | |
efried | dansmith: seems like this is a natural for swift, but that adds yet another required component. | 15:14 |
bauzas | "good luck, expenses team, for understanding my bills !" | 15:14 |
dansmith | efried: that would be one place I guess, but yeah, can't really depend on swift being there | 15:14 |
mnaser | sean-k-mooney: interestingly enough looking at the bios the 4-link xGMI max speed is set to 10.667Gbps | 15:15 |
dansmith | efried: I dunno how safe that would really be, not that it's less secure than the alternatives, but.. | 15:15 |
mnaser | "Setting this to a lower speed can save uncore power that can be used to increase core frequency or reduce overall power. It will also decrease cross socket bandwidth and increase cross socket latency." | 15:15 |
kashyap | coreycb: One more on that thing you asked earlier: QEMU upstream has this notion of "versioned CPU models", and we'll be getting new named models with the affected flags turned off. | 15:16 |
mnaser | i wonder why SMC would ship a box with with it set to the minimum speed for a server where i dont really care about cpu performance, ill bump that to 18Gbps.. | 15:16 |
efried | dansmith: Well, we're relying on the encrypted-ness of the vdev file already. A main use case is "someone walks away with the disk and we're still okay". | 15:16 |
kashyap | coreycb: I'll update the Ubuntu bug you pointed. | 15:16 |
coreycb | kashyap: that sounds nice. great, thanks | 15:16 |
efried | dansmith: though since you bring it up, I wonder if it would be possible to (ab)use barbican... | 15:17 |
dansmith | efried: sure, but access to that enables an offline attack, and not everyone may have the same security policy for their swift cluster as they do for their hypervisor nodes | 15:17 |
dansmith | efried: switft may be a best effort uuber cheap storage bucket for them, whereas hypervisor and glance storage is high security at a premium | 15:17 |
dansmith | efried: do you have a pointer to what glance calls the feature of linking multiple image payloads together? | 15:18 |
efried | oh, is that already a thing? | 15:18 |
*** ociuhandu has quit IRC | 15:19 | |
dansmith | I thought it was, and I thought you made mention of it | 15:19 |
efried | I was just going to, like, make up a metadata key that would store the a unique identifier and stuff the same value on both images to link them. | 15:19 |
dansmith | oh okay, I think there was actually a use case for this feature for something else at some point, but I might be wrong | 15:19 |
sean-k-mooney | dansmith: do you mean the ablity to have multple image location or something else | 15:19 |
efried | sean-k-mooney: talking about being able to "link" two images together in some logical fashion. | 15:20 |
dansmith | sean-k-mooney: that's not what we're talking about | 15:20 |
*** udesale has quit IRC | 15:20 | |
sean-k-mooney | oh ok like we propose to do for cyborg | 15:20 |
sean-k-mooney | e.g. where the os image has a metadata key that specify the bitstream image for the fpga | 15:20 |
*** udesale has joined #openstack-nova | 15:20 | |
efried | yeah, that's a close enough use case | 15:21 |
efried | in this case the two images are completely 1:1 | 15:21 |
efried | but otherwise it's a pretty similar model. | 15:21 |
sean-k-mooney | im not sure there is an existing freature for that | 15:21 |
sean-k-mooney | if there was we would have suggested using it for cyborg | 15:22 |
dansmith | it's similar, but I would expect that glance would gain a new container format for that, | 15:22 |
dansmith | whereas this is more "please store this binary blob for me" | 15:22 |
sean-k-mooney | well that is what the bitstream kindo of is but sure | 15:22 |
efried | sean-k-mooney: we're talking about the emulated TPM device, which libvirt stores outside of the instance dir. When we shelve-offload, we need to carry that vdev along somehow. We're brainstorming ways to make that happen. | 15:22 |
efried | see https://review.opendev.org/#/c/686804/6/specs/ussuri/approved/add-emulated-virtual-tpm.rst@197 | 15:22 |
sean-k-mooney | ah ok | 15:23 |
dansmith | sean-k-mooney: it's not if there's some agreed-upon format for containing those bitstreams, even though the bitstream themselves aren't standard | 15:23 |
*** READ10 has joined #openstack-nova | 15:23 | |
sean-k-mooney | dansmith: well for the bitstream i think the plan was intilaly to use raw | 15:24 |
sean-k-mooney | but there isnt a standard format | 15:24 |
dansmith | sean-k-mooney: you mean "bare"? | 15:24 |
dansmith | or you mean no additional wrapper around the bitstream? | 15:24 |
sean-k-mooney | i mean image type raw container type bare | 15:24 |
dansmith | yeah, so that doesn't seem like a good idea to me | 15:24 |
dansmith | because nova thinks it can boot those kinds of things | 15:24 |
efried | sean-k-mooney: the spec had originally been proposing to copy the vdev into the instance dir so it would be carried along for live migration, and then unpack it on the other side. But the instance dir is blown away for offload, so that model doesn't carry. | 15:24 |
dansmith | also, the other thing to keep in mind here, | 15:25 |
sean-k-mooney | well the bitstream format is vendeor depenent | 15:25 |
sean-k-mooney | it would be nice to have a sperate format for it | 15:25 |
dansmith | is that for things like glance multi-store, you have to know that those things are linked together when you move/copy between stores | 15:25 |
dansmith | and, | 15:25 |
dansmith | if you use that snapshot to spawn to a ceph-using compute node, | 15:25 |
dansmith | what do you do with that tpm image? | 15:25 |
dansmith | copy to everyone? ignore? | 15:25 |
sean-k-mooney | efried: right we would need to snapshot it and store it somewhere | 15:26 |
dansmith | "copy to everyone" meaning.. any instance that CoWs from that should get a copy of the tpm | 15:26 |
efried | good question, I hope if you're using a snapshot to create multiple VMs from, you would want them to start with a clean TPM, but mebbe not. | 15:26 |
dansmith | efried: but how do you know? | 15:26 |
dansmith | efried: if I snapshot my instance and delete it, then come back a year later for black friday, | 15:27 |
sean-k-mooney | dansmith: by the way this topic in general sound more like the goal of glare to solve. e.g. a generic artifact store | 15:27 |
dansmith | you don't know if I'm going to create one or a hundred of those | 15:27 |
sean-k-mooney | but i dont think glare is a thing anymore | 15:27 |
efried | dansmith: your point is that we can't differentiate between shelve-offload-unshelve and snapshot-restore? | 15:27 |
dansmith | sean-k-mooney: maybe, not sure glare was really for keeping this kind of thing, vs things like template configs.. but maybe | 15:27 |
dansmith | efried: no we definitely can, I'm saying snapshot-restore has a similar problem if you store the thing in glance or otherwise link it via the snapshot itself, vs. the image record in nova | 15:28 |
sean-k-mooney | so i guess there are different usecase but i would not assuem that we want to share tpm snapshot betwen multipel instnaces | 15:28 |
sean-k-mooney | that said i can see a use case for that where i store secure boot keys or something in the tpm snapshot | 15:29 |
dansmith | sean-k-mooney: agree | 15:29 |
efried | sean-k-mooney: I think dansmith is saying that, even if we put that stake in the ground, we can't necessarily know when we go to use a snapshot | 15:29 |
dansmith | right, | 15:29 |
sean-k-mooney | ya | 15:29 |
efried | but dansmith I would think we would only ever restore the vTPM if the operation is unshelve. If it's just "boot with this image" we don't. | 15:29 |
*** TxGirlGeek has joined #openstack-nova | 15:29 | |
dansmith | so it would be better if we have a solution where we the linkage is between the instance and the tpm, not the snapshot image and the tpm | 15:29 |
*** bhagyashris has quit IRC | 15:30 | |
efried | yeah, that works. I was actually thinking to use the instance UUID as the link | 15:30 |
dansmith | eh? | 15:30 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: PoC for using COMPUTE_SAME_HOST_COLD_MIGRATE https://review.opendev.org/695220 | 15:30 |
efried | in the metadata of both images, store vtpm_original_instance_uuid=$instance_uuid | 15:30 |
dansmith | that's a bad idea, IMHO | 15:30 |
dansmith | that opens up an attack vector | 15:31 |
efried | on unshelve, I need to make sure that the instance I'm unshelving is $instance_uuid, and I also need to suck in the vtpm image with that $instance_Uuid | 15:31 |
dansmith | if I can create an image, then I could create a tpm image for your instance and when you go to unshelve, you use my tpm instead of nothing beause you didn't have one before, so I can inject keys and get you to use them potentially | 15:31 |
sean-k-mooney | dansmith: because the uuid can be updated | 15:31 |
sean-k-mooney | we likely need to sotre this nova side to be safe | 15:32 |
dansmith | if you're going to store the image in a glance image, storing the id of that in sysmeta would be much stronger a link I think | 15:32 |
sean-k-mooney | ya that is what i was thinking | 15:32 |
dansmith | and maybe the sha256 of it too | 15:32 |
efried | dansmith: Seems like that attack vector is open anyway, just narrower... oh, yeah, a signature ought to close it. | 15:32 |
dansmith | efried: it means you have to have the ability to modify the internals of nova and create a malicious image.. it's much narrower | 15:33 |
dansmith | creating images is done via the api.. sysmeta is only at the db layer so requires a much larger exploit | 15:33 |
sean-k-mooney | ya also if the tpm is encypted | 15:34 |
efried | dansmith: also keep in mind the file is useless without... yeah. | 15:34 |
dansmith | note that I don't *know* that abusing glance in this way is legit | 15:34 |
sean-k-mooney | you not only need to replace the iamge it has to be encypted with teh right key | 15:34 |
dansmith | efried: it's not useless, it's still a dictionary attack vector, but I understand | 15:34 |
dansmith | sean-k-mooney: right, but the key is get-able from barbican via the api | 15:35 |
efried | dansmith: so the instance record sticks around when offloaded? | 15:35 |
dansmith | efried: that's the whole point of shelve | 15:35 |
efried | okay | 15:35 |
sean-k-mooney | true | 15:35 |
efried | keep the instance record, blow away the allocations etc. | 15:35 |
efried | and I assume we can unshelve onto any host, doesn't have to be the original. | 15:36 |
sean-k-mooney | yes | 15:36 |
efried | (within the bounds of scheduler restrictions of course) | 15:36 |
sean-k-mooney | we go to the schduler to select the hosts | 15:36 |
dansmith | likely not the original | 15:36 |
dansmith | shelve was intended as "cold storage" so some significant amount of time would have passed before you unshelve, so very unlikely you land on the same host | 15:36 |
dansmith | I will not always be used that way, but.. can't make any assumptions | 15:37 |
efried | okay, cool, I'm going to write this up and let the stakeholders know we've doubled the complexity again. | 15:37 |
efried | And maybe go ask -glance for recommendations on how to store the thingy. | 15:37 |
dansmith | efried: we might ask jroll and the Verizon Media (tm) people what they think about storing the tpm in glance | 15:38 |
dansmith | security-wise | 15:38 |
efried | Right. | 15:38 |
efried | Thanks for the talk dansmith sean-k-mooney | 15:38 |
sean-k-mooney | the other usecase for shelve is public cloud to avoid billing but ya unell you have a server affinity group or somethin its very unlikly to land on the same host | 15:39 |
dansmith | efried: already been asking some questions in -glance | 15:39 |
sean-k-mooney | no worres i need to review your spec and code too | 15:39 |
sean-k-mooney | ill try to do that this week | 15:39 |
efried | cool | 15:40 |
bauzas | mriedem: want me to care on backporting https://review.opendev.org/#/c/695145/ to stable branches ? | 15:46 |
bauzas | or you rather and me reviewing ? | 15:46 |
mriedem | can you do both? | 15:46 |
bauzas | both backporting and reviewing ? well, is that a thing ? | 15:46 |
bauzas | we have elod and lyarwood we can bug | 15:47 |
*** shilpasd has quit IRC | 15:47 | |
mriedem | i have things in stable/train you could review | 15:47 |
bauzas | mriedem: sure thing | 15:48 |
mriedem | i guess i don't know if you're just talking about that specific change or reviewing code in general | 15:48 |
*** fgonzales3246 has joined #openstack-nova | 15:48 | |
mriedem | because both would be good | 15:48 |
bauzas | mriedem: no, sorry, was talking of proposing backports of this change | 15:48 |
bauzas | mriedem: cool, I'm committed to reviewing for the last 2 days | 15:49 |
bauzas | anyway, let's just backport this stuff... | 15:49 |
*** READ10 has quit IRC | 15:55 | |
*** ociuhandu has joined #openstack-nova | 15:56 | |
*** tbachman has quit IRC | 15:56 | |
dansmith | jroll: you around? | 15:57 |
efried | I suspect in this case it would be possible to use the keymgr backend to store this blob. But that's really not a good general model. | 15:58 |
efried | this is really a job for swift. And then the responsibility of the admin to make sure their swift is appropriately hardened. | 15:59 |
efried | do we have a precedent for nova talking to swift?? | 16:00 |
efried | I... don't see one. | 16:00 |
dansmith | so, if you have no swift, we what.. lose your tpm on snapshot/shelve or refuse to do it? | 16:00 |
efried | yeah, I would think so. | 16:00 |
dansmith | which? | 16:00 |
efried | refuse | 16:01 |
*** jmlowe has joined #openstack-nova | 16:01 | |
dansmith | how about instead of that, | 16:01 |
efried | I mean, there seems to have been plenty of talk on recent features about "not supported" (you can try it but it won't work right) versus actually blocked | 16:01 |
dansmith | we refuse to ever boot an instance with a tpm if there's no swift endpoint in the catalog? | 16:01 |
dansmith | efried: yeah and they all suck | 16:01 |
*** ociuhandu has quit IRC | 16:01 | |
efried | dansmith: I can live with that; same as the requirement for the keymgr to exist. | 16:02 |
dansmith | efried: because the user doesn't know any of this, other than that "every openstack cloud they use seems to be different and incompatible" | 16:02 |
fgonzales3246 | hello all, I'm configuring SSL connections into my MySQL database for the components and when I activate this into the dabase and in each connection my controller starts to slow down and the requests fail due to timeout. | 16:02 |
dansmith | efried: ack, I'd much rather do that, or at least do that in addition to the other | 16:02 |
efried | dansmith: IMO it's better to block operations you know are broken than to simply state they're not supported and let them sh*t all over themselves if tried. | 16:02 |
fgonzales3246 | I tried to isolate one component like, cinder, but it occurs the same... seems that the requests doesn't go to the compute node | 16:02 |
efried | but I think the counterargument has been that we want to be able to enable the operation without needing a new microversion. | 16:03 |
fgonzales3246 | but the services are all up and running, do you know what could I be missing? Something related to nova-conductor or something else? | 16:03 |
dansmith | efried: of course, I'm saying having "the plan" be a thousand places where seemingly fundamental operations can fail with the appropriate combination of obscure backend configs sucks from the user's perspective | 16:03 |
efried | anyway, in this case we're not talking API changes, and we're talking blocking at boot rather than (well, in addition to) shelve, so I'm good. | 16:03 |
efried | dansmith: totally agree. | 16:03 |
efried | block it if you know it's broken. | 16:03 |
dansmith | ...but try not to rely on that in design | 16:04 |
efried | now I just hope I can detect a swift | 16:04 |
efried | oh, | 16:04 |
efried | cause I wanted to add a check for "key manager configured" to the "do we enable vTPM in the first place" logic | 16:05 |
efried | but couldn't find a reliable way to inspect CONF for that | 16:05 |
dansmith | you need the catalog, no? | 16:05 |
efried | but yeah, | 16:05 |
efried | I can at least look in the catalog to make sure there's a keymgr endpoint. | 16:05 |
efried | um | 16:05 |
efried | except I'm doing this on init_host and don't have a context? | 16:05 |
dansmith | no, | 16:06 |
dansmith | you do this at boot time for the instance | 16:06 |
dansmith | non-tpm instances don't care about this | 16:06 |
dansmith | what you want to avoid is agreeing to create an instance for the user which in six weeks will be unsnapshottable | 16:06 |
efried | Sorry, I'm talking about something slightly different: the part where the libvirt driver decides whether to set the "can I vTPM?" flag. | 16:06 |
dansmith | I'm not talking about that | 16:06 |
efried | ...causing it to expose the traits | 16:06 |
efried | I know, I am. | 16:06 |
efried | I understand and agree with what you're talking about. | 16:07 |
dansmith | that too may be good, but expecting keystone is up and reachable when compute starts is not okay I think | 16:07 |
efried | orly? ... okay. | 16:07 |
dansmith | computes on the other side of a network partition shouldn't just fail to start I think | 16:07 |
efried | fair enough | 16:07 |
dansmith | they'll currently wait for conductor if it's not up yet, so... | 16:07 |
efried | yeah, I don't want to be the guy to introduce a network ping into init_host. | 16:08 |
dansmith | we have that :) | 16:08 |
dansmith | I added it | 16:08 |
efried | from compute driver or from compute manager? | 16:09 |
dansmith | ...but it's just a ping() rpc method compute makes to conductor to see if it's alive before it goes :) | 16:09 |
dansmith | manager | 16:09 |
efried | yeah, that's in manager. I'm talking about libvirt compute driver's init_host, which seems... dirtier. | 16:09 |
dansmith | yes, I'm just pointing out that I called it ping :) | 16:09 |
*** ivve has quit IRC | 16:09 | |
*** nanzha has quit IRC | 16:10 | |
efried | okay, so lacking a jroll for live discussion, I'll mod the spec as if this is the plan, and ping for reviews. | 16:10 |
efried | Thanks dansmith | 16:10 |
*** jmlowe has quit IRC | 16:10 | |
dansmith | sounds good | 16:11 |
mnaser | sean-k-mooney: so my new map is 3C per NUMA node (6 threads total), ~32GB per NUMA node, with quite interesting latency numbers | 16:13 |
mnaser | https://www.irccloud.com/pastebin/91D5sGuS/ | 16:13 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: zuul: Remove unnecessary 'tox_install_siblings' https://review.opendev.org/695235 | 16:13 |
*** udesale has quit IRC | 16:14 | |
efried | mriedem: the barbican-gating-on-py3 thing from yesterday seems to be busted https://review.opendev.org/#/c/695052/ -- would you mind having a quick look at some point? | 16:16 |
*** gyee has joined #openstack-nova | 16:16 | |
mnaser | w | 16:17 |
*** tbachman has joined #openstack-nova | 16:17 | |
*** dpawlik has joined #openstack-nova | 16:18 | |
stephenfin | efried: didn't AJaeger have a fix for that? | 16:22 |
* stephenfin looks for links | 16:22 | |
sean-k-mooney | mnaser: that kindof look like what i would expect | 16:22 |
stephenfin | efried: https://review.opendev.org/#/c/689461/ | 16:22 |
efried | stephenfin: yes, gmann said something in -barbican | 16:22 |
sean-k-mooney | as the numa distance increase teh latnecy does too | 16:22 |
stephenfin | ah, maybe not the same thing | 16:22 |
stephenfin | nvm me | 16:22 |
mnaser | sean-k-mooney: i also bumped the "4-link xGMI max speed" setting to 18Gbps .. but the machine is now crashing (it was set to 10.667 earlier) | 16:23 |
efried | stephenfin: yeah, he pointed to https://review.opendev.org/#/c/689458/ (same thing but in train) | 16:23 |
mnaser | so ill reset it back to that, and hopefully the latency stays at these numbers | 16:23 |
gmann | stephenfin: yeah, stable/train backport will fix the baribican. but still asking the reason to infra on ML for learning. | 16:23 |
stephenfin | sweet | 16:23 |
sean-k-mooney | but it lends more evidence to the idea we should look at cache affinity and numa distance when selecting vm numa node mappings | 16:24 |
*** ricolin has quit IRC | 16:24 | |
mnaser | sean-k-mooney: yep, giving an instance two NUMA nodes that are '32' away vs '10' .. | 16:24 |
sean-k-mooney | mnaser: can you dump the numa distance info | 16:24 |
mnaser | or rather 11/12 | 16:24 |
sean-k-mooney | i would be gould to confimr the latency corralate with the numa distance reported | 16:25 |
mnaser | https://www.irccloud.com/pastebin/GwxGM24b/ | 16:25 |
mnaser | they're not that consistent but yeah | 16:25 |
sean-k-mooney | ya it does if we look noes 8-15 have the same numa distacne of 32 | 16:26 |
*** Sundar has joined #openstack-nova | 16:26 | |
mnaser | which are all socket #2 | 16:26 |
sean-k-mooney | yep | 16:26 |
sean-k-mooney | we see 2-7 are all reported at numa distance of 12 and have almost the same latency | 16:27 |
gmann | we need some review from mriedem and tonyb here to get these backport in (pinged dave-mccowan barbican stable core also) - we need to get these merged as soon as we can - https://review.opendev.org/#/q/I24a46d0d7476203feccb1250d4ce3ad94b2e0ecd | 16:27 |
mnaser | sean-k-mooney: struggling what the 11/12 are there for | 16:27 |
sean-k-mooney | ya its not fully clear but we could use this info to optimise slightly | 16:29 |
sean-k-mooney | i suspect that is related to how the dies are grouped in pairs | 16:29 |
sean-k-mooney | distance of 11 is to the other ccx in the pari | 16:29 |
sean-k-mooney | 12 is the other ccx on the socket no in the pair | 16:29 |
*** dpawlik has quit IRC | 16:29 | |
sean-k-mooney | and 32 is cross socket in this case | 16:29 |
*** jawad_axd has joined #openstack-nova | 16:29 | |
mnaser | sean-k-mooney: ok that makes sense then, i will try to reset the infinity fabric speed down to see if the machine stops crashing :\ | 16:32 |
*** ociuhandu has joined #openstack-nova | 16:33 | |
*** jawad_axd has quit IRC | 16:34 | |
Sundar | sean-k-mooney: Thanks a lot for reviewing the Cyborg spec despite your busy schedule. All the best. | 16:34 |
*** dpawlik has joined #openstack-nova | 16:35 | |
*** ociuhandu has quit IRC | 16:38 | |
*** ociuhandu has joined #openstack-nova | 16:38 | |
mriedem | gmann: the only one passing is stable/train | 16:40 |
*** jaosorior has joined #openstack-nova | 16:40 | |
*** dpawlik has quit IRC | 16:41 | |
efried | mriedem: it sounds like if that one merges it might fix master (I still don't understand how though) | 16:42 |
gmann | mriedem: yeah some unrelated failure. i did recheck on stein. after train cherry pick merged we can check if problem is sovled. otherwise need to get all in | 16:42 |
gmann | it all depends on how zuul pick the job definition due to those branche variant. | 16:43 |
gmann | we can try one by one. last option to avoid barbican gate is to make grenade job as py2 explicitly. | 16:44 |
gmann | *gate break | 16:44 |
mriedem | efried: because the job definitions are branch specific and grenade deploys the old side using stable/train and the new side using master | 16:47 |
mriedem | so sometimes to fix a thing on master for grenade you need to do something in master-1 | 16:47 |
mnaser | sean-k-mooney: ok making good progress, according to numastat it ended up scheduling to numa cell0 and cell1 which means this is the most "optimized" setup | 16:47 |
mnaser | topology is 8 sockets, i wonder if it would be better to model the topology as "2 sockets, 4 cores, 1 threads" or "2 sockets, 2 cores, 2 threads" too | 16:48 |
*** damien_r has quit IRC | 16:51 | |
*** damien_r has joined #openstack-nova | 16:53 | |
*** macz has joined #openstack-nova | 16:54 | |
efried | dansmith: are there any operations other than shelve for which the swift blob would be appropriate? | 16:56 |
dansmith | well, what are you going to do about snapshot? | 16:57 |
dansmith | snapshot gets a fresh tpm? | 16:57 |
dansmith | that doesn't work well for people that use snapshot as a backup mechanism | 16:57 |
dansmith | speaking of that, the backup api :) | 16:57 |
mnaser | burn it with fire | 16:58 |
* mnaser goes back to hiding | 16:58 | |
efried | dansmith: gr, config option? save_tpm_with_snapshot_note_does_not_apply_to_shelve_where_we_always_do_that_anyway = $bool | 16:58 |
dansmith | idk | 16:59 |
efried | open question I guess, sigh | 16:59 |
dansmith | I mean, not a config option, no | 16:59 |
*** martinkennelly has quit IRC | 16:59 | |
dansmith | but not sure what the answer is | 16:59 |
sean-k-mooney | mnaser: that is becasue we use itertool.permuations so we will initally iterate over the numa nodes in order trying to fit the vitual numa node to the host | 16:59 |
efried | I think the answer has to be no | 16:59 |
efried | because we have no way to associate in that scenario | 17:00 |
*** martinkennelly has joined #openstack-nova | 17:00 | |
dansmith | if I snapshot my instance and plan to roll ack if my OS upgrade fails, I'm going to be super surprised that I can't read my encrypted data volume (ever again) | 17:00 |
efried | because we're relying on the instance sysmeta to associate for shelve. | 17:00 |
sean-k-mooney | mnaser: so in that specific case it will work | 17:00 |
mnaser | sean-k-mooney: so maybe eventually machines might get some inefficent assignments | 17:00 |
sean-k-mooney | mnaser: but in general it wont | 17:00 |
sean-k-mooney | mnaser: yes | 17:00 |
mnaser | so its possible i hit node4+node5 which is a bad time | 17:00 |
dansmith | efried: I think it'd be good to get some jroll input on this | 17:00 |
efried | agree | 17:00 |
sean-k-mooney | if node 0,2,3 could fit it it would select 0 and 2 | 17:01 |
sean-k-mooney | and never check 3 | 17:01 |
dansmith | efried: but I think people will be pretty surprised if it's gone | 17:01 |
mnaser | sean-k-mooney: so far im seeing some numbers increasing by 25% -- and ok i see | 17:01 |
efried | dansmith: but then we go back to needing the association in the instance meta, because there's nothing else | 17:01 |
sean-k-mooney | mnaser: that is why i said we liekly could optimise this using the numa distance but its more complex then what we do today | 17:02 |
*** Sundar has quit IRC | 17:02 | |
dansmith | efried: image meta you mean? I'm not saying we should, I'm just poking holes in it which.. if people lose data they get mad | 17:02 |
sean-k-mooney | mnaser: and it has implciations for modeling in plamcent | 17:02 |
dansmith | efried: I'm on a call right now though | 17:02 |
efried | sorry, yeah, image meta. | 17:02 |
mnaser | sean-k-mooney: yes that sounds massively complex to get implemented -- but would be so awesome | 17:02 |
sean-k-mooney | mnaser: what feature that relates to numa isint :) | 17:03 |
mnaser | hah, good one | 17:03 |
mnaser | sean-k-mooney: i mean in my case, the fact memory doesnt float across 8/16 numa nodes, at least 2 is better, the only "bad" potential idea is if a VM get pinned to two NUMA nodes that are pretty far apart | 17:03 |
sean-k-mooney | right as in on different sockets. | 17:03 |
mnaser | yep | 17:04 |
sean-k-mooney | but even then its still better then floating as it will be deterministic | 17:04 |
sean-k-mooney | well in that once the instnace i booted its performance should not change | 17:04 |
*** fgonzales3246 has quit IRC | 17:08 | |
*** dpawlik has joined #openstack-nova | 17:12 | |
*** xek_ has joined #openstack-nova | 17:15 | |
openstackgerrit | Merged openstack/nova stable/train: Imported Translations from Zanata https://review.opendev.org/694911 | 17:17 |
*** xek has quit IRC | 17:17 | |
*** rpittau is now known as rpittau|afk | 17:18 | |
*** dpawlik has quit IRC | 17:21 | |
*** takamatsu has joined #openstack-nova | 17:24 | |
*** damien_r has quit IRC | 17:26 | |
*** jawad_axd has joined #openstack-nova | 17:26 | |
*** jawad_axd has quit IRC | 17:31 | |
openstackgerrit | Boris Bobrov proposed openstack/nova master: Create a controller for qga when SEV is used https://review.opendev.org/693072 | 17:32 |
KeithMnemonic | any change to please get some reviews on the "marker" patches mriedem was working on ? https://review.opendev.org/#/c/690721/4 Thanks! | 17:34 |
*** dpawlik has joined #openstack-nova | 17:35 | |
dustinc | efried, gibi, dansmith: RE Provider Config: Just got caught up on this morning's convo. Am I right in seeing that you guys want to A) drop NAME as identification method, and B) Allow both $UUID and $COMPUTE_NODE to identify same RP but $UUID takes precedence? If so I am 100% on board. | 17:36 |
*** jawad_axd has joined #openstack-nova | 17:47 | |
*** jaosorior has quit IRC | 17:50 | |
*** dpawlik has quit IRC | 17:51 | |
*** jawad_axd has quit IRC | 17:52 | |
*** jaosorior has joined #openstack-nova | 17:57 | |
*** igordc has joined #openstack-nova | 18:00 | |
*** derekh has quit IRC | 18:01 | |
dansmith | dustinc: I dunno about a conflict between a fixed uuid and $compute_node, but I would recommend handling that the same way I suggested for the name/uuid conflict, which is to log error and ignore both | 18:06 |
*** tbachman has quit IRC | 18:10 | |
*** amodi has joined #openstack-nova | 18:10 | |
*** jmlowe has joined #openstack-nova | 18:16 | |
*** tbachman has joined #openstack-nova | 18:17 | |
*** martinkennelly has quit IRC | 18:18 | |
*** tesseract has quit IRC | 18:18 | |
*** jaosorior has quit IRC | 18:22 | |
*** ralonsoh has quit IRC | 18:29 | |
*** nweinber_ has joined #openstack-nova | 18:30 | |
openstackgerrit | Boris Bobrov proposed openstack/nova master: Also enable iommu for virtio controllers and video in libvirt https://review.opendev.org/684825 | 18:33 |
openstackgerrit | Boris Bobrov proposed openstack/nova master: Create a controller for qga when SEV is used https://review.opendev.org/693072 | 18:33 |
*** nweinber has quit IRC | 18:34 | |
sean-k-mooney | by the way i just looked at the implemented spec folder for train. looks like we have not moved them https://github.com/openstack/nova-specs/tree/master/specs/train/implemented | 18:49 |
*** tosky has quit IRC | 18:55 | |
*** ociuhandu has quit IRC | 18:57 | |
*** ociuhandu has joined #openstack-nova | 18:59 | |
*** dviroel has joined #openstack-nova | 18:59 | |
*** ociuhandu has quit IRC | 19:03 | |
mriedem | sean-k-mooney: you can run this and post the results https://github.com/openstack/nova-specs/blob/master/tox.ini#L39 | 19:09 |
sean-k-mooney | sure just leaving to grab dinner but ill give it a try when i get back | 19:14 |
sean-k-mooney | ok so that checks with launchpad to determin if they were finsihed and then updates them | 19:15 |
sean-k-mooney | cool | 19:16 |
*** tbachman has quit IRC | 19:23 | |
*** tbachman has joined #openstack-nova | 19:25 | |
*** munimeha1 has quit IRC | 19:29 | |
*** awalende has joined #openstack-nova | 19:31 | |
*** lpetrut has quit IRC | 19:32 | |
*** lpetrut has joined #openstack-nova | 19:32 | |
*** ociuhandu has joined #openstack-nova | 19:35 | |
*** awalende_ has joined #openstack-nova | 19:36 | |
*** awalende has quit IRC | 19:39 | |
*** awalende_ has quit IRC | 19:41 | |
*** awalende has joined #openstack-nova | 19:41 | |
*** abaindur has joined #openstack-nova | 19:45 | |
*** abaindur has quit IRC | 19:45 | |
*** abaindur has joined #openstack-nova | 19:46 | |
*** ociuhandu has quit IRC | 19:46 | |
*** ociuhandu has joined #openstack-nova | 19:47 | |
*** abaindur has quit IRC | 19:48 | |
*** abaindur has joined #openstack-nova | 19:49 | |
*** awalende_ has joined #openstack-nova | 19:51 | |
*** ociuhandu has quit IRC | 19:53 | |
*** awalende has quit IRC | 19:54 | |
openstackgerrit | Mark Goddard proposed openstack/nova master: Clear rebalanced compute nodes from resource tracker https://review.opendev.org/695187 | 19:54 |
openstackgerrit | Mark Goddard proposed openstack/nova master: Invalidate provider tree when compute node disappears https://review.opendev.org/695188 | 19:54 |
openstackgerrit | Mark Goddard proposed openstack/nova master: Prevent deletion of a compute node belonging to another host https://review.opendev.org/694802 | 19:54 |
openstackgerrit | Mark Goddard proposed openstack/nova master: Fix inactive session error in compute node creation https://review.opendev.org/695189 | 19:54 |
*** ociuhandu has joined #openstack-nova | 19:57 | |
*** ociuhandu has quit IRC | 20:05 | |
*** awalende has joined #openstack-nova | 20:05 | |
*** awalende_ has quit IRC | 20:06 | |
*** awalende_ has joined #openstack-nova | 20:06 | |
*** eharney has quit IRC | 20:08 | |
*** awalende has quit IRC | 20:10 | |
*** nweinber__ has joined #openstack-nova | 20:11 | |
*** nweinber_ has quit IRC | 20:14 | |
*** awalende has joined #openstack-nova | 20:17 | |
*** awalende has quit IRC | 20:19 | |
*** awalende_ has quit IRC | 20:20 | |
*** tbachman has quit IRC | 20:26 | |
*** ociuhandu has joined #openstack-nova | 20:36 | |
efried | dustinc: The way I suggested yesterday still works IMO, and it requires less churn in both design and already-written code. | 20:38 |
efried | With dansmith's addendum | 20:40 |
efried | Which is as follows: | 20:41 |
efried | - Index by identifier (name *or* uuid) and fail hard on conflicts there | 20:41 |
efried | - Treat $COMPUTE_NODE as the default and a specific name/uuid as override | 20:41 |
efried | - If a provider appears such that you do find a conflicting name+uuid, log an error and ignore that provider completely. (This should only be possible for *nested* providers.) | 20:41 |
*** jmlowe has quit IRC | 20:43 | |
*** ociuhandu has quit IRC | 20:51 | |
*** tbachman has joined #openstack-nova | 20:51 | |
*** ociuhandu has joined #openstack-nova | 20:53 | |
efried | dansmith: shelve without offload only works for non-volume-backed, right? | 21:00 |
efried | and followup question, should we do the swift thing for shelve without offload, or only with offload? I guess the question is a) does unshelve always happen on same host in that case; b) do we care about "shelving" the minimal amount of storage required by the vdev? | 21:01 |
artom | sean-k-mooney, as promised, if you still have energy today, https://review.opendev.org/#/c/691062/ is ready again | 21:02 |
efried | probably yes to b), just for the sake of form, so shelving the vdev would be the right thing to do. | 21:02 |
efried | guess it really just makes sense to make it part of snapshot and have a crisp line | 21:04 |
*** ociuhandu has quit IRC | 21:06 | |
mriedem | efried: shelve works for volume-backed and non-volume-backed | 21:13 |
*** eharney has joined #openstack-nova | 21:14 | |
efried | mriedem: In the compute API I see that volume-backed calls the shelve-offload code path. | 21:14 |
mriedem | if you shelve but don't offload (which offloading is automatic by default config) then you unshelve within the offload window you do end up on the same host (conductor just starts the server) | 21:14 |
*** awalende has joined #openstack-nova | 21:14 | |
efried | https://opendev.org/openstack/nova/src/branch/master/nova/compute/api.py#L3867-L3873 ? | 21:15 |
mriedem | yeah so in the case of volume-backed shelve there is no snapshot image | 21:15 |
efried | right, but the distinction for 'offload' (at least the one that I care about) is that it calls destroy() | 21:16 |
efried | whereas non-offload doesn't. | 21:16 |
efried | uh, unless I got confused. | 21:16 |
efried | yeah | 21:16 |
mriedem | yeah shelve without offload powers off the guest and creates a snapshot | 21:16 |
efried | ack | 21:17 |
mriedem | i'm drawing a blank on why we shelve offload immediately for volume-backed servers | 21:18 |
mriedem | just ask one of these guys from 2013 https://review.opendev.org/#/c/34032/ | 21:19 |
efried | was gonna say, seems like it's been that way from the get-go. dansmith co-authored, so... | 21:19 |
*** awalende has quit IRC | 21:19 | |
efried | he'll surely know. | 21:19 |
mriedem | i guess this comment https://review.opendev.org/#/c/34032/21/nova/compute/manager.py@3099 | 21:20 |
dansmith | mriedem: because there's no reason not to offload immediately | 21:20 |
efried | probably because waaay back then, the distinction with offload was only whether snapshot was necessary, which it isn't for volume (right?) | 21:20 |
dansmith | mriedem: there's no point in maintaining affinity | 21:20 |
dansmith | efried: not just no snapshot required, but there's no benefit to unshelve to the same host before offload | 21:21 |
dansmith | once offloaded, it's might as well go anywhere, but before offload, it's much quicker if you stay put | 21:21 |
dansmith | but with volume-backed, that distinction does not exist | 21:21 |
efried | got it. | 21:21 |
dansmith | efried: also, did you say I co-authored something? | 21:22 |
mriedem | i didn't realize the offload periodic came later https://review.opendev.org/#/c/35361/ | 21:22 |
dansmith | oh that shelve patch | 21:22 |
dansmith | I dunno what I co-authored about it | 21:22 |
dansmith | not much | 21:22 |
mriedem | not that one | 21:22 |
mriedem | this | 21:22 |
mriedem | https://review.opendev.org/#/c/34032/ | 21:22 |
dansmith | right | 21:23 |
efried | co-authored *and* +2ed, tsk | 21:23 |
efried | those were rowdy times | 21:23 |
eandersson | Anyone got any experience with the openstacksdk interacting with nova. Trying to figure out how to fix a memory leak in Senlin. | 21:23 |
dansmith | efried: never uploaded a patch set, | 21:23 |
efried | eandersson: you mean nova talking through sdk, or using sdk to talk to nova? | 21:23 |
dansmith | efried: so that was probably honorary because I helped figure something out | 21:23 |
eandersson | using the sdk to talk to nova | 21:23 |
efried | eandersson: mmph. Not so hot there. Given you already met crickets in -sdks, mriedem and I might between us be able to figure things out... | 21:24 |
eandersson | Senlin seems to be creating a new client for everytime it makes a call to nova. Which isn't ideal. | 21:25 |
efried | and you think that's because of the way it's talking to sdk? | 21:25 |
eandersson | I wanted to make it a singleton, but not sure how to do that when the user/project might differ. | 21:25 |
eandersson | https://review.opendev.org/#/c/695139/4/senlin/profiles/base.py | 21:26 |
eandersson | If you look at like 489 there | 21:26 |
eandersson | *389 | 21:26 |
eandersson | So I know for sure the leak is caused by the openstacksdk | 21:26 |
mriedem | so create a singleton map per user/project hash? | 21:26 |
eandersson | Yea - was going to go that route, but seems crazy to me hehe | 21:27 |
mriedem | obviously it's just masking a leak | 21:27 |
eandersson | Isn't there a way I can just create one openstacksdk object and re-use it | 21:27 |
mriedem | but idk why the sdk would be leaking | 21:27 |
mriedem | i'm not the guy to ask about that... | 21:27 |
eandersson | It could be some reference that keeps it alive | 21:27 |
eandersson | I poked mordred so maybe he'll get back to me tomorrow | 21:28 |
efried | I am *sure* I remember mordred telling me sdk already does local singleton-ing | 21:28 |
efried | but that doesn't mean it isn't buggy. | 21:28 |
mriedem | dansmith: this is random, but i've noticed that during a resize_claim we make at least 3 lazy-load roundtrip calls to the db to load fields that in most cases probably aren't even populated http://paste.openstack.org/show/786457/ | 21:29 |
dansmith | sweet | 21:30 |
mriedem | was wondering what you thought about adding something to Instance.refresh() to specify additional fields to load up | 21:30 |
*** xek__ has joined #openstack-nova | 21:30 | |
mriedem | refresh today just pulls the instance with it's existing additional joined fields, | 21:30 |
dansmith | lazy load pci_Devices twice? | 21:30 |
dansmith | resources is the new vpmems stuff right? | 21:30 |
mriedem | pci_requests, pci_devices and resources | 21:30 |
dansmith | ah right | 21:30 |
dansmith | but sure, fresh with a extra_attrs optional param makes sense | 21:31 |
mriedem | so was thinking something like Instance.refresh(..., extra_attrs=None) yeah | 21:31 |
mriedem | it's a version bump since refresh is remotable | 21:31 |
dansmith | so you can fault them all in in a single go if you know you're going to hit them | 21:31 |
mriedem | but that's probably not a big deal | 21:31 |
mriedem | right | 21:31 |
dansmith | mriedem: or if you don't need a refresh, you could just do a "load_if_not_present()" | 21:31 |
mriedem | and i want to use refresh here since the instance in the resize_claim is what's being used in the compute manager as well | 21:31 |
*** xek_ has quit IRC | 21:32 | |
*** dtantsur is now known as dtantsur|afk | 21:32 | |
dansmith | then sure | 21:33 |
mriedem | https://bugs.launchpad.net/nova/+bug/1853370 if you want to hack on it | 21:35 |
openstack | Launchpad bug 1853370 in OpenStack Compute (nova) "resize_claim lazy-loads at least 3 joined fields in separate DB calls" [Low,Confirmed] | 21:35 |
mriedem | i just wanted to dump thoughts before moving on | 21:35 |
sean-k-mooney | artom: ya ill take a look ill be around for about half an hour so ill review that before i finish up today | 21:37 |
*** tosky has joined #openstack-nova | 21:41 | |
efried | sigh, I need a fresher on rebuild again. | 21:42 |
*** pcaruana has quit IRC | 21:42 | |
efried | this is where you give an instance a new image but keep its network etc. | 21:42 |
efried | does vTPM fall into "etc"? | 21:43 |
efried | I would kinda think not. | 21:43 |
efried | dansmith: ^ | 21:43 |
dansmith | efried: this? | 21:43 |
dansmith | shelve? | 21:43 |
dansmith | oh on rebuild | 21:43 |
dansmith | rebuild the instance cannot move, just rebuild its root from a new image | 21:44 |
efried | Yeah, so the question is whether we should keep the vTPM | 21:44 |
dansmith | except when it's evacuate | 21:44 |
dansmith | we should, I would think | 21:44 |
dansmith | rebuild is akin to putting the Windows 98 disc in the drive and rebooting | 21:44 |
sean-k-mooney | artom: so ya im happy with those one thing i notices is we shoudl update teh job templates to drop python 2 and 35 for the tempest plugin and just use the ussuri template instead | 21:44 |
sean-k-mooney | that can be in a follow up patch | 21:45 |
efried | I kinda want to say the vTPM should ride with the image | 21:45 |
efried | that would make for consistency with the backup/restore and shelve/unshelve type cases. | 21:45 |
sean-k-mooney | we do not nuke addtional ephemeral disks on rebuild right? | 21:46 |
sean-k-mooney | just the root disk | 21:46 |
dansmith | vtpm should survive a rebuild | 21:46 |
dansmith | for sure | 21:46 |
sean-k-mooney | you could look at the vtpm the same way | 21:46 |
efried | i.e. any time you create an instance from an image, we check the image for the metadata that points to the vtpm swift store. | 21:46 |
efried | dansmith: so what if your instance has a vTPM, and then you rebuild with an image that has ^ metadata? | 21:47 |
*** nweinber__ has quit IRC | 21:47 | |
mriedem | efried: https://docs.openstack.org/nova/latest/contributor/evacuate-vs-rebuild.html ?!!??! :) | 21:47 |
sean-k-mooney | efried: i dont think that is right | 21:47 |
dansmith | efried: instance create is like creating a new server box, rebuild is like reinstalling the OS | 21:47 |
sean-k-mooney | efried: i think the vtpm lifetime shoudl be tied to the instance not the image | 21:47 |
dansmith | TPM is hardware, therefore it stays during a rebuild, IMHO | 21:47 |
dansmith | sean-k-mooney: ++ | 21:47 |
efried | So what if your instance has a vTPM, and then you rebuild with an image that has pointer-to-vtpm-in-swift metadata? | 21:47 |
efried | "they better match, or we punt"? | 21:48 |
*** dviroel has quit IRC | 21:48 | |
dansmith | efried: you mean if you rebuild from a snapshot? | 21:48 |
dansmith | remember the pointer to swift stays with the instance not the snapshot | 21:48 |
dansmith | oh you mean for the backup case.. that's why it doesn't work for the backup case :) | 21:48 |
efried | no, you convinced me earlier that snapshot and restore should work. | 21:48 |
dansmith | no, I didn't, I said "you're effed on one of these two things depending on what you decide :) | 21:48 |
sean-k-mooney | yes and the point would be stored in the nova system meatadat not the snapshot metadata | 21:49 |
efried | but that doesn't work for backup/restore | 21:49 |
sean-k-mooney | *pointer to the tpm snapshot | 21:49 |
efried | because there's no instance meta | 21:49 |
efried | there is only image meta. | 21:49 |
sean-k-mooney | oh right we were talking about shelve and unshelve before | 21:50 |
efried | if we go on the philosophy that TPM is like hardware, it's not unreasonable that backup/restore would lose it. | 21:50 |
sean-k-mooney | well for vpmem and ephermeral disk we dont snapshot them currenlty | 21:50 |
efried | if I do a backup of my laptop and then restore that backup on a different shell, my hardware changed. | 21:51 |
sean-k-mooney | and if you reinstall the os on the same laptop the tpm data is preserved | 21:51 |
dansmith | efried: but that's why VMs are better than hardware | 21:51 |
efried | heh | 21:51 |
efried | Okay, does this sound reasonable: | 21:52 |
efried | On rebuild, if your instance had a vTPM before, and the image specifies a vTPM also, they must be the same vTPM, or we fail. | 21:53 |
sean-k-mooney | not really | 21:53 |
efried | what should we do in that case? | 21:53 |
sean-k-mooney | im not sure the image shoudl be able to reference a vtpm | 21:54 |
efried | then we can't do backup/restore. | 21:54 |
sean-k-mooney | of the tpm | 21:54 |
dansmith | no | 21:54 |
sean-k-mooney | we can still do backup and resotre of the vm root disk like a normal snapshot | 21:55 |
dansmith | efried: on rebuild you don't have a tpm in swift because you have it on disk, it hasn't gone anywhere | 21:55 |
dansmith | efried: so if the snapshot specifies one, you wouldn't use it anyway | 21:55 |
efried | so if the image designates a vTPM, ignore it | 21:55 |
dansmith | the thing I'm more worried about if you do this, | 21:55 |
dansmith | is you snapshot an instance and then start 100 copies for it | 21:55 |
dansmith | *of it | 21:55 |
dansmith | you just sprayed your keys everywhere | 21:56 |
efried | sean-k-mooney: ftr, this is where dansmith convinced me the vTPM needs to ride with the backup http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2019-11-20.log.html#t2019-11-20T17:00:12 | 21:56 |
dansmith | like a dog marking its territory on a hot summer day | 21:56 |
efried | nice visual | 21:56 |
dansmith | efried: again | 21:56 |
dansmith | efried: I'm not saying that it should necessarily, I'm saying it's going to eff up someone one way or the other | 21:56 |
dansmith | people use snapshot to roll back from a bad OS upgrade, and if you don't keep their keys, they're going to get screwed | 21:57 |
sean-k-mooney | should it be configurable? | 21:57 |
dansmith | but if you do, and they spawn that image in a bunch of places, ... | 21:57 |
efried | dansmith: well... we could also delete the swift obj (and mod the image meta) the first time we restore it. | 21:57 |
efried | that works poorly for crashy hosts tho | 21:57 |
dansmith | efried: that's fine if you store it in the image record, despite a failed host | 21:58 |
dansmith | efried: but I think that's probably going to be fragile and maybe confusing why the first instance got it and none of the others did | 21:58 |
dansmith | but perhaps that's the best option | 21:58 |
dansmith | just saying, it's going to be icky | 21:58 |
efried | I'm saying: if my workflow is to back up stuff in case my host goes down, and I do that backup, and then my host goes down, and I restore it, and then my host goes down again, then when I restore it the second time, my vTPM will be gone. | 21:59 |
dansmith | oh, yes, I see what you mean | 21:59 |
*** ociuhandu has joined #openstack-nova | 21:59 | |
sean-k-mooney | is this something we want the user to express a policy on e.g. hw_vtpm_snapshot=true|false | 21:59 |
dansmith | efried: this is also bad for the shared storage case in general actually, | 21:59 |
efried | sean-k-mooney: we could configure it to death, really. | 22:00 |
dansmith | efried: one thing about ceph backed instances is you can recover from a failed host with an evac | 22:00 |
dansmith | efried: but in this case you will fail to boot after evac because we don't have the keys anywhere | 22:00 |
sean-k-mooney | dansmith: you would loose the tpm data in that case | 22:00 |
dansmith | so that's another case this breaks pretty badly | 22:00 |
dansmith | volume-backed or ceph/nfs backed ephemeral | 22:00 |
dansmith | sean-k-mooney: right, that's what I just said | 22:00 |
sean-k-mooney | nfs might be ok if you also put the tpm datastore on nfs but ya. the impact of that will depend on what you used the tpm for i guess | 22:01 |
dansmith | sean-k-mooney: oh right, but not ceph | 22:02 |
efried | because ceph can only handle the disk itself? | 22:02 |
efried | similar to volume-backed | 22:02 |
dansmith | we don't actually mount the storage on the host with ceph | 22:02 |
dansmith | qemu talks to ceph itself, so we don't like have a dir mounted or anything | 22:02 |
sean-k-mooney | the tpm data store is not a volume so we wont create it as such in ceph | 22:02 |
efried | wow, neat. | 22:03 |
efried | dansmith: you know, in that case it's juuust possible qemu will store the vTPM file on ceph too | 22:03 |
sean-k-mooney | we kind of which qemu could do the same for the other backedn too | 22:03 |
dansmith | efried: I don't think so | 22:03 |
efried | I doubt it though. | 22:03 |
efried | yeah. | 22:03 |
dansmith | efried: we'd have to give it a volume id for that or something | 22:03 |
dansmith | the flow chart of "how your instance behaves if you have a vtpm" is starting to get pretty scary | 22:04 |
efried | today it makes up the directory based on the VM ID | 22:04 |
efried | dansmith: well, it's always the corner cases that suck. | 22:04 |
efried | 80/20 | 22:04 |
dansmith | efried: well, snapshot is not a corner case IMHO :) | 22:04 |
dansmith | and ceph definitely isn't | 22:04 |
efried | no, but second-restore-of-snapshot-because-host-went-down-twice is | 22:04 |
dansmith | I wouldn't say so | 22:05 |
sean-k-mooney | so unless you take steps to put the vtpm data store on shared storeate i think we can assume it would be lost on evacuate in general | 22:05 |
efried | sean-k-mooney: definitely that | 22:05 |
dansmith | sean-k-mooney: which means unbootable instance if you put your FDE keys there | 22:05 |
dansmith | so in the flow chart, it means evacuate has a giant asterisk next to it :) | 22:05 |
sean-k-mooney | yes maybe although resuce could save you? | 22:05 |
dansmith | "we will reconstruct the smoking hull of an instance somewhere else where you know your data is in there, but it's unreadable" :) | 22:06 |
sean-k-mooney | e.g. allwo you to reinject the key if you still have it | 22:06 |
dansmith | sean-k-mooney: the whole point of the tpm is to not have to do that right? | 22:06 |
efried | not to be *able* to do that | 22:06 |
dansmith | secure enclave doesn't have the same meaning if you also have it on a post-it under your monitor | 22:06 |
sean-k-mooney | it has other services it can provide to the os but key storage is one of them yes | 22:06 |
efried | I guess I'll just write up the big asterisk in the spec & docs. | 22:07 |
*** ociuhandu has quit IRC | 22:07 | |
dansmith | I'd like to see a list of all these special cases yeah | 22:07 |
efried | You would kinda have to make a new backup of the vtpm every time you write to it. | 22:07 |
dansmith | because if the list is too large, it starts to become not so useful | 22:08 |
sean-k-mooney | efried: you could make the same argument for the root disk i think that is out of scope of nova | 22:09 |
efried | sean-k-mooney: right, but the answer to "evacuate doesn't help you with ephemeral root disk" is "use volume/ceph". | 22:09 |
sean-k-mooney | and here its use barbican | 22:10 |
efried | with a vTPM the answer to "evacuate doesn't help you with ephemeral root disk" is "you better have a very recent snapshot of your vTPM" | 22:10 |
efried | no | 22:10 |
efried | barbican only stores the key used to unlock the file we've been discussing. | 22:10 |
efried | said file will be stored in swift | 22:10 |
efried | but only during snapshot | 22:10 |
sean-k-mooney | sure but as a user you can also store keys your self in barbican | 22:10 |
efried | how does that help? | 22:10 |
efried | you mean if you want to not use a vTPM at all? | 22:10 |
sean-k-mooney | use the tpm as a local secure cache fo the key | 22:11 |
sean-k-mooney | if you evacuate retive the backup form barbican | 22:11 |
efried | if barbican is secure enough for you, you would just use barbican. | 22:11 |
*** awalende has joined #openstack-nova | 22:11 | |
efried | though arguably you're counting on that ultimately here anyway. | 22:11 |
sean-k-mooney | you dont want to transfer it over the network everythime you boot | 22:11 |
efried | we're doing that anyway. | 22:11 |
sean-k-mooney | well ya | 22:12 |
sean-k-mooney | i was just going to say you have to do that to unlock the vtpm | 22:12 |
efried | Also I believe "VM talks to barbican" is a thing we're avoiding | 22:12 |
efried | limiting to "host talks to barbican" | 22:12 |
sean-k-mooney | barbican is a top level openstack sericve so worklaods can use it if they want too | 22:13 |
efried | if they want to, yes. | 22:13 |
efried | but it's the openstack user's credentials as opposed to $random_vm_user | 22:13 |
sean-k-mooney | oh yes that is what i ment by vm | 22:13 |
efried | and also I imagine there are reasons for the VM to be blocked from the keymgr service network-wise. | 22:13 |
sean-k-mooney | sorry | 22:13 |
sean-k-mooney | i ment application in the vm | 22:13 |
sean-k-mooney | anyway i think a resonable alternitive is if you need the keys to sruvice beyond the life time of the vm or be shared in a "sercure" way store them in barbican | 22:15 |
sean-k-mooney | if you you can tollerate losing the key in a evacuate case then your can store it soly in the tpm | 22:15 |
*** awalende has quit IRC | 22:15 | |
*** tbachman has quit IRC | 22:16 | |
sean-k-mooney | dansmith: you primary concern is we loose the key used to encyrpt the os drive and loose data right if we loos the vtpm | 22:16 |
sean-k-mooney | wyoud ^ not solve that issue | 22:16 |
sean-k-mooney | *would | 22:17 |
efried | resize does a destroy and then a spawn? | 22:18 |
sean-k-mooney | it will redefine the domain but you dont loose data | 22:19 |
sean-k-mooney | and it may change host | 22:19 |
sean-k-mooney | e.g. if you resize form a flavor with 10G root disk to one with 20 it will grow the disk file | 22:20 |
sean-k-mooney | and if its libvirt the partion table and filesystem if it can | 22:20 |
sean-k-mooney | so we would have to copy the vtpm file if we move host | 22:20 |
efried | yeah, I'm exploring corner cases with the vTPM. If your resize adds/removes a vTPM where it was previously absent/present, that's fine. If both old & new flavor have the same version/model, we should carry the data. If they have a different version/model...? | 22:21 |
sean-k-mooney | if its to the same host we shoudl not need to do anything | 22:21 |
sean-k-mooney | if they are different models i would expect you to loose the contents of the tpm | 22:21 |
sean-k-mooney | unless we block that intentionally | 22:21 |
efried | yes to the first, the second is the question. | 22:22 |
efried | there's no way to persist the data, period. The versions/models are incompatible. | 22:22 |
sean-k-mooney | ya that is proably true or at least involved to do so i would not try in v1 of this | 22:23 |
sean-k-mooney | there might be a way to convert but im not aware of one off the top of my head | 22:23 |
efried | sean-k-mooney: not without introspecting, which ain't my business. | 22:24 |
sean-k-mooney | swtpm might have a fucntion to do it | 22:24 |
sean-k-mooney | but ya unless libvirt support it i think its out of scope | 22:24 |
efried | so I'm gonna say don't block it, cause who am I to say you didn't mean to do it | 22:24 |
sean-k-mooney | that would be my feeling too | 22:25 |
sean-k-mooney | we shoudl documetn it i guess | 22:25 |
efried | doing that now | 22:26 |
sean-k-mooney | but i dont think we shoudl guard in code against all pebcak errors | 22:26 |
efried | agreed | 22:26 |
*** slaweq has quit IRC | 22:26 | |
efried | hmph, the thing about saving the swift ID in instance sysmeta has a problem code flow-wise. | 22:27 |
efried | because the driver is the thing that knows the vdev is in this particular file path | 22:27 |
efried | but the only thing we've asked the driver to do is snapshot. | 22:27 |
efried | and (I think) the driver can't tell if it's being asked to snapshot as part of a shelve or backup or... | 22:28 |
sean-k-mooney | am im not 100% sure about that last part | 22:28 |
efried | so I guess it could edit the sysmeta itself... but if it's not a shelve and the compute mgr is about to blow away the instance, the compute mgr needs a way to get that information so it can stuff it into the image meta. | 22:29 |
efried | I guess it can pull it from the instance | 22:29 |
*** dave-mccowan has quit IRC | 22:29 | |
efried | so the contract with the virt driver's snapshot method has to spell out where & how this info is stored in the sysmeta. | 22:29 |
* efried answers own question. | 22:29 | |
sean-k-mooney | we create snapshot for shelve but i dont think the compute magener actully calls snapshot as part of that flow i think its donw by the driver | 22:30 |
efried | eh? | 22:30 |
sean-k-mooney | that said i could be way off base there so im checking | 22:30 |
efried | compute manager's shelve calls driver's snapshot | 22:30 |
*** JamesBen_ has joined #openstack-nova | 22:31 | |
sean-k-mooney | oh ok | 22:31 |
efried | https://opendev.org/openstack/nova/src/branch/master/nova/compute/manager.py#L5911 | 22:31 |
*** rcernin has joined #openstack-nova | 22:31 | |
*** JamesBen_ has quit IRC | 22:32 | |
sean-k-mooney | ya your right i though we had a shelve fucntion on the driver interface that we called but this is handeled in the compute manager instead | 22:32 |
*** JamesBen_ has joined #openstack-nova | 22:32 | |
sean-k-mooney | so snapshot is passed the instace so it can get the flavor and image metadata | 22:33 |
sean-k-mooney | so the snapshot function can check if the isntace has a vtpm and directly update the system metadata as you said | 22:33 |
*** tbachman has joined #openstack-nova | 22:34 | |
*** JamesBenson has quit IRC | 22:34 | |
sean-k-mooney | oh right but it would not know if its a shleve or not | 22:34 |
*** xek_ has joined #openstack-nova | 22:34 | |
*** slaweq has joined #openstack-nova | 22:35 | |
sean-k-mooney | oh it can jsut check the instance.task_state right? | 22:36 |
efried | could, but compute mgr has to be able to peel it out anyway, so it's simpler if snapshot just always behaves the same and compute mgr pulls the info out of the sysmeta if it needs it. | 22:36 |
*** xek__ has quit IRC | 22:36 | |
*** JamesBen_ has quit IRC | 22:36 | |
sean-k-mooney | ok | 22:37 |
sean-k-mooney | anyway night all o/ | 22:37 |
*** slaweq has quit IRC | 22:40 | |
*** abaindur has quit IRC | 22:41 | |
*** abaindur has joined #openstack-nova | 22:43 | |
*** tbachman has quit IRC | 22:44 | |
openstackgerrit | John Garbutt proposed openstack/nova master: WIP: Integrating with unified limits https://review.opendev.org/615180 | 22:46 |
*** slaweq has joined #openstack-nova | 22:50 | |
*** slaweq has quit IRC | 22:55 | |
*** tkajinam has joined #openstack-nova | 23:05 | |
*** slaweq has joined #openstack-nova | 23:06 | |
*** xek_ has quit IRC | 23:08 | |
openstackgerrit | John Garbutt proposed openstack/nova master: Better policy unit tests https://review.opendev.org/657697 | 23:09 |
*** slaweq has quit IRC | 23:10 | |
*** slaweq has joined #openstack-nova | 23:11 | |
*** mlavalle has quit IRC | 23:15 | |
*** slaweq has quit IRC | 23:15 | |
*** brault has joined #openstack-nova | 23:16 | |
efried | argh, destroy_disks=True is no longer sufficient grounds to delete the barbican key :( | 23:17 |
efried | In fact, I'm not sure we can "ever" do that. | 23:18 |
*** slaweq has joined #openstack-nova | 23:21 | |
*** slaweq has quit IRC | 23:26 | |
*** slaweq has joined #openstack-nova | 23:30 | |
*** slaweq has quit IRC | 23:35 | |
*** slaweq has joined #openstack-nova | 23:37 | |
*** slaweq has quit IRC | 23:45 | |
*** kaisers1 has quit IRC | 23:48 | |
*** slaweq has joined #openstack-nova | 23:48 | |
*** kaisers has joined #openstack-nova | 23:48 | |
openstackgerrit | Eric Fried proposed openstack/nova-specs master: Spec: Ussuri: Encrypted Emulated Virtual TPM https://review.opendev.org/686804 | 23:51 |
efried | phew. dansmith sean-k-mooney TxGirlGeek ^ | 23:51 |
efried | and jroll | 23:51 |
*** slaweq has quit IRC | 23:53 | |
*** tbachman has joined #openstack-nova | 23:54 | |
*** zhanglong has joined #openstack-nova | 23:54 | |
*** zhanglong has quit IRC | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!