Wednesday, 2019-11-20

*** slaweq has quit IRC00:01
*** ociuhandu has quit IRC00:02
*** slaweq has joined #openstack-nova00:11
*** xek has quit IRC00:12
*** slaweq has quit IRC00:18
*** brinzhang has joined #openstack-nova00:32
openstackgerritsean mooney proposed openstack/nova master: add [libvirt]/max_queues config option
*** slaweq has joined #openstack-nova00:42
*** yaawang has quit IRC00:44
*** brinzhang_ has joined #openstack-nova00:45
*** slaweq has quit IRC00:46
*** brinzhang has quit IRC00:49
*** TxGirlGeek has quit IRC00:52
*** yaawang has joined #openstack-nova00:57
*** slaweq has joined #openstack-nova01:01
*** nanzha has joined #openstack-nova01:05
*** slaweq has quit IRC01:05
*** Liang__ has joined #openstack-nova01:08
*** mlavalle has quit IRC01:21
*** ociuhandu has joined #openstack-nova01:22
*** slaweq has joined #openstack-nova01:25
*** slaweq has quit IRC01:30
*** ociuhandu has quit IRC01:34
*** lifeless has quit IRC01:34
mriedemmelwitt: you're going to have to ping me earlier tomorrow about that tempest patch, it's too late for me to dig into that right now01:35
mriedembut i'll star it01:35
melwittmriedem: ok, np. it's not urgent but something I thought you might have thoughts on. thanks01:36
mriedemyour downstream ci people know they can just write tempest plugins for the tests they want to use right?01:41
mriedemit doesn't need to be in the main repo01:41
mriedemso i don't think it's a great pattern to set with like, "we'd like to test x which is 99% already covered elsewhere"01:41
mriedemanyway, i've got to drop but i left that comment on there01:43
melwittI didn't know how tempest plugins work01:43
mriedemcinder has one, check it out01:43
mriedemlike devstack plugins01:43
mriedemanyway, o/01:43
*** mriedem has quit IRC01:44
*** mdbooth has quit IRC01:49
*** lifeless has joined #openstack-nova01:49
*** ozzzo has quit IRC01:50
*** Liang__ is now known as LiangFang01:54
*** mdbooth has joined #openstack-nova01:56
*** awalende has joined #openstack-nova01:57
*** awalende has quit IRC02:02
*** ileixe has quit IRC02:05
*** ileixe has joined #openstack-nova02:06
*** lifeless has quit IRC02:12
*** macz has joined #openstack-nova02:12
*** slaweq has joined #openstack-nova02:13
*** igordc has quit IRC02:18
*** slaweq has quit IRC02:18
mnasersean-k-mooney: something in your domain -- i was wondering if libvirt by default makes sure that vcpus for domains live inside the same numa node02:22
mnaseri'm playing with new epyc rome hardware and seeing poor memory access times with sysbench, im wondering if the reasoning behind it is that cpu time is flopping back and forth between numa nodes02:22
*** lifeless has joined #openstack-nova02:24
*** macz has quit IRC02:24
*** slaweq has joined #openstack-nova02:24
*** nanzha has quit IRC02:25
*** nanzha has joined #openstack-nova02:27
*** slaweq has quit IRC02:30
*** ociuhandu has joined #openstack-nova02:35
*** slaweq has joined #openstack-nova02:35
*** abaindur has quit IRC02:36
*** alex_xu has joined #openstack-nova02:38
alex_xumelwitt: hellow, are you still here02:38
*** ociuhandu has quit IRC02:39
*** mkrai has joined #openstack-nova02:40
*** slaweq has quit IRC02:48
*** gyee has quit IRC02:53
*** HagunKim has joined #openstack-nova02:54
HagunKimHi, How can I see Nova version like 19.0.3, 20.0.1 on my nova-compute host?02:55
*** slaweq has joined #openstack-nova03:00
*** ileixe has quit IRC03:00
*** slaweq has quit IRC03:04
*** ileixe has joined #openstack-nova03:04
*** slaweq has joined #openstack-nova03:05
*** slaweq has quit IRC03:16
*** macz has joined #openstack-nova03:18
*** nanzha has quit IRC03:22
*** nanzha has joined #openstack-nova03:22
*** macz has quit IRC03:22
*** slaweq has joined #openstack-nova03:26
*** macz has joined #openstack-nova03:28
*** slaweq has quit IRC03:31
*** macz has quit IRC03:33
*** slaweq has joined #openstack-nova03:35
*** slaweq has quit IRC03:40
*** mkrai has quit IRC03:42
*** mkrai_ has joined #openstack-nova03:42
*** ricolin has joined #openstack-nova03:43
*** nanzha has quit IRC03:53
*** nanzha has joined #openstack-nova03:53
*** slaweq has joined #openstack-nova03:55
openstackgerritMerged openstack/nova master: Remove functional test specific nova code
*** slaweq has quit IRC04:02
*** liuyulong has quit IRC04:05
*** slaweq has joined #openstack-nova04:10
*** slaweq has quit IRC04:15
*** bhagyashris has joined #openstack-nova04:31
*** slaweq has joined #openstack-nova04:39
*** slaweq has quit IRC04:44
openstackgerritBrin Zhang proposed openstack/nova-specs master: Add composable flavor properties
*** slaweq has joined #openstack-nova05:01
*** sapd1_ has joined #openstack-nova05:02
*** bhagyashris has quit IRC05:04
*** sapd1 has quit IRC05:04
*** slaweq has quit IRC05:08
*** brinzhang has joined #openstack-nova05:09
*** bhagyashris has joined #openstack-nova05:09
*** brinzhang has quit IRC05:10
*** brinzhang has joined #openstack-nova05:11
*** slaweq has joined #openstack-nova05:11
*** brinzhang_ has quit IRC05:12
*** threestrands has joined #openstack-nova05:18
*** slaweq has quit IRC05:20
*** pcaruana has joined #openstack-nova05:25
*** ratailor has joined #openstack-nova05:27
*** slaweq has joined #openstack-nova05:29
*** ociuhandu has joined #openstack-nova05:31
*** slaweq has quit IRC05:33
*** ociuhandu has quit IRC05:35
*** slaweq has joined #openstack-nova05:42
*** jmlowe has quit IRC05:46
*** slaweq has quit IRC05:47
*** links has joined #openstack-nova05:48
*** brinzhang_ has joined #openstack-nova05:52
*** jraju__ has joined #openstack-nova05:54
*** links has quit IRC05:54
*** jraju__ has quit IRC05:54
*** ileixe has quit IRC05:55
*** brinzhang has quit IRC05:55
*** awalende has joined #openstack-nova05:58
*** ileixe has joined #openstack-nova05:59
*** brinzhang has joined #openstack-nova06:00
*** awalende has quit IRC06:02
*** brinzhang_ has quit IRC06:03
*** slaweq has joined #openstack-nova06:11
openstackgerritOpenStack Proposal Bot proposed openstack/nova master: Imported Translations from Zanata
*** slaweq has quit IRC06:15
*** zhanglong has joined #openstack-nova06:20
*** Nick_A has quit IRC06:26
*** slaweq has joined #openstack-nova06:27
*** slaweq has quit IRC06:32
*** larainema has joined #openstack-nova06:33
*** slaweq has joined #openstack-nova06:39
*** slaweq has quit IRC06:44
*** slaweq has joined #openstack-nova07:11
*** dpawlik has joined #openstack-nova07:14
*** brinzhang has quit IRC07:15
*** slaweq has quit IRC07:15
*** brinzhang has joined #openstack-nova07:16
*** udesale has joined #openstack-nova07:25
*** tosky has joined #openstack-nova07:35
*** priteau has quit IRC07:38
*** ileixe has quit IRC07:44
*** mkrai_ has quit IRC07:45
*** mkrai_ has joined #openstack-nova07:45
*** awalende has joined #openstack-nova07:47
*** slaweq has joined #openstack-nova07:56
*** dpawlik has quit IRC07:56
*** ccamacho has joined #openstack-nova07:57
gibiHagunKim: $ nova-compute --version07:57
*** tesseract has joined #openstack-nova07:58
*** threestrands has quit IRC07:58
HagunKimgibi: Thanks!07:58
*** macz has joined #openstack-nova07:59
*** abaindur has joined #openstack-nova08:02
*** abaindur has joined #openstack-nova08:03
*** macz has quit IRC08:04
*** ociuhandu has joined #openstack-nova08:09
*** tkajinam has quit IRC08:17
*** ociuhandu has quit IRC08:19
*** dpawlik has joined #openstack-nova08:21
bauzasgood morning Nova08:28
gibibauzas: good morning08:31
bauzasgibi: just doing a few stuff now, but if you're OK, I'd like to discuss on the audit command in 30 mins08:31
bauzasfine with you ?08:31
gibibauzas: sure, ping me08:31
bauzastl;dr: when calling --delete, you haven't use the --verbose also, so I don't see if we tried to delete the allocation08:32
bauzasgibi: ^08:32
gibibauzas: I can quickly try --verbose + --delete08:33
bauzasgibi: would be nice, thanks08:33
gibithe output of the --verbose --delete seems like to be the same as the output of only --verbose08:34
bauzasgibi: weirod08:34
gibiyesterday I had to leave but I can try to attach a debugger now08:35
bauzaslooks like the conditional doesn't work08:35
bauzasI mean in
*** ivve has joined #openstack-nova08:36
gibibauzas: debugger helped it is the args definition that is buggy
gibiI guess you are missing some test coverage for --delete :)08:38
bauzasbut I'm testing it08:38
gibibauzas: you are calling audit() directly from the test so skipping the arg parser08:39
bauzasOH !08:39
bauzasyeah this08:39
bauzashow could I test it then ?08:39
gibibauzas: good question. I just quickly checked and the heal_allocation tests do not cover args parsing either08:41
*** nanzha has quit IRC08:41
gibiI would not worry about it then in the functional env08:44
gibiit seems we don't have the facility there08:44
gibithere are some nova-manage tests in there we can cover the args parsing too08:45
openstackgerritSylvain Bauza proposed openstack/nova stable/train: Don't delete compute node, when deleting service other than nova-compute
gibiI guess this can be replaced with your audit call
*** nanzha has joined #openstack-nova08:48
*** elod_off is now known as elod08:50
openstackgerritBalazs Gibizer proposed openstack/nova stable/pike: Only nil az during shelve offload
*** ralonsoh has joined #openstack-nova08:51
*** rpittau|afk is now known as rpittau08:53
*** nanzha has quit IRC08:54
*** ociuhandu has joined #openstack-nova08:55
*** damien_r has joined #openstack-nova08:58
*** nanzha has joined #openstack-nova08:58
*** ociuhandu has quit IRC08:59
*** ociuhandu has joined #openstack-nova09:00
*** ociuhandu has quit IRC09:03
*** jawad_axd has joined #openstack-nova09:03
*** ociuhandu has joined #openstack-nova09:04
*** ociuhandu has quit IRC09:11
*** ociuhandu has joined #openstack-nova09:11
*** ociuhandu has quit IRC09:13
*** tosky has quit IRC09:13
*** ociuhandu has joined #openstack-nova09:14
*** tosky has joined #openstack-nova09:14
*** ociuhandu has quit IRC09:15
*** ociuhandu has joined #openstack-nova09:16
*** abaindur has quit IRC09:17
*** ociuhandu has quit IRC09:17
stephenfingibi, bauzas: If either of you have bandwidth available this week, I'd appreciate some reviews on the removal of nova-network series
* stephenfin will be bugging mriedem too since there are a few API removals in there09:18
*** ociuhandu has joined #openstack-nova09:18
gibistephenfin: added to my list, but no hard promises09:19
*** ociuhandu has quit IRC09:19
*** ociuhandu has joined #openstack-nova09:20
bauzasstephenfin: sure, I want to dig into reviews this week09:21
*** ociuhandu has quit IRC09:26
*** ociuhandu has joined #openstack-nova09:27
*** ociuhandu has quit IRC09:29
*** ociuhandu has joined #openstack-nova09:29
*** ociuhandu has quit IRC09:32
*** ociuhandu has joined #openstack-nova09:32
*** ociuhandu has quit IRC09:34
*** ociuhandu has joined #openstack-nova09:35
*** ociuhandu has quit IRC09:37
*** ociuhandu has joined #openstack-nova09:37
*** ociuhandu has quit IRC09:39
*** ociuhandu has joined #openstack-nova09:40
*** ociuhandu has quit IRC09:43
*** awalende has quit IRC09:43
*** ociuhandu has joined #openstack-nova09:43
*** dpawlik has quit IRC09:45
*** ociuhandu has quit IRC09:45
*** ociuhandu has joined #openstack-nova09:46
*** rcernin has quit IRC09:46
openstackgerritStephen Finucane proposed openstack/nova master: docs: Rewrite quotas documentation
*** derekh has joined #openstack-nova09:47
*** awalende has joined #openstack-nova09:48
*** ociuhandu has quit IRC09:48
*** ociuhandu has joined #openstack-nova09:49
*** mkrai_ has quit IRC09:49
*** martinkennelly has joined #openstack-nova09:50
*** ociuhandu has quit IRC09:50
*** mkrai has joined #openstack-nova09:50
*** ociuhandu has joined #openstack-nova09:51
*** awalende has quit IRC09:52
*** ociuhandu has quit IRC09:52
*** awalende has joined #openstack-nova09:52
*** priteau has joined #openstack-nova09:52
*** ociuhandu has joined #openstack-nova09:53
*** ociuhandu has quit IRC09:56
*** ociuhandu has joined #openstack-nova09:57
*** jaosorior has joined #openstack-nova09:57
*** ociuhandu has quit IRC09:58
*** ociuhandu has joined #openstack-nova09:58
openstackgerritKashyap Chamarthy proposed openstack/nova master: libvirt: Bump MIN_{LIBVIRT,QEMU}_VERSION for "Ussuri"
*** nanzha has quit IRC10:06
*** priteau has quit IRC10:06
*** nanzha has joined #openstack-nova10:08
*** mkrai has quit IRC10:20
*** dpawlik has joined #openstack-nova10:20
*** mkrai has joined #openstack-nova10:23
*** dpawlik has quit IRC10:24
*** dtantsur|afk is now known as dtantsur10:27
openstackgerritLiang Fang proposed openstack/nova-specs master: Support volume local cache
*** LiangFang has quit IRC10:32
*** ricolin has quit IRC10:32
*** HagunKim has quit IRC10:33
*** xek has joined #openstack-nova10:35
*** damien_r has quit IRC10:36
*** damien_r has joined #openstack-nova10:38
*** brinzhang_ has joined #openstack-nova10:38
openstackgerritMerged openstack/nova master: Don't delete compute node, when deleting service other than nova-compute
*** brinzhang has quit IRC10:42
*** mkrai has quit IRC10:43
*** zhanglong has quit IRC10:46
*** dtantsur is now known as dtantsur|brb10:48
*** dpawlik has joined #openstack-nova10:51
*** brinzhang has joined #openstack-nova11:01
*** brinzhang_ has quit IRC11:05
*** PrinzElvis has quit IRC11:08
*** bhagyashris has quit IRC11:10
*** macz has joined #openstack-nova11:12
*** dpawlik has quit IRC11:12
*** macz has quit IRC11:16
*** ociuhandu has quit IRC11:21
*** lpetrut has joined #openstack-nova11:26
*** mkrai has joined #openstack-nova11:34
*** tbachman has quit IRC11:38
openstackgerritSylvain Bauza proposed openstack/nova master: Add a placement audit command
openstackgerritSylvain Bauza proposed openstack/nova master: Avoid PlacementFixture silently swallowing kwargs
*** brinzhang_ has joined #openstack-nova11:45
*** ratailor has quit IRC11:46
*** dpawlik has joined #openstack-nova11:49
*** brinzhang has quit IRC11:49
*** ociuhandu has joined #openstack-nova11:51
gibistephenfin: I think your patch in devstack-plugin-ceph the causing nova-live-migration job to fail on stable/pike. Opened a bug here
openstackLaunchpad bug 1853280 in OpenStack Compute (nova) pike "nova-live-migration job constantly fails on stable/pike" [Undecided,New]11:52
gibilyarwood: ^^ for the yesterday stable/pike issue11:52
*** dpawlik has quit IRC11:53
artomstephenfin, heya, could I get you to take another look at
artomsean-k-mooney, same question for :)12:11
*** PrinzElvis has joined #openstack-nova12:15
*** brinzhang_ has quit IRC12:16
*** mkrai has quit IRC12:17
sean-k-mooneyartom: sure did you fix up the other pathce below if so ill start form the bottom and work up12:17
*** dave-mccowan has quit IRC12:17
artomsean-k-mooney, yeah, the series is ready to go - Joe even merged some of the bottom ones12:21
* artom is happy to see sean-k-mooney not angry at him for making him write unit tests.12:21
sean-k-mooneyi have not written them yet so how can i be angrey :)12:22
*** larainema has quit IRC12:23
artomAh, the anger comes later, I see12:23
*** pcaruana has quit IRC12:24
sean-k-mooneyohly after fighting with mocking for about an hour. although what you want me to test should be strait forward so you will likely get a pass12:24
artomYep, I know how "unit testing" some of those "functions" can be12:25
*** awalende has quit IRC12:26
artomBut you made such a nice clean new one... :)12:26
*** mlavalle has joined #openstack-nova12:29
openstackgerritKashyap Chamarthy proposed openstack/nova master: Pick NEXT_MIN libvirt/QEMU versions for "V" release
openstackgerritKashyap Chamarthy proposed openstack/nova master: libvirt: Bump MIN_{LIBVIRT,QEMU}_VERSION for "Ussuri"
*** mkrai has joined #openstack-nova12:36
*** pcaruana has joined #openstack-nova12:50
*** awalende has joined #openstack-nova12:52
*** mgariepy has joined #openstack-nova12:54
*** derekh has quit IRC12:55
*** awalende has quit IRC12:56
kashyapstephenfin: Heya, one reason why I also separate out the NEXT_MIN is that it lets us settle on something while we chip away at bump + resulting clean-up12:57
*** ociuhandu has quit IRC12:58
*** mkrai has quit IRC12:59
*** derekh has joined #openstack-nova13:02
stephenfinkashyap: The clean up is all optional though, right?13:03
stephenfinI mean, it's just dead code13:03
kashyapstephenfin: I mean the immediate clean-up to make the bump work; not the "full clean-up of removing dead constants"13:03
stephenfinah, I figured it would just be test removal13:03
stephenfingibi: I assume the plugin isn't versioned?13:04
kashyap:-) Hehe, you've also done part of this clean-up before, maybe you forgot briefly13:04
gibistephenfin: I think it is not branched13:04
stephenfinkashyap: yup, but they could all be done after the fact13:04
gibistephenfin: I haven't checked if it is versioned somehow13:04
*** tbachman has joined #openstack-nova13:04
kashyapstephenfin: Oh, certainly; that's how we do anyway - removing the dead constants after the bump.13:05
stephenfinkashyap: so the only thing we'd have to clean up in that patch is anything that assumes the version is == the old minimum13:05
openstackgerritMark Goddard proposed openstack/nova master: Add functional regression test for bug 1853009
openstackbug 1853009 in OpenStack Compute (nova) "Ironic node rebalance race can lead to missing compute nodes in DB" [Undecided,In progress] - Assigned to Mark Goddard (mgoddard)13:05
openstackgerritMark Goddard proposed openstack/nova master: Prevent deletion of a compute node belonging to another host
openstackgerritMark Goddard proposed openstack/nova master: Clear rebalanced compute nodes from resource tracker
openstackgerritMark Goddard proposed openstack/nova master: Invalidate provider tree when compute node disappears
openstackgerritMark Goddard proposed openstack/nova master: Fix inactive session error in compute node creation
*** priteau has joined #openstack-nova13:05
kashyapstephenfin: That's what I did (or I thought I did) here:
kashyapstephenfin: If you skimmed the commit, don't mistake the listed constants as the ones I've removed -- I just noted that they'll be removed in _future_ patches13:06
*** ricolin has joined #openstack-nova13:07
mnasersean-k-mooney: i'm trying to optimize a system to minimize memory latency.  i have a total of 8 numa nodes, 4 inside every socket (distances numa<=>numa on same socket is 10, but cross numa in same socket is 12, cross socket+cross numa is 32)13:10
mnaserso its really expensive to do cross-numa cross-socket (and even somewhat to do cross-numa in socket too).  i have 12 threads inside a NUMA node too13:11
mnaseri'm thinking if i give the VM 2 NUMA nodes (instead of none), will that end up increasing performance because hopefully the two pair of NUMA nodes end up being close by (rather than cross socket)13:12
stephenfingibi: okay, I can fix that now13:12
mnaserand maybe at least the VM will be aware that it's crossing numa paths13:12
gibistephenfin: no tags, no branches in devstack-plugin-ceph so I don't know how to pin13:12
gibistephenfin: thanks for fixing13:12
stephenfinI'm not going to pin. I'm going to simply check if Python 3 is enabled13:12
gibistephenfin: good idea13:12
sean-k-mooneymnaser: you using first gen amd eypc cpus yes13:12
mnasersean-k-mooney: second gen :)13:13
sean-k-mooneysecond gen have 1 io die per socket and only 1 numa node per socket13:13
sean-k-mooneywell by second gen i ment zen2 backed eypc cpus13:13
mnaserreally? i actually have BIOS settings to allow me to change an "NPS" settting which makes me select 1/2/413:14
sean-k-mooneyim not sure if they had a refresh based on zen 113:14
sean-k-mooneythat shoudl be there for zen113:14
mnaser see "NUMA Nodes per Socket (NPS)" -- i set that to 4 so i have 8 numa nodes in total13:14
sean-k-mooneythe recomendation i have gien internally is to disable multiple numa nodes per socket for zen13:14
mnaserreally? i've seen that it has resulted in less than ideal NUMA node latency (but a more consistent one inside a socket)13:15
sean-k-mooneyhum ok so you are using gen213:16
mnaser`numactl -N 0 -m 0 sysbench memory --threads=8 --memory-total-size=32G run` with NPS=4 was *WAY* faster than with NPS=113:16
sean-k-mooneyi might need to reasses13:16
sean-k-mooneyyes it would be13:16
sean-k-mooneyso basically we were under the impression that gen 2 would only have 1 numa node so we decided not to optimise for the up to 8 numa nodes per socket that can be exposed in gen 113:17
mnasergen 2 can even go up to 16 numa nodes, you can make it expose a numa node per each l3 cache13:18
sean-k-mooneymnaser: so to your orgincal question giving the guest 2 numa nodes will imporve the guest performace in general13:18
*** bhagyashris has joined #openstack-nova13:18
mnasersean-k-mooney: in this case, much older hardware vs gen 2 epyc.. look at the difference in memory performance13:18
sean-k-mooneyso the trade of with that is it limits the vms signifcantly13:19
sean-k-mooneyor rather how you create your flavors13:19
sean-k-mooney if you enable all 1613:19
sean-k-mooneythen you vms with a numa topology can only have 3-4 cpus per numa node due to how we do numa in nova13:20
stephenfingibi: Thank God I included :D13:20
mnaseri see, i'm ok with 8 if they are living next to each other (and i guess giving the vm the proper awareness means it will know how to address things)13:20
sean-k-mooneye.g. if you expose 1 numa node per ccx it give you the best performance but you are forst to either have multipl numa nodes for larger vms or use no numa features13:21
mnaseri rather 8 because it gives me bigger memory slices and more cores (so i can have an entire VM live in a single NUMA node)13:21
sean-k-mooneyyes but that raises an intersting point13:21
gibistephenfin: deep down you knew that you need to check for python3 enabdled :)13:21
sean-k-mooneywe have talked about breaking the 1:1 mapping betwen numa nodes in the guest and the host13:22
sean-k-mooneyif we did that we could decuple this13:22
gibistephenfin: thanks for the fix13:22
*** dave-mccowan has joined #openstack-nova13:22
sean-k-mooneythere are tradoffs but if it was a policy on the guest that would mean you could optimise the hardware config without limiting the slicing13:23
stephenfinartom: Yeah, I started looking at it this morning before lunch. ngl, it's tough going /o\13:26
mnasersean-k-mooney: right, which is odd because when i run `numastat -cv qemu-kvm` -- it seems like the VM lives in a single node13:26
mnaserbut looking at the performance benchmarks, the numbers are pretty bad13:26
mnaseri mean, its not hosting the *entire* VM in a single NUMA node, but you can tell the memory allocated to the VM does live in that node13:27
sean-k-mooneymnaser: also damb the intel chip got schooled13:27
mnasersean-k-mooney: it did in everything except memory bound stuff, i think qpi is just insanely fast13:27
sean-k-mooneyqpi was retired its no UPI13:28
*** rouk has quit IRC13:28
sean-k-mooneypartly for licensing reasons but also upi is fater the qpi13:28
artomstephenfin, I know :(13:29
sean-k-mooneyi thought the inifity fabric in zen processor was ment to be faster however but i dont think we have ever really seen benchmarks of the too fabrics13:29
artomYou efforts are appreciated13:29
artomIt's a big patch, I tried documenting what I thought was worth it13:29
mnasersean-k-mooney: the numbers i see make it seem pretty slow.13:29
artomsean-k-mooney, so actually I think you have enough good points on the whitebox series to warrant a respin, if you don't mind looking at it again after13:30
sean-k-mooneymnaser: could you provide us with the output form virsh capablitys by the way13:30
sean-k-mooneyi would be interseted in seeing what data we are feeding to nova for it to make decieions13:30
bauzasgibi: I updated the placement audit command + added a FUP for testing the PlacementFixture13:31
sean-k-mooneyartom: sure ping me when its done13:31
sean-k-mooneyartom: there was noting i would block on but they were all minor imporvements13:31
gibibauzas: It is on my list for a proper code review because so far I only tried the tool but did not properly read the code13:31
bauzasholy shit, pep813:31
bauzasgibi: sure, no worries13:31
mnasersean-k-mooney: ^13:32
bauzasgibi: I provided a functest for verifying the deletion, it should work13:32
bauzasgibi: given your notes, I think I can try to have better logs13:33
gibibauzas: the log improvement is not something I will -1 but If you have time then better logs are always welcome13:34
bauzasgibi: I think it's important13:34
bauzasso I would understand a -1 for this13:34
*** mriedem has joined #openstack-nova13:34
*** macz has joined #openstack-nova13:35
sean-k-mooneyi need to jump on a call but im seeing som intersting thing in that output notably the way the cache is reported13:35
*** damien_r has quit IRC13:36
openstackgerritMatt Riedemann proposed openstack/nova stable/train: Don't delete compute node, when deleting service other than nova-compute
sean-k-mooneymnaser: i dont know if you remember but i have spoke about the need to model cpus per cache node and cache nodes per numa node in the past at the denver ptg13:37
*** dpawlik has joined #openstack-nova13:39
sean-k-mooneymnaser: noone else seams to be on the call so ill expand on that13:39
*** macz has quit IRC13:39
*** maciejjozefczyk has quit IRC13:39
sean-k-mooneymnaser: cache banks 1 and 2 both part of numa node 1 in that config13:40
sean-k-mooneyit looks like the 24 core sku has 3 active cpus per ccx and hence per cache region13:40
sean-k-mooneyso your 8 core vm needs to be spread across 3 cache regions and 2 numa nodes13:40
mnasersean-k-mooney: it does, when i enable numa node per l3 cache, i end up with 6 threads in a numa node13:40
sean-k-mooneyoh ya i guess you could fit it on 2 cache regions and 1 numa node if you are using hyperthreads13:41
sean-k-mooneymnaser: so rather then select core fomr the same numa node if we wanted to optimise in nova we would want to select cores in the same cache region first13:42
sean-k-mooneywe can actuly model the cache toploy in libvirt too and expose that to the guest13:42
sean-k-mooneymnaser: it would be interesting to see how a 6 core vm faired against the 8 core vm13:43
*** eharney has quit IRC13:43
mnaseryeah, i read a bit about that.  i'm wondering if performance wise, exposing numa node per l3 cache and then doing 2 numa nodes in a VM would result in good overall performance13:43
*** ociuhandu has joined #openstack-nova13:44
mnaseryou'd have 4 cores each sitting in a numa node and hopefully the guest is smart enough to understand the distances involved13:44
sean-k-mooneyyes although again nova does not today fully understand the hierachy13:44
sean-k-mooneye.g. nova could select 2 numa nodes form differnet sockets13:45
sean-k-mooneyideally it woudl select 2 form the same socket to miniums the numa distance13:45
sean-k-mooneythat is actully just an optimization of the current behavior13:45
sean-k-mooneyrahter then a change13:45
mnasersean-k-mooney: right but at least the guest will be aware there is possible latency13:45
mnaseryep, indeed, its not optimal, but its an improvement13:45
*** bhagyashris has quit IRC13:46
*** bhagyashris has joined #openstack-nova13:46
sean-k-mooneyso the reason i suggested disabling the multiple numa nodes in zen 1 was our pms insisting vnf dont understand numa13:46
sean-k-mooneyafter 5+ years of eveyone explaiming what numa is to vnf vendors i dont really buy that13:47
mnaserim wondering which would be better: NPS=4 + no NUMA per L3 + 1 numa node .. vs .. NPS=4 + NUMA per L3 + 2 numa nodes -- im thinking the latter will likely be much faster (or at least the VM will be 'smarter' at understanding the topology.. somewhat)13:47
mnaserit might be unpredictable because the NUMA node might have a distance of 32, or 12, or 10 .. but at least it might know there is distance13:48
*** ociuhandu has quit IRC13:48
sean-k-mooneythe later i think would optimize the memory latency13:48
mnaserim gathering some benchmark info13:50
mnaserand then ill try the two numa node + l3 per numa13:50
sean-k-mooneyexposing NUMA per L3 should give the best performance it just requires your vms to have multiple numa nodes if they have more then 3-4 cores or >32-64G of ram13:50
mnaseryep, most of the VMs will be 8 core / 8 gb13:50
*** mdbooth has quit IRC13:51
mnaseralso FWIW those aren't dedicated cores13:51
*** damien_r has joined #openstack-nova13:51
sean-k-mooneysure you just setting hw:numa_nodes=2 but not enabling pinning13:51
aarentshi dansmith, can you confirm that the last update is fine for you ?
mnaseryep, os that way im thinknig nova should do 4 cores per NUMA, 4gb in each numa node13:51
sean-k-mooneyif you do that you should also set hw:mem_page_size to either large or small13:52
sean-k-mooneymnaser: yes it would13:52
sean-k-mooneythe reason for doing hw:mem_page_size by the way is you want to enable novas numa aware memory tracking13:53
*** mdbooth has joined #openstack-nova13:53
sean-k-mooneyif your dont want to use hugepages set it to "small"13:53
sean-k-mooneythat will use 4k pages13:53
mnasersean-k-mooney: i think huge pages is soemthing id want, but i think transparent huge pages are enabled right now13:53
sean-k-mooneyright but you should still set hw:mem_page_size=small if you have a numa toplogy and are not using explcit hugepages13:54
sean-k-mooneyif you dont you can get OOM events13:54
mnaseri think id probably want large anyways, host has 512G memory and plenty of reserved memoryt13:54
sean-k-mooneyya if you dont allow memory oversubsription then there is no reason not to use hugepages13:55
mnaserya nope no oversubscription13:55
*** liuyulong has joined #openstack-nova13:57
mnaserepyc rome numbers
mnaserintel (but much older hardware)
mnasersean-k-mooney: ^ fyi, interesting to see latency numbers13:59
coreycbkashyap: do you know if anything has landed for removing cpu flags, similar to cpu_model_extra_flags?14:01
kashyapcoreycb: Hi...afraid, no; but I've filed a Blueprint for it:14:01
coreycbkashyap: thanks14:02
coreycbkashyap: fyi this bug is why I'm asking:
openstackLaunchpad bug 1853200 in libvirt (Ubuntu) "cpu features hle and rtm disabled for security are present in /usr/share/libvirt/cpu_map.xml" [High,Confirmed] - Assigned to Ubuntu Security Team (ubuntu-security)14:03
kashyapcoreycb: I know :-(14:03
kashyapAnd I guessed as much14:03
kashyapcoreycb: One (very valid) 'workaround'  is that have QEMU add new "named CPU models" to remove the said flags.14:04
coreycbkashyap: ok I'll mention that, thanks14:05
kashyapcoreycb: For that, upstream QEMU folks must add them...please file a QEMU "RFE" bug on launchpad for it14:05
sean-k-mooneymnaser: did you enable cluster on die for the intel system out of interest14:06
coreycbkashyap: thanks I'll pass the details on to cpaelzer, he's our qemu maintainer14:06
sean-k-mooneymnaser: but yes the amd number seam to be much higher14:06
sean-k-mooneymnaser: that is quite surprising to be honest14:07
*** priteau has quit IRC14:11
*** priteau has joined #openstack-nova14:12
*** priteau has quit IRC14:13
*** jaosorior has quit IRC14:18
*** bhagyashris has quit IRC14:30
*** damien_r has quit IRC14:31
*** munimeha1 has joined #openstack-nova14:33
*** mmethot has quit IRC14:34
*** mmethot has joined #openstack-nova14:35
*** nweinber has joined #openstack-nova14:37
*** maciejjozefczyk has joined #openstack-nova14:38
*** PrinzElvis has quit IRC14:38
efriedbauzas, gibi, stephenfin: would one of you please have a look at the nova/cyborg spec update and +A if appropriate?14:39
bauzasefried: yeah, I can do14:39
efriedthanks bauzas14:39
stephenfinI didn't review the original so I'll defer to others14:39
*** eharney has joined #openstack-nova14:39
bauzasI promised to do spec reviews, in particular following the ones we discussed at the PTG14:39
efriedAfter all, melissaml and Viere are +1.14:39
*** jawad_axd has quit IRC14:45
efrieddustinc: Did you pick up what I laid down yesterday about compute node conflicts for provider config?
aarentsHi guys I need some review on this  gibi already put +2, if you can have a look ?14:45
*** jawad_axd has joined #openstack-nova14:46
bauzasaarents: /me clicks14:46
efrieddansmith: TL;DR: since, ironic notwithstanding, it would be insane for one compute node's $name to be the same as another's $uuid, we're just going to punt if there's a conflict across the whole $name+$uuid space. With that in play, it makes sense to treat $COMPUTE_NODE as a "default", which can be overridden by a specific $name/$uuid. And all that means we can detect conflicts immediately (on startup) and fail (to start up) if o14:47
*** maciejjozefczyk has quit IRC14:47
dansmithbuffer overflow14:48
*** jawad_ax_ has joined #openstack-nova14:49
*** bhagyashris has joined #openstack-nova14:49
*** jawad_axd has quit IRC14:50
sean-k-mooneyefried: ill try and review the cyborg spec shortly14:51
*** jawad_axd has joined #openstack-nova14:51
efriedsean-k-mooney: thanks. It's just an update to the existing spec, nothing earth-shaking.14:51
dansmithefried: your message was too long and was cut off after "fail to (start up)"14:52
sean-k-mooneyya looks short14:52
efrieddansmith: oh, weird. " (to start up) if one is detected."14:52
efriedthat was all.14:52
efriedmy buffer says I had another 25c14:52
efriedclearly I need to work on my TL;DRing.14:52
dansmithefried: okay, so I think I was (and likely still am) missing some context yesterday and I was hurrying,14:53
*** maciejjozefczyk has joined #openstack-nova14:53
dansmithbut the problem is that ironic nodes specified there may not exist as providers on startup, is that it?14:53
*** jawad_ax_ has quit IRC14:53
efriedthat's pretty much the only scenario that makes us have to deal with this, yeah.14:53
dansmithso we wanted to make that hard-fail if you provide something that doesn't exist there yeah?14:54
efriedgenerically the "problem" is that theoretically you can specify a config by $name and another by $uuid, but those are for the same node, but we wouldn't know that until that node "appeared".14:54
efriedIn practice this shouldn't be possible because for ironic, $name == $uuid.14:55
*** jawad_axd has quit IRC14:55
dansmithyou mean if you had two entries in the list, one by name and one by uuid and they were in fact the same provider?14:56
*** damien_r has joined #openstack-nova14:56
*** damien_r has quit IRC14:56
dansmiththat seems like a really tiny detail to be concerned about.. did this come up in some testing or something?14:56
efriedmore code inspection14:56
dansmithremind me why we have the  by-name option anyway?14:57
dansmithuuid or the "self" option should really be all we need I would think14:57
efriedIt was because you don't necessarily know the UUID yet in a green field.14:57
efriedbut you want your tripleo to be able to lay down the config a priori.14:57
dansmithin which case, non-ironic? in that case you use $COMPUTE_NODE yeah?14:57
efriedYeah, I would think so. gibi, can we convince you we don't need identification by name?14:58
efried(I think it was gibi who talked us into it)14:58
dansmitheither way,14:58
dansmithI'm not sure we really need to care that much if someone puts a thing in there twice14:58
efriedright, you kinda f'ed up if that happens, but the code has to do *something* in that case.14:59
dansmith*if* we can detect and log an error later that's helpful, but..14:59
dansmithyour concern is just not being able to detect that at startup?14:59
gibiefried: I'm sort on context. Is it about provider config?14:59
*** amodi has quit IRC14:59
efriedif we can detect it at startup, we can halt the service and force you to fix it. But we don't want to kill the service if it creeps in after we've already started running.15:00
efriedgibi: yes15:00
dansmithefried: if it's hard/impossible to do at start, is clearly wrong config, and we can log an error later I think that's reasonable15:00
dansmithefried: no, don't halt.. log an error periodically15:00
efrieddansmith: ack15:00
*** dtantsur|brb is now known as dtantsur15:00
gibiefried: so in case of a compute RP the name is known before the compute was ever started but the uuid only known after the compute creates the RP15:00
dansmithwe should never kill nova-compute after it's hit steady-state15:00
dansmithgibi: we have $COMPUTE_NODE for that15:00
efriedgibi: Right, in that case, we can use the special ``$COMPUTE_NODE`` ... yeah ^15:01
gibidansmith: I see.15:01
dansmiththey don't need to know either the name or the uuid for the compute node15:01
gibiso instead of allowing RP identification by name in general, we add a specific case for compute node15:01
dansmithwe have that already15:01
dansmithuuid can be a uuid or the special string "$COMPUTE_NODE"15:02
gibiso far I'm OK with this. I guess if other RPs needs to be identified by name then we will add other symboles for that or we re-think the identify-by-name feature15:02
dansmithefried: note that I'm okay with either removing name, or just detecting and logging this case later15:02
bauzasgibi: dansmith: I haven't paid attention yet to this convo as I'm dragged into some internal all-hands but when I was looking on how to use the new COMPUTE_NODE trait, I think that we miss the world of nested RPs15:02
bauzasunless we have some way to have some traits to be cascaded into child RPs15:03
bauzas(and I really want to avoid saying 'inheritance')15:03
*** damien_r has joined #openstack-nova15:03
bauzasbut for example, if you lookup some RP, you need to call the RP and check its trait to know whether the original RP is still related to a compute node15:03
efriedbauzas: The only rp we're supporting this "wildcard" for right now is the root. If we find we need a way to generically identify its children later, we'll have more designing to do.15:04
bauzasyou need to call the *root* RP15:04
bauzas(sorry for confusing)15:04
dansmithis bauzas talking about this same thing?15:04
efriedalmost :P15:04
gibihm, the neutron device RPs (provising bandwidth) also has a stable name generated from the device name. But as I don't have a use case for provider config related to the bandwidth feature I rest my case.15:04
bauzasefried: sure, I'm just saying this is only convenient when you're in a flat world15:05
*** johnthetubaguy has joined #openstack-nova15:05
efriedI agree. Predictably-named nested RPs make a case for identifying by name.15:05
efriedthough I think we've talked about the brittleness of relying on names before.15:06
*** maciejjozefczyk has quit IRC15:06
*** damien_r has quit IRC15:06
gibiefried: we have predicatable named nested RPs | 1110cf59-cabf-526c-bacc-08baabbac692 | aio:Open vSwitch agent:br-test15:06
mnasersean-k-mooney: sorry, was in a meeting too then, but no that wasnt enabled, but they are much older and also running ddr3 memory for comparision15:06
*** tbachman has quit IRC15:06
mnasermuch lower latency but lower bandwidth15:07
gibiefried: but as I said I have no direct use case so I don't want to push for the naming support in provider config15:07
sean-k-mooneyso that was cross socket latency15:07
bauzasefried: FWIW, we stecked a few things at the PTG between gibi, stephenfin and me that were tied to naming conventions :)15:07
*** davee_ has joined #openstack-nova15:07
bauzaswe sketched*15:07
efriedI know, what I'm saying is that we shouldn't have architected anything to rely on naming conventions, and we shouldn't perpetuate such things.15:07
efriedbut that ship has probably sailed.15:07
mnasersean-k-mooney: yep!15:07
sean-k-mooneymnaser: i was trying to figure out if the intel number were between socket or two numa nodes in the same socket15:07
*** johnthetubaguy has quit IRC15:07
bauzasefried: indeed15:07
*** damien_r has joined #openstack-nova15:07
sean-k-mooneyya those latency numbers for amd are bad15:07
mnasernope, between two sockets, which is interesting in just how much faster cross socket latency is low15:07
efriedIn any case, if gibi doesn't have any real, current bw use cases, and bauzas doesn't have any real, current vgpu use cases, I'm fine making the decision to remove identify-by-name to limit scope/complexity.15:08
gibiefried: there are things already architectured that way. like the hostname of the compute node is an id neutron use. and nova user the name of the device RP from neutron15:08
mnasersean-k-mooney: im looking at the bios for the speed between the two sockets, i learned something about the amd architecture that doesnt seem ideal15:08
efriedwe've removed a lot of bells and whistles for that reason, just to make this go.15:08
gibiefried: agree with the limited scope15:08
*** dpawlik has quit IRC15:08
gibiefried: go for it15:08
bauzasefried: it's not really a vgpu thingy15:08
sean-k-mooneyso that lend more crednece to enableing numa per ccx and usign muiti numa guests15:08
gibiefried: we can always add that later15:08
bauzasefried: it's more what gibi said15:09
efriedotoh, where I started this morning actually handles things including name pretty well IMO15:09
bauzasefried: some other projets create their own RPs and we need to know them15:09
sean-k-mooneymnaser: the fact the infity fabric speed used to be tied to the meory speed15:09
efriedand dustinc has mostly already coded it up15:09
*** johnthetubaguy has joined #openstack-nova15:09
efriedso it's really just a question of test surface15:09
mnasersean-k-mooney: "By default, the BIOS for EPYC 7002 Series processors will run at the maximum allowable clock frequency by the platform. This configuration results in the maximum memory bandwidth for the processor, but in some cases, it may not be the lowest latency. The Infinity Fabric will have a maximum speed of 1467 MHz (lower in some platforms), resulting in a single clock penalty to transfer data from the15:09
mnasermemory channels onto the Infinity Fabric to progress through the SoC. To achieve the lowest latency, you can set the memory frequency to be equal to the Infinity Fabric speed.  Lowering the memory clock speed also results in power savings in the memory controller, thus allowing the rest of the SoC to consume more power potentially resulting in a performance boost elsewhere, depending on the workload."15:09
bauzasefried: for the moment, we infer their existences based on naming convention but I'm all up for other ideas :-)15:09
gibibauzas: a counter example is shared disk. there we said use the uuid of the placement aggregate instead of the name of the sharing RP in the nova conf.15:10
sean-k-mooneymnaser: ya so in gen 1 they were tied of the same clock15:10
gibibauzas: so we are getting better at this15:10
bauzasgibi: aaaaaand we discussed 30 mins of any potential issues with it :)15:10
dansmithefried: reminder: I'm fine with logging the conflict late when the admin configured it wrong15:10
*** JamesBenson has joined #openstack-nova15:11
sean-k-mooneyso your memroy bandwith was limited by the speed at which the infinity frabric could run15:11
bauzasthe chicken-and-egg problem15:11
dansmithmaybe nobody else is :)15:11
gibibauzas: that is what PTG for. isn't it? ;)15:11
sean-k-mooneymnaser: they split them to allwo you to tune them seperatly15:11
bauzasgibi: that's surely some justification for my expense report, for sure :p15:11
sean-k-mooneymnaser: so you can optimise for latency or througput15:11
*** tbachman has joined #openstack-nova15:12
efrieddansmith: That wfm too. The only question is, in addition to logging the error, we have to *pick one* to use.15:12
bauzas"look ! this isn't a very expensive dinner, this was just a brainstorming session in a crowdy area !"15:12
efriedor, I suppose, refuse to use either.15:12
dansmithefried: refuse to use either15:12
bauzas"and we ordered snacks meanwhile"15:12
efrieddansmith: ack. I agree. We shall make it so.15:12
efrieddansmith: Since you're being so nice already, and I've obv got you swapped out of whatever else you would be doing, can we talk about shelve-offloading an instance with a vTPM?15:12
*** ociuhandu has joined #openstack-nova15:13
dansmithefried: yeah15:13
efriedDo you have a better idea than "create a glance (or maybe swift?) object for the vdev file"?15:13
gibibauzas: there wasn't too much snack provided during the PTG so _we had to_ order some :)15:13
*** shilpasd has joined #openstack-nova15:14
efrieddansmith:  seems like this is a natural for swift, but that adds yet another required component.15:14
bauzas"good luck, expenses team, for understanding my bills  !"15:14
dansmithefried: that would be one place I guess, but yeah, can't really depend on swift being there15:14
mnasersean-k-mooney: interestingly enough looking at the bios the 4-link xGMI max speed is set to 10.667Gbps15:15
dansmithefried: I dunno how safe that would really be, not that it's less secure than the alternatives, but..15:15
mnaser"Setting this to a lower speed can save uncore power that can be used to increase core frequency or reduce overall power. It will also decrease cross socket bandwidth and increase cross socket latency."15:15
kashyapcoreycb: One more on that thing you asked earlier: QEMU upstream has this notion of "versioned CPU models", and we'll be getting new named models with the affected flags turned off.15:16
mnaseri wonder why SMC would ship a box with with it set to the minimum speed for a server where i dont really care about cpu performance, ill bump that to 18Gbps..15:16
efrieddansmith: Well, we're relying on the encrypted-ness of the vdev file already. A main use case is "someone walks away with the disk and we're still okay".15:16
kashyapcoreycb: I'll update the Ubuntu bug you pointed.15:16
coreycbkashyap: that sounds nice. great, thanks15:16
efrieddansmith: though since you bring it up, I wonder if it would be possible to (ab)use barbican...15:17
dansmithefried: sure, but access to that enables an offline attack, and not everyone may have the same security policy for their swift cluster as they do for their hypervisor nodes15:17
dansmithefried: switft may be a best effort uuber cheap storage bucket for them, whereas hypervisor and glance storage is high security at a premium15:17
dansmithefried: do you have a pointer to what glance calls the feature of linking multiple image payloads together?15:18
efriedoh, is that already a thing?15:18
*** ociuhandu has quit IRC15:19
dansmithI thought it was, and I thought you made mention of it15:19
efriedI was just going to, like, make up a metadata key that would store the a unique identifier and stuff the same value on both images to link them.15:19
dansmithoh okay, I think there was actually a use case for this feature for something else at some point, but I might be wrong15:19
sean-k-mooneydansmith: do you mean the ablity to have multple image location or something else15:19
efriedsean-k-mooney: talking about being able to "link" two images together in some logical fashion.15:20
dansmithsean-k-mooney: that's not what we're talking about15:20
*** udesale has quit IRC15:20
sean-k-mooneyoh ok like we propose to do for cyborg15:20
sean-k-mooneye.g. where the os image has a metadata key that specify the bitstream image for the fpga15:20
*** udesale has joined #openstack-nova15:20
efriedyeah, that's a close enough use case15:21
efriedin this case the two images are completely 1:115:21
efriedbut otherwise it's a pretty similar model.15:21
sean-k-mooneyim not sure there is an existing freature for that15:21
sean-k-mooneyif there was we would have suggested using it for cyborg15:22
dansmithit's similar, but I would expect that glance would gain a new container format for that,15:22
dansmithwhereas this is more "please store this binary blob for me"15:22
sean-k-mooneywell that is what the bitstream kindo of is but sure15:22
efriedsean-k-mooney: we're talking about the emulated TPM device, which libvirt stores outside of the instance dir. When we shelve-offload, we need to carry that vdev along somehow. We're brainstorming ways to make that happen.15:22
sean-k-mooneyah ok15:23
dansmithsean-k-mooney: it's not if there's some agreed-upon format for containing those bitstreams, even though the bitstream themselves aren't standard15:23
*** READ10 has joined #openstack-nova15:23
sean-k-mooneydansmith: well for the bitstream i think the plan was intilaly to use raw15:24
sean-k-mooneybut there isnt a standard format15:24
dansmithsean-k-mooney: you mean "bare"?15:24
dansmithor you mean no additional wrapper around the bitstream?15:24
sean-k-mooneyi mean image type raw container type bare15:24
dansmithyeah, so that doesn't seem like a good idea to me15:24
dansmithbecause nova thinks it can boot those kinds of things15:24
efriedsean-k-mooney: the spec had originally been proposing to copy the vdev into the instance dir so it would be carried along for live migration, and then unpack it on the other side. But the instance dir is blown away for offload, so that model doesn't carry.15:24
dansmithalso, the other thing to keep in mind here,15:25
sean-k-mooneywell the bitstream format is vendeor depenent15:25
sean-k-mooneyit would be nice to have a sperate format for it15:25
dansmithis that for things like glance multi-store, you have to know that those things are linked together when you move/copy between stores15:25
dansmithif you use that snapshot to spawn to a ceph-using compute node,15:25
dansmithwhat do you do with that tpm image?15:25
dansmithcopy to everyone? ignore?15:25
sean-k-mooneyefried: right we would need to snapshot it and store it somewhere15:26
dansmith"copy to everyone" meaning.. any instance that CoWs from that should get a copy of the tpm15:26
efriedgood question, I hope if you're using a snapshot to create multiple VMs from, you would want them to start with a clean TPM, but mebbe not.15:26
dansmithefried: but how do you know?15:26
dansmithefried: if I snapshot my instance and delete it, then come back a year later for black friday,15:27
sean-k-mooneydansmith: by the way this topic in general sound more like the goal of glare to solve. e.g. a generic artifact store15:27
dansmithyou don't know if I'm going to create one or a hundred of those15:27
sean-k-mooneybut i dont think glare is a thing anymore15:27
efrieddansmith: your point is that we can't differentiate between shelve-offload-unshelve and snapshot-restore?15:27
dansmithsean-k-mooney: maybe, not sure glare was really for keeping this kind of thing, vs things like template configs.. but maybe15:27
dansmithefried: no we definitely can, I'm saying snapshot-restore has a similar problem if you store the thing in glance or otherwise link it via the snapshot itself, vs. the image record in nova15:28
sean-k-mooneyso i guess there are different usecase but i would not assuem that we want to share tpm snapshot betwen multipel instnaces15:28
sean-k-mooneythat said i can see a use case for that where i store secure boot keys or something in the tpm snapshot15:29
dansmithsean-k-mooney: agree15:29
efriedsean-k-mooney: I think dansmith is saying that, even if we put that stake in the ground, we can't necessarily know when we go to use a snapshot15:29
efriedbut dansmith I would think we would only ever restore the vTPM if the operation is unshelve. If it's just "boot with this image" we don't.15:29
*** TxGirlGeek has joined #openstack-nova15:29
dansmithso it would be better if we have a solution where we the linkage is between the instance and the tpm, not the snapshot image and the tpm15:29
*** bhagyashris has quit IRC15:30
efriedyeah, that works. I was actually thinking to use the instance UUID as the link15:30
openstackgerritMatt Riedemann proposed openstack/nova master: PoC for using COMPUTE_SAME_HOST_COLD_MIGRATE
efriedin the metadata of both images, store vtpm_original_instance_uuid=$instance_uuid15:30
dansmiththat's a bad idea, IMHO15:30
dansmiththat opens up an attack vector15:31
efriedon unshelve, I need to make sure that the instance I'm unshelving is $instance_uuid, and I also need to suck in the vtpm image with that $instance_Uuid15:31
dansmithif I can create an image, then I could create a tpm image for your instance and when you go to unshelve, you use my tpm instead of nothing beause you didn't have one before, so I can inject keys and get you to use them potentially15:31
sean-k-mooneydansmith: because the uuid can be updated15:31
sean-k-mooneywe likely need to sotre this nova side to be safe15:32
dansmithif you're going to store the image in a glance image, storing the id of that in sysmeta would be much stronger a link I think15:32
sean-k-mooneyya that is what i was thinking15:32
dansmithand maybe the sha256 of it too15:32
efrieddansmith: Seems like that attack vector is open anyway, just narrower... oh, yeah, a signature ought to close it.15:32
dansmithefried: it means you have to have the ability to modify the internals of nova and create a malicious image.. it's much narrower15:33
dansmithcreating images is done via the api.. sysmeta is only at the db layer so requires a much larger exploit15:33
sean-k-mooneyya also if the tpm is encypted15:34
efrieddansmith: also keep in mind the file is useless without... yeah.15:34
dansmithnote that I don't *know* that abusing glance in this way is legit15:34
sean-k-mooneyyou not only need to replace the iamge it has to be encypted with teh right key15:34
dansmithefried: it's not useless, it's still a dictionary attack vector, but I understand15:34
dansmithsean-k-mooney: right, but the key is get-able from barbican via the api15:35
efrieddansmith: so the instance record sticks around when offloaded?15:35
dansmithefried: that's the whole point of shelve15:35
efriedkeep the instance record, blow away the allocations etc.15:35
efriedand I assume we can unshelve onto any host, doesn't have to be the original.15:36
efried(within the bounds of scheduler restrictions of course)15:36
sean-k-mooneywe go to the schduler to select the hosts15:36
dansmithlikely not the original15:36
dansmithshelve was intended as "cold storage" so some significant amount of time would have passed before you unshelve, so very unlikely you land on the same host15:36
dansmithI will not always be used that way, but.. can't make any assumptions15:37
efriedokay, cool, I'm going to write this up and let the stakeholders know we've doubled the complexity again.15:37
efriedAnd maybe go ask -glance for recommendations on how to store the thingy.15:37
dansmithefried: we might ask jroll and the Verizon Media (tm) people what they think about storing the tpm in glance15:38
efriedThanks for the talk dansmith sean-k-mooney15:38
sean-k-mooneythe other usecase for shelve is public cloud to avoid billing but ya unell you have a server affinity group or somethin its very unlikly to land on the same host15:39
dansmithefried: already been asking some questions in -glance15:39
sean-k-mooneyno worres i need to review your spec and code too15:39
sean-k-mooneyill try to do that this week15:39
bauzasmriedem: want me to care on backporting to stable branches ?15:46
bauzasor you rather and me reviewing ?15:46
mriedemcan you do both?15:46
bauzasboth backporting and reviewing ? well, is that a thing ?15:46
bauzaswe have elod and lyarwood we can bug15:47
*** shilpasd has quit IRC15:47
mriedemi have things in stable/train you could review15:47
bauzasmriedem: sure thing15:48
mriedemi guess i don't know if you're just talking about that specific change or reviewing code in general15:48
*** fgonzales3246 has joined #openstack-nova15:48
mriedembecause both would be good15:48
bauzasmriedem: no, sorry, was talking of proposing backports of this change15:48
bauzasmriedem: cool, I'm committed to reviewing for the last 2 days15:49
bauzasanyway, let's just backport this stuff...15:49
*** READ10 has quit IRC15:55
*** ociuhandu has joined #openstack-nova15:56
*** tbachman has quit IRC15:56
dansmithjroll: you around?15:57
efriedI suspect in this case it would be possible to use the keymgr backend to store this blob. But that's really not a good general model.15:58
efriedthis is really a job for swift. And then the responsibility of the admin to make sure their swift is appropriately hardened.15:59
efrieddo we have a precedent for nova talking to swift??16:00
efriedI... don't see one.16:00
dansmithso, if you have no swift, we what.. lose your tpm on snapshot/shelve or refuse to do it?16:00
efriedyeah, I would think so.16:00
*** jmlowe has joined #openstack-nova16:01
dansmithhow about instead of that,16:01
efriedI mean, there seems to have been plenty of talk on recent features about "not supported" (you can try it but it won't work right) versus actually blocked16:01
dansmithwe refuse to ever boot an instance with a tpm if there's no swift endpoint in the catalog?16:01
dansmithefried: yeah and they all suck16:01
*** ociuhandu has quit IRC16:01
efrieddansmith: I can live with that; same as the requirement for the keymgr to exist.16:02
dansmithefried: because the user doesn't know any of this, other than that "every openstack cloud they use seems to be different and incompatible"16:02
fgonzales3246hello all, I'm configuring SSL connections into my MySQL database for the components and when I activate this into the dabase and in each connection my controller starts to slow down and the requests fail due to timeout.16:02
dansmithefried: ack, I'd much rather do that, or at least do that in addition to the other16:02
efrieddansmith: IMO it's better to block operations you know are broken than to simply state they're not supported and let them sh*t all over themselves if tried.16:02
fgonzales3246I tried to isolate one component like, cinder, but it occurs the same... seems that the requests doesn't go to the compute node16:02
efriedbut I think the counterargument has been that we want to be able to enable the operation without needing a new microversion.16:03
fgonzales3246but the services are all up and running, do you know what could I be missing? Something related to nova-conductor or something else?16:03
dansmithefried: of course, I'm saying having "the plan" be a thousand places where seemingly fundamental operations can fail with the appropriate combination of obscure backend configs sucks from the user's perspective16:03
efriedanyway, in this case we're not talking API changes, and we're talking blocking at boot rather than (well, in addition to) shelve, so I'm good.16:03
efrieddansmith: totally agree.16:03
efriedblock it if you know it's broken.16:03
dansmith...but try not to rely on that in design16:04
efriednow I just hope I can detect a swift16:04
efriedcause I wanted to add a check for "key manager configured" to the "do we enable vTPM in the first place" logic16:05
efriedbut couldn't find a reliable way to inspect CONF for that16:05
dansmithyou need the catalog, no?16:05
efriedbut yeah,16:05
efriedI can at least look in the catalog to make sure there's a keymgr endpoint.16:05
efriedexcept I'm doing this on init_host and don't have a context?16:05
dansmithyou do this at boot time for the instance16:06
dansmithnon-tpm instances don't care about this16:06
dansmithwhat you want to avoid is agreeing to create an instance for the user which in six weeks will be unsnapshottable16:06
efriedSorry, I'm talking about something slightly different: the part where the libvirt driver decides whether to set the "can I vTPM?" flag.16:06
dansmithI'm not talking about that16:06
efried...causing it to expose the traits16:06
efriedI know, I am.16:06
efriedI understand and agree with what you're talking about.16:07
dansmiththat too may be good, but expecting keystone is up and reachable when compute starts is not okay I think16:07
efriedorly? ... okay.16:07
dansmithcomputes on the other side of a network partition shouldn't just fail to start I think16:07
efriedfair enough16:07
dansmiththey'll currently wait for conductor if it's not up yet, so...16:07
efriedyeah, I don't want to be the guy to introduce a network ping into init_host.16:08
dansmithwe have that :)16:08
dansmithI added it16:08
efriedfrom compute driver or from compute manager?16:09
dansmith...but it's just a ping() rpc method compute makes to conductor to see if it's alive before it goes :)16:09
efriedyeah, that's in manager. I'm talking about libvirt compute driver's init_host, which seems... dirtier.16:09
dansmithyes, I'm just pointing out that I called it ping :)16:09
*** ivve has quit IRC16:09
*** nanzha has quit IRC16:10
efriedokay, so lacking a jroll for live discussion, I'll mod the spec as if this is the plan, and ping for reviews.16:10
efriedThanks dansmith16:10
*** jmlowe has quit IRC16:10
dansmithsounds good16:11
mnasersean-k-mooney: so my new map is 3C per NUMA node (6 threads total), ~32GB per NUMA node, with quite interesting latency numbers16:13
openstackgerritStephen Finucane proposed openstack/nova master: zuul: Remove unnecessary 'tox_install_siblings'
*** udesale has quit IRC16:14
efriedmriedem: the barbican-gating-on-py3 thing from yesterday seems to be busted -- would you mind having a quick look at some point?16:16
*** gyee has joined #openstack-nova16:16
*** tbachman has joined #openstack-nova16:17
*** dpawlik has joined #openstack-nova16:18
stephenfinefried: didn't AJaeger have a fix for that?16:22
* stephenfin looks for links16:22
sean-k-mooneymnaser: that kindof look like what i would expect16:22
efriedstephenfin: yes, gmann said something in -barbican16:22
sean-k-mooneyas the numa distance increase teh latnecy does too16:22
stephenfinah, maybe not the same thing16:22
stephenfinnvm me16:22
mnasersean-k-mooney: i also bumped the "4-link xGMI max speed" setting to 18Gbps .. but the machine is now crashing (it was set to 10.667 earlier)16:23
efriedstephenfin: yeah, he pointed to (same thing but in train)16:23
mnaserso ill reset it back to that, and hopefully the latency stays at these numbers16:23
gmannstephenfin: yeah, stable/train backport will fix the baribican. but still asking the reason to infra on ML for learning.16:23
sean-k-mooneybut it lends more evidence to the idea we should look at cache affinity and numa distance when selecting vm numa node mappings16:24
*** ricolin has quit IRC16:24
mnasersean-k-mooney: yep, giving an instance two NUMA nodes that are '32' away vs '10' ..16:24
sean-k-mooneymnaser: can you dump the numa distance info16:24
mnaseror rather 11/1216:24
sean-k-mooneyi would be gould to confimr the latency corralate with the numa distance reported16:25
mnaserthey're not that consistent but yeah16:25
sean-k-mooneyya it does if we look noes 8-15 have the same numa distacne of 3216:26
*** Sundar has joined #openstack-nova16:26
mnaserwhich are all socket #216:26
sean-k-mooneywe see 2-7 are all reported at numa distance of 12 and have almost the same latency16:27
gmannwe need some review from mriedem and tonyb here to get these backport in (pinged dave-mccowan barbican stable core also) - we need to get these merged as soon as we can -
mnasersean-k-mooney: struggling what the 11/12 are there for16:27
sean-k-mooneyya its not fully clear but we could use this info to optimise slightly16:29
sean-k-mooneyi suspect that is related to how the dies are grouped in pairs16:29
sean-k-mooneydistance of 11 is to the other ccx in the pari16:29
sean-k-mooney12 is the other ccx on the socket no in the pair16:29
*** dpawlik has quit IRC16:29
sean-k-mooneyand 32 is cross socket in this case16:29
*** jawad_axd has joined #openstack-nova16:29
mnasersean-k-mooney: ok that makes sense then, i will try to reset the infinity fabric speed down to see if the machine stops crashing :\16:32
*** ociuhandu has joined #openstack-nova16:33
*** jawad_axd has quit IRC16:34
Sundarsean-k-mooney: Thanks a lot for reviewing the Cyborg spec despite your busy schedule. All the best.16:34
*** dpawlik has joined #openstack-nova16:35
*** ociuhandu has quit IRC16:38
*** ociuhandu has joined #openstack-nova16:38
mriedemgmann: the only one passing is stable/train16:40
*** jaosorior has joined #openstack-nova16:40
*** dpawlik has quit IRC16:41
efriedmriedem: it sounds like if that one merges it might fix master (I still don't understand how though)16:42
gmannmriedem: yeah some unrelated failure. i did recheck on stein. after train cherry pick merged we can check if problem is sovled. otherwise need to get all in16:42
gmannit all depends on how zuul pick the job definition due to those branche variant.16:43
gmannwe can try one by one. last option to avoid barbican gate is to make grenade job as py2 explicitly.16:44
gmann*gate break16:44
mriedemefried: because the job definitions are branch specific and grenade deploys the old side using stable/train and the new side using master16:47
mriedemso sometimes to fix a thing on master for grenade you need to do something in master-116:47
mnasersean-k-mooney: ok making good progress, according to numastat it ended up scheduling to numa cell0 and cell1 which means this is the most "optimized" setup16:47
mnasertopology is 8 sockets, i wonder if it would be better to model the topology as "2 sockets, 4 cores, 1 threads" or "2 sockets, 2 cores, 2 threads" too16:48
*** damien_r has quit IRC16:51
*** damien_r has joined #openstack-nova16:53
*** macz has joined #openstack-nova16:54
efrieddansmith: are there any operations other than shelve for which the swift blob would be appropriate?16:56
dansmithwell, what are you going to do about snapshot?16:57
dansmithsnapshot gets a fresh tpm?16:57
dansmiththat doesn't work well for people that use snapshot as a backup mechanism16:57
dansmithspeaking of that, the backup api :)16:57
mnaserburn it with fire16:58
* mnaser goes back to hiding16:58
efrieddansmith: gr, config option? save_tpm_with_snapshot_note_does_not_apply_to_shelve_where_we_always_do_that_anyway = $bool16:58
efriedopen question I guess, sigh16:59
dansmithI mean, not a config option, no16:59
*** martinkennelly has quit IRC16:59
dansmithbut not sure what the answer is16:59
sean-k-mooneymnaser: that is becasue we use itertool.permuations so we will initally iterate over the numa nodes in order trying to fit the vitual numa node to the host16:59
efriedI think the answer has to be no16:59
efriedbecause we have no way to associate in that scenario17:00
*** martinkennelly has joined #openstack-nova17:00
dansmithif I snapshot my instance and plan to roll ack if my OS upgrade fails, I'm going to be super surprised that I can't read my encrypted data volume (ever again)17:00
efriedbecause we're relying on the instance sysmeta to associate for shelve.17:00
sean-k-mooneymnaser: so in that specific case it will work17:00
mnasersean-k-mooney: so maybe eventually machines might get some inefficent assignments17:00
sean-k-mooneymnaser: but in general it wont17:00
sean-k-mooneymnaser: yes17:00
mnaserso its possible i hit node4+node5 which is a bad time17:00
dansmithefried: I think it'd be good to get some jroll input on this17:00
sean-k-mooneyif node 0,2,3 could fit it it would select 0 and 217:01
sean-k-mooneyand never check 317:01
dansmithefried: but I think people will be pretty surprised if it's gone17:01
mnasersean-k-mooney: so far im seeing some numbers increasing by 25% -- and ok i see17:01
efrieddansmith: but then we go back to needing the association in the instance meta, because there's nothing else17:01
sean-k-mooneymnaser: that is why i said we liekly could optimise this using the numa distance but its more complex then what we do today17:02
*** Sundar has quit IRC17:02
dansmithefried: image meta you mean? I'm not saying we should, I'm just poking holes in it which.. if people lose data they get mad17:02
sean-k-mooneymnaser: and it has implciations for modeling in plamcent17:02
dansmithefried: I'm on a call right now though17:02
efriedsorry, yeah, image meta.17:02
mnasersean-k-mooney: yes that sounds massively complex to get implemented -- but would be so awesome17:02
sean-k-mooneymnaser: what feature that relates to numa isint :)17:03
mnaserhah, good one17:03
mnasersean-k-mooney: i mean in my case, the fact memory doesnt float across 8/16 numa nodes, at least 2 is better, the only "bad" potential idea is if a VM get pinned to two NUMA nodes that are pretty far apart17:03
sean-k-mooneyright as in on different sockets.17:03
sean-k-mooneybut even then its still better then floating as it will be deterministic17:04
sean-k-mooneywell in that once the instnace i booted its performance should not change17:04
*** fgonzales3246 has quit IRC17:08
*** dpawlik has joined #openstack-nova17:12
*** xek_ has joined #openstack-nova17:15
openstackgerritMerged openstack/nova stable/train: Imported Translations from Zanata
*** xek has quit IRC17:17
*** rpittau is now known as rpittau|afk17:18
*** dpawlik has quit IRC17:21
*** takamatsu has joined #openstack-nova17:24
*** damien_r has quit IRC17:26
*** jawad_axd has joined #openstack-nova17:26
*** jawad_axd has quit IRC17:31
openstackgerritBoris Bobrov proposed openstack/nova master: Create a controller for qga when SEV is used
KeithMnemonicany change to please get some reviews on the "marker" patches mriedem was working on ? Thanks!17:34
*** dpawlik has joined #openstack-nova17:35
dustincefried, gibi, dansmith: RE Provider Config: Just got caught up on this morning's convo. Am I right in seeing that you guys want to A) drop NAME as identification method, and B) Allow both $UUID and $COMPUTE_NODE to identify same RP but $UUID takes precedence? If so I am 100% on board.17:36
*** jawad_axd has joined #openstack-nova17:47
*** jaosorior has quit IRC17:50
*** dpawlik has quit IRC17:51
*** jawad_axd has quit IRC17:52
*** jaosorior has joined #openstack-nova17:57
*** igordc has joined #openstack-nova18:00
*** derekh has quit IRC18:01
dansmithdustinc: I dunno about a conflict between a fixed uuid and $compute_node, but I would recommend handling that the same way I suggested for the name/uuid conflict, which is to log error and ignore both18:06
*** tbachman has quit IRC18:10
*** amodi has joined #openstack-nova18:10
*** jmlowe has joined #openstack-nova18:16
*** tbachman has joined #openstack-nova18:17
*** martinkennelly has quit IRC18:18
*** tesseract has quit IRC18:18
*** jaosorior has quit IRC18:22
*** ralonsoh has quit IRC18:29
*** nweinber_ has joined #openstack-nova18:30
openstackgerritBoris Bobrov proposed openstack/nova master: Also enable iommu for virtio controllers and video in libvirt
openstackgerritBoris Bobrov proposed openstack/nova master: Create a controller for qga when SEV is used
*** nweinber has quit IRC18:34
sean-k-mooneyby the way i just looked at the implemented spec folder for train. looks like we have not moved them
*** tosky has quit IRC18:55
*** ociuhandu has quit IRC18:57
*** ociuhandu has joined #openstack-nova18:59
*** dviroel has joined #openstack-nova18:59
*** ociuhandu has quit IRC19:03
mriedemsean-k-mooney: you can run this and post the results
sean-k-mooneysure just leaving to grab dinner but ill give it a try when i get back19:14
sean-k-mooneyok so that checks with launchpad to determin if they were finsihed and then updates them19:15
*** tbachman has quit IRC19:23
*** tbachman has joined #openstack-nova19:25
*** munimeha1 has quit IRC19:29
*** awalende has joined #openstack-nova19:31
*** lpetrut has quit IRC19:32
*** lpetrut has joined #openstack-nova19:32
*** ociuhandu has joined #openstack-nova19:35
*** awalende_ has joined #openstack-nova19:36
*** awalende has quit IRC19:39
*** awalende_ has quit IRC19:41
*** awalende has joined #openstack-nova19:41
*** abaindur has joined #openstack-nova19:45
*** abaindur has quit IRC19:45
*** abaindur has joined #openstack-nova19:46
*** ociuhandu has quit IRC19:46
*** ociuhandu has joined #openstack-nova19:47
*** abaindur has quit IRC19:48
*** abaindur has joined #openstack-nova19:49
*** awalende_ has joined #openstack-nova19:51
*** ociuhandu has quit IRC19:53
*** awalende has quit IRC19:54
openstackgerritMark Goddard proposed openstack/nova master: Clear rebalanced compute nodes from resource tracker
openstackgerritMark Goddard proposed openstack/nova master: Invalidate provider tree when compute node disappears
openstackgerritMark Goddard proposed openstack/nova master: Prevent deletion of a compute node belonging to another host
openstackgerritMark Goddard proposed openstack/nova master: Fix inactive session error in compute node creation
*** ociuhandu has joined #openstack-nova19:57
*** ociuhandu has quit IRC20:05
*** awalende has joined #openstack-nova20:05
*** awalende_ has quit IRC20:06
*** awalende_ has joined #openstack-nova20:06
*** eharney has quit IRC20:08
*** awalende has quit IRC20:10
*** nweinber__ has joined #openstack-nova20:11
*** nweinber_ has quit IRC20:14
*** awalende has joined #openstack-nova20:17
*** awalende has quit IRC20:19
*** awalende_ has quit IRC20:20
*** tbachman has quit IRC20:26
*** ociuhandu has joined #openstack-nova20:36
efrieddustinc: The way I suggested yesterday still works IMO, and it requires less churn in both design and already-written code.20:38
efriedWith dansmith's addendum20:40
efriedWhich is as follows:20:41
efried- Index by identifier (name *or* uuid) and fail hard on conflicts there20:41
efried- Treat $COMPUTE_NODE as the default and a specific name/uuid as override20:41
efried- If a provider appears such that you do find a conflicting name+uuid, log an error and ignore that provider completely. (This should only be possible for *nested* providers.)20:41
*** jmlowe has quit IRC20:43
*** ociuhandu has quit IRC20:51
*** tbachman has joined #openstack-nova20:51
*** ociuhandu has joined #openstack-nova20:53
efrieddansmith: shelve without offload only works for non-volume-backed, right?21:00
efriedand followup question, should we do the swift thing for shelve without offload, or only with offload? I guess the question is a) does unshelve always happen on same host in that case; b) do we care about "shelving" the minimal amount of storage required by the vdev?21:01
artomsean-k-mooney, as promised, if you still have energy today, is ready again21:02
efriedprobably yes to b), just for the sake of form, so shelving the vdev would be the right thing to do.21:02
efriedguess it really just makes sense to make it part of snapshot and have a crisp line21:04
*** ociuhandu has quit IRC21:06
mriedemefried: shelve works for volume-backed and non-volume-backed21:13
*** eharney has joined #openstack-nova21:14
efriedmriedem: In the compute API I see that volume-backed calls the shelve-offload code path.21:14
mriedemif you shelve but don't offload (which offloading is automatic by default config) then you unshelve within the offload window you do end up on the same host (conductor just starts the server)21:14
*** awalende has joined #openstack-nova21:14
efried ?21:15
mriedemyeah so in the case of volume-backed shelve there is no snapshot image21:15
efriedright, but the distinction for 'offload' (at least the one that I care about) is that it calls destroy()21:16
efriedwhereas non-offload doesn't.21:16
efrieduh, unless I got confused.21:16
mriedemyeah shelve without offload powers off the guest and creates a snapshot21:16
mriedemi'm drawing a blank on why we shelve offload immediately for volume-backed servers21:18
mriedemjust ask one of these guys from 2013
efriedwas gonna say, seems like it's been that way from the get-go. dansmith co-authored, so...21:19
*** awalende has quit IRC21:19
efriedhe'll surely know.21:19
mriedemi guess this comment
dansmithmriedem: because there's no reason not to offload immediately21:20
efriedprobably because waaay back then, the distinction with offload was only whether snapshot was necessary, which it isn't for volume (right?)21:20
dansmithmriedem: there's no point in maintaining affinity21:20
dansmithefried: not just no snapshot required, but there's no benefit to unshelve to the same host before offload21:21
dansmithonce offloaded, it's might as well go anywhere, but before offload, it's much quicker if you stay put21:21
dansmithbut with volume-backed, that distinction does not exist21:21
efriedgot it.21:21
dansmithefried: also, did you say I co-authored something?21:22
mriedemi didn't realize the offload periodic came later
dansmithoh that shelve patch21:22
dansmithI dunno what I co-authored about it21:22
dansmithnot much21:22
mriedemnot that one21:22
efriedco-authored *and* +2ed, tsk21:23
efriedthose were rowdy times21:23
eanderssonAnyone got any experience with the openstacksdk interacting with nova. Trying to figure out how to fix a memory leak in Senlin.21:23
dansmithefried: never uploaded a patch set,21:23
efriedeandersson: you mean nova talking through sdk, or using sdk to talk to nova?21:23
dansmithefried: so that was probably honorary because I helped figure something out21:23
eanderssonusing the sdk to talk to nova21:23
efriedeandersson: mmph. Not so hot there. Given you already met crickets in -sdks, mriedem and I might between us be able to figure things out...21:24
eanderssonSenlin seems to be creating a new client for everytime it makes a call to nova. Which isn't ideal.21:25
efriedand you think that's because of the way it's talking to sdk?21:25
eanderssonI wanted to make it a singleton, but not sure how to do that when the user/project might differ.21:25
eanderssonIf you look at like 489 there21:26
eanderssonSo I know for sure the leak is caused by the openstacksdk21:26
mriedemso create a singleton map per user/project hash?21:26
eanderssonYea - was going to go that route, but seems crazy to me hehe21:27
mriedemobviously it's just masking a leak21:27
eanderssonIsn't there a way I can just create one openstacksdk object and re-use it21:27
mriedembut idk why the sdk would be leaking21:27
mriedemi'm not the guy to ask about that...21:27
eanderssonIt could be some reference that keeps it alive21:27
eanderssonI poked mordred so maybe he'll get back to me tomorrow21:28
efriedI am *sure* I remember mordred telling me sdk already does local singleton-ing21:28
efriedbut that doesn't mean it isn't buggy.21:28
mriedemdansmith: this is random, but i've noticed that during a resize_claim we make at least 3 lazy-load roundtrip calls to the db to load fields that in most cases probably aren't even populated
mriedemwas wondering what you thought about adding something to Instance.refresh() to specify additional fields to load up21:30
*** xek__ has joined #openstack-nova21:30
mriedemrefresh today just pulls the instance with it's existing additional joined fields,21:30
dansmithlazy load pci_Devices twice?21:30
dansmithresources is the new vpmems stuff right?21:30
mriedempci_requests, pci_devices and resources21:30
dansmithah right21:30
dansmithbut sure, fresh with a extra_attrs optional param makes sense21:31
mriedemso was thinking something like Instance.refresh(..., extra_attrs=None) yeah21:31
mriedemit's a version bump since refresh is remotable21:31
dansmithso you can fault them all in in a single go if you know you're going to hit them21:31
mriedembut that's probably not a big deal21:31
dansmithmriedem: or if you don't need a refresh, you could just do a "load_if_not_present()"21:31
mriedemand i want to use refresh here since the instance in the resize_claim is what's being used in the compute manager as well21:31
*** xek_ has quit IRC21:32
*** dtantsur is now known as dtantsur|afk21:32
dansmiththen sure21:33
mriedem if you want to hack on it21:35
openstackLaunchpad bug 1853370 in OpenStack Compute (nova) "resize_claim lazy-loads at least 3 joined fields in separate DB calls" [Low,Confirmed]21:35
mriedemi just wanted to dump thoughts before moving on21:35
sean-k-mooneyartom: ya ill take a look ill be around for about half an hour so ill review that before i finish up today21:37
*** tosky has joined #openstack-nova21:41
efriedsigh, I need a fresher on rebuild again.21:42
*** pcaruana has quit IRC21:42
efriedthis is where you give an instance a new image but keep its network etc.21:42
efrieddoes vTPM fall into "etc"?21:43
efriedI would kinda think not.21:43
efrieddansmith: ^21:43
dansmithefried: this?21:43
dansmithoh on rebuild21:43
dansmithrebuild the instance cannot move, just rebuild its root from a new image21:44
efriedYeah, so the question is whether we should keep the vTPM21:44
dansmithexcept when it's evacuate21:44
dansmithwe should, I would think21:44
dansmithrebuild is akin to putting the Windows 98 disc in the drive and rebooting21:44
sean-k-mooneyartom: so ya im happy with those one thing i notices is we shoudl update teh job templates to drop python 2 and 35 for the tempest plugin and just use the ussuri template instead21:44
sean-k-mooneythat can be in a follow up patch21:45
efriedI kinda want to say the vTPM should ride with the image21:45
efriedthat would make for consistency with the backup/restore and shelve/unshelve type cases.21:45
sean-k-mooneywe do not nuke addtional ephemeral disks on rebuild right?21:46
sean-k-mooneyjust the root disk21:46
dansmithvtpm should survive a rebuild21:46
dansmithfor sure21:46
sean-k-mooneyyou could look at the vtpm the same way21:46
efriedi.e. any time you create an instance from an image, we check the image for the metadata that points to the vtpm swift store.21:46
efrieddansmith: so what if your instance has a vTPM, and then you rebuild with an image that has ^ metadata?21:47
*** nweinber__ has quit IRC21:47
mriedemefried: ?!!??! :)21:47
sean-k-mooneyefried: i dont think that is right21:47
dansmithefried: instance create is like creating a new server box, rebuild is like reinstalling the OS21:47
sean-k-mooneyefried: i think the vtpm lifetime shoudl be tied to the instance not the image21:47
dansmithTPM is hardware, therefore it stays during a rebuild, IMHO21:47
dansmithsean-k-mooney: ++21:47
efriedSo what if your instance has a vTPM, and then you rebuild with an image that has pointer-to-vtpm-in-swift metadata?21:47
efried"they better match, or we punt"?21:48
*** dviroel has quit IRC21:48
dansmithefried: you mean if you rebuild from a snapshot?21:48
dansmithremember the pointer to swift stays with the instance not the snapshot21:48
dansmithoh you mean for the backup case.. that's why it doesn't work for the backup case :)21:48
efriedno, you convinced me earlier that snapshot and restore should work.21:48
dansmithno, I didn't, I said "you're effed on one of these two things depending on what you decide :)21:48
sean-k-mooneyyes and the point would be stored in the nova system meatadat  not the snapshot metadata21:49
efriedbut that doesn't work for backup/restore21:49
sean-k-mooney*pointer to the tpm snapshot21:49
efriedbecause there's no instance meta21:49
efriedthere is only image meta.21:49
sean-k-mooneyoh right we were talking about shelve and unshelve before21:50
efriedif we go on the philosophy that TPM is like hardware, it's not unreasonable that backup/restore would lose it.21:50
sean-k-mooneywell for vpmem and ephermeral disk we dont snapshot them currenlty21:50
efriedif I do a backup of my laptop and then restore that backup on a different shell, my hardware changed.21:51
sean-k-mooneyand if you reinstall the os on the same laptop the tpm data is preserved21:51
dansmithefried: but that's why VMs are better than hardware21:51
efriedOkay, does this sound reasonable:21:52
efriedOn rebuild, if your instance had a vTPM before, and the image specifies a vTPM also, they must be the same vTPM, or we fail.21:53
sean-k-mooneynot really21:53
efriedwhat should we do in that case?21:53
sean-k-mooneyim not sure the image shoudl be able to reference a vtpm21:54
efriedthen we can't do backup/restore.21:54
sean-k-mooneyof the tpm21:54
sean-k-mooneywe can still do backup and resotre of the vm root disk like a normal snapshot21:55
dansmithefried: on rebuild you don't have a tpm in swift because you have it on disk, it hasn't gone anywhere21:55
dansmithefried: so if the snapshot specifies one, you wouldn't use it anyway21:55
efriedso if the image designates a vTPM, ignore it21:55
dansmiththe thing I'm more worried about if you do this,21:55
dansmithis you snapshot an instance and then start 100 copies for it21:55
dansmith*of it21:55
dansmithyou just sprayed your keys everywhere21:56
efriedsean-k-mooney: ftr, this is where dansmith convinced me the vTPM needs to ride with the backup
dansmithlike a dog marking its territory on a hot summer day21:56
efriednice visual21:56
dansmithefried: again21:56
dansmithefried: I'm not saying that it should necessarily, I'm saying it's going to eff up someone one way or the other21:56
dansmithpeople use snapshot to roll back from a bad OS upgrade, and if you don't keep their keys, they're going to get screwed21:57
sean-k-mooneyshould it be configurable?21:57
dansmithbut if you do, and they spawn that image in a bunch of places, ...21:57
efrieddansmith: well... we could also delete the swift obj (and mod the image meta) the first time we restore it.21:57
efriedthat works poorly for crashy hosts tho21:57
dansmithefried: that's fine if you store it in the image record, despite a failed host21:58
dansmithefried: but I think that's probably going to be fragile and maybe confusing why the first instance got it and none of the others did21:58
dansmithbut perhaps that's the best option21:58
dansmithjust saying, it's going to be icky21:58
efriedI'm saying: if my workflow is to back up stuff in case my host goes down, and I do that backup, and then my host goes down, and I restore it, and then my host goes down again, then when I restore it the second time, my vTPM will be gone.21:59
dansmithoh, yes, I see what you mean21:59
*** ociuhandu has joined #openstack-nova21:59
sean-k-mooneyis this something we want the user to express a policy on e.g. hw_vtpm_snapshot=true|false21:59
dansmithefried: this is also bad for the shared storage case in general actually,21:59
efriedsean-k-mooney: we could configure it to death, really.22:00
dansmithefried: one thing about ceph backed instances is you can recover from a failed host with an evac22:00
dansmithefried: but in this case you will fail to boot after evac because we don't have the keys anywhere22:00
sean-k-mooneydansmith: you would loose the tpm data in that case22:00
dansmithso that's another case this breaks pretty badly22:00
dansmithvolume-backed or ceph/nfs backed ephemeral22:00
dansmithsean-k-mooney: right, that's what I just said22:00
sean-k-mooneynfs might be ok if you also put the tpm datastore on nfs but ya. the impact of that will depend on what you used the tpm for i guess22:01
dansmithsean-k-mooney: oh right, but not ceph22:02
efriedbecause ceph can only handle the disk itself?22:02
efriedsimilar to volume-backed22:02
dansmithwe don't actually mount the storage on the host with ceph22:02
dansmithqemu talks to ceph itself, so we don't like have a dir mounted or anything22:02
sean-k-mooneythe tpm data store is not a volume so we wont create it as such in ceph22:02
efriedwow, neat.22:03
efrieddansmith: you know, in that case it's juuust possible qemu will store the vTPM file on ceph too22:03
sean-k-mooneywe kind of which qemu could do the same for the other backedn too22:03
dansmithefried: I don't think so22:03
efriedI doubt it though.22:03
dansmithefried: we'd have to give it a volume id for that or something22:03
dansmiththe flow chart of "how your instance behaves if you have a vtpm" is starting to get pretty scary22:04
efriedtoday it makes up the directory based on the VM ID22:04
efrieddansmith: well, it's always the corner cases that suck.22:04
dansmithefried: well, snapshot is not a corner case IMHO :)22:04
dansmithand ceph definitely isn't22:04
efriedno, but second-restore-of-snapshot-because-host-went-down-twice is22:04
dansmithI wouldn't say so22:05
sean-k-mooneyso unless you take steps to put the vtpm data store on shared storeate i think we can assume it would be lost on evacuate in general22:05
efriedsean-k-mooney: definitely that22:05
dansmithsean-k-mooney: which means unbootable instance if you put your FDE keys there22:05
dansmithso in the flow chart, it means evacuate has a giant asterisk next to it :)22:05
sean-k-mooneyyes maybe although resuce could save you?22:05
dansmith"we will reconstruct the smoking hull of an instance somewhere else where you know your data is in there, but it's unreadable" :)22:06
sean-k-mooneye.g. allwo you to reinject the key if you still have it22:06
dansmithsean-k-mooney: the whole point of the tpm is to not have to do that right?22:06
efriednot to be *able* to do that22:06
dansmithsecure enclave doesn't have the same meaning if you also have it on a post-it under your monitor22:06
sean-k-mooneyit has other services it can provide to the os but key storage is one of them yes22:06
efriedI guess I'll just write up the big asterisk in the spec & docs.22:07
*** ociuhandu has quit IRC22:07
dansmithI'd like to see a list of all these special cases yeah22:07
efriedYou would kinda have to make a new backup of the vtpm every time you write to it.22:07
dansmithbecause if the list is too large, it starts to become not so useful22:08
sean-k-mooneyefried: you could make the same argument for the root disk i think that is out of  scope of nova22:09
efriedsean-k-mooney: right, but the answer to "evacuate doesn't help you with ephemeral root disk" is "use volume/ceph".22:09
sean-k-mooneyand here its use barbican22:10
efriedwith a vTPM the answer to "evacuate doesn't help you with ephemeral root disk" is "you better have a very recent snapshot of your vTPM"22:10
efriedbarbican only stores the key used to unlock the file we've been discussing.22:10
efriedsaid file will be stored in swift22:10
efriedbut only during snapshot22:10
sean-k-mooneysure but as a user you can also store keys your self in barbican22:10
efriedhow does that help?22:10
efriedyou mean if you want to not use a vTPM at all?22:10
sean-k-mooneyuse the tpm as a local secure cache fo the key22:11
sean-k-mooneyif you evacuate retive the backup form barbican22:11
efriedif barbican is secure enough for you, you would just use barbican.22:11
*** awalende has joined #openstack-nova22:11
efriedthough arguably you're counting on that ultimately here anyway.22:11
sean-k-mooneyyou dont want to transfer it over the network everythime you boot22:11
efriedwe're doing that anyway.22:11
sean-k-mooneywell ya22:12
sean-k-mooneyi was just going to say you have to do that to unlock the vtpm22:12
efriedAlso I believe "VM talks to barbican" is a thing we're avoiding22:12
efriedlimiting to "host talks to barbican"22:12
sean-k-mooneybarbican is a top level openstack sericve so worklaods can use it if they want too22:13
efriedif they want to, yes.22:13
efriedbut it's the openstack user's credentials as opposed to $random_vm_user22:13
sean-k-mooneyoh yes that is what i ment by vm22:13
efriedand also I imagine there are reasons for the VM to be blocked from the keymgr service network-wise.22:13
sean-k-mooneyi ment application in the vm22:13
sean-k-mooneyanyway i think a resonable alternitive is if you need the keys to sruvice beyond the life time of the vm or be shared in a "sercure" way store them in barbican22:15
sean-k-mooneyif you you can tollerate losing the key in a evacuate case then your can store it soly in the tpm22:15
*** awalende has quit IRC22:15
*** tbachman has quit IRC22:16
sean-k-mooneydansmith: you primary concern is we loose the key used to encyrpt the os drive and loose data right if we loos the vtpm22:16
sean-k-mooneywyoud ^ not solve that issue22:16
efriedresize does a destroy and then a spawn?22:18
sean-k-mooneyit will redefine the domain but you dont loose data22:19
sean-k-mooneyand it may change host22:19
sean-k-mooneye.g. if you resize form a flavor with 10G root disk to one with 20 it will grow the disk file22:20
sean-k-mooneyand if its libvirt the partion table and filesystem if it can22:20
sean-k-mooneyso we would have to copy the vtpm file if we move host22:20
efriedyeah, I'm exploring corner cases with the vTPM. If your resize adds/removes a vTPM where it was previously absent/present, that's fine. If both old & new flavor have the same version/model, we should carry the data. If they have a different version/model...?22:21
sean-k-mooneyif its to the same host we shoudl not need to do anything22:21
sean-k-mooneyif they are different models i would expect you to loose the contents of the tpm22:21
sean-k-mooneyunless we block that intentionally22:21
efriedyes to the first, the second is the question.22:22
efriedthere's no way to persist the data, period. The versions/models are incompatible.22:22
sean-k-mooneyya that is proably true or at least involved to do so i would not try in v1 of this22:23
sean-k-mooneythere might be a way to convert but im not aware of one off the top of my head22:23
efriedsean-k-mooney: not without introspecting, which ain't my business.22:24
sean-k-mooneyswtpm might have a fucntion to do it22:24
sean-k-mooneybut ya unless libvirt support it i think its out of scope22:24
efriedso I'm gonna say don't block it, cause who am I to say you didn't mean to do it22:24
sean-k-mooneythat would be my feeling too22:25
sean-k-mooneywe shoudl documetn it i guess22:25
efrieddoing that now22:26
sean-k-mooneybut i dont think we shoudl guard in code against all pebcak errors22:26
*** slaweq has quit IRC22:26
efriedhmph, the thing about saving the swift ID in instance sysmeta has a problem code flow-wise.22:27
efriedbecause the driver is the thing that knows the vdev is in this particular file path22:27
efriedbut the only thing we've asked the driver to do is snapshot.22:27
efriedand (I think) the driver can't tell if it's being asked to snapshot as part of a shelve or backup or...22:28
sean-k-mooneyam im not 100% sure about that last part22:28
efriedso I guess it could edit the sysmeta itself... but if it's not a shelve and the compute mgr is about to blow away the instance, the compute mgr needs a way to get that information so it can stuff it into the image meta.22:29
efriedI guess it can pull it from the instance22:29
*** dave-mccowan has quit IRC22:29
efriedso the contract with the virt driver's snapshot method has to spell out where & how this info is stored in the sysmeta.22:29
* efried answers own question.22:29
sean-k-mooneywe create snapshot for shelve but i dont think the compute magener actully calls snapshot as part of that flow i think its donw by the driver22:30
sean-k-mooneythat said i could be way off base there so im checking22:30
efriedcompute manager's shelve calls driver's snapshot22:30
*** JamesBen_ has joined #openstack-nova22:31
sean-k-mooneyoh ok22:31
*** rcernin has joined #openstack-nova22:31
*** JamesBen_ has quit IRC22:32
sean-k-mooneyya your right i though we had a shelve fucntion on the driver interface that we called but this is handeled in the compute manager instead22:32
*** JamesBen_ has joined #openstack-nova22:32
sean-k-mooneyso snapshot is passed the instace so it can get the flavor and image metadata22:33
sean-k-mooneyso the snapshot function can check if the isntace has a vtpm and directly update the system metadata as you said22:33
*** tbachman has joined #openstack-nova22:34
*** JamesBenson has quit IRC22:34
sean-k-mooneyoh right but it would not know if its a shleve or not22:34
*** xek_ has joined #openstack-nova22:34
*** slaweq has joined #openstack-nova22:35
sean-k-mooneyoh it can jsut check the  instance.task_state right?22:36
efriedcould, but compute mgr has to be able to peel it out anyway, so it's simpler if snapshot just always behaves the same and compute mgr pulls the info out of the sysmeta if it needs it.22:36
*** xek__ has quit IRC22:36
*** JamesBen_ has quit IRC22:36
sean-k-mooneyanyway night all o/22:37
*** slaweq has quit IRC22:40
*** abaindur has quit IRC22:41
*** abaindur has joined #openstack-nova22:43
*** tbachman has quit IRC22:44
openstackgerritJohn Garbutt proposed openstack/nova master: WIP: Integrating with unified limits
*** slaweq has joined #openstack-nova22:50
*** slaweq has quit IRC22:55
*** tkajinam has joined #openstack-nova23:05
*** slaweq has joined #openstack-nova23:06
*** xek_ has quit IRC23:08
openstackgerritJohn Garbutt proposed openstack/nova master: Better policy unit tests
*** slaweq has quit IRC23:10
*** slaweq has joined #openstack-nova23:11
*** mlavalle has quit IRC23:15
*** slaweq has quit IRC23:15
*** brault has joined #openstack-nova23:16
efriedargh, destroy_disks=True is no longer sufficient grounds to delete the barbican key :(23:17
efriedIn fact, I'm not sure we can "ever" do that.23:18
*** slaweq has joined #openstack-nova23:21
*** slaweq has quit IRC23:26
*** slaweq has joined #openstack-nova23:30
*** slaweq has quit IRC23:35
*** slaweq has joined #openstack-nova23:37
*** slaweq has quit IRC23:45
*** kaisers1 has quit IRC23:48
*** slaweq has joined #openstack-nova23:48
*** kaisers has joined #openstack-nova23:48
openstackgerritEric Fried proposed openstack/nova-specs master: Spec: Ussuri: Encrypted Emulated Virtual TPM
efriedphew. dansmith sean-k-mooney TxGirlGeek ^23:51
efriedand jroll23:51
*** slaweq has quit IRC23:53
*** tbachman has joined #openstack-nova23:54
*** zhanglong has joined #openstack-nova23:54
*** zhanglong has quit IRC23:59

Generated by 2.15.3 by Marius Gedminas - find it at!