*** adriant has quit IRC | 00:11 | |
*** lbragstad_ has joined #openstack-nova | 00:12 | |
*** lbragstad has quit IRC | 00:14 | |
*** igordc has quit IRC | 00:23 | |
*** markvoelker has quit IRC | 00:29 | |
*** markvoelker has joined #openstack-nova | 00:29 | |
*** hemna has quit IRC | 00:36 | |
*** adriant has joined #openstack-nova | 00:42 | |
*** henriqueof has joined #openstack-nova | 00:45 | |
*** henriqueof1 has quit IRC | 00:46 | |
*** ivve has quit IRC | 00:47 | |
*** ricolin has joined #openstack-nova | 00:48 | |
*** gyee has quit IRC | 00:56 | |
*** henriqueof1 has joined #openstack-nova | 00:56 | |
*** henriqueof has quit IRC | 00:57 | |
dolpher | alex_xu: good evening | 01:03 |
---|---|---|
*** markvoelker has quit IRC | 01:03 | |
*** markvoelker has joined #openstack-nova | 01:04 | |
*** henriqueof1 has quit IRC | 01:05 | |
*** henriqueof has joined #openstack-nova | 01:05 | |
*** hemna has joined #openstack-nova | 01:09 | |
*** pcaruana has quit IRC | 01:09 | |
*** slaweq has joined #openstack-nova | 01:11 | |
*** brinzhang_ has joined #openstack-nova | 01:13 | |
*** slaweq has quit IRC | 01:16 | |
*** brinzhang has quit IRC | 01:17 | |
openstackgerrit | weibin proposed openstack/nova master: Add support for using ceph RBD ereasure code https://review.opendev.org/681188 | 01:20 |
yaawang | good morning | 01:21 |
*** Tianhao_Hu has joined #openstack-nova | 01:22 | |
*** Tianhao_Hu has quit IRC | 01:22 | |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add user_id and project_id colume to Migration https://review.opendev.org/673990 | 01:32 |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add user_id and project_id colume to Migration https://review.opendev.org/673990 | 01:35 |
*** dklyle has quit IRC | 01:42 | |
*** david-lyle has joined #openstack-nova | 01:42 | |
*** hemna has quit IRC | 01:43 | |
*** TxGirlGeek has quit IRC | 01:45 | |
*** david-lyle has quit IRC | 01:46 | |
*** dklyle has joined #openstack-nova | 01:47 | |
alex_xu | dolpher: yaawang good evening and good morning | 01:54 |
*** yedongcan has joined #openstack-nova | 02:09 | |
*** slaweq has joined #openstack-nova | 02:11 | |
openstackgerrit | Brin Zhang proposed openstack/nova master: Set user_id/project_id from context when creating a Migration https://review.opendev.org/679413 | 02:11 |
*** slaweq has quit IRC | 02:16 | |
*** dannins has joined #openstack-nova | 02:20 | |
*** hemna has joined #openstack-nova | 02:43 | |
*** tinwood has quit IRC | 02:50 | |
*** tinwood has joined #openstack-nova | 02:52 | |
dolpher | ERROR: No matching distribution found for configparser===4.0.1 | 03:01 |
dolpher | Looks like configparser4.0.1 is not available from pypi | 03:02 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: db: Add resources column in instance_extra table https://review.opendev.org/678447 | 03:12 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: object: Introduce Resource and ResourceList objs https://review.opendev.org/678448 | 03:12 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Add resources dict into _Provider https://review.opendev.org/678449 | 03:12 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Retrieve the allocations early https://review.opendev.org/678450 | 03:12 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Claim resources in resource tracker https://review.opendev.org/678452 | 03:12 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: Enable driver discovering PMEM namespaces https://review.opendev.org/678453 | 03:12 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: report VPMEM resources by provider tree https://review.opendev.org/678454 | 03:12 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: Support VM creation with vpmems and vpmems cleanup https://review.opendev.org/678455 | 03:12 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Parse vpmem related flavor extra spec https://review.opendev.org/678456 | 03:12 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: Enable driver configuring PMEM namespaces https://review.opendev.org/679640 | 03:12 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Add functional tests for virtual persistent memory https://review.opendev.org/678470 | 03:12 |
*** hemna has quit IRC | 03:16 | |
*** Garyx_ has joined #openstack-nova | 03:33 | |
*** Garyx has quit IRC | 03:33 | |
*** mkrai has joined #openstack-nova | 03:41 | |
*** mkrai has quit IRC | 03:46 | |
*** hemna has joined #openstack-nova | 03:48 | |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: Enable driver configuring PMEM namespaces https://review.opendev.org/679640 | 04:03 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Add functional tests for virtual persistent memory https://review.opendev.org/678470 | 04:03 |
*** KeithMnemonic has quit IRC | 04:04 | |
*** slaweq has joined #openstack-nova | 04:11 | |
*** slaweq has quit IRC | 04:16 | |
openstackgerrit | Merged openstack/nova stable/stein: doc: Fix a broken reference link https://review.opendev.org/681141 | 04:17 |
openstackgerrit | Merged openstack/nova stable/rocky: doc: Fix a broken reference link https://review.opendev.org/681142 | 04:17 |
openstackgerrit | Merged openstack/nova stable/queens: doc: Fix a broken reference link https://review.opendev.org/681143 | 04:17 |
*** spatel has joined #openstack-nova | 04:17 | |
*** hemna has quit IRC | 04:23 | |
*** sapd1_x has joined #openstack-nova | 04:23 | |
*** ricolin has quit IRC | 04:27 | |
*** udesale has joined #openstack-nova | 04:31 | |
*** pcaruana has joined #openstack-nova | 04:32 | |
*** Luzi has joined #openstack-nova | 04:46 | |
*** markvoelker has quit IRC | 04:48 | |
*** dave-mccowan has quit IRC | 04:57 | |
*** pcaruana has quit IRC | 04:58 | |
*** spatel has quit IRC | 04:58 | |
*** yedongcan has quit IRC | 05:03 | |
*** mkrai has joined #openstack-nova | 05:09 | |
*** pots has joined #openstack-nova | 05:10 | |
*** sapd1_x has quit IRC | 05:13 | |
*** ash2307 has quit IRC | 05:13 | |
*** ash2307 has joined #openstack-nova | 05:15 | |
*** hemna has joined #openstack-nova | 05:15 | |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Add functional tests for virtual persistent memory https://review.opendev.org/678470 | 05:19 |
*** jaosorior has joined #openstack-nova | 05:20 | |
*** ratailor has joined #openstack-nova | 05:30 | |
*** ccamacho has quit IRC | 05:33 | |
*** efried_zzz has quit IRC | 05:34 | |
*** hamzy_ has quit IRC | 05:35 | |
*** brinzhang has joined #openstack-nova | 05:48 | |
*** brinzhang_ has quit IRC | 05:51 | |
*** slaweq has joined #openstack-nova | 05:53 | |
*** slaweq has quit IRC | 05:59 | |
*** tkajinam has quit IRC | 06:00 | |
*** mkrai has quit IRC | 06:01 | |
*** aloga has quit IRC | 06:03 | |
*** markvoelker has joined #openstack-nova | 06:06 | |
*** tkajinam has joined #openstack-nova | 06:06 | |
*** pcaruana has joined #openstack-nova | 06:08 | |
*** markvoelker has quit IRC | 06:11 | |
*** slaweq has joined #openstack-nova | 06:13 | |
*** TxGirlGeek has joined #openstack-nova | 06:15 | |
*** TxGirlGeek has quit IRC | 06:17 | |
*** hemna has quit IRC | 06:18 | |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Retrieve the allocations early https://review.opendev.org/678450 | 06:28 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Claim resources in resource tracker https://review.opendev.org/678452 | 06:28 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: Enable driver discovering PMEM namespaces https://review.opendev.org/678453 | 06:28 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: report VPMEM resources by provider tree https://review.opendev.org/678454 | 06:28 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: Support VM creation with vpmems and vpmems cleanup https://review.opendev.org/678455 | 06:28 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Parse vpmem related flavor extra spec https://review.opendev.org/678456 | 06:28 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: Enable driver configuring PMEM namespaces https://review.opendev.org/679640 | 06:28 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Add functional tests for virtual persistent memory https://review.opendev.org/678470 | 06:28 |
*** ccamacho has joined #openstack-nova | 06:35 | |
*** rpittau|afk is now known as rpittau | 06:36 | |
*** mkrai has joined #openstack-nova | 06:39 | |
*** pcaruana has quit IRC | 06:41 | |
*** ricolin has joined #openstack-nova | 06:41 | |
alex_xu | stephenfin: sean-k-mooney the fallback case I understand, https://etherpad.openstack.org/p/pcpu_fallback | 06:41 |
*** maciejjozefczyk has joined #openstack-nova | 06:43 | |
gibi | good morning nova | 06:45 |
alex_xu | gibi: good morning | 06:46 |
*** tkajinam_ has joined #openstack-nova | 06:53 | |
*** tkajinam has quit IRC | 06:56 | |
*** ricolin has quit IRC | 06:59 | |
*** damien_r has joined #openstack-nova | 07:00 | |
*** ralonsoh has joined #openstack-nova | 07:00 | |
*** brault has joined #openstack-nova | 07:04 | |
openstackgerrit | hutianhao27 proposed openstack/nova master: Fix bug directory left after cold migration https://review.opendev.org/681652 | 07:05 |
*** ivve has joined #openstack-nova | 07:07 | |
*** spsurya has joined #openstack-nova | 07:07 | |
*** pcaruana has joined #openstack-nova | 07:11 | |
bauzas | good morning nova | 07:13 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Retrieve the allocations early https://review.opendev.org/678450 | 07:15 |
*** pierreprinetti has quit IRC | 07:15 | |
*** trident has quit IRC | 07:15 | |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Claim resources in resource tracker https://review.opendev.org/678452 | 07:15 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: Enable driver discovering PMEM namespaces https://review.opendev.org/678453 | 07:15 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: report VPMEM resources by provider tree https://review.opendev.org/678454 | 07:15 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: Support VM creation with vpmems and vpmems cleanup https://review.opendev.org/678455 | 07:15 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Parse vpmem related flavor extra spec https://review.opendev.org/678456 | 07:15 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: Enable driver configuring PMEM namespaces https://review.opendev.org/679640 | 07:15 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Add functional tests for virtual persistent memory https://review.opendev.org/678470 | 07:15 |
*** awalende has joined #openstack-nova | 07:17 | |
*** ricolin has joined #openstack-nova | 07:18 | |
*** ralonsoh has quit IRC | 07:23 | |
*** ralonsoh has joined #openstack-nova | 07:24 | |
*** ralonsoh has quit IRC | 07:24 | |
*** trident has joined #openstack-nova | 07:24 | |
*** ralonsoh has joined #openstack-nova | 07:24 | |
*** trident has quit IRC | 07:28 | |
openstackgerrit | Brin Zhang proposed openstack/nova master: Filter migrations by user_id/project_id https://review.opendev.org/674243 | 07:30 |
*** ralonsoh has quit IRC | 07:32 | |
*** ralonsoh has joined #openstack-nova | 07:32 | |
*** avolkov has joined #openstack-nova | 07:36 | |
*** ttsiouts has joined #openstack-nova | 07:37 | |
*** trident has joined #openstack-nova | 07:37 | |
*** dolpher has quit IRC | 07:41 | |
*** hamzy has joined #openstack-nova | 07:42 | |
gibi | do we have a solution for the gate failure: ERROR: Could not find a version that satisfies the requirement configparser===4.0.1 | 07:50 |
gibi | ? | 07:50 |
alex_xu | gibi: I saw this https://review.opendev.org/#/c/681630/ | 07:50 |
alex_xu | so it suppose to be work now? | 07:51 |
gibi | alex_xu: thanks, then I go and recheclk | 07:52 |
gibi | recheck | 07:52 |
*** takashin has left #openstack-nova | 08:02 | |
*** shilpasd has joined #openstack-nova | 08:07 | |
*** mkrai has quit IRC | 08:08 | |
*** tkajinam_ has quit IRC | 08:09 | |
*** mkrai has joined #openstack-nova | 08:10 | |
*** cdent has joined #openstack-nova | 08:11 | |
*** shilpasd has quit IRC | 08:11 | |
*** hemna has joined #openstack-nova | 08:14 | |
*** markvoelker has joined #openstack-nova | 08:16 | |
*** threestrands has quit IRC | 08:16 | |
lyarwood | Have we cut M3 yet? I've been holding off for a while on posting bugfixes to avoid clogging up the gate. | 08:17 |
lyarwood | wait, the deadline is today, ignore me. | 08:18 |
gibi | lyarwood: yepp, today is still crazy FF day | 08:18 |
lyarwood | ack thanks I'll hold off in that case | 08:19 |
*** markvoelker has quit IRC | 08:20 | |
*** mdbooth has joined #openstack-nova | 08:20 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Make SRIOV computes non symmetric in func test https://review.opendev.org/681667 | 08:24 |
*** derekh has joined #openstack-nova | 08:28 | |
stephenfin | alex_xu: Looking | 08:34 |
stephenfin | alex_xu: That looks correct. I started work on the fix for that yesterday but haven't finished. It only happens if you have two compute nodes though, right? | 08:35 |
alex_xu | stephenfin: it can be happened for 100 nodes, if all 100 nodes can get from placement allocation candidtes, but failed in later filtering | 08:36 |
stephenfin | both cases? | 08:37 |
alex_xu | stephenfin: yes, see the line 25 | 08:37 |
alex_xu | stephenfin: I think the root cause is that | 08:37 |
stephenfin | but it'll only happen if we attempt to use the same host | 08:38 |
alex_xu | I don't think I figure out the root cause yesterday, it isn't just about resize | 08:38 |
alex_xu | what about case1 | 08:38 |
alex_xu | stephenfin: I think anycase about placement can return candidates, but nova scheduler filter refuse the request. | 08:39 |
alex_xu | then we won't fallback | 08:39 |
stephenfin | true | 08:41 |
stephenfin | hmm | 08:41 |
stephenfin | I'm still tempted to think it's not a huge issue. You'd need an awful lot to go against you | 08:42 |
alex_xu | stephenfin: but need to think about that whether we can have such worse case, like 100 nodes failed at filter not the placement | 08:42 |
stephenfin | Namely that you're in the middle of an upgrade and you've got a lot of failures in the compute node | 08:42 |
alex_xu | I guess it will bad in the every begining of upgrade, like you only have 5 nodes or 10 nodes in T | 08:42 |
stephenfin | I'd be tempted to just add a workaround option | 08:42 |
stephenfin | That, when enabled, will trigger "request VCPU by default" | 08:43 |
stephenfin | so if someone ran into this in the middle of an upgrade, they'd set the config option and we wouldn't request PCPU until they've a few more compute nodes upgraded | 08:43 |
stephenfin | It might be overkill thoughj | 08:43 |
stephenfin | *though | 08:43 |
alex_xu | ok, the operator can switch that when he confidence he has enough PCPU node | 08:43 |
stephenfin | yeah | 08:43 |
*** sapd1 has quit IRC | 08:44 | |
alex_xu | stephenfin: I guess someone may say we add more job for operator | 08:44 |
alex_xu | stephenfin: how deficult if we fallback the whole placment+filtering | 08:44 |
stephenfin | I'm looking now | 08:44 |
*** jaosorior has quit IRC | 08:45 | |
alex_xu | I'm thinking we don't want to fallback all the time, how to do that check condition when filter failed | 08:46 |
*** hemna has quit IRC | 08:47 | |
*** IvensZambrano has joined #openstack-nova | 08:48 | |
stephenfin | alex_xu: Okay, I think it should be possible. Leave it with me :) | 08:48 |
alex_xu | stephenfin: so cool :) | 08:49 |
bauzas | alex_xu: stephenfin: I looked at your convo, can you please summarize me why the filter would different from placement ? | 08:50 |
alex_xu | bauzas: here is two cases I found https://etherpad.openstack.org/p/pcpu_fallback | 08:50 |
alex_xu | maybe more | 08:50 |
stephenfin | bauzas: We try to get PCPU from placement, and if that returns nothing we try for VCPU | 08:51 |
stephenfin | It's to minimize the upgrade impact | 08:51 |
stephenfin | However, if the first request does return some stuff, but those things fail the filters, we don't do the fallback for VCPU | 08:51 |
bauzas | well, I still don't get it | 08:51 |
bauzas | that's surely me, but if the operator said "okay, let's go with PCPU", that's because he opted-in, right? | 08:52 |
stephenfin | they've opted in by setting '[compute] cpu_dedicated_set' on some compute nodes, yes | 08:52 |
bauzas | but we said that it wouldn't provide PCPU inventories until all computes are opted in, right? | 08:53 |
* bauzas looks back at the spec | 08:53 | |
alex_xu | bauzas: no, the upgrade plan is changed | 08:53 |
bauzas | actually, we send new inventories after opting in with the conf opt, you're right | 08:53 |
bauzas | I remember +2ing this | 08:53 |
bauzas | alex_xu: stephenfin: honesly, I feel it's hard to manage both at the same time, and I voiced it a couple of times in the spec review | 08:54 |
alex_xu | bauzas: here is the thing manage same time https://review.opendev.org/#/c/671801/46/nova/scheduler/manager.py@174 | 08:55 |
*** priteau has joined #openstack-nova | 08:56 | |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Retrieve the allocations early https://review.opendev.org/678450 | 08:57 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Claim resources in resource tracker https://review.opendev.org/678452 | 08:57 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: Enable driver discovering PMEM namespaces https://review.opendev.org/678453 | 08:57 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: report VPMEM resources by provider tree https://review.opendev.org/678454 | 08:57 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: Support VM creation with vpmems and vpmems cleanup https://review.opendev.org/678455 | 08:57 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Parse vpmem related flavor extra spec https://review.opendev.org/678456 | 08:57 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: Enable driver configuring PMEM namespaces https://review.opendev.org/679640 | 08:57 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Add functional tests for virtual persistent memory https://review.opendev.org/678470 | 08:57 |
stephenfin | bauzas: I'm not sure what else there is to do though. We have to do this, because no one has an alternative | 08:57 |
stephenfin | and we need to get to a future with PCPU. We can't just keep kicking the can down the road :( | 08:57 |
bauzas | I don't disagree with you | 08:57 |
bauzas | I'm just trying to make sure we don't shoot ourselves in the feet | 08:58 |
bauzas | that whole upgrade plan gave me brainaches when I was reviewing the spec | 08:58 |
stephenfin | oh, I've been working plenty hard to make sure we don't that :) | 08:58 |
stephenfin | changing the world (as far as placement is concerned) is hard | 08:58 |
stephenfin | as is getting things through the gate right now, it seems :D | 08:59 |
stephenfin | so many false negatives... | 08:59 |
*** efried has joined #openstack-nova | 09:00 | |
efried | o/ nova | 09:00 |
*** ociuhandu has joined #openstack-nova | 09:00 | |
bauzas | ... | 09:00 |
efried | The gate appears to be well and truly f'ed. | 09:00 |
stephenfin | Yes. Yes it does. | 09:00 |
efried | There was a configparser thing, which may be fixed? | 09:01 |
efried | But I haven't seen any successful runs even since that happened | 09:01 |
stephenfin | I'm not sure. Takashi-san (ignorance time: should you use the first name or second?) has been furiously rechecking some Mox -> mock and its failed everytime with a different error | 09:02 |
bauzas | lemme look at logstash | 09:02 |
bauzas | any bug tracking this ? | 09:02 |
bauzas | any exception snippet I could use ? | 09:02 |
stephenfin | bauzas: Not really. It's a simple dependency issue https://zuul.opendev.org/t/openstack/build/f4c728cbb77e48b7a149ea952d2ca2ec/log/job-output.txt#862 | 09:03 |
bauzas | that's enough for looking at occurrences | 09:03 |
*** udesale has quit IRC | 09:04 | |
* bauzas is a bit rusty with logstash but this should be doable | 09:04 | |
stephenfin | it was hitting anything using Python 2.7. Here's the functional test from the same change (the other one was the py27 unit test) https://zuul.opendev.org/t/openstack/build/f862fb763ee54e96b0a844eaf6f00b57/log/job-output.txt#944 | 09:04 |
stephenfin | I guess they dropped support for Python 2 too \o/ | 09:04 |
*** ociuhandu has quit IRC | 09:04 | |
*** udesale has joined #openstack-nova | 09:07 | |
luyao | stephenfin: I see you maybe busy for your own patch, very appreciate if you can look the vpmems again when you get time. NUMA issue is fixed and CI for vpmems works well, the tests passed on the last two patches. | 09:08 |
stephenfin | luyao: Yup, don't worry, I'll get to it before the end of my day | 09:09 |
stephenfin | Should be a straightforward +2 at this point, I'd imagine | 09:09 |
luyao | stephenfin: many thanks! :D | 09:09 |
bauzas | http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22ERROR%3A%20Could%20not%20find%20a%20version%20that%20satisfies%20the%20requirement%20configparser%5C%22 | 09:09 |
bauzas | looks like it's solved ^ | 09:09 |
efried | stephenfin: meanwhile, what's the status of cpu-resources? Ready for a re-look? | 09:09 |
stephenfin | efried: one patch, yes, but alex_xu has some concerns around the other patch | 09:10 |
efried | bauzas: are you savvy on the quota issue or do we need to wait for Dan? | 09:10 |
alex_xu | efried: yes, the only https://review.opendev.org/#/c/671801/, I +2 all other patches | 09:10 |
stephenfin | efried: In short, the way we're doing the retry is to only retry if placement doesn't provide any matches | 09:10 |
stephenfin | However, placement could return some matches and the NUMATopologyFilter (or other filters) could reject those | 09:11 |
efried | ugh | 09:11 |
stephenfin | So alex_xu is suggesting that we should both the request to placement and the request to the filter scheduler inside the try-try again logic | 09:11 |
bauzas | efried: well, I'm not a quota specialist, but looks like melwitt's concerns were addressed | 09:11 |
bauzas | efried: I can take a look on it | 09:11 |
alex_xu | efried: may help you understand the problem https://etherpad.openstack.org/p/pcpu_fallback | 09:12 |
efried | stephenfin: or what we could do is just do both GET /a_c queries, cat the vcpu ones after the pcpu ones, and run the filter once. | 09:12 |
bauzas | we are technically sharding the capacity, right? | 09:12 |
stephenfin | efried: I thought of that but doesn't the filter scheduler shuffle the allocation candidates? | 09:13 |
*** ralonsoh has quit IRC | 09:13 | |
efried | placement does the shuffle | 09:13 |
stephenfin | ohhhh | 09:13 |
efried | I don't know if the filter scheduler also does | 09:13 |
*** FlorianFa has quit IRC | 09:13 | |
efried | stephenfin: this scenario only applies if PCPUs were requested, right? | 09:14 |
efried | Meaning if you have the granny switch flipped *and* you're on a flavor that's requesting dedicated? | 09:14 |
efried | Doing a double GET /a_c would be expensive, so it would be nice if we were only doing it in a corner case. | 09:15 |
*** hemna has joined #openstack-nova | 09:15 | |
*** lpetrut has joined #openstack-nova | 09:16 | |
*** rcernin has quit IRC | 09:16 | |
gibi | efried: if there is more than one candidate for a single host then the scheduler will only look for the first candidate (per host) | 09:18 |
*** ociuhandu has joined #openstack-nova | 09:19 | |
openstackgerrit | Eric Fried proposed openstack/nova master: Support reverting migration / resize with bandwidth https://review.opendev.org/676140 | 09:19 |
gibi | efried: that will complicate my work a bit ^^ | 09:20 |
*** shilpasd has joined #openstack-nova | 09:20 | |
efried | gibi: just a rebase, shouldn't affect anything | 09:21 |
*** ociuhandu has quit IRC | 09:21 | |
*** ociuhandu has joined #openstack-nova | 09:21 | |
efried | gibi: only one candidate per host -- yeah, that kills it (<== stephenfin) | 09:21 |
gibi | efried: I have ongoing work in that series locally, so I have to restach top of the rebase | 09:23 |
gibi | efried: I will manage | 09:23 |
efried | oh shit, I'm sorry gibi | 09:23 |
luyao | sean-k-mooney: Hi Sean, are you around. | 09:23 |
gibi | efried: no worries, I think I can do some interactive rebase magic | 09:23 |
efried | gibi: I didn't rebase anything except the first patch | 09:23 |
*** tbachman has quit IRC | 09:23 | |
efried | so you may just be able to proceed as you were | 09:23 |
*** lpetrut has quit IRC | 09:23 | |
gibi | efried: ack. I will rebase my work on top of that | 09:24 |
efried | I was just trying to do what little I could to get the gate moving | 09:24 |
gibi | efried: sure, I understand your motives | 09:24 |
efried | that patch had already failed py37 on the f'in innodb thing | 09:24 |
stephenfin | efried, gibi: That shouldn't matter, should it? | 09:24 |
*** lpetrut has joined #openstack-nova | 09:24 | |
stephenfin | I mean, if a host is reporting PCPU then by definition we shouldn't try using it for VCPU | 09:25 |
efried | stephenfin: It makes it a very imperfect solution, but I guess it still narrows the window | 09:25 |
efried | oh | 09:25 |
gibi | efried: no worries. I guess I'm overreacted | 09:25 |
*** tbachman has joined #openstack-nova | 09:25 | |
efried | that's a good point stephenfin... I think | 09:25 |
stephenfin | the crucial thing is does the filter scheduler shuffle hosts? | 09:25 |
stephenfin | *shuffle allocation candidates | 09:25 |
gibi | stephenfin: I see. so either there will be more than once candidate but then we are OK with the first. Or there only be candidate with VCPU and then we fallback to that | 09:25 |
*** mkrai has quit IRC | 09:26 | |
gibi | stephenfin: there is some processing involved let me find a link | 09:26 |
efried | gibi: actually, I'm not sure if you had hoped to merge more in that series than is already +Wed, but it might be best to wait until after FF to do *anything* else. Esp since gate resources are going to be really scarce for the forseeable future. | 09:26 |
gibi | efried: Matt and I has hopes for the series. He said he could be able to +2 the rest today. But I accept the bad news if the gate is sad | 09:27 |
*** shilpasd has quit IRC | 09:27 | |
stephenfin | gibi: I think so. Either we'll return a candidate for host from the PCPU query (host has 'cpu_dedicated_set' configured)... | 09:27 |
stephenfin | or we'll return a candidate for a host from the VCPU query (host has 'vcpu_pin_set' or 'cpu_shared_set' configured or nothing)... | 09:28 |
gibi | stephenfin: https://github.com/openstack/nova/blob/5fa49cd0b8b6015aa61b4312b2ce1ae780c42c64/nova/scheduler/manager.py#L176 | 09:28 |
stephenfin | or we'll return the same candidate for both (host has 'cpu_dedicated_set' *and* 'cpu_shared_set') but ignore the one from the VCPU request | 09:28 |
gibi | stephenfin: sound OK to me | 09:28 |
stephenfin | because the PCPU query was done first | 09:28 |
stephenfin | schweet | 09:28 |
gibi | stephenfin: but look at the link. We bould dicts :/ | 09:28 |
* stephenfin clicks | 09:28 | |
bauzas | stephenfin: gibi: efried: alex_xu: seriously, I'm still continuing to consider we overthink the problem | 09:29 |
gibi | stephenfin: hm, that dict is not a problem the allocation requests still in a list | 09:29 |
efried | gibi: but only per host | 09:30 |
bauzas | I mean, once the operator opts in, he litterally shards his cloud in twice | 09:30 |
gibi | efried: but the fallback is needed per host, isn't it? | 09:30 |
bauzas | if we say the opt-in is per compute, that means the sharding can be long | 09:30 |
efried | gibi: actually, maybe so. | 09:30 |
bauzas | that's why the more I see our convos, the best i think it would be to just consider a global option that would dictate all computes sending PCPU inventories | 09:31 |
gibi | stephenfin: we take the first ar per host https://github.com/openstack/nova/blob/5fa49cd0b8b6015aa61b4312b2ce1ae780c42c64/nova/scheduler/filter_scheduler.py#L238 | 09:31 |
efried | gibi, stephenfin: as long as it's okay if we try has-only-VCPU candidates before we try a has-PCPU candidate | 09:31 |
stephenfin | efried: It's not an issue. As noted, the NUMATopologyFilter or virt driver will fail that request | 09:31 |
efried | ...which should still be fine, because the ntf will remove... yeah | 09:31 |
efried | so again, stephenfin, under what circumstances are we worried about this? | 09:32 |
stephenfin | bauzas: We sort of have that - the compute nodes won't send PCPU inventory until 'cpu_dedicated_set' is set | 09:32 |
bauzas | stephenfin: I know, my opinion is just "doc it, dude" | 09:32 |
efried | If it's only in the "sharded" case as bauzas says, then I think it may be acceptable to take the double GET /a_c hit. | 09:33 |
stephenfin | efried: The current solution (where we only try the second 'GET /a_c' if the first doesn't return anything) or the modified one we're suggesting here (where we always make two requests and concat them) ? | 09:33 |
efried | but if it can happen in a fully-pre or fully-post cloud, we should think it through more carefully. | 09:33 |
cdent | in train two a_c hits is pretty cheap | 09:34 |
bauzas | adding more complexity into a abstract API that would add special-cases in order to fix a very temporary situation is IMHO like using a big hammer | 09:34 |
efried | I'm talking about doing double GET /a_c preemptively. | 09:34 |
efried | cdent: even for CERN? | 09:34 |
cdent | yeah | 09:34 |
cdent | <1 sec per | 09:34 |
stephenfin | oh, then I can't think there are huge issues outside of always making an extra requests to placement | 09:34 |
efried | and it's not just the API calls; we'd be doubling the amount of memory we're using to store the candidates | 09:34 |
stephenfin | which cdent is saying isn't a huge issue | 09:35 |
efried | I suppose we could cut the ?limit in half... but I don't know if that's a good idea. | 09:35 |
stephenfin | and only happens if we're using pinning | 09:35 |
stephenfin | efried: alex_xu had suggested a config option to disable the double lookup | 09:35 |
efried | that's also a good idea | 09:35 |
stephenfin | or rather, disable the fallback | 09:35 |
efried | workaround | 09:35 |
stephenfin | workaround option, killed in the next release | 09:35 |
stephenfin | yeah | 09:36 |
cdent | (the perf issue for a double a_c is the nova-side processing of the results, not the placemnt-side generaiton) | 09:36 |
stephenfin | just for CERN and other people with huge clouds who might run into this issue | 09:36 |
efried | so: workaround disabled, you might get nvh when you're out of "real" PCPUs; workaround enabled, you're taking an extra perf/mem hit (small though it may be) any time we... what? | 09:36 |
efried | ...detect that there are any PCPUs in the cloud? | 09:37 |
efried | still trying to understand when we would do the double thing | 09:37 |
*** ricolin has quit IRC | 09:37 | |
stephenfin | anytime you try to create or move a pinned instance | 09:37 |
efried | mm | 09:37 |
stephenfin | for one cycle | 09:38 |
efried | mm | 09:38 |
bauzas | yeah, that's what I'm trying to say : dudes, if you really want to support sharded clouds, it will come with a performance penalty | 09:39 |
bauzas | in this case, make it super temporary and super clear that this is a workaround | 09:39 |
efried | bauzas: problem is, it's not just sharded that you get hit on | 09:40 |
efried | in this case if you fully upgrade, you still get hit | 09:40 |
efried | but | 09:40 |
efried | in that case you should disable the workaround | 09:40 |
efried | because you're done | 09:40 |
efried | and don't need it. | 09:40 |
efried | so | 09:40 |
bauzas | that will leave the operator a choice between a quick rolling upgrade (and no performance hit) or a slow rolling upgrade with some performance degradation he could estimate | 09:40 |
efried | I think this is acceptable. | 09:40 |
bauzas | efried: not really, you opt-in to a sharded world once you use a conf opt | 09:40 |
bauzas | efried: I'm talking of the opt usage rolling update (sorry indeed) | 09:41 |
bauzas | ie. upgrade your cloud to Train | 09:41 |
bauzas | take your time | 09:41 |
bauzas | once all computes are done, you have two choices | 09:41 |
bauzas | A/ you have some puppetry that can change options on the fly | 09:41 |
bauzas | in this case, you don't really need to use some workaround that gives you performance hit | 09:42 |
efried | stephenfin: propose double query config on or off by default? | 09:42 |
bauzas | do the option update at some time | 09:42 |
bauzas | B/ you don't like CMSes or you prefer updating it smoothly for business reasons | 09:42 |
bauzas | then, you sign-off for a performance hit | 09:43 |
bauzas | but you'll be sure everything will continue to work | 09:43 |
bauzas | this sounds reasonable to me | 09:43 |
stephenfin | efried: the previous approach was a boolean - request PCPU or request VCPU - and it defaulted to requesting VCPU to not break upgrades | 09:43 |
bauzas | that said, I trust cdent | 09:43 |
stephenfin | I think the same thing applies here | 09:43 |
* cdent does his best evil laugh | 09:44 | |
bauzas | if querying a_c isn't really a performance issue, it's just a matter of providing a temporary codepath | 09:44 |
efried | stephenfin: meaning we want to err on the side of getting hosts, even if it hurts, so default to double query? | 09:44 |
cdent | as I said, the performance concern is nova-side processing, which can be tested | 09:44 |
stephenfin | we need to default to double requests otherwise everyone will upgrade, not realize they need to set 'cpu_dedicated_set' on compute nodes, try to boot a pinned instance and fail miserably | 09:44 |
stephenfin | efried: yup | 09:44 |
efried | ack | 09:44 |
stephenfin | I'll work with TripleO at a minimum to make sure that workaround option is enabled on new deployments | 09:45 |
stephenfin | Can do the same from OSA and kolla too | 09:45 |
stephenfin | (though I'm not sure Kolla supports configuring hosts for pinned instances etc.) | 09:45 |
* efried suspects dan is going to shit a brick | 09:45 | |
stephenfin | we've got three hours or so before he's up - should be loads of time to shove everything through | 09:46 |
* stephenfin cackles | 09:46 | |
efried | yeah, except the part where the gate delay is already past 3h | 09:46 |
*** ralonsoh has joined #openstack-nova | 09:46 | |
stephenfin | :( | 09:46 |
efried | 3h13m | 09:46 |
efried | and counting | 09:46 |
stephenfin | I saw stuff in the queue for 9 hours last night | 09:46 |
stephenfin | though I may have been misreading things | 09:46 |
bauzas | stephenfin: wait a sec | 09:47 |
bauzas | the workaround necessarly has to be enabled *by default* if you don't wanna break users | 09:47 |
efried | bauzas: "enabled" meaning "double" yah? | 09:47 |
efried | I think that's what we're suggesting. | 09:47 |
efried | stephenfin said he's going to work with the deployment projects to make sure they disable it explicitly for *new* deployments | 09:48 |
stephenfin | bauzas: what efried said | 09:49 |
*** udesale has quit IRC | 09:49 | |
stephenfin | for anyone that knows TripleO, that's what we were planning to do with the previous config option too https://review.opendev.org/#/c/681207/ | 09:50 |
stephenfin | so it can be done | 09:50 |
*** udesale has joined #openstack-nova | 09:50 | |
*** hemna has quit IRC | 09:50 | |
efried | are we ready to unblock the bottom of cpu-resources and/or vpmem at this point? Get them queued up... | 09:50 |
efried | ...does this issue warrant continuing to hold cpu-resources until ready? | 09:51 |
stephenfin | I'm biased, but I _think_ we're good | 09:56 |
luyao | efried: for vpmems, I think I'm ready, but not sure stephenfin and sean-k-mooney, still need sean-k-mooney to confirm the xml again | 09:57 |
stephenfin | The quota issue and the switch over to the try-try again logic were the blockers. The former's resolved and the latter just needs a tweak | 09:57 |
stephenfin | I've promised luyao I'll sign off on vPMEM today too, so that's pretty much good to go too | 09:57 |
efried | Okay. I'll unblock cpu-resources and start +Aing. Plenty of time to yank them out of the gate if something goes pear-shaped. | 09:58 |
efried | And I'll unblock vpmem, ready for your +A stephenfin | 09:58 |
stephenfin | Cool. Lemme finish reworking this patch and I'll hit that | 09:59 |
stephenfin | efried, alex_xu, bauzas: bikeshed time - what should the workaround option be called? | 10:01 |
stephenfin | it's disabling the second request for VCPUs if the instance is pinned | 10:02 |
stephenfin | disable_second_request_for_pinned_instances sux | 10:02 |
efried | [workarounds]try_really_hard_to_get_allocation_candidates_for_pinned_instances_but_suffer_a_performance_hit | 10:02 |
bauzas | sorry folks, was AFK | 10:02 |
efried | oh, right, reverse of that | 10:02 |
stephenfin | *do_not_try_... | 10:02 |
stephenfin | yup | 10:02 |
efried | yeah | 10:02 |
* bauzas suddently remembered he was a dad who left her daughters at school | 10:03 | |
efried | hah, I did that the other day | 10:03 |
bauzas | you don't imagine how a 9yo can yell at you :p | 10:03 |
efried | my kid: "why are you so late, dad?" | 10:03 |
efried | me: "because I suck" | 10:03 |
bauzas | yeah, life of a remottee | 10:04 |
bauzas | anyway | 10:04 |
bauzas | for the workaround name, meh | 10:04 |
bauzas | just provide a clear doc message, that's it | 10:04 |
stephenfin | disable_fallback_pcpu_query | 10:04 |
stephenfin | I'm going with that | 10:04 |
efried | ++ | 10:04 |
bauzas | cool | 10:04 |
bauzas | it could be "brexit" | 10:05 |
* bauzas runs off | 10:05 | |
cdent | pcpu_backstop | 10:06 |
bauzas | this ^ | 10:06 |
efried | gibi, bauzas, sean-k-mooney: if one of you could ack https://review.opendev.org/#/c/671800/34 at some point soon... | 10:06 |
alex_xu | stephenfin: +1 disable_fallback_pcpu_query | 10:06 |
bauzas | efried: I was previously +2 | 10:07 |
bauzas | efried: but lemme look | 10:07 |
efried | bauzas: I see a +1 at PS33 | 10:07 |
bauzas | my bad | 10:07 |
efried | but yes, please +2+W if you're satisfied | 10:07 |
efried | this is one I wasn't comfortable +2ing myself. | 10:08 |
* bauzas just does at the moment https://review.opendev.org/#/c/671800/33..34 | 10:08 | |
efried | hah, don't do that :P | 10:08 |
*** dtantsur|afk is now known as dtantsur | 10:08 | |
bauzas | oh heh, just discovered https://en.wiktionary.org/wiki/Bauza#English | 10:09 |
bauzas | \o/ | 10:09 |
*** mkrai has joined #openstack-nova | 10:11 | |
bauzas | efried: https://sbauza.wordpress.com/2014/11/14/how-to-compare-2-patchsets-in-gerrit/ | 10:12 |
bauzas | (FWIW) | 10:12 |
efried | You're 26887th in the US | 10:12 |
*** ttsiouts has quit IRC | 10:13 | |
* bauzas doesn't know he lives in the US | 10:13 | |
bauzas | did the president already paid for France? | 10:13 |
efried | Yeah, if you moved here there would be 90*4* of you | 10:13 |
stephenfin | I never learned how to use vimdiff | 10:13 |
*** ttsiouts has joined #openstack-nova | 10:13 | |
efried | afaict the diff from PS33 is negligible, just that one reinstated assertion in two tests | 10:13 |
bauzas | that's what I saw | 10:14 |
bauzas | +Wipped | 10:14 |
bauzas | stephenfin: you don't really need vimdiff | 10:14 |
efried | thanks bauzas | 10:14 |
efried | now, who's going to +A the quota one? | 10:14 |
alex_xu | by fiinger-guessing game? | 10:15 |
*** lpetrut has quit IRC | 10:17 | |
*** ttsiouts has quit IRC | 10:17 | |
*** hemna has joined #openstack-nova | 10:19 | |
*** lpetrut has joined #openstack-nova | 10:19 | |
*** tbachman has quit IRC | 10:21 | |
*** markvoelker has joined #openstack-nova | 10:26 | |
*** panda is now known as panda|ruck | 10:28 | |
*** mkrai has quit IRC | 10:28 | |
*** markvoelker has quit IRC | 10:32 | |
*** jaosorior has joined #openstack-nova | 10:32 | |
openstackgerrit | Luyao Zhong proposed openstack/nova master: objects: use all_things_equal from objects.base https://review.opendev.org/681397 | 10:36 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Func test for migrate re-schedule with bandwidth https://review.opendev.org/676972 | 10:45 |
efried | gibi: https://review.opendev.org/#/c/680810/ is failing, can I pull it out of the gate? | 10:46 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Make SRIOV computes non symmetric in func test https://review.opendev.org/681667 | 10:48 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Support migrating SRIOV port with bandwidth https://review.opendev.org/676980 | 10:50 |
gibi | efried: sure | 10:52 |
*** hemna has quit IRC | 10:52 | |
gibi | efried: what is the way to get it out of the gate? | 10:53 |
gibi | efried: rebase? That patch can be rebased freely as it is not part of the series. | 10:54 |
efried | yes. cool, thanks. | 10:54 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Follow up for I220fa02ee916728e241503084b14984bab4b0c3b https://review.opendev.org/680810 | 10:54 |
efried | rebase stops the current run and shoves it to the back of the queue | 10:54 |
gibi | rebased. | 10:54 |
gibi | is there any other way that does not involve changing the review itself? | 10:55 |
efried | thanks, re+W'd | 10:55 |
efried | gibi: no | 10:55 |
efried | not that I'm aware of | 10:55 |
gibi | we will run out of rebase targets if the master is not moving forward :) | 10:55 |
efried | yeah, I know :( | 10:55 |
efried | but for now this is the best we can do. | 10:55 |
gibi | ack | 10:55 |
*** sapd1_x has joined #openstack-nova | 10:56 | |
aspiers | kashyap, efried: seems hw_firmware_type is missing from glance/etc/metadefs | 10:56 |
aspiers | and so is hw_mem_encryption | 10:56 |
efried | what does that mean? | 10:57 |
aspiers | it means they are not shown as nice user-friendly options in Horizon, for a start | 10:57 |
aspiers | not sure if there are other implications | 10:57 |
efried | to refine the question: Is this going to affect a) gate, b) vpmem/cpu-resources series? | 10:57 |
aspiers | I don't think it affects nova at all | 10:58 |
aspiers | We probably would have noticed by now if it did | 10:58 |
efried | then I humbly request the issue be raised *after* this week | 10:58 |
aspiers | I think it's just for registering metadata as "officially blessed" rather than some arbitrary key/value pair | 10:58 |
gibi | bauzas, efried: lost a +W in a rebase https://review.opendev.org/#/c/676972 could you please add it back? | 10:58 |
aspiers | efried: haha sure, it can't be urgent :) | 10:58 |
aspiers | efried: but probably trivial to fix too, and anyway it's a matter of submitting reviews to glance not nova | 10:59 |
efried | gibi: done, but you should feel free to re+W in these circumstances | 10:59 |
efried | aspiers: cool | 10:59 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Allow migrating server with port resource request https://review.opendev.org/671497 | 10:59 |
gibi | efried: ack, will do | 11:00 |
sean-k-mooney | efried: i see bauzas ack'd https://review.opendev.org/#/c/671800/34 but ill take a look now too | 11:03 |
efried | sean-k-mooney: if you like. Though actually I would rather you re-looked at that one vpmem patch... | 11:04 |
efried | sean-k-mooney: this un https://review.opendev.org/#/c/678455/36 | 11:04 |
sean-k-mooney | efried: sure | 11:04 |
efried | thank you | 11:04 |
sean-k-mooney | looking at teh cpu seriese have they all been approved | 11:04 |
efried | mostly | 11:04 |
efried | we're still working on a couple at the top | 11:04 |
efried | remember that thing about trying a second GET /a_c if the first one returned no results? | 11:05 |
sean-k-mooney | ok so we are waiting on the gate for most of them and fixing the last few patches | 11:05 |
sean-k-mooney | efried: ya | 11:05 |
sean-k-mooney | efried: is it causing issues | 11:05 |
efried | I'll save you reading the backscroll, but we're doing something different now: we're going to do both GET /a_c calls up front (with a workaround conf opt) | 11:05 |
efried | Stephen is working on that now | 11:06 |
sean-k-mooney | ok | 11:06 |
sean-k-mooney | what is the advantage of that | 11:06 |
sean-k-mooney | are we doing them in parallel or something | 11:06 |
efried | with the cascaded approach, if we did the first one, got results, then sent them down to the filter and the filter rejected all of them, we'd be hosed. | 11:06 |
efried | So one option was to make that retry "wider" | 11:07 |
efried | which would be tough to code | 11:07 |
artom | efried, has that configparser thing been fixed? I see you rechecked the NUMA LM patches | 11:07 |
sean-k-mooney | ok i see | 11:07 |
efried | artom: yes | 11:07 |
artom | Sweet | 11:07 |
efried | artom: carefully curating the gate atm | 11:07 |
artom | Yep, thanks for that | 11:08 |
efried | sean-k-mooney: no, we have to do them sequentially, because we always want to try the pcpu ones first for a given host. | 11:08 |
*** tbachman has joined #openstack-nova | 11:08 | |
efried | but we do one right after the other | 11:08 |
efried | and collect all the candidates | 11:08 |
efried | then send them all down to the filter. | 11:08 |
sean-k-mooney | right | 11:08 |
sean-k-mooney | i assume we only add host form the second query if its not in the first? | 11:09 |
sean-k-mooney | or do we just let the filter handel that | 11:09 |
efried | I don't think we're being that careful about it | 11:09 |
sean-k-mooney | actully it does not matter | 11:09 |
efried | right | 11:09 |
sean-k-mooney | the filter work on host not allocation candiates | 11:10 |
sean-k-mooney | so it will pass the host if it fit and fail it if it does not | 11:11 |
sean-k-mooney | and the host will only support one of the two with pinning | 11:11 |
sean-k-mooney | a host can have VCPU and PCPUs but if it returns pcpu allocation then that is what we will need to use | 11:12 |
sean-k-mooney | efried: by the way this https://review.opendev.org/#/c/679656/ is the numa live migration job. it not a pirortiy today but i would like to merge that before RC1 to get it running nightly once artoms code is fully merged. | 11:14 |
sean-k-mooney | it add the nova-nfv-multi-numa-multinode job | 11:15 |
efried | sean-k-mooney: ack, please ask about it next week | 11:15 |
sean-k-mooney | yep will do | 11:15 |
efried | sean-k-mooney: https://review.opendev.org/#/c/681358/ os-vif release ack pls? | 11:16 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Do not query allocations twice in finish_revert_resize https://review.opendev.org/678827 | 11:17 |
sean-k-mooney | mels patch ya ill ack that now | 11:17 |
*** ttsiouts has joined #openstack-nova | 11:19 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: fixup! Add support for translating CPU policy extra specs, image meta https://review.opendev.org/681719 | 11:20 |
stephenfin | efried, sean-k-mooney, alex_xu, bauzas: Before I squash that back and fix the tests, want to give that the once over ^ ? | 11:20 |
alex_xu | yup, checking now | 11:20 |
efried | ... | 11:21 |
*** ttsiouts has quit IRC | 11:23 | |
*** ttsiouts has joined #openstack-nova | 11:23 | |
alex_xu | ah, that is smart | 11:25 |
*** hemna has joined #openstack-nova | 11:25 | |
alex_xu | stephenfin: works for me | 11:26 |
sean-k-mooney | checking | 11:26 |
efried | stephenfin: looks good, couple of comments in there. | 11:27 |
luyao | sean-k-mooney: Hi, if you get time, could you help look vpmem xml again? thanks ( we discussed before about the numa requiring and alignment, and got a consensus about them) | 11:29 |
luyao | https://review.opendev.org/#/c/678455/ | 11:30 |
*** jaosorior has quit IRC | 11:31 | |
sean-k-mooney | efried: also commented. we should not use update as i noted inline | 11:31 |
stephenfin | efried: I shouldn't have pushed those mox patches through last night either. Sorry :( | 11:32 |
efried | no biggie stephenfin | 11:32 |
efried | they're all going to get kicked out here in a minute | 11:33 |
sean-k-mooney | luyao: yep i have it open efried pinged me earlier | 11:33 |
sean-k-mooney | im just going to grab coffe then ill review the vpmem seriese in full starting with the xml patch | 11:33 |
stephenfin | cool. I switched to -2 on the bottom one to make it clear | 11:33 |
luyao | sean-k-mooney: hah, thanks | 11:33 |
stephenfin | On the other hand, once that lands mox is basically gone. It's only been four years \o/ | 11:35 |
efried | sean-k-mooney, stephenfin: It's not important enough to spend a bunch of time on, but am I crazy about .update()? | 11:35 |
stephenfin | in relation to...? | 11:35 |
efried | a psum is a psum | 11:35 |
efried | the way you're doing it I guess you're not replacing one that's already there, whereas .update would (redundantly) replace | 11:36 |
stephenfin | I need a link | 11:36 |
efried | I guess python probably isn't smart enough to skip | 11:36 |
sean-k-mooney | oh maybe i miss understood | 11:36 |
efried | so yeah, I guess the way you're doing it is going to be more efficient | 11:36 |
efried | but the result will be the same | 11:36 |
sean-k-mooney | i though stephen was filtering the allocation candiate but he is not | 11:37 |
efried | just question of whether you want fewer LOC I guess | 11:37 |
efried | stephenfin: https://review.opendev.org/#/c/681719/1/nova/scheduler/manager.py@197 | 11:37 |
sean-k-mooney | update is fine for the summaries | 11:37 |
sean-k-mooney | i would prefer if we filtered this however | 11:37 |
sean-k-mooney | Draft | 11:37 |
sean-k-mooney | alloc_reqs.extend(alloc_reqs_fallback) | 11:38 |
efried | sean-k-mooney: filter it how? | 11:38 |
stephenfin | efried: tomato tomato. As you say, it doesn't matter | 11:38 |
stephenfin | I'm happy to go with '.update()' | 11:39 |
sean-k-mooney | is that the set of alloction candiates | 11:39 |
efried | we explicitly can't skip the ones we already have PCPUs for | 11:39 |
efried | that's the whole point | 11:39 |
efried | ... I think | 11:39 |
sean-k-mooney | it would be better not to pass 2 allcoation candiate for any given host | 11:39 |
efried | because the filter might kick out the PCPU ones | 11:39 |
sean-k-mooney | efried: we need to make sure if the filter passes we use the PCPU allocation candiate not the VCPU one | 11:40 |
sean-k-mooney | if we have two we need extra code to handel that | 11:40 |
efried | That's already covered I thought | 11:40 |
efried | because the candidates are in order per host | 11:40 |
sean-k-mooney | where | 11:40 |
*** ratailor has quit IRC | 11:40 | |
efried | .extend is O(1) (I think) so we're just taking a mem hit by not filtering. Whereas if we loop through and try to filter, we're O(N). | 11:41 |
sean-k-mooney | if they are ordered and we take the first one then ok | 11:41 |
efried | y | 11:41 |
efried | we noodled this out with gibi earlier I think | 11:41 |
stephenfin | To do that, we'd need to inspect each entry in alloc_reqs_fallback, pull out the resource provider ID, and check if it exists in alloc_reqs | 11:41 |
stephenfin | and I think they're ordered so it won't matter | 11:41 |
*** pots has quit IRC | 11:41 | |
sean-k-mooney | efried: exthend is fine if we have odering and can rely on it | 11:41 |
sean-k-mooney | i just was not aware that was a thing | 11:42 |
stephenfin | I could do this (I think) | 11:42 |
*** pots has joined #openstack-nova | 11:42 | |
efried | stephenfin: I'd rather leave it I think | 11:42 |
sean-k-mooney | ya i think its fine | 11:42 |
efried | unless some good reason to do otherwise | 11:42 |
stephenfin | alloc_reqs.extend([a for a in alloc_reqs_fallback if a['allocations'][path to rp UUID] not in provider_summaries) | 11:42 |
efried | still O(N) | 11:43 |
sean-k-mooney | im just trying to make sure we dont pick the wrong allcoatin candiage and end up using VCPUs when we have pinned cores on a host that has both | 11:43 |
stephenfin | Nah, it's simpler as-is and avoids us doing extra work on what could be a very big list | 11:43 |
stephenfin | Yup | 11:43 |
stephenfin | sean-k-mooney: Remember, the NUMATopologyFilter will prevent that | 11:43 |
efried | ++ | 11:43 |
sean-k-mooney | stephenfin: no the numa toplogy filter will only pass or fail a host | 11:43 |
sean-k-mooney | stephenfin: it has no say over the allocation candiates | 11:44 |
sean-k-mooney | so it will pass or fail both | 11:44 |
sean-k-mooney | filters today work on host not allocation candiates | 11:44 |
stephenfin | ah, right | 11:44 |
sean-k-mooney | that is why the ordering is important | 11:44 |
stephenfin | so we'd be using the VCPU-based allocation candidate blob | 11:44 |
sean-k-mooney | and that we take the first one | 11:44 |
stephenfin | but the host has both 'NUMATopology.cpuset' and 'NUMATopology.pcpuset' set to something | 11:45 |
stephenfin | and the NUMATopologyFilter would use the latter and pass the request | 11:45 |
sean-k-mooney | yes | 11:45 |
stephenfin | Okay, I'll triple check that ordering is preserved | 11:46 |
sean-k-mooney | its an issue only if it we have free inventory of both VCPUs and PCPUs | 11:46 |
sean-k-mooney | and have allocation candate for both | 11:46 |
stephenfin | agreed | 11:46 |
sean-k-mooney | in which case we need to make sure we use the PCPU inventory | 11:46 |
sean-k-mooney | *allocation_candiate | 11:46 |
stephenfin | a corner case, but there's no reason to give someone that loaded gun | 11:47 |
*** ttsiouts has quit IRC | 11:47 | |
sean-k-mooney | i prefer not giving people firearms at all | 11:47 |
donnyd | sean-k-mooney: something is wrong with the mirror at FN | 11:48 |
sean-k-mooney | sowrd at noon is much more civalised then pistols at dawn | 11:48 |
* stephenfin was going to make a joke about hurleys, but your choice of firearms instead of weapons precluded that :( | 11:48 | |
donnyd | I am working on getting it fixed | 11:48 |
stephenfin | https://www.youtube.com/watch?v=vtnS3_bmtm8 | 11:48 |
sean-k-mooney | hurleys are a perfectly suitable alternitive | 11:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Do not query allocations twice in finish_revert_resize https://review.opendev.org/678827 | 11:49 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Allow resizing server with port resource request https://review.opendev.org/679019 | 11:49 |
*** mkrai has joined #openstack-nova | 11:50 | |
sean-k-mooney | stephenfin: i havent seen that in a while | 11:51 |
*** elod has quit IRC | 11:51 | |
*** elod has joined #openstack-nova | 11:52 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Extract pf$N literals as constants from func test https://review.opendev.org/680991 | 11:54 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Improve dest service level func tests https://review.opendev.org/680998 | 11:57 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Follow up for Ib50b6b02208f5bd2972de8a6f8f685c19745514c https://review.opendev.org/681490 | 11:57 |
alex_xu | stephenfin: for sean-k-mooney pointed case both have VCPU and PCPU, then we only keep the PCPU allocation_candidtes override VCPu allocation_candidtes in alloc_reqs, then it will be ok. | 11:57 |
alex_xu | if both PCPU and VCPU on the same host, that VCPU must be for shared cpu | 11:57 |
sean-k-mooney | right | 11:58 |
sean-k-mooney | but i did not see code to do that in stephens code | 11:58 |
sean-k-mooney | that siad i only quickly looked at it | 11:59 |
alex_xu | yea, nice spot | 11:59 |
stephenfin | alex_xu: Yeah, we don't have that. Instead we're relying on ordering preventing that from happening | 11:59 |
stephenfin | I think I should add the filtering, to be safe | 11:59 |
*** hemna has quit IRC | 12:00 | |
sean-k-mooney | if we were building a dict of host to allocation candiate we coudl use set_default | 12:00 |
stephenfin | Scenario: we only have one host that provides both PCPU and VCPU | 12:00 |
stephenfin | actually, I was going to say the the NUMATopology filter fails the request with PCPUs but passes the VCPU request | 12:01 |
stephenfin | would that happen? | 12:01 |
stephenfin | ...though? | 12:01 |
*** markvoelker has joined #openstack-nova | 12:01 | |
sean-k-mooney | no | 12:01 |
stephenfin | I guess something could have changed in the time between the two tests | 12:01 |
sean-k-mooney | the filter does not operate on requests | 12:01 |
sean-k-mooney | they operate on hosts | 12:01 |
stephenfin | yeah, correct | 12:01 |
stephenfin | so if our order is preserved, we'll have hit that host already due to the allocation request for PCPUs | 12:02 |
stephenfin | if it failed once, it will fail if we hit it a second time due to the allocation request with VCPUs | 12:02 |
stephenfin | right? | 12:02 |
*** tbachman has quit IRC | 12:02 | |
alex_xu | no, the scheduler filtered the hosts by allocation-candidtes first, then iterate the filtered hosts | 12:03 |
efried | "something could have changed" disregard this, already a window between GET /a_c and PUT /allocs | 12:03 |
*** ttsiouts has joined #openstack-nova | 12:03 | |
*** mkrai has quit IRC | 12:03 | |
stephenfin | efried: I mean we take the allocation request with PCPUs and try passing that through the NUMATopologyFilter or something and it fails | 12:04 |
sean-k-mooney | i though we got all the host form all allcoation candiate and looped over them once | 12:04 |
stephenfin | we then try a load of other hosts, which all fail | 12:04 |
sean-k-mooney | applying all fiters | 12:04 |
sean-k-mooney | then passed that to the weighers | 12:04 |
stephenfin | then we come back to that same original host only using the allocation candidate with VCPUs | 12:04 |
*** mkrai has joined #openstack-nova | 12:04 | |
sean-k-mooney | and finally select a host then find the corresponing allcoaiton candiate | 12:04 |
alex_xu | stephenfin: maybe we ignore the allocation_request if the rp_uuid already in alloc_reqs_by_rp_uuid https://review.opendev.org/#/c/681719/1/nova/scheduler/manager.py@212 | 12:04 |
stephenfin | only in the time inbetween, something has changed on the host so the NUMATopologyFilter or whatever passes the request | 12:05 |
stephenfin | it's going to be such a tiny window but I wonder if it's worth worrying about? | 12:05 |
sean-k-mooney | stephenfin: if something changes the claim wil fial in placmeent | 12:05 |
alex_xu | ah, no, I'm wrong, we can't ignore directly, since we have same rp for child rp | 12:05 |
sean-k-mooney | then we will move on to the next host | 12:05 |
efried | I wouldn't have thought it was any different than what we're doing now for multiple candidates on one host. | 12:06 |
stephenfin | no, it's not really actually | 12:06 |
efried | sean-k-mooney: I think he's talking about if the second one spuriously succeeds | 12:06 |
efried | but yeah, I don't think this is worth worrying about. | 12:07 |
sean-k-mooney | my point is i dont think the numa toplogy filter will ever run twice for the same host | 12:07 |
sean-k-mooney | it will only run once | 12:07 |
efried | and i think that ^ is what we want | 12:07 |
alex_xu | ^ agree | 12:07 |
alex_xu | agree to sean-k-mooney... :) | 12:07 |
sean-k-mooney | because filter today have no knowadge of allcoation candates | 12:07 |
efried | because if there are available PCPUs for a host, we want to check that candidate only. But if there aren't any, there will only be (at most) VCPU candidates in the list. | 12:08 |
sean-k-mooney | if there are free pcpus and vcpus we will pass the host and have two candiates | 12:08 |
sean-k-mooney | and hopefully rely on orderr to pic the PCPU one | 12:09 |
efried | Yes exactly. Whole point of this is to make sure the first thing we look at is *either* a PCPU candidate *or* there are no PCPU candidates. | 12:09 |
sean-k-mooney | ya | 12:09 |
sean-k-mooney | stephenfin: rather then guessing | 12:09 |
sean-k-mooney | can you add a test | 12:09 |
stephenfin | sure. What should the test do? | 12:10 |
sean-k-mooney | create a host that has inventories fo both | 12:10 |
sean-k-mooney | and assrt the pcpu inventory was calimed | 12:10 |
sean-k-mooney | as a functional test | 12:10 |
stephenfin | I think I have that already | 12:10 |
* stephenfin checks | 12:10 | |
sean-k-mooney | if it is we are all good | 12:10 |
efried | If ordering was random, that would be nondeterministic | 12:10 |
stephenfin | sean-k-mooney: Yeah, look at https://review.opendev.org/#/c/671801/46/nova/tests/functional/libvirt/test_numa_servers.py@364 | 12:11 |
alex_xu | If a host have both vcpu and pcpu, then we will get two allocation_request in alloc_reqs_by_rp_uuid https://review.opendev.org/#/c/681719/1/nova/scheduler/manager.py@212 | 12:11 |
sean-k-mooney | well we coudl do it in a loop | 12:11 |
sean-k-mooney | bug fair point | 12:11 |
stephenfin | It's a resize test but it does exactly this | 12:11 |
alex_xu | then we only claim the PCPU one | 12:11 |
stephenfin | Lemme run that in a loop | 12:11 |
efried | alex_xu: exactly | 12:11 |
alex_xu | so there will be check to found out there two allocation req, one for vcpu and another for pcpu, then only claim for pcpu | 12:12 |
sean-k-mooney | ok i think ye have that in hand to ill go back to the vpmem reviews | 12:14 |
*** mkrai has quit IRC | 12:15 | |
*** tbachman has joined #openstack-nova | 12:16 | |
openstackgerrit | Ivaylo Mitev proposed openstack/nova master: VMware: Update flavor-related metadata on resize https://review.opendev.org/681004 | 12:17 |
stephenfin | efried: What was the decision on '.update()' vs. the for-loop again? | 12:18 |
sean-k-mooney | stephenfin: update is fine | 12:21 |
sean-k-mooney | the sumaris will be the same | 12:21 |
sean-k-mooney | efried: the vpmem xml generation patch looks sane to me | 12:22 |
sean-k-mooney | im starting form the bottom of the seriese now and working my way up | 12:22 |
openstackgerrit | Merged openstack/nova master: Fix race in _test_live_migration_force_complete https://review.opendev.org/681540 | 12:25 |
sean-k-mooney | donnyd: thanks for the head up on the mirror issue | 12:25 |
*** lbragstad_ has quit IRC | 12:25 | |
openstackgerrit | Ivaylo Mitev proposed openstack/nova master: VMware: Update flavor-related metadata on resize https://review.opendev.org/681004 | 12:26 |
*** owalsh is now known as owalsh_brb | 12:27 | |
donnyd | sean-k-mooney: I think we have it narrowed down. should be back up in the next couple hours | 12:28 |
sean-k-mooney | it looks like its the same issue i was having from your conversation on infra | 12:28 |
sean-k-mooney | you are hitting MTU issue becasue of the tunnel | 12:28 |
sean-k-mooney | well similar | 12:28 |
sean-k-mooney | i needed to clamp it on HEs end | 12:29 |
sean-k-mooney | in your case you need to clamp the mtu in the neutron subnet | 12:29 |
sean-k-mooney | but same general symptoms of retransmits of large packets | 12:29 |
sean-k-mooney | by the way i also sit in the infra channel so you can always ping me there if there are issues | 12:30 |
sean-k-mooney | if you want to notify nova in general then here is fine too | 12:30 |
*** elod has quit IRC | 12:32 | |
*** elod has joined #openstack-nova | 12:33 | |
*** eharney has joined #openstack-nova | 12:36 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Skip querying resource request in revert_resize if no qos port https://review.opendev.org/681513 | 12:37 |
gibi | bauzas, mriedem: I've finished respinning the bandwidth series | 12:38 |
openstackgerrit | Thomas Bechtold proposed openstack/nova master: Add nova-status to man-pages list https://review.opendev.org/681733 | 12:41 |
*** derekh has quit IRC | 12:42 | |
*** mriedem has joined #openstack-nova | 12:43 | |
*** owalsh_brb is now known as owalsh | 12:43 | |
*** larainema has quit IRC | 12:45 | |
*** ociuhandu has quit IRC | 12:50 | |
*** mriedem has quit IRC | 12:54 | |
*** BjoernT has joined #openstack-nova | 12:56 | |
*** derekh has joined #openstack-nova | 12:56 | |
*** ociuhandu has joined #openstack-nova | 12:58 | |
*** derekh has quit IRC | 12:59 | |
*** mriedem has joined #openstack-nova | 13:00 | |
*** derekh has joined #openstack-nova | 13:00 | |
*** _erlon_ has quit IRC | 13:01 | |
*** _erlon_ has joined #openstack-nova | 13:02 | |
*** jmlowe has quit IRC | 13:03 | |
*** BjoernT has quit IRC | 13:04 | |
bauzas | gibi: ack, will look | 13:05 |
openstackgerrit | Thomas Bechtold proposed openstack/nova master: Add nova-status to man-pages list https://review.opendev.org/681733 | 13:06 |
sean-k-mooney | luyao: efried stephenfin https://review.opendev.org/#/c/678448/21/nova/objects/base.py i think that is a proablem that should be fixed. it could be a follow up but all_things_equal is not a valid comparison implematnion of == | 13:10 |
efried | sean-k-mooney: say wha? | 13:13 |
sean-k-mooney | its not symetic so a == b will not always be the same as b == a | 13:13 |
sean-k-mooney | and it does not check type | 13:13 |
sean-k-mooney | so two ovo with the same filed will compare equal if they are complete unrelated types | 13:14 |
efried | sean-k-mooney: This code is copied in from numa.py | 13:14 |
efried | a is not None because this is invoked with `self` | 13:14 |
efried | and I don't care if they're different types. | 13:14 |
mriedem | lyarwood: have you seen this? https://bugs.launchpad.net/nova/+bug/1843643 | 13:15 |
openstack | Launchpad bug 1843643 in OpenStack Compute (nova) "VM on encrypted boot volume fails to start after compute host reboot" [Undecided,New] | 13:15 |
sean-k-mooney | efried: well it a free function so you can only rely on it being invoked with self when it is used to implement _equal_ | 13:16 |
sean-k-mooney | * __eq__ | 13:16 |
efried | right, but it's not called "two_objects_identical" | 13:16 |
efried | it's called "all_things_equal" | 13:16 |
sean-k-mooney | yes and equal is a stonger guarentee then equvalent | 13:17 |
sean-k-mooney | anyway | 13:17 |
lyarwood | mriedem: no if it's new | 13:17 |
sean-k-mooney | the way its used it will work | 13:17 |
* lyarwood looks | 13:17 | |
efried | sean-k-mooney: the only reason this code is added to base is so it can be removed from numa and reused | 13:18 |
efried | the code isn't changed. | 13:18 |
sean-k-mooney | so ill change to a +1 i guess but i still think that is an incorerct implemation of __eq__ | 13:18 |
*** alex_xu has quit IRC | 13:18 | |
sean-k-mooney | right i think the code in numa was wrong | 13:18 |
sean-k-mooney | againg it proably works for the case we use it in | 13:18 |
sean-k-mooney | *again | 13:19 |
sean-k-mooney | but its not generally correct | 13:19 |
*** alex_xu has joined #openstack-nova | 13:19 | |
openstackgerrit | Martin Midolesov proposed openstack/nova master: Implementing graceful shutdown. https://review.opendev.org/666245 | 13:20 |
alex_xu | sean-k-mooney: I guess just quite return, since self can't be None, if the other is None, then just return False | 13:20 |
lyarwood | mriedem: that smells like more of an issue with resume_guests_state_on_host_boot tbh | 13:20 |
*** lbragstad has joined #openstack-nova | 13:20 | |
* lyarwood adds notes in the bug | 13:20 | |
efried | sean-k-mooney: more tech debt we shouldn't be fixing here. | 13:20 |
efried | see https://review.opendev.org/#/c/681397/8 | 13:21 |
sean-k-mooney | efried: ya i said it could be fix in a follow up | 13:21 |
sean-k-mooney | as a free fuction its incorrect | 13:21 |
sean-k-mooney | but if only invoked in __eq__ where the first arguemtn it self | 13:21 |
sean-k-mooney | then soem of the assumtion make sense | 13:22 |
*** jmlowe has joined #openstack-nova | 13:24 | |
*** dolpher has joined #openstack-nova | 13:24 | |
*** jmlowe has quit IRC | 13:25 | |
efried | sean-k-mooney: are you un -1 ing? | 13:26 |
sean-k-mooney | ill +1 but https://www.python.org/dev/peps/pep-0207/#proposed-resolutions section 4 allow the interperter to swap the arguemtn to == which is why i rased this | 13:26 |
bauzas | folks, I'm being dragged for some internal bug duty, ping me for urgent reviews | 13:27 |
efried | ack, thanks bauzas | 13:27 |
bauzas | live my life | 13:27 |
sean-k-mooney | efried: if == is not reflexiv/semetic it can break on no cpython impmenations | 13:27 |
sean-k-mooney | on cpython this will work fine as it wont swap them | 13:27 |
*** Luzi has quit IRC | 13:28 | |
*** nweinber__ has joined #openstack-nova | 13:28 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/stein: Log notifications if assertion in _test_live_migration_force_complete fails https://review.opendev.org/681743 | 13:28 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/stein: Fix race in _test_live_migration_force_complete https://review.opendev.org/681744 | 13:28 |
efried | sean-k-mooney: open a bug? | 13:28 |
sean-k-mooney | sure | 13:29 |
*** cdent has quit IRC | 13:30 | |
*** pcaruana has quit IRC | 13:30 | |
mriedem | gibi: sorry about all of that _reschedule_resize_or_reraise noise | 13:32 |
mriedem | i had the logic obviously wrong | 13:32 |
*** cdent has joined #openstack-nova | 13:34 | |
efried | I'm going to get a nap while the gate cranks. Back in... say, 3.5h. | 13:35 |
dansmith | gate looks constipated | 13:35 |
dansmith | heh | 13:35 |
efried | dansmith: there was a dep problem overnight, resolved now, but everything is backed up, yes. | 13:36 |
*** efried is now known as efried_afk | 13:36 | |
*** jmlowe has joined #openstack-nova | 13:36 | |
bauzas | dansmith: we checked whether we should provide the gate a pill, but looks like the issue is solved | 13:36 |
gibi | mriedem: no worries. I needed my fresh mind the morning to figure out what is going on. | 13:37 |
gibi | mriedem, bauzas: I will be on and off in the coming 2-3 hours. But I will be back for the night to finish up things if needed | 13:37 |
bauzas | http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22ERROR%3A%20Could%20not%20find%20a%20version%20that%20satisfies%20the%20requirement%20configparser%5C%22 | 13:38 |
dansmith | bauzas: go natural: https://www.youtube.com/watch?v=Ku42Iszh9KM | 13:38 |
bauzas | gibi: like I said, I'm also dragged due to some internal bug scrub duty | 13:38 |
bauzas | :( | 13:38 |
mriedem | gibi: ok, i did spot a race in the new functional test in the sriov patch, so i'm going to -1 for that, but i can probably just fix and rebase the series myself | 13:38 |
mriedem | bauzas: you can't tell your management / team lead that today is f'ing upstream feature freeze? | 13:39 |
mriedem | and get a 1 day break? | 13:39 |
bauzas | dansmith: I wish you knew some french folks named "Les Nuls" so I could share you some video | 13:39 |
*** ricolin has joined #openstack-nova | 13:39 | |
dansmith | heh | 13:39 |
*** markvoelker has quit IRC | 13:40 | |
mriedem | if bauzas can't review the rest of the bw provider migrate series i'll see if i can get melwitt to help out | 13:40 |
bauzas | mriedem: this duty should be around 1 hour or 2 | 13:40 |
bauzas | but then I'll have meeting | 13:40 |
bauzas | yay | 13:40 |
mriedem | is there a manager standing next to you with a gun to your head saying you have to do bug duty over FF? | 13:40 |
mriedem | can you tell them you're a nova core? | 13:40 |
mriedem | one of the remaining few? | 13:41 |
bauzas | my dog is barking at me | 13:41 |
mriedem | whatever, this is why i have no hope for "we'll do it in U" | 13:41 |
*** ociuhandu has quit IRC | 13:41 | |
bauzas | mriedem: that's why you'll never see me saying "Ussuri" | 13:42 |
bauzas | mriedem: "Unicorn" sounds a way better release name | 13:42 |
*** markvoelker has joined #openstack-nova | 13:42 | |
bauzas | (more appropriate at least) | 13:42 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Include both VCPU and PCPU in core quota count https://review.opendev.org/681374 | 13:43 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Add support for translating CPU policy extra specs, image meta https://review.opendev.org/671801 | 13:43 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: fakelibvirt: Make 'Connection.getHostname' unique https://review.opendev.org/681060 | 13:43 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: libvirt: Mock 'libvirt_utils.file_open' properly https://review.opendev.org/681061 | 13:43 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Add reshaper for PCPU https://review.opendev.org/674895 | 13:43 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: tests: Additional functional tests for pinned instances https://review.opendev.org/681750 | 13:44 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: trivial: Remove single-use classmethod https://review.opendev.org/681751 | 13:44 |
alex_xu | stephenfin: need you kick the first one https://review.opendev.org/#/c/678447/14 | 13:44 |
stephenfin | dansmith: In case you didn't see the scrollback, we modified the solution in https://review.opendev.org/671801 slightly | 13:44 |
stephenfin | alex_xu: just done | 13:44 |
alex_xu | stephenfin: thanks | 13:44 |
stephenfin | though I realize I missed some of your comments | 13:45 |
* stephenfin quickly fixes up | 13:45 | |
artom | bauzas, want me to do the bugs thing? | 13:45 |
artom | bauzas, so you can concentrate on w/e? | 13:45 |
stephenfin | alex_xu: However, https://review.opendev.org/#/c/671801/ should be ready for your viewing pleasure too | 13:45 |
bauzas | artom: how is your series ? | 13:45 |
alex_xu | stephenfin: got it | 13:45 |
artom | bauzas, merged (well, merging, because gate) | 13:45 |
stephenfin | bauzas: Want to look at https://review.opendev.org/#/c/681750/, pretty please? | 13:46 |
artom | (Otherwise I wouldn't have offered) | 13:46 |
stephenfin | (I split it out to make https://review.opendev.org/#/c/671801/ smaller) | 13:46 |
*** BjoernT has joined #openstack-nova | 13:46 | |
bauzas | artom: cool then okay, we could switch | 13:46 |
bauzas | stephenfin: I can | 13:46 |
artom | bauzas, ack, I'll handle the bugs, do your nova core job | 13:46 |
bauzas | stephenfin: did your reorganised your series ? | 13:47 |
bauzas | oh wait no | 13:47 |
stephenfin | nope, just broke some things out | 13:47 |
artom | The "out" is ket | 13:47 |
artom | *key | 13:47 |
dansmith | stephenfin: I didn't but I'll try to circle back in a bit | 13:48 |
alex_xu | stephenfin: I didn't find that part for sean-k-mooney said host both have pcpu and vcpu | 13:49 |
stephenfin | alex_xu: We test it implicitly here https://review.opendev.org/#/c/671801/47/nova/tests/functional/libvirt/test_numa_servers.py@364 | 13:50 |
stephenfin | Both of those hosts have both PCPU and VCPU inventory | 13:50 |
stephenfin | and we still get PCPU as expected https://review.opendev.org/#/c/671801/47/nova/tests/functional/libvirt/test_numa_servers.py@460 | 13:51 |
stephenfin | and https://review.opendev.org/#/c/671801/47/nova/tests/functional/libvirt/test_numa_servers.py@489 | 13:51 |
*** KeithMnemonic has joined #openstack-nova | 13:53 | |
mriedem | gibi: i've gone through the updates, i'll rebase and fix the small things and then i'll be +2 up the stack | 13:53 |
alex_xu | emm...let me check | 13:53 |
bauzas | mriedem: cool, I'll look at the series once i'm done with a couple of changes from stephenfin | 13:54 |
*** dave-mccowan has joined #openstack-nova | 13:54 | |
mriedem | efried_afk: so we're pushing PCPU in before the scheduler and quotas stuff is approved? https://review.opendev.org/#/c/671793/ | 13:55 |
*** brinzhang_ has joined #openstack-nova | 13:55 | |
*** tkajinam has joined #openstack-nova | 13:55 | |
*** hemna has joined #openstack-nova | 13:55 | |
mriedem | i think for train the quotas one is pretty straight-forward, counting from placement will work the same as counting from nova, | 13:56 |
mriedem | but the scheduler patch looks like it's still in flux | 13:56 |
stephenfin | the quota stuff has got a tentative ack from melwitt, and the scheduler stuff has the same from efried and alex_xu | 13:56 |
stephenfin | see here | 13:56 |
stephenfin | https://review.opendev.org/#/c/681719/ | 13:56 |
* stephenfin abandons same now that's outlived its purpose | 13:56 | |
dansmith | even still, if we were to land that first patch and get stuck on the rest, we've got a config option that claims to enable something that doesn't, which is uncool | 13:57 |
stephenfin | it'll do exactly what it says - you just won't be able to use the things it provides (PCPUs) | 13:58 |
gibi | mriedem: thanks a lot! | 13:58 |
*** hamzy has quit IRC | 13:58 | |
bauzas | stephenfin: can you please tell me why we need this https://review.opendev.org/#/c/681750/1/nova/tests/unit/virt/libvirt/fake_imagebackend.py ? | 13:58 |
mriedem | i hope there is going to be some "so you want to use PCPUs" admin doc at some point b/c there are a ton of moving parts | 13:58 |
bauzas | mriedem: I feel there are some docs that absolutely need to be written | 13:59 |
stephenfin | bauzas: There's a check deep in the libvirt code for that. If it's not done, it complains with "function doesn't have attribute SUPPORTS_CLONE" | 13:59 |
bauzas | mriedem: in particular given the upgrade plan we took a while to agree this morning | 13:59 |
stephenfin | bauzas: I'm not the only one who found that. This is from the vPMEM series https://review.opendev.org/#/c/678470/35/nova/tests/unit/virt/libvirt/fake_imagebackend.py | 13:59 |
mriedem | stephenfin: there should probably be a comment on that in the fake imagebackend code then | 13:59 |
bauzas | stephenfin: what mriedem said | 14:00 |
*** ociuhandu has joined #openstack-nova | 14:00 | |
bauzas | because I have no idea why you need this | 14:00 |
mriedem | pretty gross that the virt agnostic image backend fixture has to have libvirt specific stuff in it | 14:00 |
*** belmoreira has joined #openstack-nova | 14:00 | |
mriedem | bauzas: stephenfin loves to doc so i'm sure he doesn't have a problem documenting all the things | 14:00 |
bauzas | mriedem: the upgrade strategy is a bit tricky, I'm just saying we need to carefully doc it | 14:01 |
bauzas | based on the fact we'll provide some options | 14:01 |
stephenfin | mriedem: It's a libvirt-specific image backend fixture - look at the filename | 14:01 |
stephenfin | it's for mocking out nova/virt/libvirt/imagebackend.py | 14:01 |
* stephenfin does love to doc | 14:02 | |
mriedem | stephenfin: you're right, sorry - i was thinking of https://review.opendev.org/#/c/678470/35/nova/tests/unit/image/fake.py | 14:02 |
stephenfin | all g | 14:03 |
stephenfin | I wonder what the comment should be though | 14:03 |
stephenfin | I mean, it's an attribute of the Image base class in nova/virt/libvirt/imagebackend.py | 14:04 |
stephenfin | and this function is a stub for that class (or some subclass) | 14:04 |
stephenfin | so it's not really doing anything special but rather just ensuring the stub looks like the thing it's stubbing | 14:04 |
mriedem | # Set the SUPPORTS_CLONE member variable to satisfy the Image base class. | 14:05 |
stephenfin | Sounds good. Done | 14:08 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: tests: Additional functional tests for pinned instances https://review.opendev.org/681750 | 14:08 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Include both VCPU and PCPU in core quota count https://review.opendev.org/681374 | 14:08 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Add support for translating CPU policy extra specs, image meta https://review.opendev.org/671801 | 14:08 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: fakelibvirt: Make 'Connection.getHostname' unique https://review.opendev.org/681060 | 14:08 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: libvirt: Mock 'libvirt_utils.file_open' properly https://review.opendev.org/681061 | 14:08 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Add reshaper for PCPU https://review.opendev.org/674895 | 14:08 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: trivial: Remove single-use classmethod https://review.opendev.org/681751 | 14:08 |
*** pcaruana has joined #openstack-nova | 14:08 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: trivial: Remove single-use classmethod https://review.opendev.org/681751 | 14:09 |
dansmith | stephenfin: what job logs can I look at to see this running? | 14:13 |
stephenfin | dansmith: Look at the functional logs for https://review.opendev.org/#/c/671801 | 14:14 |
dansmith | no I mean a devstack/tempest job | 14:14 |
stephenfin | None. The Intel NFV CI has been missing for quite some time. It's been alex_xu, sean-k-mooney, Bhagyashri Shewale (forgot his IRC nick) and I testing it locally | 14:16 |
dansmith | stephenfin: is there some reason this has to be tested in special ci? | 14:16 |
sean-k-mooney | yes it need nested virt | 14:16 |
stephenfin | Pinned CPUs. That needs nested virt | 14:16 |
stephenfin | which we can't rely on | 14:16 |
sean-k-mooney | but i can test it in FN | 14:17 |
dansmith | we can't get a one-off? | 14:17 |
dansmith | sean-k-mooney: that would be good | 14:17 |
artom | stephenfin, wait, I thought this used the <vcpus> style pinning, no? | 14:17 |
artom | Which *does* work in the vanilla gate | 14:17 |
dansmith | I'm shocked that artom and sean-k-mooney were able to get numa live migration tested for us, so surely this is possible | 14:17 |
sean-k-mooney | that was also testing pinning with the resize flavors | 14:18 |
sean-k-mooney | so yes | 14:18 |
artom | dansmith, let's be honest, *sean-k-mooney* was able to get NUMA live migration tested | 14:18 |
dansmith | artom: I just meant that effort, but sure | 14:18 |
stephenfin | sean-k-mooney: can we just point the same job at the top-most change of this so? | 14:18 |
sean-k-mooney | we can also technical test non numa stuff in vexhost and limestone | 14:18 |
sean-k-mooney | stephenfin: yes but ill create a second patch | 14:19 |
dansmith | I'm disappointed that nobody has asked this question yet :/ | 14:19 |
sean-k-mooney | i want to actully merge that job | 14:19 |
*** lbragstad has quit IRC | 14:19 | |
stephenfin | between manual tests and the functional tests, I figure our coverage is pretty much spot on as-is | 14:20 |
dansmith | stephenfin: what testing are you doing locally? Just hand booting some things or running tempest? | 14:20 |
artom | dansmith, presumably the crucial thing would be testing the upgrade path - so something grenade-y? | 14:20 |
stephenfin | dansmith: https://etherpad.openstack.org/p/nova-cpu-resources | 14:20 |
stephenfin | alex_xu has his own set of tests that he's been working towards too | 14:21 |
sean-k-mooney | what do people want me to actully test for this in the ci job? | 14:21 |
dansmith | artom: I'm definitely concerned about that yeah, but more concerned about at least seeing a tempest run first | 14:21 |
sean-k-mooney | for the live migration job i ran the live migration test shoudl i jsut run the server and network baskic op | 14:21 |
dansmith | stephenfin: that's cool, but it's far from comprehensive | 14:21 |
alex_xu | more on upgrade case, one node old, one node new, then boot, resize between nodes | 14:22 |
sean-k-mooney | also do you want a singel node test or multi | 14:22 |
dansmith | sean-k-mooney: if you're going to run the grenade job, multi would be good | 14:22 |
*** mriedem61 has joined #openstack-nova | 14:23 | |
sean-k-mooney | im not sure how easy grenate will be. i could try the non legacy grenade job however | 14:23 |
*** mriedem has quit IRC | 14:23 | |
*** mriedem61 is now known as mriedem | 14:23 | |
sean-k-mooney | ok i can give grenade a try but ill do a non grenade run first | 14:23 |
artom | Hrm, so actually grenade is not what I meant by upgrade | 14:23 |
artom | I mean it's good too, but... | 14:23 |
artom | I'd be most concerned about the in-place upgrade scenario | 14:24 |
dansmith | artom: how is that not grenade? | 14:24 |
artom | dansmith, wait, does grenade do that? | 14:24 |
sean-k-mooney | thats all greande does | 14:24 |
artom | I thought it just deployed a mix of old and new computes | 14:24 |
dansmith | artom: that is all it does | 14:24 |
artom | It upgrades in place? | 14:24 |
sean-k-mooney | yes | 14:24 |
dansmith | artom: it can leave one back | 14:24 |
artom | Oh, hah, ignore me then :P | 14:25 |
dansmith | artom: but in general it's single node in-place | 14:25 |
sean-k-mooney | if its multi node it does the contoler first then you do the live migrate back and for test | 14:25 |
mriedem | the grenade multinode jobs leave one nova-compute back | 14:25 |
artom | Oh, and the multinode version of it does the live migration back and forth business | 14:25 |
dansmith | sean-k-mooney: I think a full basic tempest run would be good just to make sure we're good on things like resizes, etc | 14:25 |
mriedem | artom: yeah | 14:25 |
mriedem | my wifi dropped (power outage) - talking about PCPU CI testing? | 14:26 |
sean-k-mooney | ok so tempest full single node with pinned flavor | 14:26 |
dansmith | I'm concerned that maybe nobody has even run tempest against this yet to the point that there may be some gotchas we don't know about such that it can't run :/ | 14:26 |
artom | So actually wouldn't we be able to just resurect https://review.opendev.org/#/c/680806/ and make it depend on stephenfin's series? | 14:26 |
dansmith | mriedem: yes | 14:26 |
stephenfin | artom: I could point run whitebox locally? | 14:27 |
mriedem | stephenfin: alex_xu: have you done functional testing with the PCPU series where things aren't stubbed out? | 14:27 |
*** henriqueof1 has joined #openstack-nova | 14:27 | |
*** hamzy has joined #openstack-nova | 14:27 | |
artom | stephenfin, whitebox doesn't have any code to test your stuff | 14:27 |
sean-k-mooney | dansmith: well tempest has run with out pinning so if i do a run with only pinning that covers it right | 14:27 |
dansmith | mriedem: they have an etherpad of hand-run test cases | 14:27 |
alex_xu | mriedem: I done manual test with upgrade case | 14:27 |
dansmith | sean-k-mooney: yeah, isn't that what we're talking about here? | 14:27 |
artom | stephenfin, but... in your series, for instances using the PCPU resource class, what's the XML that pins them? | 14:27 |
stephenfin | artom: It tests that we can boot pinned instances, resize them, etc. | 14:27 |
artom | <vcpu cpuset=> or <cputune> ? | 14:28 |
sean-k-mooney | ill kick of 3 runs 1 with vcpu pin set 1 with onlyv cpu_dedicated_set and one with cpu_dedicated_set and cpu_shared_set | 14:28 |
stephenfin | artom: The same thing that was used for pinning with 'hw:cpu_policy' | 14:28 |
dansmith | sean-k-mooney: sounds good | 14:28 |
stephenfin | none of the libvirt XML stuff changes. This is all higher level accounting stuff | 14:28 |
dansmith | stephenfin: he knows, he's asking from a where-can-we-test perspective I think | 14:28 |
sean-k-mooney | FN is currenly offline since donnyd is doing a router upgrde so ill use vexhost instead for now | 14:29 |
stephenfin | Yeah, I get that. The answer is the same place we can test 'hw:cpu_policy=dedicated' currently | 14:29 |
artom | stephenfin, well I was asking because the allocation-styl <vcpu> pinning is what's done if the instance doens't have a NUMA topology and vcpu_pin_set is set (we talked about this, remember?) | 14:29 |
donnyd | I am ready to bring FN back online | 14:29 |
dansmith | stephenfin: unless one knows where that is... | 14:29 |
artom | stephenfin, and *that* can be tested in the gate | 14:29 |
*** hemna has quit IRC | 14:29 | |
artom | stephenfin, but from what you're saying that's not what your series uses | 14:29 |
sean-k-mooney | donnyd: well either works for this test i just need nested virt | 14:29 |
stephenfin | artom: yeah, no, the instance will have a NUMA topology because of the pinning | 14:29 |
sean-k-mooney | but i do need to go write a patch to add a lable that will use all the nested virt provder rater then just one specifically | 14:30 |
*** henriqueof has quit IRC | 14:30 | |
donnyd | sean-k-mooney: I think maybe vexxhost can also run this job. I don't want to speak on behalf of mnaser | 14:31 |
dansmith | donnyd: I think he's said several times that it can :) | 14:31 |
mnaser | yes should be ok | 14:31 |
* mnaser goe sback to hiding | 14:31 | |
*** mriedem has quit IRC | 14:31 | |
*** mriedem has joined #openstack-nova | 14:32 | |
donnyd | mnaser: do you have a flavor already built? | 14:32 |
sean-k-mooney | ill do all the nodepool lable stuff next week to make this more resiliant | 14:32 |
donnyd | kk sean-k-mooney | 14:33 |
sean-k-mooney | donnyd: this does not need the multi numa node stuff | 14:33 |
sean-k-mooney | just a single numa node is fine | 14:33 |
donnyd | oh just nested virt | 14:33 |
sean-k-mooney | yep | 14:33 |
sean-k-mooney | we just want to test pinning | 14:33 |
sean-k-mooney | since that is what we are chaning | 14:33 |
sean-k-mooney | once artoms stuff is actully merged i can retarget the multi numa job to also test stephens code | 14:34 |
sean-k-mooney | and get test coverage for both | 14:34 |
dansmith | migration with pinning "works" today but just blind copies everything to the other side, right? | 14:34 |
sean-k-mooney | ya | 14:34 |
dansmith | so these tests should also support migration but with smarts yeah? | 14:34 |
sean-k-mooney | i can run the migration tests if you like | 14:35 |
dansmith | yes, definitely | 14:35 |
sean-k-mooney | but it will do the wrong thing until artoms code lands | 14:35 |
mnaser | no i didn't get a chance to setup the flavor because it didn't seem pressing at the time :) | 14:35 |
dansmith | we need to at least make sure we don't regress the stupid behavior | 14:35 |
mnaser | ill wait until "we need it" | 14:35 |
sean-k-mooney | but if i use concurancy 1 it wont break anything | 14:35 |
dansmith | sean-k-mooney: or depends-on artoms? | 14:35 |
sean-k-mooney | im not sure if there will be a merge conflict | 14:36 |
sean-k-mooney | ill test that locally | 14:36 |
dansmith | uh | 14:36 |
openstackgerrit | Martin Midolesov proposed openstack/nova master: Implementing graceful shutdown. https://review.opendev.org/666245 | 14:36 |
dansmith | well, if there is we should rebase the pcpu stuff now, right? | 14:36 |
sean-k-mooney | am i guess so | 14:36 |
dansmith | maybe check locally first | 14:36 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Support migrating SRIOV port with bandwidth https://review.opendev.org/676980 | 14:36 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Allow migrating server with port resource request https://review.opendev.org/671497 | 14:36 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Allow resizing server with port resource request https://review.opendev.org/679019 | 14:36 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Extract pf$N literals as constants from func test https://review.opendev.org/680991 | 14:36 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Improve dest service level func tests https://review.opendev.org/680998 | 14:36 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Follow up for Ib50b6b02208f5bd2972de8a6f8f685c19745514c https://review.opendev.org/681490 | 14:36 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Skip querying resource request in revert_resize if no qos port https://review.opendev.org/681513 | 14:36 |
sean-k-mooney | dansmith: artoms funtional tests still ned to be rebased but if i merge the cpu series with artoms serise without that it works locally | 14:40 |
sean-k-mooney | so ill do that quickly | 14:40 |
dansmith | sean-k-mooney: ack | 14:40 |
sean-k-mooney | the functionla patch we are leaving to not need to rebase all the rest | 14:40 |
dansmith | sean-k-mooney: that means you can depends-on it in your gate jobs yeah? or are you going to rebase all of pcpu and push it up? | 14:40 |
sean-k-mooney | so that will be rebased when the stuff is merged | 14:41 |
mriedem | gibi: bauzas: melwitt: i'm +2 up the bw provider migrate/resize series now https://review.opendev.org/#/q/topic:bp/support-move-ops-with-qos-ports+status:open | 14:41 |
mriedem | just a few changes need a +W but they aren't too hard | 14:41 |
*** dave-mccowan has quit IRC | 14:41 | |
sean-k-mooney | ya i just need to depnd on the second form last patch in artoms serie | 14:41 |
sean-k-mooney | e.g. the one before the the functionl test | 14:41 |
sean-k-mooney | but i can have the job that is testing artoms stuff depend on stephens change too | 14:42 |
sean-k-mooney | and it should all merge fine | 14:42 |
sean-k-mooney | did i mention that zuul is awsome recently | 14:42 |
artom | Famous last words ;) | 14:42 |
sean-k-mooney | i did the merge locally so it should be fine | 14:43 |
sean-k-mooney | it will be a 4 way merge however (master+numa+cpu+ci) patches | 14:44 |
dansmith | nothing has even entered the gate yet | 14:45 |
dansmith | I mean, nothing at all | 14:45 |
mriedem | check queue is ~200 deep so nothing approved today is going to merge today | 14:45 |
sean-k-mooney | yep | 14:45 |
mriedem | for nova anyway | 14:45 |
* mriedem lights a cigarette in the queue for the mainframe job | 14:45 | |
dansmith | mriedem: don't burn your cards | 14:46 |
mriedem | hey, i'm a professional | 14:46 |
mriedem | oh wait, | 14:46 |
mriedem | can i switch cigarette for flavored vape? | 14:47 |
mriedem | i want to be young and hip | 14:47 |
dansmith | mriedem: vaping kills, haven't you heard? | 14:48 |
dansmith | apparently kills you faster than cigarettes, which is pretty impressive | 14:48 |
donnyd | mriedem: the juul virginia tobacco is pretty good | 14:48 |
*** hamzy has quit IRC | 14:48 | |
mriedem | i did hear a pretty funny quote from trump on his ban on flavored vaping, | 14:48 |
*** lbragstad has joined #openstack-nova | 14:49 | |
*** hamzy has joined #openstack-nova | 14:49 | |
mriedem | something like, "the kids, they're coming home, and they're saying to their mom, 'hey, i want to vape!', and it's bad" | 14:49 |
mriedem | like, what kid tells their parents they want to smoke? | 14:49 |
mriedem | they just do it and then tell their parents they were at their friend's house, the friend whose parents smoke | 14:49 |
mriedem | duh | 14:49 |
stephenfin | I see you've done this before | 14:50 |
mriedem | i was at jared's house if my wife asks | 14:50 |
dansmith | I like how mriedem has to fly under his wife's radar just like he had to with his parents | 14:53 |
openstackgerrit | sean mooney proposed openstack/nova master: [DNM] numa + pcpus in placment live migration tests https://review.opendev.org/681771 | 14:54 |
sean-k-mooney | donnyd: is FN currenly active? | 14:54 |
donnyd | no | 14:55 |
donnyd | https://review.opendev.org/#/c/681731/ | 14:55 |
sean-k-mooney | ok then ^ will fail | 14:55 |
sean-k-mooney | that is the tweek to the multi numa job | 14:55 |
sean-k-mooney | ill start on the vexhost single numa version now | 14:55 |
*** hemna has joined #openstack-nova | 14:57 | |
*** pcaruana has quit IRC | 14:57 | |
*** jmlowe has quit IRC | 15:01 | |
bauzas | mriedem: ack, reviewing those now | 15:02 |
dansmith | sean-k-mooney: thanks | 15:02 |
donnyd | sean-k-mooney: https://review.opendev.org/#/c/681773/ | 15:02 |
*** weshay is now known as weshay_passport | 15:05 | |
*** jmlowe has joined #openstack-nova | 15:05 | |
*** tkajinam has quit IRC | 15:06 | |
stephenfin | alex_xu: If you're still about, could you look at https://review.opendev.org/#/c/681750/ ? | 15:07 |
luyao | stephenfin: he's away, but will be back some time(he said, but not sure) | 15:09 |
aspiers | kashyap: is the reference to vexpress-a15 on https://docs.openstack.org/glance/latest/admin/useful-image-properties.html really correct? | 15:10 |
aspiers | feels like that doc is at best incomplete | 15:11 |
*** rouk has joined #openstack-nova | 15:11 | |
*** hamzy has quit IRC | 15:14 | |
*** hamzy has joined #openstack-nova | 15:14 | |
*** spsurya has quit IRC | 15:16 | |
bauzas | gibi: +Wd with comment https://review.opendev.org/#/c/676980/22/nova/compute/manager.py@4439 | 15:20 |
bauzas | gibi: I'm not opposed to using a deprecated method as it's an easy path | 15:20 |
bauzas | gibi: but I feel this old API is broken and we should bump the version very soon | 15:21 |
gibi | bauzas: the comment about the usage explains that it is there until RPC version 6.0 | 15:21 |
bauzas | I know | 15:22 |
gibi | bauzas: but until that, I _have to_ use the deprecated conversion methos | 15:22 |
gibi | d | 15:22 |
bauzas | I'm just saying this RPC bump is quite important | 15:22 |
gibi | bauzas: I agree we shoudl make a bump | 15:22 |
bauzas | gibi: yeah | 15:22 |
bauzas | gibi: you could technically use another helper method | 15:23 |
bauzas | but this one is convenient, I don't disagree | 15:23 |
*** gyee has joined #openstack-nova | 15:23 | |
*** brinzhang_ has quit IRC | 15:25 | |
*** lpetrut has quit IRC | 15:25 | |
gibi | bauzas: ack, I made a TODO for myself about RPC bump for Ussuri but I'm sure I will need help doing that properly | 15:25 |
*** damien_r has quit IRC | 15:28 | |
dansmith | gibi: we have a doc about how to do it actually | 15:29 |
dansmith | it's been a while | 15:29 |
dansmith | gibi: https://wiki.openstack.org/wiki/RpcMajorVersionUpdates | 15:30 |
dansmith | not sure we've done it since we've had auto version pinning so we'd need to think through that a bit | 15:31 |
bauzas | gibi: yeah follow dansmith's point, I made it for the scheduler and I can give you a ton of code snippets | 15:31 |
bauzas | dansmith: oh good point | 15:31 |
*** hemna has quit IRC | 15:31 | |
*** JamesBenson has joined #openstack-nova | 15:31 | |
bauzas | it's been a while since I've seen a major bump | 15:31 |
bauzas | (but it's been a while since I followed changes like I was before) | 15:31 |
gibi | dansmith: thanks. I've annotated my TODO with the wiki link | 15:32 |
sean-k-mooney | whats the the synatax for resouce request again "resouces:PCPU=2" | 15:33 |
sean-k-mooney | that is correct yes? | 15:34 |
sean-k-mooney | in the flavor | 15:34 |
dansmith | sean-k-mooney: with proper spelling, I think so | 15:35 |
sean-k-mooney | ill copy it form functional test to get it right | 15:35 |
bauzas | sean-k-mooney: yeah, if you don't want more | 15:35 |
gibi | mriedem: thanks for fixing the race and fixing up the series and the reviews! | 15:36 |
bauzas | sean-k-mooney: example for VGPU : https://docs.openstack.org/nova/latest/admin/virtual-gpu.html#configure-a-flavor-controller | 15:36 |
sean-k-mooney | ya i found an example | 15:36 |
*** boxiang has quit IRC | 15:37 | |
mriedem | gibi: np | 15:37 |
bauzas | FWIW, pidgin makes it a joke | 15:37 |
bauzas | or a smiley rather | 15:37 |
*** boxiang has joined #openstack-nova | 15:37 | |
mriedem | dansmith: i think you did the last compute rpc api bump to 5.0 in queens | 15:38 |
mriedem | which i'm pretty sure was after the auto stuff | 15:39 |
dansmith | yeah I was looking.. I think so too | 15:39 |
mriedem | sean-k-mooney: i said this the other day, but i'd think you'd want a flavor with PCPU=1 (i guess if you test resize you need an alternate flavor, but that could also be PCPU=1 and a different ram/disk) and you'll want to tell tempest to run the tests in serial or at least --concurrency=2 | 15:40 |
mriedem | i think, because otherwise i'd expect novalidhost errors while concurrent tests are consuming all the PCPU inventory on the single node | 15:40 |
bauzas | gibi: just a nit with +Wd https://review.opendev.org/#/c/679019/14/releasenotes/notes/support-cold-migrating-neutron-ports-with-resource-request-6d23be654a253625.yaml | 15:41 |
mriedem | i'd probably also just restrict tempest to tempest.api.compute tests to start | 15:41 |
*** sapd1_x has quit IRC | 15:41 | |
*** macz has joined #openstack-nova | 15:42 | |
donnyd | sean-k-mooney: You should be good to go now | 15:42 |
*** rpittau is now known as rpittau|afk | 15:44 | |
gibi | bauzas: ack, thanks | 15:45 |
*** maciejjozefczyk has quit IRC | 15:45 | |
bauzas | gibi: mriedem: efried_afk: I'm glad we can say support-move-ops-with-qos-ports as approved | 15:47 |
bauzas | let's now wait for the gate to claim it as implemented | 15:47 |
mriedem | cool, thanks for hitting those | 15:47 |
bauzas | mriedem: you did the most | 15:48 |
bauzas | efried_afk: sean-k-mooney: what's the status with vpmem ? | 15:48 |
bauzas | can i help ? | 15:49 |
bauzas | stephenfin: FWIW, reviewing back https://review.opendev.org/#/c/671801/48 | 15:49 |
stephenfin | bauzas: I'm going through vPMEM while my local hosts are deploying for manual upgrade testing of PCPU | 15:50 |
stephenfin | So it _should_ be fine | 15:50 |
stephenfin | We're going to need to rebase this whole thing into one giant chain once it's one though | 15:50 |
stephenfin | because vPMEM, VCPU and NUMA-aware LM series all conflict with each other | 15:51 |
bauzas | \o/ | 15:51 |
stephenfin | so much fun :) | 15:51 |
stephenfin | I can do that once everything is approved, I guess | 15:51 |
stephenfin | hopefully the conflicts are trivial (I suspect they will be) | 15:52 |
*** hamzy_ has joined #openstack-nova | 15:52 | |
*** ivve has quit IRC | 15:52 | |
*** hamzy has quit IRC | 15:52 | |
gibi | dddf | 15:52 |
gibi | sorry | 15:54 |
gibi | bauzas: thanks! | 15:54 |
gibi | bauzas: to be precise the cold migrate and resize part of support-move-ops-with-qos-ports is now approved. I have to continue working on evacuate, live migrate, and unshelve in Ussuri | 15:55 |
bauzas | oh correct of course | 15:55 |
bauzas | I was short in explanation :) | 15:55 |
gibi | :) | 15:55 |
bauzas | but we stick with the plan you wrote | 15:55 |
gibi | bauzas: yeah, there was no way to fitt everything | 15:55 |
bauzas | FWIW, I wish I could have had time for this with VGPU support... | 15:55 |
gibi | bauzas: let's try that again in U | 15:56 |
stephenfin | melwitt: ta for https://review.opendev.org/681374 (y) | 15:56 |
bauzas | every cycle, I make myself a promise | 15:56 |
bauzas | I stopped doing it | 15:56 |
luyao | stephenfin: reminder for vpmems series. :) | 15:57 |
stephenfin | luyao: https://review.opendev.org/#/c/678452/32 :) | 15:57 |
bauzas | stephenfin: I'm balancing in my head the need for defaulting disable_fallback_pcpu_query to True as default | 15:58 |
melwitt | stephenfin: np, good call on keeping the quota tests with it and moving the other unrelated func test to the next patch | 15:58 |
stephenfin | bauzas: It's the same discussion we had with the previous manual configuration option | 15:58 |
bauzas | yeah I now | 15:58 |
bauzas | know* | 15:58 |
stephenfin | melwitt: Yeah, thank mriedem for that (he complained (fairly) that the next one was way too big) | 15:59 |
luyao | stephenfin: thanks, not noticed that.:) | 15:59 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Follow up for Ib50b6b02208f5bd2972de8a6f8f685c19745514c https://review.opendev.org/681490 | 15:59 |
*** ttsiouts has quit IRC | 16:01 | |
gibi | bauzas, mriedem: could you re-apply your love there? ^^ | 16:01 |
bauzas | gibi: a safe love but yeah | 16:02 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Skip querying resource request in revert_resize if no qos port https://review.opendev.org/681513 | 16:02 |
*** ttsiouts has joined #openstack-nova | 16:02 | |
gibi | bauzas: :) | 16:02 |
*** dave-mccowan has joined #openstack-nova | 16:02 | |
*** brault has quit IRC | 16:03 | |
*** ricolin has quit IRC | 16:03 | |
*** TxGirlGeek has joined #openstack-nova | 16:03 | |
mriedem | consider my love applied | 16:03 |
*** elod has quit IRC | 16:04 | |
*** ricolin has joined #openstack-nova | 16:04 | |
gibi | mriedem: thanks | 16:04 |
artom | Eww | 16:04 |
mriedem | hey, | 16:05 |
*** ricolin has quit IRC | 16:05 | |
mriedem | nothing eww about a grown man loving up on some patches | 16:05 |
*** elod has joined #openstack-nova | 16:05 | |
*** ttsiouts has quit IRC | 16:06 | |
bauzas | stephenfin: https://review.opendev.org/#/c/671801/48/releasenotes/notes/cpu-resources-d4e6a0c12681fa87.yaml@35 worth considering a nova-status upgrade check that would verify before you upgrade to Ussuri that you no longer require the workaround | 16:06 |
melwitt | mnaser: fyi os-vif 1.15.2 has been released https://review.opendev.org/681358 | 16:06 |
bauzas | anyway, dropping now, but I'll be back later (doing a bit of trail running) | 16:07 |
stephenfin | bauzas: Yup, good call. That's what we did for the consoleauth workaround | 16:07 |
* bauzas back in a few hours | 16:08 | |
stephenfin | dansmith: Can we squash migrations in the future? | 16:08 |
dansmith | stephenfin: what migrations? | 16:08 |
stephenfin | i.e. "INFO migrate.versioning.api [-] 254 -> 255..." | 16:08 |
stephenfin | our DB migrations | 16:08 |
dansmith | we've compacted them before, but not in a long time, as you know | 16:09 |
dansmith | so.. yes? | 16:09 |
*** dtantsur is now known as dtantsur|afk | 16:09 | |
stephenfin | I did not know that | 16:09 |
stephenfin | or rather, I'd forgotten it | 16:09 |
* stephenfin notes to investigate that for ones from Queens or so and back in Ussuri (Ussuri?) | 16:10 | |
*** artom has quit IRC | 16:10 | |
mriedem | stephenfin: my recollection with that is sdague heard from a bunch of operators at one point that they actually didn't like the compacting because it f'ed up things like FFUs | 16:12 |
mriedem | this was long before FFU was even a term | 16:12 |
mriedem | so "skip level upgrades" at the time | 16:12 |
dansmith | yeah it's a giant pain | 16:12 |
mriedem | so we stopped doing it, | 16:12 |
mriedem | and, | 16:12 |
stephenfin | understandable. That's why I was thinking Queens | 16:12 |
melwitt | mriedem: I'm +1 on the top patch of brinzhang's set and I see you are +1 on the middle patch. I'm wondering if I should fix up the bottom patch since I doubt brin is around the rest of today | 16:13 |
*** ociuhandu has quit IRC | 16:13 | |
stephenfin | rather than Train or something more recent | 16:13 |
mriedem | the amount of new schema migrations we do compared to the old days is nowhere close | 16:13 |
dansmith | also true | 16:13 |
dansmith | we chewed through a ton early on, but it's rare today | 16:13 |
stephenfin | I bring it up because I'm deploying on a machine locally and they're taking forever | 16:13 |
mriedem | dan prince had a schema diff tool thing since he was the one that used to do it but i'm not sure if that works anymore | 16:13 |
stephenfin | but that machine is using a HDD so that could be a limiting factor | 16:14 |
mriedem | melwitt: i've got an appointment this afternoon and haven't been back on the api change itself yet, so i'm not sure i'm going to be able to get that done today, | 16:14 |
mriedem | melwitt: you could fix up the bottom change though yeah it's just simple testing | 16:15 |
*** jmlowe has quit IRC | 16:15 | |
mriedem | i also have a patch i've been meaning to update and get into placement this release that i need to spend some time on | 16:15 |
*** ccamacho has quit IRC | 16:16 | |
melwitt | mriedem: ack. I'll do it just in case but won't expect you'll be able to be back to it today | 16:16 |
* stephenfin also notes to create empty migrations once we cut stable/train | 16:16 | |
openstackgerrit | sean mooney proposed openstack/nova master: [DNM] cpu pinning testing https://review.opendev.org/681807 | 16:17 |
sean-k-mooney | ok i think ^ is correct | 16:19 |
sean-k-mooney | stephenfin: can you check https://review.opendev.org/#/c/681807/1/playbooks/nfv/pinning.yaml | 16:19 |
stephenfin | sure | 16:19 |
sean-k-mooney | those flavors are correct right? | 16:19 |
sean-k-mooney | i need to start makeing the other 2 version vcpu_pin_set only where i will have to drp the expcitly resouce request and cpu_dedicated_set only which shuld just work | 16:20 |
sean-k-mooney | e.g. no flavor changes needed | 16:21 |
stephenfin | sean-k-mooney: the first one has vcpus=1 but '--property hw:cpu_threads=2' | 16:21 |
stephenfin | that'll conflict, surely | 16:21 |
sean-k-mooney | actully ya it might | 16:21 |
sean-k-mooney | it previously had 2 vcpus | 16:21 |
sean-k-mooney | ill fix that | 16:21 |
sean-k-mooney | anything else | 16:21 |
stephenfin | Personally I'd just drop the CPU topology stuff since it's not relevant to this | 16:21 |
stephenfin | looking | 16:21 |
sean-k-mooney | i just did | 16:22 |
stephenfin | '--property hw:cpu_policy=dedicated --property resources:PCPU=2' | 16:22 |
*** brault has joined #openstack-nova | 16:22 | |
stephenfin | that'll fail | 16:22 |
sean-k-mooney | that shoudl not | 16:22 |
stephenfin | you've to do one or the other. We enforce that at the API | 16:22 |
sean-k-mooney | oh ok | 16:22 |
sean-k-mooney | but wait no that does not make sense | 16:22 |
*** brault has quit IRC | 16:22 | |
sean-k-mooney | if you just did --property resources:PCPU=2 | 16:23 |
sean-k-mooney | then you would not get cpu pinning | 16:23 |
sean-k-mooney | so why would we ever allow that | 16:23 |
stephenfin | you will - PCPU == a pinned CPUs | 16:23 |
sean-k-mooney | we should not allow that | 16:23 |
stephenfin | 'hw:cpu_policy=dedicated' is syntactic sugar for 'resources:PCPU=$(flavor.vcpus)' | 16:23 |
sean-k-mooney | didnt we say we woudl only do pinning if you had dedicated | 16:23 |
stephenfin | A PCPU is a resource for dedicated CPUs | 16:24 |
sean-k-mooney | yes but we dont want to supprot people useing --property resources:PCPU=2 long term | 16:24 |
dansmith | sean-k-mooney: why? | 16:24 |
dansmith | that's exactly what I want | 16:24 |
sean-k-mooney | because you have to change your flavors if we change how we modle it in placement | 16:24 |
dansmith | I'm fine with that | 16:24 |
sean-k-mooney | also it will break if you enable multiple numa nodes in the image | 16:24 |
stephenfin | break how? | 16:25 |
stephenfin | or when? | 16:25 |
sean-k-mooney | when numa is modles in palcmeent. it will add the request to the un numbered group | 16:25 |
sean-k-mooney | but if you had hw_numa_nodes=2 in the image and resources:PCPU=2 in the falvor it would fial | 16:26 |
sean-k-mooney | you would have to use teh numbered resouce request synatx | 16:26 |
sean-k-mooney | in the flavor instead | 16:26 |
stephenfin | in a future where NUMA is in placement, yes, you would need to use a different syntax | 16:27 |
sean-k-mooney | so we dont want to encurage resources:PCPU as it leaks implmeantion details of placmenet via the nova api | 16:27 |
stephenfin | but that's the same for requesting resource:VCPU if you use NUMA without placement | 16:27 |
stephenfin | and we already support that | 16:27 |
stephenfin | (requesting resources:VCPU) | 16:27 |
sean-k-mooney | yes that is also bad | 16:27 |
stephenfin | mainly for ironic but anyone can use it | 16:27 |
sean-k-mooney | i get why this raw syntax exists | 16:27 |
sean-k-mooney | but its fragile | 16:27 |
sean-k-mooney | and leaking placmenet detail via nova api | 16:28 |
stephenfin | that's a fair opinion | 16:28 |
stephenfin | but going back to the original point | 16:28 |
stephenfin | "--property hw:cpu_policy=dedicated --property resources:PCPU=2" is a no-no | 16:28 |
stephenfin | as is "-property resources:PCPU=2 --property hw:cpu_thread_policy=prefer" | 16:29 |
sean-k-mooney | i expect that to work if the PCPUs match the flavor.vcpus | 16:29 |
melwitt | fwiw when we have unified limits users are going to know all about placement resources, that's how they set limits | 16:29 |
stephenfin | you want either | 16:29 |
dansmith | which is a good thing, IMHO | 16:29 |
sean-k-mooney | stephenfin: that shoudl definetly work | 16:29 |
melwitt | yeah, just saying | 16:29 |
sean-k-mooney | stephenfin: prefer is the most relaxed policy | 16:29 |
stephenfin | it's not the prefer bit that's the issue | 16:30 |
*** DinaBelova has quit IRC | 16:30 | |
sean-k-mooney | if you say nothing its a stricter requirement | 16:30 |
stephenfin | it's the fact that you're mixing the old way of doing this and the new way | 16:30 |
sean-k-mooney | this is not what i understood form the spec | 16:30 |
stephenfin | I want to be very clear and say you either do things with resources:PCPU and traits:HW_CPU_HYPERTHREADING | 16:30 |
stephenfin | or with hw:cpu_policy and hw:cpu_thread_policy | 16:31 |
mriedem | we should probably start an etherpad for post-FF release todos huh....like documenting PCPUs | 16:31 |
sean-k-mooney | ill change the tst to work but i think this is a problem | 16:31 |
dansmith | so, | 16:32 |
dansmith | we're not sure how to use it | 16:32 |
dansmith | and there are some concerns about how to use it | 16:32 |
dansmith | so we should merge it, figure it out later and then document? :) | 16:32 |
stephenfin | that's not very fair | 16:32 |
mriedem | i've started https://etherpad.openstack.org/p/nova-train-release-todo | 16:32 |
*** udesale has quit IRC | 16:32 | |
stephenfin | I don't know how to use BW-aware scheduling | 16:32 |
dansmith | stephenfin: I think you know how to use it, which is why I said we :) | 16:33 |
mriedem | it's documented | 16:33 |
stephenfin | I'm not blocking that because I don't personally understand it | 16:33 |
mriedem | the bw stuff was documented in stein | 16:33 |
stephenfin | So's this. There's a not insignificant spec for the thing | 16:33 |
stephenfin | and I don't think me not documenting things is a concern | 16:34 |
*** dave-mccowan has quit IRC | 16:34 | |
sean-k-mooney | when i get the jobs running ill review the api checks | 16:34 |
dansmith | stephenfin: sean-k-mooney is one of the people you included in the "are testing it manually" crew, so I think being a little concerned is not unreasonable | 16:34 |
openstackgerrit | sean mooney proposed openstack/nova master: [DNM] cpu pinning testing https://review.opendev.org/681807 | 16:37 |
sean-k-mooney | stephenfin: is that more to your likeing | 16:38 |
stephenfin | perfect | 16:38 |
sean-k-mooney | i think the second flavor should result in no cpus pinning and an api error personally but that should work as you suggest | 16:38 |
*** DinaBelova has joined #openstack-nova | 16:39 | |
*** gbarros has joined #openstack-nova | 16:43 | |
openstackgerrit | sean mooney proposed openstack/nova master: [DNM] test with dedicated cpus only https://review.opendev.org/681827 | 16:43 |
stephenfin | sean-k-mooney: I'll try not to rathole on it, but what do you think requesting 'resources:PCPU=N' should imply, out of curiosity? | 16:45 |
stephenfin | It sounds like your objections to that would apply equally to 'resources:VCPU=N' | 16:46 |
*** igordc has joined #openstack-nova | 16:46 | |
sean-k-mooney | it should be an error if hw:cpu_policy is not set to dedicated and if it is set to dedicated it shoudl be compared to the flavor.vcpu | 16:46 |
sean-k-mooney | i dont think operators shoudl use either | 16:47 |
stephenfin | but I was told we wanted to get away from those request specs to the more generic 'resources' syntax | 16:47 |
sean-k-mooney | by who | 16:47 |
stephenfin | jaypipes, efried, dansmith (above) | 16:47 |
sean-k-mooney | because i rememebr talking about this with alex_xu and efried_afk | 16:47 |
* dansmith owns it | 16:47 | |
sean-k-mooney | i had tought that i convinced both efried_afk and alex_xu that we shoudl prefer the abstract form | 16:48 |
sean-k-mooney | e.g. hw:cpu_policy | 16:48 |
stephenfin | prefer, yes, but not limit to | 16:49 |
sean-k-mooney | i have not spoken to dansmith or jaypipes about it | 16:49 |
stephenfin | hw:cpu_policy is syntactic sugar | 16:49 |
sean-k-mooney | i dont think it should be jsut syntatic sugar | 16:49 |
mriedem | efried_afk: in case you didn't see i've created https://etherpad.openstack.org/p/nova-train-release-todo and added it to the meeting agenda | 16:49 |
sean-k-mooney | i was pretty sure we agree in the reve to explcitly not support resouce:PCPUs | 16:49 |
sean-k-mooney | this came up in the pmem seriese too | 16:50 |
sean-k-mooney | we are not infering pmem usage form pmem resocues:... extra specs | 16:51 |
openstackgerrit | melanie witt proposed openstack/nova master: Add user_id and project_id column to Migration https://review.opendev.org/673990 | 16:53 |
openstackgerrit | melanie witt proposed openstack/nova master: Set user_id/project_id from context when creating a Migration https://review.opendev.org/679413 | 16:53 |
openstackgerrit | melanie witt proposed openstack/nova master: Filter migrations by user_id/project_id https://review.opendev.org/674243 | 16:53 |
*** efried_afk is now known as efried | 16:56 | |
*** weshay_passport is now known as weshay | 16:57 | |
sean-k-mooney | anyway i have more or less said my pieice but the final point i want to make is that if you use resouces:* you will either have to do a resize to a flaovr without if if that resouce moves in placmenet before upgrading or we will have to do an online data migration of all instance embeded flavor or the vm will not be able to cold/live migrate after an upgrade from on version of nova to another | 16:58 |
sean-k-mooney | well or a resize to the new flavor with the updated extra specs | 16:58 |
sean-k-mooney | reflectign the toplogy change | 16:58 |
stephenfin | aye, that'll be the same for vGPU too, unfortunately | 16:59 |
stephenfin | I'd rather like to provide a flavor migration tool in the future | 16:59 |
stephenfin | Migrate all flavors included embedded ones, as a way of deprecating old extra specs | 16:59 |
stephenfin | but I haven't really thought it throuhg | 16:59 |
sean-k-mooney | we could but its still not a good thing to encurage people to use | 16:59 |
mriedem | melwitt: i'm +2 on those bottom 2 changes now | 16:59 |
stephenfin | We'll do it in U, eh :) | 16:59 |
mriedem | if the api change doesn't make train at least the data will start flowing | 17:00 |
sean-k-mooney | i hope not | 17:00 |
sean-k-mooney | i dont think we should be deprecating those extrapecs | 17:00 |
mriedem | stephenfin: heh https://review.opendev.org/#/c/637217/ | 17:00 |
*** derekh has quit IRC | 17:00 | |
mriedem | "This code can be removed in Queens" | 17:01 |
melwitt | mriedem: ack thanks | 17:01 |
sean-k-mooney | anyway that not an somthing we are doing today | 17:02 |
stephenfin | \o/ | 17:02 |
* stephenfin points to the removal of cells v1, consoleauth and ec2 crud as proof he's serious about burning down tech debt | 17:02 | |
stephenfin | I have the nova-network series ready to go once U opens up | 17:03 |
sean-k-mooney | stephenfin: what you crurrtly hav is inline with the spec https://specs.openstack.org/openstack/nova-specs/specs/train/approved/cpu-resources.html#example-flavor-configurations so i guess that is what we are stuck with. | 17:05 |
stephenfin | yeah, I haven't pulled the rug out from anyone here | 17:06 |
stephenfin | jaypipes wanted the resources-style syntax | 17:06 |
stephenfin | you wanted the hw:cpu_policy-style | 17:06 |
sean-k-mooney | yes we have both i would have hoped we could use both together if they dont conflict | 17:07 |
openstackgerrit | sean mooney proposed openstack/nova master: [DNM] legacy vcpu_pin_set pinning with shared emulator threads https://review.opendev.org/681840 | 17:08 |
*** hamzy_ has quit IRC | 17:09 | |
*** hamzy_ has joined #openstack-nova | 17:11 | |
*** ociuhandu has joined #openstack-nova | 17:11 | |
*** artom has joined #openstack-nova | 17:13 | |
sean-k-mooney | https://review.opendev.org/#/c/681773/ is now merged so FN should be back in rotation so ill also kick off the multi numa migration run | 17:13 |
sean-k-mooney | with stephens changes | 17:13 |
sean-k-mooney | e.g. https://review.opendev.org/#/c/681771/1 | 17:13 |
*** ociuhandu has quit IRC | 17:16 | |
sean-k-mooney | i seam to be getting node failures form the vexxhost lables so if thoese all fail ill swap it over to FN since that seams to be working proably again. | 17:20 |
*** ralonsoh has quit IRC | 17:20 | |
*** awalende has quit IRC | 17:22 | |
*** awalende has joined #openstack-nova | 17:23 | |
*** nweinber__ has quit IRC | 17:23 | |
*** awalende has quit IRC | 17:27 | |
*** hemna has joined #openstack-nova | 17:27 | |
*** brault has joined #openstack-nova | 17:30 | |
*** ivve has joined #openstack-nova | 17:31 | |
*** xek has joined #openstack-nova | 17:34 | |
*** brault has quit IRC | 17:34 | |
*** itlinux has joined #openstack-nova | 17:35 | |
openstackgerrit | sean mooney proposed openstack/nova master: [DNM] cpu pinning testing https://review.opendev.org/681807 | 17:39 |
openstackgerrit | sean mooney proposed openstack/nova master: [DNM] test with dedicated cpus only https://review.opendev.org/681827 | 17:39 |
openstackgerrit | sean mooney proposed openstack/nova master: [DNM] legacy vcpu_pin_set pinning with shared emulator threads https://review.opendev.org/681840 | 17:39 |
sean-k-mooney | ok they are all queued to run on FN now | 17:41 |
*** priteau has quit IRC | 17:42 | |
*** nweinber__ has joined #openstack-nova | 17:43 | |
*** markvoelker has quit IRC | 17:45 | |
efried | stephenfin: I think I'm caught up since nap. Going to look at https://review.opendev.org/#/c/681750/ (additional func tests) now. Anything else I need to hit? | 17:46 |
*** markvoelker has joined #openstack-nova | 17:46 | |
stephenfin | efried: I don't think so | 17:46 |
efried | stephenfin: you still running up the vpmem series? | 17:46 |
sean-k-mooney | we should get the results fomr FN in about an hour by the way | 17:47 |
efried | noyce | 17:47 |
stephenfin | I'm working through manual tests at the moment but once that's done, I'm probably going to string together NUMA-aware live migration, vPMEM, and PCPU into one giant chain | 17:47 |
sean-k-mooney | the result for combining the numa migration serise with PCU will report back first then teh 3 toplogies for PCPUS series only | 17:47 |
stephenfin | on account of the few (trivial, but not auto-resolvable) merge conflicts between the three | 17:47 |
sean-k-mooney | im gong to go eat something. be back in a while | 17:48 |
*** bbowen__ has quit IRC | 17:55 | |
*** boxiang has quit IRC | 17:55 | |
*** boxiang has joined #openstack-nova | 17:56 | |
*** pcaruana has joined #openstack-nova | 17:58 | |
sean-k-mooney | stephenfin: can you wait till http://zuul.openstack.org/stream/c1f6ccc551c649bab6f60911fd3485ff?logfile=console.log complete before you push that chain | 17:59 |
*** brault has joined #openstack-nova | 17:59 | |
stephenfin | sean-k-mooney: sure, I won't be doing it for a while | 18:00 |
sean-k-mooney | ok it just started running tempest tests | 18:00 |
*** hemna has quit IRC | 18:00 | |
sean-k-mooney | that is numa migration + pcpu in placmenet | 18:01 |
*** oomichi_ has joined #openstack-nova | 18:01 | |
sean-k-mooney | it would be nice to wait for the 3 other jobs too but if that one pass thats a good start | 18:01 |
sean-k-mooney | oh the first live migration test passed | 18:01 |
efried | \o/ | 18:04 |
*** brault has quit IRC | 18:04 | |
sean-k-mooney | one of the tests failed so far but the rest are all passing. any idea what http://paste.openstack.org/show/775434/ is | 18:06 |
*** lbragstad has quit IRC | 18:07 | |
sean-k-mooney | it looks like a delete failed but we will need the full logs to know why | 18:07 |
sean-k-mooney | it might be unrelated | 18:07 |
aspiers | efried: I fixed the glance metadata and it makes SEV quite nice to set up from Horizon https://review.opendev.org/#/c/681866/ | 18:12 |
sean-k-mooney | did you also update the usful image properts. apparenlty that is a thing you are ment too do | 18:13 |
aspiers | sean-k-mooney: click the link ;-) | 18:13 |
*** dolpher has quit IRC | 18:13 | |
sean-k-mooney | yep just did and you did + added the release note | 18:13 |
aspiers | sean-k-mooney: yes I extended your release note | 18:13 |
sean-k-mooney | so that shoudl be all in order | 18:13 |
sean-k-mooney | actully i exteded someone elses | 18:14 |
aspiers | and I used git blame to find a good commit to copy and it was yours ;) | 18:14 |
sean-k-mooney | they uses 1 release note per release apparely for al the metadefs | 18:14 |
aspiers | weird | 18:14 |
aspiers | but better to stay consistent | 18:14 |
aspiers | efried, sean-k-mooney: look how pretty it is! https://photos.app.goo.gl/MTuTzPnz165bVVaC6 | 18:15 |
sean-k-mooney | if it works for them. why not. the only issue would be backporting could be weird but you would not backport them anyway | 18:15 |
sean-k-mooney | ha you with your suse theming | 18:15 |
aspiers | :-D | 18:16 |
aspiers | believe it or not that is devstack | 18:16 |
sean-k-mooney | its says suse openstack cloud | 18:16 |
aspiers | yup | 18:16 |
aspiers | it's still devstack ;) | 18:16 |
aspiers | with our theme | 18:16 |
sean-k-mooney | oh you have enabled the plugin | 18:16 |
*** hamzy_ has quit IRC | 18:16 | |
sean-k-mooney | ya | 18:16 |
sean-k-mooney | it does look nice | 18:16 |
sean-k-mooney | aspiers: also its nice to have the docs programatially available | 18:17 |
*** hamzy_ has joined #openstack-nova | 18:17 | |
aspiers | yes | 18:17 |
sean-k-mooney | you could in princapal hit the metadef api for the cli or heat too | 18:17 |
aspiers | I didn't really know it worked this way until today | 18:17 |
sean-k-mooney | heat does actully use it | 18:17 |
aspiers | Yeah I guessed it would | 18:17 |
sean-k-mooney | so the heat ui for flavors will pull that info too | 18:18 |
aspiers | exactly | 18:18 |
sean-k-mooney | i have a todo to go through all th eimage porperties we suppport in nova and figure out where the gaps are | 18:19 |
sean-k-mooney | i would like to get them all up to date | 18:19 |
stephenfin | so I have never done boot from volume before | 18:19 |
stephenfin | and our docs don't work | 18:19 |
stephenfin | hurrah! | 18:19 |
stephenfin | :D | 18:19 |
sean-k-mooney | do it from horizon | 18:19 |
stephenfin | looking at https://docs.openstack.org/nova/latest/user/launch-instance-from-volume.html | 18:19 |
sean-k-mooney | its the defualt if you have cinder instaled | 18:19 |
stephenfin | I got it working with novaclient | 18:19 |
stephenfin | thanks to a bug which we closed as not having enough info /o\ https://bugzilla.redhat.com/show_bug.cgi?id=1505465 | 18:20 |
openstack | bugzilla.redhat.com bug 1505465 in python-openstackclient "openstack server create can't parse --block-device option source=blank,dest=volume" [Medium,Closed: insufficient_data] - Assigned to jpichon | 18:20 |
stephenfin | i need me a whiteboard | 18:20 |
sean-k-mooney | well did you try https://docs.openstack.org/nova/latest/user/launch-instance-from-volume.html#create-volume-from-image-and-boot-instance | 18:20 |
stephenfin | I did. See step 3? '--block-device' doesn't exist | 18:21 |
stephenfin | that's the syntax from the novaclient command | 18:21 |
stephenfin | Another lossy conversion from novaclient to openstackclient, I suspect | 18:21 |
stephenfin | I hope I didn't review that, because I did a terrible job if so | 18:21 |
stephenfin | Doesn't matter though. I have live migration working between the two baremetal nodes _finally_ (only took three hours). Time to start manual testing | 18:21 |
sean-k-mooney | mriedem: would know but i think we wantd to add some syntatic sugar to make this better | 18:22 |
sean-k-mooney | stephenfin: ^ | 18:22 |
stephenfin | though it sounds you'll have most of it done before I even get started. Hurrah for automated testing | 18:22 |
stephenfin | Well, docs that are accurate would be a great start | 18:22 |
sean-k-mooney | as i said i ues horizon for this because well the docs always sucked | 18:23 |
*** IvensZambrano has quit IRC | 18:23 | |
sean-k-mooney | for "boot form voulem form image" anyway | 18:23 |
dtroyer | stephenfin: osc4 was released yesterday and has the block device fixes we did over the summer | 18:23 |
dtroyer | by 'we' I mean mriedem and others | 18:24 |
sean-k-mooney | im guessing we udpated the wrong docs | 18:24 |
stephenfin | dtroyer: o rly? If you can point me to the tl;dr: for that, I can incorporate it in the docs rework I'll do post feature freeze | 18:24 |
sean-k-mooney | dtroyer: ye/they added a cleaner syntax that hide most of the detail right | 18:24 |
mriedem | https://docs.openstack.org/python-openstackclient/latest/cli/command-objects/server.html#server-create | 18:25 |
mriedem | [--boot-from-volume <volume-size>] is new | 18:25 |
dtroyer | yes, was an entire forum session in Denver | 18:25 |
sean-k-mooney | dtroyer: yep i was there whcih is why i rememebr this was a thing | 18:25 |
mriedem | and --block-device-mapping allows type=image | 18:25 |
stephenfin | I personally tried various combinations of --image, --volume, --block-device and --block-device-mapping before giving up and falling back to novaclient /o\ | 18:25 |
mriedem | stephenfin: https://review.opendev.org/#/q/topic:story/2006302+(status:open+OR+status:merged) | 18:26 |
sean-k-mooney | mriedem: can you do --boot-from-volume 100G --image <my image> now? | 18:26 |
mriedem | eff yeah you can | 18:26 |
sean-k-mooney | i was hoping it would end up bining somthing like ^ | 18:26 |
sean-k-mooney | awsome | 18:27 |
efried | aspiers: what's causing it to be spelled "flavours"? | 18:27 |
efried | that should be fixed | 18:27 |
aspiers | efried: my locale | 18:27 |
stephenfin | I guess what I wanted (not now) was the equivalent of 'nova boot --block-device source=image,id=<image ID>,dest=volume,size=10,shutdown=preserve,bootindex=0 ...' | 18:27 |
aspiers | and no, it should be fixed whenever it's spelt "flavors" :-p | 18:28 |
sean-k-mooney | efried: that is the correct spelling | 18:28 |
efried | 'flavor' should be a token | 18:28 |
mriedem | stephenfin: there is a thing like that | 18:28 |
mriedem | not the exact same syntax | 18:28 |
*** hemna has joined #openstack-nova | 18:28 | |
aspiers | efried: en_GB != en_US | 18:28 |
stephenfin | it sounds like with OSC 4 that's 'openstack server create --boot-from-volume 10G --image <image ID> | 18:28 |
stephenfin | ' | 18:28 |
stephenfin | *...' | 18:28 |
mriedem | stephenfin: yeah | 18:28 |
efried | aspiers, sean-k-mooney: I'm not disputing that the word "flavor" is spelled "flavour". This is not a word. It's the name of a thing. You can't issue a command with "openstack flavour ..." | 18:28 |
mriedem | maybe the osc functional test makes usage more clear https://review.opendev.org/#/c/674111/5/openstackclient/tests/functional/compute/v2/test_server.py | 18:28 |
aspiers | efried: which instance of "flavour" are you referring to? | 18:29 |
dtroyer | efried: /me makes a note for a special build… | 18:29 |
stephenfin | It's on the list of doc fixes I want to get to post feature freeze | 18:29 |
sean-k-mooney | efried: oh if its givign a cli example ya the US spelling needs to be used | 18:29 |
efried | I'm talking about this: https://photos.google.com/share/AF1QipOckLfsOWRShI0bwvHL-PGoVzhxwWM376UgImt2h1flG6AXslZmqiMvpF5V3EmxzA?key=MUNMbkVQVDMwbnlmQkswOVZLZldzRThyYV9BaGt3 | 18:29 |
efried | the title "Update Flavour Metadata" | 18:29 |
efried | You're not updating the metadata for a flavour. | 18:30 |
efried | You're updating the metadata for a flavor | 18:30 |
efried | no matter where you're sitting. | 18:30 |
aspiers | that's just Horizon's translation | 18:30 |
aspiers | imagine if it was Chinese | 18:30 |
efried | swhy I'm asking how that is being "translated". It shouldn't be. | 18:30 |
sean-k-mooney | ya form a ui point of view its fine alther the flavor quota is not updated | 18:30 |
stephenfin | sounds like the great configuration drive vs. config drive war of early 2019 | 18:30 |
aspiers | it would look ridiculous if there was some Mandarin surrounding the word "Flavor" | 18:31 |
donnyd | dtroyer: can you just alias that in osc | 18:31 |
aspiers | efried: in fact, it would be ridiculous in any language. You can't have titles which are only partially translated | 18:31 |
efried | whoah | 18:31 |
efried | of course you can. | 18:31 |
* donnyd fans flames | 18:31 | |
donnyd | LOL | 18:31 |
aspiers | OK, agree to disagree I guess :) | 18:32 |
efried | should you have 新星 instead of "nova" in a chinese title??? | 18:32 |
aspiers | Quite possibly, but that's different | 18:32 |
* stephenfin suddently feels hungry | 18:32 | |
artom | загрузка с volume | 18:32 |
aspiers | "nova" is a name which has no pre-existing semantics relevant to OpenStack | 18:32 |
aspiers | "flavor" is a word which is being used in a way which aligns with its normal every-day meaning | 18:33 |
aspiers | so they are totally different | 18:33 |
aspiers | If OpenStack used the word "purple" instead of "flavor" then I might agree with you. But it doesn't. | 18:33 |
sean-k-mooney | efried: i use to work in a localistion research center in my university before i started on openstack | 18:33 |
aspiers | efried: But you are free to report a bug in Horizon ;-) | 18:34 |
sean-k-mooney | they would have said yes you shoudl translate it except where giving example of commandline/tools or api requets | 18:34 |
aspiers | Yeah I'd agree with that | 18:34 |
efried | okay, nova is clearly a terrible example. "cinder" is also totally aligned with its normal every-day meaning, so we should translate that one. And "glance". | 18:34 |
aspiers | Huh :) | 18:34 |
aspiers | How are cinder and glance aligned? | 18:35 |
efried | you know, cinder for block storage, glance for images. | 18:35 |
aspiers | Yeah I know but that's not alignment. That's a couple of obscure puns. | 18:35 |
efried | "flavor" is just as far away from what the thing is in openstack as "cinder" and "glance". | 18:35 |
aspiers | Everyone knows what "flavor" means, without needing a pun to be explained. | 18:35 |
efried | wow | 18:36 |
aspiers | It's not | 18:36 |
sean-k-mooney | efried: your saving grace is the fact that you are not ment to traslate proper nouns like names, tradmarks. products | 18:36 |
artom | Flavors are usually in my mouth | 18:36 |
artom | Not in my cloud :) | 18:36 |
sean-k-mooney | so if you want to win the argument sat that falvor is the proper name of a resouces typ | 18:36 |
aspiers | https://www.dictionary.com/browse/flavor | 18:36 |
aspiers | "3. the characteristic quality of a thing:" | 18:36 |
efried | Now, if we wanted to capitalize Nova, Glance, and Cinder I would *totally* agree they're different. | 18:36 |
aspiers | That's exactly what nova flavors are | 18:37 |
sean-k-mooney | you know the code ofthen also spells flavor instance-type | 18:37 |
artom | "broken" is not a flavor ;) | 18:37 |
aspiers | "4. a particular quality noticeable in a thing:" | 18:37 |
sean-k-mooney | anyway https://20aa241df59e72844ae9-597ff148d0ea9164d11e7cb764cf9b04.ssl.cf5.rackcdn.com/681771/1/experimental/nova-nfv-multi-numa-multinode/c1f6ccc/testr_results.html.gz | 18:38 |
aspiers | Both of those align with nova's usage | 18:38 |
sean-k-mooney | it look like almost all the test passed | 18:38 |
aspiers | Flavors are qualities of VMs | 18:38 |
efried | or "instances" | 18:38 |
aspiers | Anyway, dinner time | 18:38 |
sean-k-mooney | for the combinined numa migration + pcpus tests | 18:38 |
*** gbarros has quit IRC | 18:38 | |
artom | sean-k-mooney, "almost" makes me nervous | 18:38 |
sean-k-mooney | the ones that failed were not live migration | 18:38 |
sean-k-mooney | those all passed | 18:39 |
sean-k-mooney | your code is fine | 18:39 |
efried | I suspect it would confuse the hell out of Chinese operators if "instance" was 例子 everywhere. | 18:39 |
artom | sean-k-mooney, well ok, but is stephenfin's code fine? | 18:39 |
sean-k-mooney | it finished 30 seconds ago so im chekcing | 18:39 |
*** mriedem is now known as mriedem_afk | 18:39 | |
mriedem_afk | i'll be back before the meeting | 18:39 |
aspiers | efried: I have no idea, but anyway none of this is related to any of my patches, so I'm washing my hands ;-) | 18:40 |
sean-k-mooney | one of the 3 failures was failing to delete a vm | 18:40 |
efried | Yeah, even at the start of this I was just being pedantic, not arguing for actual change | 18:40 |
aspiers | I would have thought that the i18n team would have received enough complaints by now if it wasn't right | 18:40 |
aspiers | Unless Horizon changed something recently, I dunno | 18:40 |
*** jmlowe has joined #openstack-nova | 18:40 | |
mriedem_afk | the one guy at a university trying to stand up nova + xen just called out in the ML | 18:41 |
aspiers | I never noticed it with the British spelling before, but maybe it was always there | 18:41 |
efried | timeouts | 18:41 |
sean-k-mooney | ok 1 was failure to delete an instance and 2 failrues were trying to migrete to the same host | 18:41 |
sean-k-mooney | artom: ^ | 18:41 |
artom | sean-k-mooney, which review is that from? | 18:41 |
sean-k-mooney | https://zuul.opendev.org/t/openstack/build/c1f6ccc551c649bab6f60911fd3485ff/log/controller/logs/screen-n-cpu.txt.gz?severity=4 | 18:42 |
sean-k-mooney | ah let me get the link | 18:42 |
sean-k-mooney | https://review.opendev.org/#/c/681771/1 | 18:42 |
dansmith | there's some failed db lookups | 18:43 |
dansmith | like upcalls or something? I don't even know | 18:43 |
dansmith | Sep 12 18:10:31.308872 ubuntu-bionic-expanded-fortnebula-regionone-0011217193 nova-compute[14200]: ERROR nova.compute.manager [instance: 44f764db-e0ef-492a-8ac4-c9b110037b7d] oslo_messaging.rpc.client.RemoteError: Remote error: CantStartEngineError No sql_connection parameter is established | 18:43 |
dansmith | Sep 12 18:10:31.376947 ubuntu-bionic-expanded-fortnebula-regionone-0011217193 nova-compute[14200]: INFO nova.compute.manager [None req-a4e37fd4-9b29-472f-82f8-4cc5d6ba78cd tempest-TestNetworkAdvancedServerOps-353388395 tempest-TestNetworkAdvancedServerOps-353388395] [instance: 44f764db-e0ef-492a-8ac4-c9b110037b7d] Setting instance back to active after: Instance rollback performed due to: Unable to migrate instance | 18:44 |
dansmith | (44f764db-e0ef-492a-8ac4-c9b110037b7d) to current host (ubuntu-bionic-expanded-fortnebula-regionone-0011217193). | 18:44 |
sean-k-mooney | ya i think i have seen those before | 18:44 |
sean-k-mooney | but i dont know where | 18:44 |
artom | Wasn't that happening when you were testing just my patches? | 18:44 |
sean-k-mooney | perhaps | 18:45 |
sean-k-mooney | we shoudl have the ci logs to check | 18:45 |
artom | Yeah, trying to find them | 18:45 |
sean-k-mooney | i think i have seen that on master before too | 18:45 |
dansmith | it's trying to do a reschedule, | 18:45 |
sean-k-mooney | logstsh shoudl tell us i guess | 18:45 |
dansmith | but I'm guessing that means we failed to do something we should have been able to do, | 18:46 |
dansmith | like claiming pcpus again or something | 18:46 |
sean-k-mooney | so we have this error https://zuul.opendev.org/t/openstack/build/c1f6ccc551c649bab6f60911fd3485ff/log/controller/logs/screen-n-cpu.txt.gz#4825 | 18:47 |
sean-k-mooney | that is what i expect to see wehn we live migrate without updating things properly | 18:47 |
dansmith | the logs I saw were cold migrate I think | 18:47 |
dansmith | yep, resize | 18:48 |
sean-k-mooney | inventory data: {'VCPU': {'total': 7, 'reserved': 0, 'min_unit': 1, 'max_unit': 7, 'step_size': 1, 'allocation_ratio': 16.0}, | 18:54 |
sean-k-mooney | so that isnt workign correctly right | 18:54 |
sean-k-mooney | this should be the combiantion fo stephens change + artoms | 18:54 |
sean-k-mooney | so i would expect that to contain PCPUS | 18:55 |
sean-k-mooney | i have other jobs that are just stephens code running too | 18:55 |
dansmith | wouldn't we fail to boot at all if there wasn't any inventory? | 18:55 |
dansmith | meaning, fail to get to the compute at all | 18:55 |
sean-k-mooney | i think its reporting inventory of PCPU and either not traslating or doing the fallback | 18:56 |
artom | I didn't touch any of the scheduler stuff in my code - I assumed it handled placement allocations correctly for a live migration | 18:57 |
artom | Maybe that part needs to be filled in for PCPUs? | 18:57 |
dansmith | inventory data: {'VCPU': {'total': 7, 'reserved': 0, 'min_unit': 1, 'max_unit': 7, 'step_size': 1, 'allocation_ratio': 16.0}, 'MEMORY_MB': {'total': 16039, 'reserved': 512, 'min_unit': 1, 'max_unit': 16039, 'step_size': 1, 'allocation_ratio': 1.5}, 'DISK_GB': {'total': 74, 'reserved': 0, 'min_unit': 1, 'max_unit': 74, 'step_size': 1, 'allocation_ratio': 1.0} | 18:58 |
dansmith | am I missing the PCPU in there? | 18:58 |
dansmith | Translating request for VCPU=2 to VCPU=0,PCPU=2 | 18:59 |
dansmith | I thought that was supposed to happen in the scheduler request, but that's in the compute log | 18:59 |
sean-k-mooney | no | 18:59 |
sean-k-mooney | so did i | 19:00 |
sean-k-mooney | could it happen in the compute on resize maybe | 19:00 |
dansmith | no, it can't call the scheduler | 19:00 |
sean-k-mooney | well its in the scheduler utils | 19:00 |
sean-k-mooney | but ya | 19:01 |
dansmith | unless that's just being logged from a util method we're calling to generate an allocation update or something | 19:01 |
sean-k-mooney | do we have placmenet logs | 19:01 |
*** gbarros has joined #openstack-nova | 19:02 | |
sean-k-mooney | that is a request for PCPUs right https://zuul.opendev.org/t/openstack/build/c1f6ccc551c649bab6f60911fd3485ff/log/controller/logs/screen-placement-api.txt.gz#342 | 19:02 |
*** hemna has quit IRC | 19:03 | |
sean-k-mooney | its using the fallback | 19:03 |
dansmith | that's a req from the scheduler I assume, | 19:03 |
dansmith | I'm not sure why we're seeing the translation message in the compute log tho | 19:03 |
sean-k-mooney | yes and just below it we see it request again with VCPUS | 19:04 |
sean-k-mooney | stephenfin: did i miss adding something in the nova conf | 19:05 |
stephenfin | sean-k-mooney: what did you add? | 19:05 |
sean-k-mooney | just vcpu_pin_set in this case + cpu_shared_set | 19:05 |
sean-k-mooney | but it hsould still get PCPU no? | 19:06 |
stephenfin | nope | 19:06 |
sean-k-mooney | oh ok | 19:06 |
dansmith | stephenfin: do you know why we're logging that translation thing in the compute log? is it from generating allocations to compare in the periodic or something? | 19:06 |
stephenfin | the host won't do reshape or report PCPU until '[compute] cpu_dedicated_set' is configured. That's our switchover | 19:06 |
sean-k-mooney | ah ok | 19:06 |
sean-k-mooney | ill update that job then | 19:07 |
stephenfin | dansmith: which one? | 19:07 |
sean-k-mooney | the other use cpu_share_set and cpu_dedicated_set | 19:07 |
stephenfin | oh Translating request for VCPU=2 to VCPU=0,PCPU=2 | 19:07 |
dansmith | yeah | 19:07 |
stephenfin | yeah, sec | 19:07 |
dansmith | I saw that, forgot it was in the compute log and was like "cool, cool, we're ... wait a sec." :) | 19:08 |
stephenfin | so this is the code that does the translation https://review.opendev.org/#/c/671801/48/nova/scheduler/utils.py | 19:09 |
stephenfin | and the question is where do we build ResourceRequest object on the compute ndoe | 19:09 |
stephenfin | *node | 19:09 |
stephenfin | because I didn't add that :) | 19:09 |
dansmith | if we're legit logging that we should probably change that wording a little | 19:09 |
stephenfin | got it | 19:10 |
stephenfin | https://github.com/openstack/nova/blob/master/nova/virt/libvirt/utils.py#L595-L602 | 19:10 |
sean-k-mooney | "cpu_dedicated_set: 0-6" is the correct spelling yes | 19:10 |
stephenfin | yup, and it has to be in the compute group | 19:11 |
sean-k-mooney | ya | 19:11 |
stephenfin | dansmith: That's from the vCPU model selection code and it was added as a quick way to extract the traits | 19:12 |
openstackgerrit | sean mooney proposed openstack/nova master: [DNM] numa + pcpus in placment live migration tests https://review.opendev.org/681771 | 19:12 |
stephenfin | So agreed, it's misleading, and I didn't see it before because the vCPU model selection change only merged last night | 19:13 |
dansmith | s'cool, just sayin' | 19:13 |
sean-k-mooney | ok that re running | 19:13 |
sean-k-mooney | the 3 jobs that are just running the PCPU code shoudl report back in the next 20 mins or so the last one with numa will take about an hour and a half | 19:14 |
stephenfin | cool :) | 19:14 |
stephenfin | The solution, fwiw (and IMO), is what efried has asked for elsewhere: a util function to pull traits and resources from a flavor in a standard way, so we don't have a load of 'resourceNN:<type>' regexes about the place | 19:15 |
stephenfin | regexes? regexi? | 19:15 |
dansmith | sean-k-mooney: so what's the deal on the failed db queries? | 19:15 |
dansmith | sean-k-mooney: worried that maybe we're lazy-loading some api-only field or something | 19:15 |
sean-k-mooney | if they dont show up the ones that is stephens only then its from artoms series | 19:16 |
sean-k-mooney | if its in both its from master | 19:16 |
*** TxGirlGeek has quit IRC | 19:17 | |
sean-k-mooney | this https://review.opendev.org/#/c/681771/1 is the combiend job and https://review.opendev.org/#/c/681807/2 is the 3 pcpu only jobs | 19:17 |
*** TxGirlGeek has joined #openstack-nova | 19:17 | |
sean-k-mooney | i think i did see that with artoms series but not in all versions | 19:17 |
sean-k-mooney | could it be the late upcall for anti affintiy | 19:18 |
sean-k-mooney | if we are lazy loadign somethign im not sure what it would be off the top of my head | 19:19 |
stephenfin | sean-k-mooney, dansmith: WIP manual testing notes here too, btw https://etherpad.openstack.org/p/nova-cpu-resources | 19:20 |
stephenfin | I'm at controller+compute updated to use PCPU code with other compute using plain master (without PCPU code). Onto setting 'cpu_dedicated_set' on the former now | 19:21 |
stephenfin | I went with master because DevStack wouldn't deploy from stable/stein and I figured this would be easier simulate "upgrades" from nova's perspective | 19:21 |
sean-k-mooney | i ment to get food earilar and still havent so im gong to run for an 30 mins to an hour | 19:22 |
*** itlinux is now known as itlinux-away | 19:22 | |
stephenfin | I just want to test the reshaper and create/move operations with nodes with and without the PCPU stuff, after all | 19:22 |
dansmith | sean-k-mooney: I dunno how it could be unrelated if it works on master (the lazy load) | 19:22 |
*** awalende has joined #openstack-nova | 19:23 | |
*** itlinux-away is now known as itlinux | 19:24 | |
*** itlinux is now known as itlinux-away | 19:24 | |
*** itlinux-away is now known as itlinux | 19:24 | |
*** itlinux is now known as itlinux-away | 19:24 | |
sean-k-mooney | its likely an issue in artoms code then if stephen has not seen it in his testing? | 19:26 |
*** artom has quit IRC | 19:26 | |
*** itlinux-away is now known as itlinux | 19:26 | |
*** itlinux is now known as itlinux-away | 19:26 | |
dansmith | no, I would think it's more likely that stephen added something that looks at an object field, but hasn't been testing with computes and conductors that don't have access to the api database | 19:27 |
*** itlinux-away is now known as itlinux | 19:27 | |
*** TxGirlGeek has quit IRC | 19:27 | |
*** itlinux is now known as itlinux-away | 19:27 | |
dansmith | artom's code eventually passed this right? | 19:27 |
*** bbowen has joined #openstack-nova | 19:27 | |
*** awalende has quit IRC | 19:28 | |
mnaser | melwitt: thanks (re os-vif) | 19:29 |
*** factor has quit IRC | 19:30 | |
*** factor has joined #openstack-nova | 19:30 | |
*** artom has joined #openstack-nova | 19:31 | |
dansmith | ...still not a damn thing in the gate | 19:31 |
*** itlinux-away is now known as itlinux | 19:31 | |
*** itlinux is now known as itlinux-away | 19:31 | |
*** itlinux-away is now known as itlinux | 19:32 | |
*** itlinux is now known as itlinux-away | 19:32 | |
*** lbragstad has joined #openstack-nova | 19:32 | |
melwitt | the Nova Penalty™ | 19:33 |
*** itlinux-away is now known as itlinux | 19:34 | |
*** itlinux is now known as itlinux-away | 19:34 | |
artom | dansmith, stuff that's been +W'ed pre-FF can be rechecked until it merges, right? | 19:40 |
artom | It's not like "whelp, FF, killall the things" | 19:41 |
* artom has to run | 19:42 | |
*** artom has quit IRC | 19:42 | |
dansmith | ...mkay | 19:42 |
*** itlinux-away is now known as itlinux | 19:43 | |
*** itlinux is now known as itlinux-away | 19:43 | |
stephenfin | dansmith: okay, so, here's what I've tested | 19:43 |
stephenfin | that I can boot pinned and unpinned instances on master | 19:44 |
stephenfin | that if I "upgrade" a compute node to use the PCPU stuff, the existing instances keep working | 19:44 |
stephenfin | that I can boot new instances on that upgraded compute node | 19:44 |
stephenfin | that if I set 'cpu_dedicated_set', all my pinned instance magically transition to having PCPU inventory | 19:45 |
stephenfin | (that one was tricky because I (boldly) had pinned and unpinned instances on the same host, so I have to pick a value for 'cpu_dedicated_set' that match the pinset of the pinned instances and still leave some aside for 'cpu_shared_set') | 19:46 |
stephenfin | that I can continue to boot pinned instances after having set that, and they will go to both the host reporting PCPUs and the host without PCPUs | 19:47 |
stephenfin | and throughout all of that, placement's view of WTF is happening remains consistent | 19:48 |
stephenfin | *that placement's | 19:48 |
stephenfin | dansmith: How's that sounding so far? Anything in particular you want me to double back on or try? | 19:49 |
stephenfin | Happy to tar up the logs from both compute nodes for your perusal too. They're a bit messy because I've made a few (user-driven) mistakes along the way but they should do to quickly grep for ERRORs | 19:50 |
stephenfin | The main ones of those I've seen are errors from RabbitMQ about trying again in 1 sec, which I seem to always get in DevStack | 19:50 |
*** itlinux-away is now known as itlinux | 19:51 | |
stephenfin | and libvirt b****** when I tried to set 'cpu_dedicated_set' to a range that excluded pinned cores from existing instances | 19:51 |
*** itlinux is now known as itlinux-away | 19:51 | |
*** TxGirlGeek has joined #openstack-nova | 19:52 | |
stephenfin | That's an issue already with vcpu_pin_set but I thought I'd fixed that with this bugger https://review.opendev.org/#/c/680107/ I've clearly missed something though and will keep investigating | 19:52 |
stephenfin | Also, there's a bug in the vCPU model selection code | 19:53 |
*** icarusfactor has joined #openstack-nova | 19:53 | |
*** TxGirlGeek has quit IRC | 19:54 | |
stephenfin | Namely, if I have a required trait and it's not a CPU flag, this will contain at least some None entries https://github.com/openstack/nova/blob/master/nova/virt/libvirt/utils.py#L600 | 19:54 |
*** itlinux-away is now known as itlinux | 19:55 | |
stephenfin | and we have a later check to see if the result from the function is false'y | 19:55 |
*** factor has quit IRC | 19:55 | |
*** itlinux is now known as itlinux-away | 19:55 | |
stephenfin | set([None]) is not false'y | 19:55 |
stephenfin | patch incoming for that | 19:55 |
*** itlinux-away is now known as itlinux | 19:55 | |
*** itlinux is now known as itlinux-away | 19:56 | |
*** TxGirlGeek has joined #openstack-nova | 19:56 | |
*** itlinux-away is now known as itlinux | 19:59 | |
*** itlinux is now known as itlinux-away | 19:59 | |
sean-k-mooney | stephenfin: here is the results for https://review.opendev.org/#/c/681827/2 https://openstack.fortnebula.com:13808/v1/AUTH_e8fd161dc34c421a979a9e6421f823e9/zuul_opendev_logs_5bb/681827/2/experimental/nova-nfv-multinode/5bb9889/testr_results.html.gz | 20:00 |
sean-k-mooney | that is your code without artoms | 20:00 |
sean-k-mooney | the only thing that failed were 2 live migration tests | 20:00 |
stephenfin | that's exactly what we expected, right? | 20:01 |
sean-k-mooney | there are a coupld of extra failure in https://openstack.fortnebula.com:13808/v1/AUTH_e8fd161dc34c421a979a9e6421f823e9/zuul_opendev_logs_909/681840/2/experimental/nova-nfv-multinode/9090152/testr_results.html.gz | 20:01 |
sean-k-mooney | whichi is using vcpu_pin_set again | 20:01 |
dansmith | resize fails are not expected | 20:01 |
dansmith | Details: {'code': 400, 'created': '2019-09-12T18:52:42Z', 'message': 'CPU set to unpin [1] must be a subset of pinned CPU set []'} | 20:01 |
*** kaliya has joined #openstack-nova | 20:02 | |
*** itlinux-away is now known as itlinux | 20:02 | |
sean-k-mooney | you get set[] if the update resouce provider task fails | 20:02 |
*** itlinux is now known as itlinux-away | 20:02 | |
sean-k-mooney | which happen if you live migrate and cause a pinning conflict | 20:02 |
*** itlinux-away is now known as itlinux | 20:02 | |
*** itlinux is now known as itlinux-away | 20:03 | |
sean-k-mooney | but i just got back form getting food so now im going to go eat in | 20:03 |
sean-k-mooney | brb | 20:03 |
dansmith | ah, so you think this had already caused a pinning conflict and then a resize failed or something? | 20:03 |
dansmith | I thought you were running these with concurrency=1 | 20:03 |
sean-k-mooney | ya | 20:03 |
sean-k-mooney | yes but it takes a while for it to get fixed | 20:03 |
*** itlinux-away is now known as itlinux | 20:04 | |
sean-k-mooney | e.g. the perodic task time | 20:04 |
dansmith | even after the delete? | 20:04 |
*** itlinux is now known as itlinux-away | 20:04 | |
sean-k-mooney | it wont be fixed until the update_resouces perodic runs i think | 20:04 |
dansmith | that seems odd | 20:05 |
*** kaliya has quit IRC | 20:06 | |
*** itlinux-away is now known as itlinux | 20:06 | |
*** itlinux is now known as itlinux-away | 20:07 | |
*** dolpher has joined #openstack-nova | 20:09 | |
*** markvoelker has quit IRC | 20:11 | |
*** markvoelker has joined #openstack-nova | 20:11 | |
dansmith | I see that both in the periodic as well as during a boot | 20:11 |
*** itlinux-away is now known as itlinux | 20:13 | |
*** itlinux is now known as itlinux-away | 20:13 | |
*** tbachman has quit IRC | 20:15 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Note about Destination.forbidden_aggregates https://review.opendev.org/680945 | 20:18 |
dansmith | stephenfin: sorry I missed the ping above | 20:18 |
*** itlinux-away is now known as itlinux | 20:18 | |
*** itlinux is now known as itlinux-away | 20:18 | |
dansmith | stephenfin: I want to see tempest passing against it.. I'm glad you're doing lots of hand testing of it too, and I believe that whatever you're doing is working if you say it is | 20:19 |
*** tbachman has joined #openstack-nova | 20:19 | |
dansmith | not that it matters, several other people are apparently fine +Wing this without seeing our test suite run on it, so this is just for funsies at this point | 20:19 |
stephenfin | I guess having a bajillion functional tests does help thing though | 20:20 |
*** itlinux-away is now known as itlinux | 20:20 | |
*** itlinux is now known as itlinux-away | 20:20 | |
*** oomichi_ has quit IRC | 20:20 | |
stephenfin | Yes, I know they're not the same thing, but this is all touching the management'y parts of nova | 20:20 |
stephenfin | which the functional tests are excellent for | 20:20 |
dansmith | this is all touching plenty of the stuff in nova that breaks all the time | 20:20 |
*** xek has quit IRC | 20:21 | |
*** itlinux-away is now known as itlinux | 20:21 | |
*** itlinux is now known as itlinux-away | 20:21 | |
*** itlinux-away is now known as itlinux | 20:21 | |
*** itlinux is now known as itlinux-away | 20:21 | |
*** itlinux-away is now known as itlinux | 20:23 | |
*** itlinux is now known as itlinux-away | 20:23 | |
*** itlinux-away is now known as itlinux | 20:23 | |
*** itlinux is now known as itlinux-away | 20:23 | |
*** itlinux-away is now known as itlinux | 20:27 | |
*** itlinux is now known as itlinux-away | 20:27 | |
sean-k-mooney | ok im back | 20:27 |
dansmith | sean-k-mooney: so what's the deal with the test run that was showing the object/db errors? | 20:29 |
sean-k-mooney | i dont know but thats what im going to look into now | 20:29 |
sean-k-mooney | did we see that in the ones for just stephens code | 20:29 |
sean-k-mooney | or is it only in the combined one | 20:29 |
*** lpetrut has joined #openstack-nova | 20:30 | |
dansmith | only combined, afaik | 20:30 |
sean-k-mooney | ok ill redeploy just artoms code and see if its there | 20:30 |
dansmith | sean-k-mooney: it would have caused the tests to fail when we originally run on artom's code if it was right? | 20:30 |
sean-k-mooney | i wonder if it could be releated to the instance.refresh() | 20:31 |
dansmith | no, I think it was on reqspec | 20:31 |
sean-k-mooney | am im not sure it would | 20:31 |
*** itlinux-away is now known as itlinux | 20:31 | |
sean-k-mooney | when i saw that previoulsy the migrtaions worked | 20:31 |
dansmith | sean-k-mooney: I asked earlier why not and what is different | 20:31 |
*** itlinux is now known as itlinux-away | 20:31 | |
stephenfin | Do we know if that instance was moved before attempting to rebuild? I'm assuming not | 20:32 |
dansmith | these were failing in the pcpu tests as a result no? | 20:32 |
efried | stephenfin: thanks for those reviews. There's going to be a merge conflict with numa lm on test_objects, so I'm going to rebase now to resolve preemptively, cool? | 20:32 |
*** itlinux-away is now known as itlinux | 20:32 | |
*** itlinux is now known as itlinux-away | 20:32 | |
*** pcaruana has quit IRC | 20:32 | |
stephenfin | efried: yeah, go for it. I've been meaning to do that myself but got sidetracked | 20:33 |
*** itlinux-away is now known as itlinux | 20:33 | |
stephenfin | Lemme know what ones I need to rehit if the merge conflicts are non-trivial | 20:33 |
*** itlinux is now known as itlinux-away | 20:33 | |
*** itlinux-away is now known as itlinux | 20:33 | |
*** itlinux is now known as itlinux-away | 20:33 | |
stephenfin | efried: Aaaactually, do you think you could rebase the cpu-resources series onto that too? | 20:33 |
efried | stephenfin: Yeah, I can do that, but in two stages if that's okay with you. | 20:34 |
stephenfin | I pointed out earlier that I think that's going to have the same issue | 20:34 |
stephenfin | cool by me | 20:34 |
dansmith | sean-k-mooney: File "/opt/stack/nova/nova/objects/instance_group.py" | 20:34 |
dansmith | sean-k-mooney: so maybe it is the late affinity check, but.. I don't understand why we'd hit that here but not in regular/other tests.. can you explain? | 20:34 |
sean-k-mooney | dansmith: we turn it off in the upstream gate but i might not be truning it off in my jobs | 20:35 |
dansmith | ah okay | 20:35 |
sean-k-mooney | it shoudl be in the nova-cpu.conf right | 20:35 |
sean-k-mooney | the workarounds section | 20:35 |
*** itlinux-away is now known as itlinux | 20:36 | |
*** itlinux is now known as itlinux-away | 20:36 | |
dansmith | oh you mean we turn off the check, not disable whatever test this is? | 20:36 |
dansmith | I dunno where it is specifically but nova-cpu.conf yeah | 20:36 |
sean-k-mooney | yes we disabel the upcall | 20:36 |
sean-k-mooney | also https://8ab1fb0a384d9cbfa221-969de3c017bb40c2acaf4bf21edd2ff6.ssl.cf1.rackcdn.com/681771/2/experimental/nova-nfv-multi-numa-multinode/1faf214/testr_results.html.gz | 20:36 |
sean-k-mooney | https://review.opendev.org/#/c/681771/ | 20:36 |
*** markvoelker has quit IRC | 20:37 | |
dansmith | do we have any grenade results yet? | 20:37 |
sean-k-mooney | no i dont have a grenade job set up. but i can try to set one up quickly | 20:37 |
dansmith | I don't think quickly matters anymore, | 20:38 |
dansmith | but it'd be good to get a run on it while we have context, IMHO | 20:38 |
*** trident has quit IRC | 20:39 | |
sean-k-mooney | so there is a patch to convert grenade to non legacy | 20:39 |
*** markvoelker has joined #openstack-nova | 20:39 | |
sean-k-mooney | if i depend on that i should be able to do the same config changes | 20:39 |
*** mriedem_afk is now known as mriedem | 20:39 | |
dansmith | oh, is that why it's hard because it's still a legacy job? | 20:39 |
sean-k-mooney | dansmith: this https://review.opendev.org/#/c/548936/ | 20:40 |
sean-k-mooney | dansmith: yes | 20:40 |
dansmith | gotcha | 20:40 |
dansmith | well, if that works currently, then yeah, try that I'd say | 20:40 |
sean-k-mooney | devstack-gate is a pain to get kvm working | 20:40 |
sean-k-mooney | yes it looks like its passing so ill try and modify it | 20:40 |
dansmith | cool | 20:40 |
sean-k-mooney | well depend on it | 20:40 |
mriedem | so...things are where they were 2 hours ago? | 20:41 |
mriedem | ooo down to 179 changes in the check queue | 20:41 |
dansmith | still nothing in the gate tho | 20:41 |
mriedem | yup, i don't expect nova stuff to probably be fully merged until maybe this weekend | 20:42 |
mriedem | if you account for rechecks | 20:42 |
sean-k-mooney | dansmith: by the way the latest test run for the combiniend numa + pcpus job does not appear to have the db error | 20:42 |
dansmith | sean-k-mooney: that db error wouldn't be flaky, so that makes me kinda wonder | 20:42 |
sean-k-mooney | well the difference is i am using cpu_dedicated_set instead of vcpu_pin_set | 20:43 |
dansmith | sean-k-mooney: unless one node is configured to have api db access and the other is not, and you don't get the fail if you're lucky and land on the one | 20:43 |
sean-k-mooney | stephenfin: could we be trying to lazy load the pinned cpus or something when we use vcpu_pin_set instead of cpu_dedicated_Set | 20:45 |
*** itlinux-away is now known as itlinux | 20:45 | |
*** TxGirlGeek has quit IRC | 20:45 | |
*** itlinux is now known as itlinux-away | 20:45 | |
mriedem | devstack shouldn't configure nova-cpu.conf nor either of the cell conductor configs for api db access | 20:45 |
mriedem | by default anyway | 20:45 |
dansmith | mriedem: yeah I dunno why this would be legit failing here and not on master tho | 20:46 |
*** lpetrut has quit IRC | 20:46 | |
sean-k-mooney | yeah the db is not configured on either node | 20:46 |
mriedem | talking about this job? https://openstack.fortnebula.com:13808/v1/AUTH_e8fd161dc34c421a979a9e6421f823e9/zuul_opendev_logs_909/681840/2/experimental/nova-nfv-multinode/9090152/testr_results.html.gz | 20:46 |
stephenfin | sean-k-mooney: I'm looking but I don't see why it would. | 20:47 |
dansmith | no | 20:47 |
dansmith | mriedem: https://20aa241df59e72844ae9-597ff148d0ea9164d11e7cb764cf9b04.ssl.cf5.rackcdn.com/681771/1/experimental/nova-nfv-multi-numa-multinode/c1f6ccc/compute/logs/screen-n-cpu.txt.gz | 20:47 |
efried | stephenfin: that rebase isn't going to work, because numalm is based waaay back on master, so that way conflicts with (at least) cpu models. I could rebase numalm, but it would lose its place in the queue (~10.5h) so I don't want to do that. Just going to ride it out, mkay? | 20:47 |
dansmith | mriedem: but I think this is just the late affinity check that must not be disabled on sean-k-mooney's | 20:47 |
stephenfin | (y) | 20:47 |
sean-k-mooney | this one https://zuul.opendev.org/t/openstack/build/1faf2148524e41638672ff627cb9a011/log/compute/logs/etc/nova/nova-cpu_conf.txt.gz#92 | 20:48 |
mriedem | disable_group_policy_check_upcall | 20:48 |
dansmith | sean-k-mooney: that's set on your job? | 20:48 |
mriedem | it's not in https://20aa241df59e72844ae9-597ff148d0ea9164d11e7cb764cf9b04.ssl.cf5.rackcdn.com/681771/1/experimental/nova-nfv-multi-numa-multinode/c1f6ccc/compute/logs/etc/nova/nova_conf.txt.gz | 20:48 |
mriedem | in the subnode | 20:48 |
sean-k-mooney | yes | 20:48 |
dansmith | okay not being on the subnode would explain why we hit it sometimes and not others | 20:48 |
mriedem | devstack should configure that by default if you're using superconductor mode (which is the default) | 20:48 |
*** takashin has joined #openstack-nova | 20:48 | |
*** hamzy_ has quit IRC | 20:49 | |
sean-k-mooney | its set on the contoler and compute | 20:49 |
mriedem | in that link above, | 20:49 |
mriedem | it's set on the compute on the controller host https://20aa241df59e72844ae9-597ff148d0ea9164d11e7cb764cf9b04.ssl.cf5.rackcdn.com/681771/1/experimental/nova-nfv-multi-numa-multinode/c1f6ccc/controller/logs/etc/nova/nova-cpu_conf.txt.gz | 20:49 |
mriedem | the primary | 20:49 |
mriedem | but not the subnode | 20:49 |
mriedem | so that's why it's flaky | 20:49 |
* dansmith nods | 20:49 | |
mriedem | i'm not sure why it wouldn't be set, unless the zuul yaml post-config is overwriting the workarounds section | 20:51 |
mriedem | but the controller devstack run fixes it but the subnode one doesn't | 20:51 |
*** trident has joined #openstack-nova | 20:51 | |
sean-k-mooney | i dont think that is what is happening | 20:51 |
sean-k-mooney | that is the comptue/subnode https://zuul.opendev.org/t/openstack/build/1faf2148524e41638672ff627cb9a011/log/compute/logs/etc/nova/nova-cpu_conf.txt.gz#92 | 20:52 |
sean-k-mooney | that is the contoler/primary https://zuul.opendev.org/t/openstack/build/1faf2148524e41638672ff627cb9a011/log/controller/logs/etc/nova/nova-cpu_conf.txt.gz#117 | 20:52 |
sean-k-mooney | on the run that pased its set on both | 20:52 |
mriedem | yeah i was looking at the one that failed | 20:52 |
sean-k-mooney | oh | 20:52 |
mriedem | note that i'm coming into this 2 hours late :) | 20:52 |
sean-k-mooney | so the one that failed didnt have it set on both | 20:53 |
dansmith | right | 20:53 |
sean-k-mooney | ok | 20:53 |
sean-k-mooney | i have no idea why that is then | 20:53 |
sean-k-mooney | i can add it expclitly to the job i guess | 20:53 |
*** itlinux-away is now known as itlinux | 20:54 | |
*** itlinux is now known as itlinux-away | 20:54 | |
sean-k-mooney | i also dont have that set locally so that is prably why i sometimes saw this an other times did not | 20:54 |
mriedem | sean-k-mooney: different patch sets for those jobs, | 20:54 |
mriedem | not sure if that matters, | 20:54 |
mriedem | i was looking at PS1 | 20:54 |
mriedem | https://review.opendev.org/#/c/681771/1..2/.zuul.yaml | 20:54 |
*** itlinux-away is now known as itlinux | 20:54 | |
mriedem | do'nt know why that would change things | 20:55 |
*** itlinux is now known as itlinux-away | 20:55 | |
*** itlinux-away is now known as itlinux | 20:55 | |
sean-k-mooney | nor do i | 20:55 |
*** itlinux is now known as itlinux-away | 20:55 | |
sean-k-mooney | that was needed to activate stephens code | 20:55 |
mriedem | unless there is a devstack or zuul bug or something | 20:55 |
sean-k-mooney | sice cpu_dedicated_set is what does the reshape | 20:55 |
mriedem | maybe https://github.com/openstack/devstack/commit/2468ceaa724aa5c8c44fb87ae223eb6687ff85f2 regressed something in devstack | 20:56 |
sean-k-mooney | anwyway the important thing is the last run pased with both seriese merged togoether so at least that good | 20:56 |
sean-k-mooney | maybe but both of thoes runs were done today | 20:56 |
mriedem | yeah, i guess we just keep an eye out for random failures like that | 20:57 |
*** mdbooth has quit IRC | 20:57 | |
sean-k-mooney | ya maybe check logstash to see if its happening | 20:57 |
sean-k-mooney | right so i was going to try and create a non legacy grenade job with all the onter non merged stuff. | 20:58 |
*** hemna has joined #openstack-nova | 20:58 | |
*** TxGirlGeek has joined #openstack-nova | 20:58 | |
dansmith | mriedem: we only do that if we have an instance group right? | 20:59 |
sean-k-mooney | what of the chaces 4 un merged unrelated seriese will all work togeher perficectly | 20:59 |
dansmith | does the network advanced server ops thing use groups? | 20:59 |
sean-k-mooney | i dont think so | 20:59 |
sean-k-mooney | you mean server groups right | 20:59 |
dansmith | so I'm wondering why we'd be hitting this for that | 20:59 |
dansmith | yes, server grops | 20:59 |
dansmith | groups even | 20:59 |
sean-k-mooney | do we do check for resize/migrate | 21:00 |
dansmith | only if they're in a group, IIRC | 21:00 |
sean-k-mooney | its for the anti afinity stuff right | 21:00 |
openstackgerrit | Eric Fried proposed openstack/nova master: libvirt: Enable driver discovering PMEM namespaces https://review.opendev.org/678453 | 21:00 |
openstackgerrit | Eric Fried proposed openstack/nova master: libvirt: report VPMEM resources by provider tree https://review.opendev.org/678454 | 21:00 |
openstackgerrit | Eric Fried proposed openstack/nova master: libvirt: Support VM creation with vpmems and vpmems cleanup https://review.opendev.org/678455 | 21:00 |
openstackgerrit | Eric Fried proposed openstack/nova master: Parse vpmem related flavor extra spec https://review.opendev.org/678456 | 21:00 |
openstackgerrit | Eric Fried proposed openstack/nova master: libvirt: Enable driver configuring PMEM namespaces https://review.opendev.org/679640 | 21:00 |
openstackgerrit | Eric Fried proposed openstack/nova master: Add functional tests for virtual persistent memory https://review.opendev.org/678470 | 21:00 |
openstackgerrit | Eric Fried proposed openstack/nova master: objects: use all_things_equal from objects.base https://review.opendev.org/681397 | 21:00 |
*** slaweq has quit IRC | 21:00 | |
dansmith | I was wondering earlier if we were failing to boot because of pcpu stuff and running that group recalc in a reschedule or something | 21:01 |
*** itlinux-away is now known as itlinux | 21:01 | |
efried | stephenfin: https://review.opendev.org/#/c/678453/33..34/nova/tests/unit/objects/test_objects.py | 21:01 |
mriedem | efried: nova meeting? | 21:01 |
efried | yes | 21:01 |
*** itlinux is now known as itlinux-away | 21:01 | |
mriedem | dansmith: yeah https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1457 | 21:01 |
ozzzo | I'm trying to get live migration working without shared storage. I got libvirtd to listen and set auth_tcp = "none" to fix the auth errrors; now the error is: | 21:02 |
ozzzo | 2019-09-12 13:05:10.154 60779 ERROR nova.virt.libvirt.driver [-] [instance: 01a9d185-f003-4f59-b87b-9256f5ea2eaa] Live Migration failure: unsupported configuration: Unable to find security driver for model apparmor: libvirtError: unsupported configuration: Unable to find security driver for model apparmor | 21:02 |
mriedem | though if it's a reschedule there are a couple of known up-call fails from that | 21:02 |
mriedem | dansmith: for resize and build reschedule | 21:02 |
ozzzo | Does anyone know what that means? I don't have apparmor running | 21:02 |
mriedem | to pull the AZ for the host which goes to aggregates | 21:02 |
mriedem | dansmith: https://bugs.launchpad.net/nova/+bug/1781286 | 21:02 |
openstack | Launchpad bug 1781286 in OpenStack Compute (nova) "CantStartEngineError in cell conductor during reschedule - get_host_availability_zone up-call" [Medium,Triaged] | 21:02 |
sean-k-mooney | ozzzo: that is proably what its complaining about | 21:03 |
dansmith | this one looked like setup_groups | 21:03 |
mriedem | Error trying to reschedule: oslo_messaging.rpc.client.RemoteError: Remote error: CantStartEngineError No sql_connection parameter is established | 21:03 |
dansmith | or whatever it's called | 21:03 |
*** itlinux-away is now known as itlinux | 21:03 | |
ozzzo | I need apparmor for live migration? | 21:03 |
sean-k-mooney | no | 21:03 |
*** itlinux is now known as itlinux-away | 21:03 | |
*** itlinux-away is now known as itlinux | 21:03 | |
*** nweinber__ has quit IRC | 21:03 | |
*** itlinux is now known as itlinux-away | 21:03 | |
*** itlinux-away is now known as itlinux | 21:03 | |
mriedem | dansmith: nope it's that aggs bug | 21:03 |
mriedem | https://20aa241df59e72844ae9-597ff148d0ea9164d11e7cb764cf9b04.ssl.cf5.rackcdn.com/681771/1/experimental/nova-nfv-multi-numa-multinode/c1f6ccc/controller/logs/screen-n-cond-cell1.txt.gz | 21:03 |
sean-k-mooney | ozzzo: but if you dont have apparmor running you need to disable apparmor as the security driver in the libvirt config | 21:03 |
mriedem | grep for CantStartEngineError in there | 21:03 |
*** nweinber__ has joined #openstack-nova | 21:03 | |
mriedem | reschedules and then tries to get az from the cell conductor | 21:04 |
sean-k-mooney | mriedem: ya we were seeing CantStartEngineError | 21:04 |
mriedem | yup, known bug, see above | 21:04 |
mriedem | i don't think tempest does a ton of server group testing but there are a few tests for server affinity and anti-affinty group testing | 21:05 |
dansmith | mriedem: that fails from resouces unavailable on the compute | 21:05 |
*** mdbooth has joined #openstack-nova | 21:05 | |
ozzzo | I don't find "appa" in my libvirt config. Is it a default that I need to override? | 21:05 |
dansmith | "Requested instance NUMA topology cannot fit the given host NUMA topology" | 21:05 |
mriedem | dansmith: oh, maybe that's a race? | 21:05 |
mriedem | are the tests running in serial? | 21:05 |
dansmith | sean-k-mooney: ^ | 21:05 |
ozzzo | oic it looks like it is the default in qemu.conf | 21:05 |
ozzzo | # security_driver = [ "selinux", "apparmor" ] | 21:06 |
sean-k-mooney | dansmith: or the issue was the exception in update_resouce | 21:06 |
ozzzo | what would that look like, to disable both? | 21:06 |
ozzzo | security_driver = [] ? | 21:06 |
sean-k-mooney | ozzzo: i think by defualt it will use either | 21:06 |
sean-k-mooney | you have to expclity diable it | 21:06 |
sean-k-mooney | mriedem: and yes i have it running serially. | 21:07 |
mriedem | ozzzo: please see the channel topic, this is a mostly dev focused channel and today is feature freeze for the train release so as you can see there is quite a bit of dev activity trying to get this release closed out | 21:07 |
mriedem | and we're having a meeting at the same time over in #openstack-meeting | 21:08 |
mriedem | ozzzo: #openstack-operators or just #openstack might help, or the ask.o.o forum | 21:08 |
mriedem | otherwise try hitting up the openstack-discuss mailing list | 21:08 |
sean-k-mooney | ozzzo: i think you need to do [] yes but i dont know. | 21:09 |
ozzzo | ok i'll try it, ty | 21:09 |
mriedem | anyway, sean-k-mooney dansmith reschedule in gate jobs with devstack doesn't really work b/c of that up-call | 21:09 |
mriedem | we just don't normally see or care about it b/c if we hit it the job fails and we're likely toast anyway, but we don't normally see reschedules in tempest runs anyway, | 21:10 |
mriedem | unless like libvirtd dies on a compute and if that happens we fail anyway | 21:10 |
dansmith | that's why I'm wondering if we're failing and doing a reschedule because of the changes under test | 21:10 |
mriedem | could be latent races in the numa claims code but yeah, idk | 21:12 |
mriedem | if we're running tests in serial that should be less of a risk, unless some test is creating multiple servers in one run | 21:12 |
mriedem | and we're trying to pin all to the same cpus? | 21:12 |
*** nweinber__ has quit IRC | 21:12 | |
dansmith | oh well, we'll just wait for mnaser to shake out all the bugs in here when he rolls it out | 21:12 |
* dansmith facepalms | 21:13 | |
mriedem | the job is setup to pin to the same cpus for the flavor we're using? | 21:13 |
mriedem | or how does that work? | 21:13 |
mriedem | iow, if a test creates more than one server on the same host, are we guaranteed to fail a claim for one and reschedule? | 21:14 |
sean-k-mooney | no | 21:14 |
sean-k-mooney | we have 8 cpus in the gate vms | 21:14 |
sean-k-mooney | the instace are each claiming 1 | 21:14 |
sean-k-mooney | and i have 7 of the 8 enabeld | 21:15 |
sean-k-mooney | but some tests do create multiple vms | 21:15 |
mriedem | yeah ok, but there aren't any tempest tests that create more than 3 servers i'm pretty sure | 21:17 |
mriedem | in a single test case i mean, | 21:17 |
mriedem | and i'm pretty sure the tempest tearDown waits for the servers to be gone, i.e. 404 from the API | 21:17 |
sean-k-mooney | the case weher we saw that errror the teardwon failed to delete before hiting the timeout | 21:18 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: libvirt: Correctly handle non-CPU flag traits https://review.opendev.org/681932 | 21:18 |
*** artom has joined #openstack-nova | 21:18 | |
stephenfin | mriedem, dansmith, efried: ^ | 21:18 |
mriedem | stephenfin: is that a master only regression? it should probably have a bug | 21:19 |
stephenfin | test is wrong - I rushed it | 21:19 |
stephenfin | I can do that | 21:19 |
mriedem | so only unit tests, no functional coverage | 21:19 |
sean-k-mooney | did that get intoduced with the sev stuff | 21:19 |
mriedem | this might be where i asked for a nova/tests/functional/regressions test case | 21:20 |
mriedem | code: do something wrong | 21:20 |
mriedem | unit test: assert the code did something wrong | 21:20 |
*** JamesBenson has quit IRC | 21:22 | |
*** tbachman has quit IRC | 21:23 | |
stephenfin | can someone give me an example of a non-CPU flag trait we support at the moment? | 21:23 |
*** zhubx has joined #openstack-nova | 21:25 | |
mriedem | COMPUTE_SUPPORTS_MULTIATTACH | 21:25 |
sean-k-mooney | i was going to mention the once in my seriese that we punted... | 21:25 |
sean-k-mooney | ya that | 21:25 |
mriedem | so i bet we could recreate in devstack by just adding that to a tempest flavor yeah? | 21:26 |
sean-k-mooney | ya | 21:26 |
sean-k-mooney | you can do it the way im doing all the nfv/numa jobs | 21:27 |
*** JamesBenson has joined #openstack-nova | 21:27 | |
mriedem | heh even just COMPUTE_ATTACH_INTERFACE | 21:27 |
stephenfin | mriedem: https://bugs.launchpad.net/nova/+bug/1843836 | 21:27 |
openstack | Launchpad bug 1843836 in OpenStack Compute (nova) "Failure to schedule if flavor contains non-CPU flag traits" [Undecided,New] | 21:27 |
*** boxiang has quit IRC | 21:27 | |
sean-k-mooney | well all the compute capablity traits fall into that catagory right | 21:27 |
mriedem | stephenfin: yeah anything in here really https://github.com/openstack/nova/blob/master/nova/virt/driver.py#L102 | 21:28 |
efried | artom et al, numalm bottom failed l-c on subunit parser https://storage.bhs1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_769/634827/60/check/openstack-tox-lower-constraints/7693dab/testr_results.html.gz | 21:28 |
artom | efried, ugh, recheck? | 21:28 |
artom | That's not a legit thing, right? | 21:29 |
efried | #2 failed functional on 6!=7 -- we fixed that mriedem right? Rebase? | 21:29 |
efried | artom: question whether to recheck or rebase | 21:29 |
efried | artom: you're way down the master branch | 21:29 |
mriedem | efried: gate should pick it up from master | 21:29 |
efried | you would think so mriedem, but it's possible this was queued from before that merged too. | 21:29 |
efried | https://ebfd4ec54dbc92b01f1e-dbc96e30a25abd79ad83d28f747c4e2b.ssl.cf2.rackcdn.com/635229/62/check/nova-tox-functional/a59c098/testr_results.html.gz | 21:29 |
artom | efried, yeah, that was intentional to make reviewing easier | 21:30 |
mriedem | the subunit to parser failure is when a test dumps too much shit to the logging output stream | 21:30 |
sean-k-mooney | efried: it will do a merge of the patch you see in gerrit and master | 21:30 |
mriedem | efried: yeah maybe | 21:30 |
mriedem | that claims one was in the gate b/c my fix was merged | 21:30 |
mriedem | *before | 21:30 |
efried | #3 hasn't failed yet, #4 is still queued | 21:30 |
*** JamesBenson has quit IRC | 21:30 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: libvirt: Correctly handle non-CPU flag traits https://review.opendev.org/681932 | 21:31 |
efried | so the question is, do we recheck 1&2, hoping 3&4 pass, and save their place in the queue | 21:31 |
efried | or do we rebase, and lose 3&4's place in the queue | 21:31 |
sean-k-mooney | if we are going to rebase the LM series now is proably better then in a few hours | 21:31 |
*** henriqueof1 has quit IRC | 21:31 | |
* artom still isn't sure what the actionable thing to do here is | 21:31 | |
efried | artom: your call | 21:31 |
stephenfin | mriedem: I'll whip up a functional test for that tomorrow. It's too late to face into that now | 21:31 |
mriedem | ha, got a node after 11 hours, then failed | 21:31 |
stephenfin | but unit test is fixed and I've linked the bug | 21:31 |
mriedem | stephenfin: sure, i was just going to push a simple devstack patch to show the failure | 21:31 |
artom | efried, does it really matter? I'm assuming if I rebase now, I'll get administrative +Ws | 21:32 |
efried | artom: first option we have to wait until gate posts actual failure, which will be a while. IMO second option would be best | 21:32 |
efried | artom: yes | 21:32 |
artom | It's not like we need this by tomorrow | 21:32 |
*** hemna has quit IRC | 21:32 | |
*** markvoelker has quit IRC | 21:32 | |
stephenfin | giant chain of patches | 21:32 |
stephenfin | wheee | 21:32 |
mriedem | artom: sorry but my +Ws on that series were only good through yesterday | 21:32 |
artom | It can wait the weekend or w/e, when the load on the gate isn't as high | 21:32 |
artom | mriedem, you sonnova | 21:32 |
mriedem | i thought you saw the disclaimer? | 21:32 |
sean-k-mooney | artom: we are better off just rebasing it now i think | 21:33 |
efried | artom: by rebasing now, you kick four patches out of the queue, leaving room for other things. | 21:33 |
*** lbragstad has quit IRC | 21:33 | |
* artom rebases | 21:33 | |
sean-k-mooney | if we can line ups the LM serise then put PCPUs on top the vPMEM i think that is the best path forward | 21:33 |
* efried waits with +Wand | 21:34 | |
stephenfin | sean-k-mooney: As things would have it, I've got a PCPUs on vPMEM rebase just waiting to go | 21:34 |
*** JamesBenson has joined #openstack-nova | 21:34 | |
* efried stops doing that ^ | 21:34 | |
sean-k-mooney | stephenfin: then after artom pushs can you rebase on top? | 21:35 |
efried | don't do that ^ | 21:35 |
stephenfin | yarp | 21:35 |
efried | cause it'll kick vpmem out | 21:35 |
stephenfin | or not | 21:35 |
stephenfin | actually, yeah, no | 21:35 |
mriedem | are these all conflicting? | 21:35 |
efried | minorly | 21:35 |
stephenfin | PCPU doesn't conflict with LM | 21:35 |
efried | and I un-conflicted vpmem with lm | 21:35 |
mriedem | i really only care about numa lm | 21:35 |
efried | yeah we know | 21:36 |
artom | mriedem, <3 | 21:36 |
stephenfin | VPMEM conflicted with LM but efried fixed it | 21:36 |
mriedem | ...because it's been a thing people have wanted forever | 21:36 |
stephenfin | PCPU and VPMEM is a horror show | 21:36 |
sean-k-mooney | mriedem: sicne we intoduced numa in like juno | 21:36 |
artom | mriedem, and here I thought it was because of my pretty eyes | 21:36 |
stephenfin | in terms of number of merge conflicts, not complexity | 21:36 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: New objects for NUMA live migration https://review.opendev.org/634827 | 21:36 |
artom | Incoming | 21:36 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: LM: Use Claims to update numa-related XML on the source https://review.opendev.org/635229 | 21:36 |
mriedem | artom: you do have nice brown (?) eyes | 21:36 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: NUMA live migration support https://review.opendev.org/634606 | 21:36 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: Deprecate CONF.workarounds.enable_numa_live_migration https://review.opendev.org/640021 | 21:36 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: Functional tests for NUMA live migration https://review.opendev.org/672595 | 21:36 |
mriedem | i got lost in them | 21:36 |
efried | hazel is my vote | 21:37 |
mriedem | +Wing | 21:37 |
stephenfin | I could have worked on k8s... | 21:37 |
*** hamzy_ has joined #openstack-nova | 21:37 | |
sean-k-mooney | stephenfin: i hear you like yaml | 21:37 |
sean-k-mooney | stephenfin: becasue that is all k8s is as a user | 21:37 |
*** tbachman_ has joined #openstack-nova | 21:38 | |
mriedem | soon we'll all only code in yaml | 21:38 |
*** JamesBenson has quit IRC | 21:39 | |
efried | mriedem: "I'll push a devstack change to put a required trait that I know will be on all libvirt computes and that can show the failure for us quicker than writing some functional libvirt test." I assume this is a joke. | 21:39 |
mriedem | efried: nope | 21:39 |
mriedem | it's a 2 line change in devstack, | 21:39 |
mriedem | yes i know it will take awhile to get a node | 21:39 |
efried | yeah, and an 11h... | 21:39 |
mriedem | but as melwitt pointed out earlier, nova gets punished by zuul | 21:39 |
mriedem | devstack probably doesn't | 21:39 |
efried | true story | 21:39 |
melwitt | yeah, a devstack change might make it in a few hours. or less | 21:40 |
sean-k-mooney | or you can cheat. notice that all my jobs completed in like 2 yours | 21:40 |
efried | I thought you meant push a change to nova/devstack | 21:40 |
mriedem | i heard fungi takes a devstack change manually and runs it on his own hardware :) | 21:40 |
*** henriqueof has joined #openstack-nova | 21:40 | |
sean-k-mooney | the FN special lables im using have there own pool and if you put them in the experimental queue run almost right away | 21:40 |
melwitt | let's cheat | 21:40 |
* melwitt jokes | 21:41 | |
fungi | mriedem: i take devstack changes amd manually rub them on my feet. it's better than a pedicure | 21:41 |
mriedem | i was not expecting that response | 21:42 |
melwitt | that is some fungi CI indeed | 21:42 |
mriedem | now i just need to look up the magic incantation to configure a flavor with a required trait | 21:42 |
fungi | i no longer have a basement full of server racks | 21:42 |
sean-k-mooney | melwitt: i mean if we want to repoduce quickly we could. also im ment to be wringin a greneade job.... | 21:42 |
fungi | nor, you know, a basement | 21:42 |
sean-k-mooney | efried: stephenfin so what patches are where currently | 21:43 |
sean-k-mooney | vpmem is pending | 21:43 |
efried | mriedem: extra spec trait:YOUR_TRAIT_HERE='required' ? | 21:43 |
sean-k-mooney | and stephenfin you have a version of pcpu on that | 21:43 |
mriedem | efried: WRONG | 21:43 |
mriedem | trait:<trait>=required | 21:44 |
mriedem | quotes would kill you | 21:44 |
melwitt | sean-k-mooney: I only vaguely understand what y'all were talking about earlier. was just saying funny [to me] unhelpful things | 21:44 |
efried | mriedem: I was being syntax-y | 21:44 |
stephenfin | sean-k-mooney: I do, yeah. Just running tests locally before I push it up | 21:44 |
efried | if you're doing it on an osc cli, the quotes will go away | 21:44 |
efried | s/n osc// | 21:44 |
mriedem | https://review.opendev.org/681938 | 21:45 |
sean-k-mooney | melwitt: :) i mean if one queue is slow and the other is fast sometimes cheating in the gate is for the greater good | 21:45 |
efried | sean-k-mooney: keeping in mind that shoving stuff in the queue from another project will push all the nova stuff down a slot too. | 21:46 |
sean-k-mooney | ya i know that is why os-vif is generaly way quicker to test stuff in the nova | 21:46 |
fungi | yep. the change scheduling round-robin allocates nodes to changes by project, so the more changes there are for a given project requesting resources, the longer the later ones will wait for a turn | 21:46 |
mriedem | https://bugs.launchpad.net/nova/+bug/1843836 tagged for train-rc-potential so we can start using that tag | 21:46 |
openstack | Launchpad bug 1843836 in OpenStack Compute (nova) "Failure to schedule if flavor contains non-CPU flag traits" [Undecided,In progress] - Assigned to Stephen Finucane (stephenfinucane) | 21:47 |
sean-k-mooney | fungi: isnt that there to stop all gate capastity been eaten by triplo | 21:47 |
mriedem | sean-k-mooney: and nova and neutron | 21:47 |
fungi | that was a big part of it, but yes also nova and neutron ;) | 21:47 |
mriedem | https://bugs.launchpad.net/nova/+bugs?field.tag=train-rc-potential | 21:48 |
sean-k-mooney | yes but i think triplo still uses like the equivalent of nova and neutron combined | 21:48 |
sean-k-mooney | anyway it makes it fairer for everyone else | 21:48 |
fungi | under the old first-come-first-served scheduling, if a project has someone push a 30-change series in one shot then those all got priority and one-off changes for other projects got to wait until that entire series got the requested resources | 21:49 |
* mriedem does shifty eyes | 21:49 | |
*** markvoelker has joined #openstack-nova | 21:50 | |
sean-k-mooney | we nver have 30+ long seriese in nova | 21:50 |
efried | never | 21:50 |
* efried reboots | 21:50 | |
* fungi chuckles | 21:50 | |
*** efried has quit IRC | 21:50 | |
sean-k-mooney | only ever have 3 of them together | 21:50 |
sean-k-mooney | its never just 1 | 21:51 |
sean-k-mooney | also our downstream ci is supper dumb and really does not like it when that is down to it | 21:51 |
sean-k-mooney | our downstream ci does not first apply the previous patch before the current one | 21:52 |
*** efried has joined #openstack-nova | 21:52 | |
fungi | i'll be the first to admit that the round-robin job scheduler algorithm is still painful, but in general being backlogged on builds is going to be painful for someone regardless | 21:52 |
sean-k-mooney | fungi: yes but at least zuul make series actully work | 21:53 |
sean-k-mooney | jenkins on the other hand... | 21:53 |
fungi | if people have ideas for less painful scheduling algorithms/rules we can consider those too | 21:53 |
sean-k-mooney | the only one i have come up with that i thought was beeter we cant do. | 21:54 |
fungi | still trying to figure out if some of the packet scheduling algorithms used for adaptive rate limiting in the network space could be applied to job scheduling | 21:54 |
sean-k-mooney | which is split check into fast-check and check | 21:54 |
sean-k-mooney | and stick the tox jobs and docs in fast check | 21:55 |
fungi | we already have an analog of qos in place by prioritizing different pipelines | 21:55 |
fungi | and perform windowing and exponential backoff on failure in dependent pipelines | 21:55 |
fungi | so it's not out of the realm of possibility that networking ideas have still more we can steal from | 21:56 |
sean-k-mooney | fungi: just dont follow tcp's window algortioum | 21:56 |
sean-k-mooney | but ya | 21:56 |
fungi | well, yeah, it's not the exact same algorithm, just the basic idea | 21:56 |
fungi | but basically we only allocate resources to a subset of changes in dependent pipelines like the gate, and then increase or decrease that window based on how many changes pass or fail | 21:57 |
sean-k-mooney | there are some intersting algortiom in the cache/task execution domain too | 21:57 |
fungi | so that even though the gate pipeline has priority over the check pipeline, a perpetually failing gate load won't monopolize all available resources and leave check starved entirely | 21:58 |
sean-k-mooney | well to get to gate you have to go through check | 21:58 |
sean-k-mooney | so that should not happen anyway | 21:58 |
sean-k-mooney | it would be self regualting | 21:58 |
*** mlavalle has joined #openstack-nova | 21:59 | |
sean-k-mooney | this would poably be better conversation to have on infra or zuul channel | 21:59 |
fungi | except dependent pipelines can eat orders of magnitude more resources due to gate resets from failures near the front of the queue | 21:59 |
efried | fungi: would be neat to have a fast-fail toggle... somewhere, somehow. | 21:59 |
sean-k-mooney | efried: its a conflict between fail fast and report as much info as possibel | 22:00 |
efried | yeah, I know | 22:00 |
sean-k-mooney | fungi: how hard would it be to have each job report independly | 22:00 |
efried | but if I know my patch *should* pass, I turn that toggle on, so if e.g. py27 blows up spuriously, the whole thing gets kicked out right away freeing up the rest. | 22:00 |
fungi | it might be that fast-fail would be more appropriate in a dependent pipeline if its changes are filtered through an independent pipeline first like happens with the openstack tenant | 22:01 |
sean-k-mooney | efried: ya if a voting job fails it would be cool if it did nto start any other job in the job set but continue runniinng the ones that are running | 22:01 |
openstackgerrit | Eric Fried proposed openstack/nova master: DB API changes to get non-matching aggregates from metadata https://review.opendev.org/671074 | 22:01 |
openstackgerrit | Eric Fried proposed openstack/nova master: Add a new request filter to isolate aggregates https://review.opendev.org/671075 | 22:01 |
openstackgerrit | Eric Fried proposed openstack/nova master: Docs for isolated aggregates request filter https://review.opendev.org/667952 | 22:01 |
fungi | given the expectation is that if the change has made it through check then it normally shouldn't fail in the gate either, so as soon as one failure is encountered it can just be reported and ejected | 22:02 |
efried | like that one --^ | 22:02 |
sean-k-mooney | fungi: isnet there a way to make jobs depend on other jobs in the same pipline | 22:02 |
fungi | there is, but then you delay starting them until the others finish which increases time to report on success | 22:02 |
sean-k-mooney | yes but if you only dely it by the 5-10 minute fo the tox jobs | 22:03 |
sean-k-mooney | that might be worth it | 22:03 |
sean-k-mooney | fungi: its too bad we cant simulate this | 22:04 |
fungi | sure, i expect that depends on the project. ultimately though i think there's a lot more throughput to be gained by finding and fixing the bugs that lead to nondeterministic test results than in optimizing for failure | 22:04 |
*** weshay is now known as weshay|ruck | 22:04 | |
sean-k-mooney | well i hope we are trying to do both | 22:04 |
fungi | sure, but the resources consumed by build failures far outweigh the meager gains from shuffling them around in the available resource pool | 22:06 |
stephenfin | efried: Alrighty, functional, unit and pep8 tests passing so I'm going to push up this rebase of PCPU onto vPMEM. Can you spin through it and check out the conflicts when I do? | 22:07 |
* mriedem looks back at irc | 22:08 | |
mriedem | fungi: i apologize for rousing you into this channel and discussion | 22:08 |
mriedem | s/rousing/conjuring/ | 22:08 |
*** tbachman_ has quit IRC | 22:09 | |
fungi | mriedem: no apologies needed. i'm just sorry if i derailed otherwise stimulating discussion in here ;) | 22:09 |
fungi | i do still recommend devstack changes on the feet though... so soothing | 22:09 |
efried | stephenfin: will do. | 22:09 |
efried | stephenfin: actually, hold on | 22:10 |
sean-k-mooney | fungi: the scheduling probalem for gate jobs has a lot of similar challanges to nova schduler problem for vms btu a lot of difference too | 22:10 |
stephenfin | holding | 22:10 |
efried | stephenfin: cpu-resources #1 just passed check and entered gate | 22:10 |
mriedem | fungi: before you showed up and professionaled it up we were talking about artom's beautiful brown eyes | 22:10 |
efried | stephenfin: but #2 is failing check | 22:10 |
stephenfin | efried: where are you seeing this? | 22:11 |
mriedem | sean-k-mooney: i already said we should use zuul as a nova scheduler driver in -infra months ago | 22:11 |
mriedem | dibs on that ideea | 22:11 |
efried | stephenfin: so if possible, just rebase #2+ onto vpmem | 22:11 |
sean-k-mooney | hehe its all yours | 22:11 |
efried | I... think that should be possible. | 22:11 |
sean-k-mooney | mriedem: add a ai or ml to it and you have your self a start up | 22:11 |
stephenfin | efried: I don't think I can straddle another branch like that | 22:12 |
stephenfin | I'd need to pull in vpmem between #1 and #2 | 22:12 |
stephenfin | which means pulling that out of the queue | 22:12 |
efried | yeah, boo | 22:12 |
sean-k-mooney | even when its a pain im still glad we have the ci that zuul/infra provides. can you imaging merging all this stuff by hand and testing it like the kernel does | 22:14 |
* alex_xu probably can continue sleep | 22:14 | |
*** hemna has joined #openstack-nova | 22:15 | |
efried | alex_xu: Yes, please do, I don't think there's anything you need to do here. Thanks for all the work. | 22:15 |
alex_xu | efried: stephenfin sean-k-mooney thank you all :) | 22:15 |
sean-k-mooney | alex_xu: night o/ | 22:16 |
efried | stephenfin: I guess the choices are 1) wait for #1 to merge -- gate queue looks to be <2h -- then rebase the remainder on vpmem; or 2) rebase all right now and lose the +V on #1 | 22:16 |
stephenfin | I've just going to rebase | 22:16 |
stephenfin | it's one patch | 22:16 |
stephenfin | if I don't, we'll need to rebase vpmem | 22:16 |
stephenfin | to pick up the newly merged patch from master | 22:17 |
efried | not sure about that. But.. okay. | 22:17 |
sean-k-mooney | because of a conflict? | 22:17 |
efried | no, because 2+ relies on 1 | 22:17 |
stephenfin | ^ that | 22:17 |
sean-k-mooney | oh ok ya | 22:17 |
sean-k-mooney | thats our green check policy | 22:18 |
stephenfin | maybe the gate will rebase for me. idk | 22:18 |
sean-k-mooney | the gate will merge in some case but never rebase | 22:18 |
stephenfin | doesn't seem worth worrying about though. it's easy to hit recheck all weekend long | 22:18 |
*** eharney has quit IRC | 22:19 | |
*** tbachman has joined #openstack-nova | 22:19 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: libvirt: Start reporting PCPU inventory to placement https://review.opendev.org/671793 | 22:19 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: libvirt: '_get_(v|p)cpu_total' to '_get_(v|p)cpu_available' https://review.opendev.org/672693 | 22:19 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: objects: Add 'InstanceNUMATopology.cpu_pinning' property https://review.opendev.org/680106 | 22:19 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Validate CPU config options against running instances https://review.opendev.org/680107 | 22:19 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: trivial: Use sane indent https://review.opendev.org/680229 | 22:19 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: objects: Add 'NUMACell.pcpuset' field https://review.opendev.org/680108 | 22:19 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: hardware: Differentiate between shared and dedicated CPUs https://review.opendev.org/671800 | 22:19 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: libvirt: Start reporting 'HW_CPU_HYPERTHREADING' trait https://review.opendev.org/675571 | 22:19 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: tests: Additional functional tests for pinned instances https://review.opendev.org/681750 | 22:19 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Include both VCPU and PCPU in core quota count https://review.opendev.org/681374 | 22:19 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Add support for translating CPU policy extra specs, image meta https://review.opendev.org/671801 | 22:19 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: fakelibvirt: Make 'Connection.getHostname' unique https://review.opendev.org/681060 | 22:19 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: libvirt: Mock 'libvirt_utils.file_open' properly https://review.opendev.org/681061 | 22:19 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Add reshaper for PCPU https://review.opendev.org/674895 | 22:19 |
stephenfin | oh thank God | 22:19 |
*** rcernin has joined #openstack-nova | 22:19 | |
stephenfin | I was terrified I'd accidentally rebase vpmem :-D | 22:19 |
sean-k-mooney | ok so now in thery i can create a job that depens on pcpu+vpmem and numa LM right in my testing changes | 22:20 |
sean-k-mooney | since artom rebased LM there shoudl be no conflict anymore right | 22:21 |
stephenfin | correct | 22:21 |
sean-k-mooney | im going to check that locally. | 22:21 |
sean-k-mooney | also i really like git at times like this | 22:22 |
efried | stephenfin: that's weird, how did https://review.opendev.org/#/c/671793/ end up with a -2 from me on it? Usually that only happens if you wound up identical to an earlier PS that was -2 | 22:23 |
sean-k-mooney | ya i can merge https://review.opendev.org/674895 with https://review.opendev.org/#/c/640021 | 22:24 |
stephenfin | Looks like it was one of your procedural -2s | 22:24 |
stephenfin | I suspect git thinks the diff is the same | 22:24 |
stephenfin | as an earlier PS | 22:24 |
sean-k-mooney | ya if gerrit detect there is no change via a rebase it keps +/- 1/2 | 22:25 |
sean-k-mooney | that stops peole just rebasign to clear negitive reviews | 22:25 |
stephenfin | I wish Gerrit had some way to view just the diff between two versions side-by-side without all the additional context introduced in between | 22:25 |
*** avolkov has quit IRC | 22:25 | |
stephenfin | As it is, I've to open different versions of the review in different tabs and toggle between them | 22:26 |
sean-k-mooney | thats a 4 way diff and i think the new ui can do it | 22:26 |
stephenfin | o rly? | 22:26 |
*** BjoernT has quit IRC | 22:26 | |
sean-k-mooney | apparently althogh i have not see it in partice | 22:26 |
sean-k-mooney | im sure we could still break it | 22:26 |
melwitt | git review -m <review num>,PSold-PSnew will do it, fwiw | 22:27 |
sean-k-mooney | you creat it by taking the origin patch angainst base and the updatdate patch against new base then diff the two diffs | 22:27 |
melwitt | https://docs.openstack.org/infra/git-review/usage.html | 22:28 |
sean-k-mooney | i dont know if that ignore change intoduce by rebasing | 22:29 |
stephenfin | it does not :( | 22:29 |
sean-k-mooney | but the process i discribe should do that | 22:30 |
melwitt | it did for me last time I tried it a few days ago | 22:30 |
stephenfin | at least if I do 'git review -m 671793,24-25' I see all the vpmem stuff introduced | 22:30 |
stephenfin | maybe I've an old version of git-review | 22:30 |
* melwitt tries with your example | 22:31 | |
sean-k-mooney | melwitt: it will do it if you dont rebase between 4 and 10 | 22:31 |
melwitt | I mean I did a diff after a rebase between | 22:31 |
stephenfin | sean-k-mooney: yeah, 'git format-patch HEAD~N > (old|new); diff -r old/ new/' would do the trick | 22:31 |
stephenfin | melwitt: I imagine the difference could be that that rebase didn't introduce changes to the files you were touching | 22:32 |
stephenfin | but that's a complete guess | 22:32 |
*** markvoelker has quit IRC | 22:32 | |
melwitt | it did, otherwise it wouldn't be useful right? I went to do it in git-review after seeing the diff in gerrit a mess with rebase stuff | 22:32 |
efried | stephenfin: looks like manual rebases on four of those? You're +A all the way. | 22:33 |
sean-k-mooney | well i just get an error when i run stephens example | 22:33 |
stephenfin | efried: That's what I was seeing too | 22:34 |
stephenfin | Sweet. Anything else I need to do or is it home time? | 22:34 |
efried | I do what stephenfin does - open two tabs and toggle. | 22:34 |
melwitt | but anyway, I'm getting the same result as you with your example :\ I don't get why this acts inconsistent depending on the change | 22:35 |
efried | and yes, rumor has it the newer gerrit has a 4-way. aspiers is always wishing misty-eyed for it. | 22:35 |
sean-k-mooney | it is very pretty | 22:35 |
efried | maybe fungi is getting that set up for us once he gets his shoes back on | 22:35 |
fungi | no shoes needed with new gerrit | 22:36 |
* sean-k-mooney is impressed by how flavor less this beer tastes... | 22:37 | |
fungi | it's like the plushest 1970s era orange shag carpet | 22:37 |
* fungi has no clue what a 4-way diff is. wondering if he needs extra-dimensional space to render it, and what sort of monitor that entails | 22:38 | |
* sean-k-mooney apparenetly it has ingredenet but otherwise heineken light really does taste like water | 22:38 | |
sean-k-mooney | fungi: its a diff that only show you what changed between two patche set ignoring anything that came form a rebase | 22:39 |
melwitt | there's a heineken light? eesh. as if normal heineken wasn't bad enough | 22:39 |
fungi | yeah, i'm probably a beer snob but i wouldn't touch heineken anything with a 3m pole | 22:39 |
*** lbragstad has joined #openstack-nova | 22:40 | |
sean-k-mooney | fungi: normally i only dirink stouts or poters if im drinking beer | 22:40 |
fungi | (currently enjoying o'connor brewing's "el guapo" agave ipa) | 22:40 |
sean-k-mooney | but this was all i had in the fridge | 22:40 |
melwitt | I'm obsessed with hazy ipas lately | 22:40 |
melwitt | in colder weather I go for stouts and porters | 22:41 |
fungi | those can indeed be tasty | 22:41 |
melwitt | with the exception of north coast's old no 38, I will drink that anytime | 22:42 |
fungi | i do prefer drier varieties like pale ales, but especially when doing yardwork in the summer | 22:42 |
*** slaweq has joined #openstack-nova | 22:42 | |
sean-k-mooney | hot summer days are cider weather | 22:43 |
stephenfin | sean-k-mooney: Spoken like a true Tipp wan | 22:45 |
*** tbachman has quit IRC | 22:45 | |
sean-k-mooney | of course i was born in the town where bulmer/magnars cider is from | 22:46 |
*** markvoelker has joined #openstack-nova | 22:46 | |
sean-k-mooney | although i prefer dry cider in general | 22:46 |
*** slaweq has quit IRC | 22:48 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Cleanup reno live-migration-with-PCI-device https://review.opendev.org/681942 | 22:49 |
mriedem | hmm, i was going to avoid mentioning things in the cycle highlights about deprecating the xenapi driver, consoleauth and cells v1, but wondering if i should mention that the placement-in-nova code was removed | 22:50 |
*** markvoelker has quit IRC | 22:51 | |
mriedem | i've avoided mentioning things like that as "highlights" | 22:51 |
sean-k-mooney | if you mentioned it as remvoed you should mention cells v1 as removed | 22:52 |
sean-k-mooney | deprecation of xen is more a lowlight | 22:52 |
mriedem | cells v1 was optional and not many used it, | 22:52 |
mriedem | placement-in-nova was not optional | 22:52 |
*** ivve has quit IRC | 22:52 | |
mriedem | nor was consoleauth really | 22:52 |
mriedem | anyway, from a marketing perspective those aren't highlights | 22:52 |
efried | mriedem: if you'd rather put a positive spin, focus on "extracted placement is required" | 22:52 |
sean-k-mooney | was the removeal of placmenet done with a release note | 22:53 |
mriedem | that's not really marketing | 22:53 |
stephenfin | does the prelude form part of the marketing? | 22:53 |
mriedem | sean-k-mooney: yes, i'm scanning the release notes for highlights | 22:53 |
mriedem | stephenfin: no, | 22:53 |
mriedem | that's why i think ^ is all good prelude stuff b/c it's big, | 22:53 |
mriedem | but it's not marketing shit that sales people would undersatnd | 22:53 |
melwitt | yeah, I was gonna say, I agree none of those are highlights but they are prelude material | 22:53 |
stephenfin | I agreed | 22:53 |
stephenfin | *agree | 22:53 |
mriedem | if i'm selling train, why do i care if nova uses placement in nova or external | 22:53 |
sean-k-mooney | mriedem: i expect the placement release note will mark it graduation to a sperate service as a highlight | 22:53 |
efried | sales schmales, this is open source beyotch | 22:53 |
stephenfin | efried: Not according to cdent :-D | 22:54 |
efried | mriedem: because it's way faster? | 22:54 |
cdent | aw man, don't invoke me for that | 22:54 |
stephenfin | sorry, cdent | 22:54 |
melwitt | sean-k-mooney: well, I think we already used "you can run with extracted placement" as a highlight last cycle | 22:54 |
cdent | for cycle highlights we justsaid that | 22:55 |
melwitt | re: graduation to separate service | 22:55 |
*** macz has quit IRC | 22:55 | |
cdent | whoops | 22:55 |
cdent | we said "extracted placement must be used" | 22:55 |
efried | okay, I gotta go teach people how to beat each other up. No rest for the wicked and all that. Thanks all for the long hours and general ass-kicking today. | 22:55 |
cdent | and then talked about nfv | 22:55 |
cdent | because it's, like, the rage | 22:55 |
*** hemna has quit IRC | 22:56 | |
sean-k-mooney | cdent: ye had some pretty big feature this release | 22:56 |
cdent | sean-k-mooney: I'm only being half serious, see: https://review.opendev.org/681197 | 22:56 |
sean-k-mooney | cdent: nfv does corralate with rage however | 22:57 |
*** tkajinam has joined #openstack-nova | 23:03 | |
mriedem | efried: et al nova cycle highlights https://review.opendev.org/#/c/681943/1 | 23:05 |
mriedem | if people think that should say something about sev please propose words | 23:05 |
mriedem | sean-k-mooney: sriov live migration was not driver specific right? | 23:06 |
sean-k-mooney | it only works for libvirt | 23:06 |
mriedem | f | 23:06 |
mriedem | of course | 23:06 |
sean-k-mooney | you said that in the highlight | 23:07 |
mriedem | i didn't say it was specific to libvirt, | 23:07 |
sean-k-mooney | oh that was numa | 23:07 |
mriedem | like the numa one | 23:07 |
mriedem | ok i'll leave it wip until tomorrow and update it | 23:08 |
mriedem | i need to family now | 23:08 |
sean-k-mooney | o/ | 23:08 |
*** mriedem is now known as mriedem_afk | 23:08 | |
sean-k-mooney | does anywone know what grenade_devstack_localrc does https://review.opendev.org/#/c/548936/102/.zuul.yaml | 23:09 |
sean-k-mooney | is there a way in greade to set differnt local.conf options in different releases? | 23:10 |
sean-k-mooney | ah there is | 23:11 |
sean-k-mooney | maybe not... | 23:13 |
stephenfin | efried: Would you mind if we pulled this out of the queue so I can do some surgery on it tomorrow? https://review.opendev.org/#/c/678470/ | 23:13 |
stephenfin | It's test only so we can merge after FF, if artom's comments on https://review.opendev.org/#/c/640021/ are to be believed | 23:13 |
sean-k-mooney | dansmith: i think i have hit a blocker on the greade job | 23:23 |
sean-k-mooney | https://review.opendev.org/#/c/548936/102/.zuul.yaml@69 | 23:23 |
sean-k-mooney | i have it mostly done but i dont know how to change the config on upgrade | 23:23 |
sean-k-mooney | i think i might need a greade plugin to do that or and extention to the grenade job | 23:24 |
*** awalende has joined #openstack-nova | 23:24 | |
*** awalende has quit IRC | 23:28 | |
*** cdent has quit IRC | 23:31 | |
*** icarusfactor has quit IRC | 23:32 | |
*** icarusfactor has joined #openstack-nova | 23:33 | |
*** igordc has quit IRC | 23:34 | |
*** factor has joined #openstack-nova | 23:37 | |
*** icarusfactor has quit IRC | 23:37 | |
*** TxGirlGeek has quit IRC | 23:38 | |
*** JamesBenson has joined #openstack-nova | 23:40 | |
*** JamesBenson has quit IRC | 23:44 | |
artom | stephenfin, that was my understanding, yeah | 23:51 |
artom | I mean, even dansmith was telling me he'd look at it next week | 23:51 |
artom | To me that's an implicit "test-only patches can wait until after FF" | 23:51 |
sean-k-mooney | speaking of test only patches my uber test of doom https://review.opendev.org/#/c/681771/2 is pendingin the experimenal queue. that will test PMEM PCPUS and numa migration | 23:53 |
sean-k-mooney | its curently getting ready to stack | 23:53 |
sean-k-mooney | http://zuul.openstack.org/stream/6a67b28d6e9e4b739e7575df7943a69d?logfile=console.log | 23:53 |
sean-k-mooney | but im going to go to sleep now | 23:53 |
sean-k-mooney | if nothing else merge but that 31 patch chain then there are no conflic between those 3 feature branches | 23:54 |
sean-k-mooney | so lets see how that looks in the morning | 23:54 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!