16:00:05 #startmeeting nova 16:00:06 Meeting started Thu Feb 11 16:00:05 2021 UTC and is due to finish in 60 minutes. The chair is gibi. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:07 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:09 The meeting name has been set to 'nova' 16:00:34 o/ 16:00:40 o/ 16:00:52 ~o~ 16:01:13 \o 16:01:55 o/ 16:02:28 let's get started 16:02:41 #topic Bugs (stuck/critical) 16:02:44 no critical bugs 16:02:49 #link 13 new untriaged bugs (+1 since the last meeting): #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 16:03:03 o/ (in another meeting though) 16:03:21 is there any specific bug we need to talk about? 16:04:11 #topic Gate status 16:04:18 gate feels OK to me 16:04:31 I know gmann fixed on of the outstanding gate failure recently 16:04:36 *one 16:04:59 gibi: do we have e-r query for that? if not i can add 16:05:06 to see if there is any more tests need fixes 16:05:08 gmann: I have to check 16:05:15 ok i can check. 16:05:18 thanks 16:05:32 btw, there is an ongoing POST_FAILURE issue in Zuul that is being investigated by infra 16:05:49 * bauzas waves late, forgot the meeting 16:05:49 hm, it was fixed apparently 16:06:26 any specific gate failre you want to mention? 16:07:16 also if you see tempest regex error 16:07:18 #link https://review.opendev.org/c/openstack/tempest/+/774766 16:07:25 ^^ that is fixed so you can recheck 16:07:34 --exlude typo 16:08:00 thanks gmann 16:08:33 #topic Runway status 16:08:39 #link https://etherpad.opendev.org/p/nova-runways-wallaby 16:08:45 #link https://blueprints.launchpad.net/nova/+spec/nova-support-webvnc-with-password-anthentication : has negative feedback to work with 16:08:51 #link https://blueprints.launchpad.net/nova/+spec/compact-db-migrations-wallaby : the nova db patches approved the nova_api db patches needs review 16:09:00 #link https://blueprints.launchpad.net/nova/+spec/modernize-os-hypervisors-api : the api code landed, the python-novaclient patch and the policy patch needs some work 16:09:25 #link https://blueprints.launchpad.net/nova/+spec/support-interface-attach-with-qos-ports the last necessary patch got feedback and now updated, ready to review 16:09:39 * kashyap waves 16:09:41 any specific feature we need to talk about 16:09:43 ? 16:09:56 #link https://blueprints.launchpad.net/nova/+spec/libvirt-default-machine-type - I've just added this to the queue btw 16:11:32 lyarwood: ack 16:11:47 is there a prime candidate who has time to review that? 16:12:19 I can 16:12:43 stephenfin: thanks, just a reminder that I'm AFK on Friday's but I'll address any feedback first thing Monday 16:12:43 cool, I will try to get there as a second reviewer somewhere mid next week, but early next week I will be busy with internal conference preparation 16:12:53 * kashyap can also look at it from some of the libvirt-related PoV ... 16:13:24 thanks both 16:13:56 lyarwood: Thank _you_ for taking on the work ... I was supposed to do some of it, and couldn't 16:14:38 any other feature that needs attention? 16:15:03 just the DB stuff 16:15:10 I guess here or in open discussion but... there's a bunch of pre-requisite os-traits patches up 16:15:22 the API DB ones are ready for review now 16:15:31 And I'm told we can't just Depends-on: for those, so we need to land them and release 16:15:39 I already pinged Dan about them 16:15:49 stephenfin: ack, it is on my radar 16:15:55 artom: Yeah, I reviewed most/all of those last night 16:15:59 just need another +" 16:16:00 *2 16:16:03 stephenfin, yep, so needs the +A 16:16:06 artom: which feature depends on the os-traits release? 16:16:18 gibi, yes ;) 16:16:36 So there's secure boot, VDPA, ephemeral encryption... 16:16:40 I see 16:16:42 My socket policy thing 16:16:52 thanks 16:17:05 I will try to hit it this week 16:17:33 anything else? 16:18:23 #topic Release Planning 16:18:27 Feature Freeze is at 11th of March, in 4 weeks from now 16:18:41 let's hurry up landing features :) 16:18:58 anything else about the coming release? 16:19:52 #topic Stable Branches 16:19:58 Rocky (and might be older branches too) is blocked by issue (tempest-slow job): https://bugs.launchpad.net/neutron/+bug/1914037 16:20:00 Launchpad bug 1914037 in tempest "scenario tests tempest.scenario.test_network_v6.TestGettingAddress fails" [Medium,Triaged] - Assigned to Hemanth Nakkina (hemanth-n) 16:20:04 newer branches seem OK 16:20:38 EOM 16:20:43 thank elod 16:20:59 any other news from stable? 16:21:04 Nope 16:21:11 np, I'll try to review the fix 16:21:13 yeah we are trying to fix that in https://review.opendev.org/c/openstack/tempest/+/774764 16:21:24 and testing nova patch #link https://review.opendev.org/c/openstack/nova/+/775003 16:21:36 it did not tested due to how zuul pick the job definition 16:21:44 which is fixed now and should work. 16:22:20 gmann: thanks 16:22:21 Now the nova-stable-maint is part of placement-stable-maint group so all our stable love can spread to placement too 16:22:22 gmann: thanks, looks promising \o/ 16:23:01 moving on 16:23:01 gibi: ah that reminds me 16:23:07 lyarwood: yes 16:23:21 gibi: sorry, quick note on placement, stable/victoria was blocked but I didn't have time to look into why 16:23:49 gibi: I was trying to land the .gitreview changes to actually open up the branch 16:24:04 https://review.opendev.org/c/openstack/placement/+/754671 for example 16:24:14 lyarwood: I'll have a look 16:24:18 thanks 16:24:20 lyarwood: pyflakes rror 16:24:21 error 16:24:37 yeah I assumed it would be something lc related 16:24:38 pyflakes version conflict 16:24:45 I've just not had time 16:25:05 if you and elod could look that would be great, I'll help with reviews once I'm back on Monday 16:25:08 hacking needs to be bumped I guess 16:25:11 sure 16:25:14 thanks 16:25:24 lyarwood: thanks 16:25:37 #topic Sub/related team Highlights 16:25:40 I'll try to take care of the conflicts there :) 16:25:49 elod: cool, thaks 16:25:52 thanks even 16:26:05 Libvirt (bauzas) 16:26:17 . 16:26:23 (that's it ;) ) 16:26:26 OK 16:26:30 #topic Open discussion 16:26:38 we have couple of topics on the agenda 16:26:45 (kashyap; 05-FEB-2021) Late blueprint-approval request: https://blueprints.launchpad.net/nova/+spec/allow-disabling-cpu-flags 16:26:51 I realize this is late in the cycle, but this really helps alleviate potential live migration problems on some Intel hardware during upgrades and updates. This is technically a simple feature; but can also be considered a "bug fix" that unblocks live migration in some scenarios. 16:26:51 Yep... 16:26:56 Main Benefit: The ability to selectively disable CPU features for a guest means: newly launched VMs on a compute node can now disable offending guest CPU flags that block live migration. This facilitates live migration to a host with TSX=off. 16:27:00 Example: Today, a VM running on a compute node with Intel TSX=on (which is the default on all Linux kernels until v.5.11) cannot be migrated to a node with Intel TSX=off. But, the ability to selectively disable CPU flags alleviates this (and similar problems) — you can now keep TSX enabled on a host, and yet block it for the guest via `cpu_model_extra_flags`. This unblocks live-migrating the 16:27:06 said guest to a host with TSX=off. 16:27:09 EOM 16:27:11 Notes: On relevant Intel processors, TSX is suggested to be disabled as it can be a potential security problem. TSX is disabled by default upstream Linux v.5.11 (Oct-2019) 16:27:32 gibi: So ... as a cherry on top; the last two hours I've done some tests 16:27:49 So I said before that if nobody objects and the implementation patch is ready then I'm willing to approve the bp late and right at the moment +2 the impl patch as well 16:28:00 gibi: o/ I have a topic for later when all others are done. Sorry I didn't put it in the agenda 16:28:11 ganso: ack, I will ping you 16:28:26 kashyap: does the test promising? 16:28:32 Yes! 16:28:38 awesome 16:28:38 Let me get the evidence quickly :) 16:28:52 gibi: Here (for later): https://kashyapc.fedorapeople.org/CPU_flags_Nova_tests.txt 16:29:11 I've done thre tests w/ three different Nova [libvirt]cpu_* configurations 16:29:28 And all three yield expected results. I'm just checking some more; and I'll post the evidence in the review for the record 16:29:54 kashyap: thanks 16:29:59 it is convincing 16:30:01 gibi: So, if you have a quick look there; the enabled CPU flag shows up in the guest; and the disabled ones don't. 16:30:19 I also want to test on a different Intel host (the problematic ones), and then summarize the results. 16:30:54 btw, this is the implementation patch #link https://review.opendev.org/c/openstack/nova/+/774240 16:31:20 Yeah 16:31:39 gibi: A small observation, though, on the XML bits: 16:33:11 As expected, the disabled flags don't show up for the guest. But I'm wondering if it should also show up as "disable" in the guest XML, e.g. 16:33:14 16:33:17 16:34:07 gibi: As of now, the functionality is as expected: if you tell Nova to disable a flag; it will not give it to the guest. But if you tell it to explicitly enable, or give neither '+' nor '-', it enables it. 16:34:11 All expected behaviour. 16:34:41 kashyap: let's take this in the review 16:34:46 so others in this room, is there any objection to late approve the above bp and then quickly rewiew the small implementation patch? 16:34:48 Anyway, I don't want to ramble on about the feature here. 16:34:49 Yep, sorry 16:35:44 * kashyap thinks he made others zone out :D 16:36:44 * gibi fetches his PTL whip 16:36:51 Hehe 16:37:22 * bauzas turns around and doesn't look 16:37:23 stephenfin: lyarwood: or anyone else --^ Any objections, rotten tomatoes, snide remarks? 16:37:46 nope 16:37:58 none from me at the moment, but there's time ;) 16:38:05 * lyarwood will review on Monday 16:38:11 No problem 16:38:26 OK, I consider this as sold. I will late approve the bp and we will to a proper review on the impl patch 16:38:32 moving on 16:38:42 (stephenfin) https://review.opendev.org/c/openstack/nova/+/772271 is stuck 16:38:45 gibi: Yeap, thank you! 16:38:47 elod is concerned about the backportability of this, as it has user-facing impacts. As noted by lyarwood though, the previous behavior was wrong 16:39:04 EOM 16:39:22 elod: how strongly you object :) 16:39:25 ? 16:40:04 well, like my last comment there :D 16:40:14 I tend to agree with elod, but haven't fully digested the change 16:40:52 broken forever behavior right? if it's not a regression, then it's less clear that it _needs_ to go back, and since it's a fairly substantial change in behavior, I'd generally rather not 16:41:20 yeah broken forever 16:42:14 yeah, broken forever, but use of designate means you'll likely hit it sooner rather than later 16:42:43 I'll review it more in a bit and vote, but probably -1 16:43:28 with possible two -1s I consider it as not approved for inclusion 16:43:51 well -1's from stable cores but yeah I agree 16:44:03 lets wait for dansmith to review it in full and we can go from there 16:44:10 lyarwood: OK 16:44:11 yeah, seems reasonable 16:44:18 moving on 16:44:20 (stephenfin) Outreachy projects? 16:44:26 I'm already helping mentor some NDSU students over in OSC/SDK land w/ diablo_rojo and gtema and could probably work with someone else. Do we have any nice, self-contained items that we'd like to do but just haven't had time for though? 16:44:33 * gibi thinks of things like PCI in placement (would need hardware though) 16:44:39 no me stephenfin :D 16:44:45 http://lists.openstack.org/pipermail/openstack-discuss/2021-February/020288.html 16:44:48 EOM 16:45:31 so per $summary 16:45:51 ideally it should be something useful enough that it will be reviewed, but not so important that it'll be an issue if it isn't done 16:46:27 shared storage in placement? 16:46:40 that's huge actually, ignore me 16:46:57 PCI also not well understood at least not for me without digging up notes 16:47:09 fair point 16:47:49 mypy could be something that is small, but we don't have consensus on the usefullness of it 16:48:03 yeah, don't put them in the middle of THAT :) 16:48:08 :) 16:48:39 somebody should fix the gerrit -> launchpad integration, that would be very usefull for me 16:48:46 ++ :) 16:49:03 that would not be release critical so can be done slowly 16:49:12 I just don't know if somebody already started it 16:49:18 and it not nova specific 16:49:58 And the job results display 16:50:10 stephenfin: do you need ideas or do you need a volunteering mentor? 16:50:17 we can open an ethercan of worms 16:50:22 The current greasemonkey script is buggy 16:50:22 I can check with infra 16:50:33 gibi: Ideas. I'm okay with mentoring 16:50:38 gibi: stephenfin: Yeah, for Outreachy ... FWIW, anything hardware-specific would be too much for a novice student 16:51:02 Something that can be done in VMs / et al, with bite-sized-tasks would be nice 16:51:04 We don't need to figure them out now. Mostly raising it to the front of peoples' minds 16:51:06 * kashyap stops giving unsolicited advice 16:51:29 stephenfin: Is OpenStack accepted to Outreachy this cycle? 16:51:30 If anyone does have additional ideas, lemme know and I'll chat with Kendall about them 16:51:43 stephenfin: thanks 16:51:45 kashyap: yup, it seems so (see the ML link) 16:51:51 Nice 16:52:06 moving on 16:52:09 ganso: your turn 16:52:14 gibi: thanks! 16:52:28 hi everyone! I'd like to revisit this https://bugs.launchpad.net/nova/+bug/1821755 16:52:30 Launchpad bug 1821755 in OpenStack Compute (nova) "live migration break the anti-affinity policy of server group simultaneously" [Medium,In progress] - Assigned to Boxiang Zhu (bxzhu-5355) 16:52:55 2 approaches have been suggested, 1 long term ideal solution using placement 16:53:07 and 1 short term approach, which seems to be https://review.opendev.org/c/openstack/nova/+/651969/ 16:53:21 that short term approach seems like it just mitigates the problem and it is still racy 16:53:48 so I'm looking at the long term approach. Considering the complexity of integrating all the moving parts, I'd assume it will require a spec 16:54:13 because it sounds like it will involve deprecating the affinity and anti-affinity filters, in favor of having this functionality in placement 16:54:55 I'd like to know with everyone agrees that this is the correct direction, if this is something worth working into (as it will require review effort from you folks) 16:55:04 s/know with/know if 16:55:24 ganso: modelling affinity in placement definetly needs a spec 16:55:46 ganso: I don't recall if we had any stab at it previously 16:56:19 instances are in placement as consumers so locality can be checked 16:57:10 I was thinking about having a placement property or something to map the affinity and anti-affinity to the instances, like if it was a resource 16:57:48 anyway, those are details that can be sorted out in the spec and in future meetings 16:58:14 ganso: the problem is that allocation candidate query does not have a way to say I don't want to be next to a consumer 16:58:35 so this probably needs placement api work 16:59:19 anyhow I suggest you to look at existing placement pre filters in nova 16:59:23 as a starting point 16:59:30 gibi: thanks! 16:59:42 is there anything else for today? 16:59:44 I will start playing aroudn with this and working on the spec 16:59:54 ganso:cool, ping me if you have questions 17:00:02 gibi: will do =) 17:00:06 we are out of time 17:00:09 thanks for the meeting 17:00:11 we covered a lot 17:00:16 #endmeeting