14:00:11 #startmeeting nova 14:00:12 Meeting started Thu Oct 17 14:00:11 2019 UTC and is due to finish in 60 minutes. The chair is mriedem. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:15 The meeting name has been set to 'nova' 14:00:21 o/ 14:00:32 o/ 14:00:38 o. 14:00:39 efried is shuttling kids atm 14:00:42 * cdent is only barely here 14:00:44 o/ 14:00:49 o/ 14:00:55 o/ 14:01:12 #link agenda https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting 14:01:19 #topic Last meeting 14:01:23 #link Minutes from last meeting: http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-10-03-14.00.html 14:01:27 actions: 14:01:32 #link db migration placeholders (merged) https://review.opendev.org/#/c/686411/ (done) 14:01:39 #link SEV bug warning patch https://review.opendev.org/#/c/686414/ (done) 14:01:53 #link update contributor guide https://review.opendev.org/#/c/685630/ - done but more could be done 14:01:59 #topic Release News 14:02:03 Train is released, stable/train is open for business. 14:02:10 #link Nova Ussuri schedule https://wiki.openstack.org/wiki/Nova/Ussuri_Release_Schedule 14:02:20 any questions about release stuff? 14:02:26 hmm, 14:02:37 so the compute alias patch finally landed, I guess I should propose that to train as well 14:02:51 i think it's optional, 14:03:02 when looking at stein and rocky, we did one of those in it's target release and one after 14:03:05 but didn't backport 14:03:07 i can't remember which 14:03:27 it doesn't hurt to backport either so whatever 14:03:29 it is not critical, but it's confusing for it to not be consistent, just not a huge deal to hold it up 14:03:30 yeha 14:04:07 moving on 14:04:12 #topic Summit/PTG Planning 14:04:19 Ussuri scope containment, 14:04:23 #link Spec template update for "core liaison" is merged https://review.opendev.org/#/c/685857/ 14:04:44 ^ had 2 cores vote on it so that makes 3 total that are aware of the new spec process, 14:05:02 if people have issues with what was merged they can propose amendments 14:05:20 * mriedem doesn't know what the new process is yet either really 14:05:26 questions? 14:05:33 lol 14:05:45 so we still waiting for new process of spec selection right ? 14:06:05 core liaison is just for adding liaison things not the new process 14:06:12 not really "waiting" I don't think.. specs are being approved and implemented 14:06:13 you mean the capping 14:06:20 yeah 14:06:30 or we dropped the idea of capping 14:06:35 i want to say eric intends on capping things as we get closer to a freeze date but not sure 14:06:42 i don't think he dropped the idea of capping 14:06:57 but i also haven't been paying much attention to it either so you'd have to ask eric 14:07:09 ok 14:07:17 #topic Stable branch status 14:07:27 as i said stable/train is open and i've been +2ing changes 14:07:31 #link stable/train: https://review.openstack.org/#/q/status:open+(project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/nova)+branch:stable/train 14:07:43 we just recently-ish did a stein release, 14:07:43 #link stable/stein: https://review.openstack.org/#/q/status:open+(project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/nova)+branch:stable/stein 14:07:51 we did a rocky release last week, 14:07:51 #link stable/rocky: https://review.openstack.org/#/q/status:open+(project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/nova)+branch:stable/rocky 14:08:03 and we need to flush queens and do a release soon, 14:08:03 #link stable/queens: https://review.openstack.org/#/q/status:open+(project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/nova)+branch:stable/queens 14:08:11 queens goes into extended maintenance in a couple of weeks 14:08:35 oct 25 to be exact 14:08:51 #help review stable/queens backports so we can flush the queue and do a final release before extended maintenance 14:09:10 note that some of the things proposed to queens are yet to be merged on the newer branches - basically everything gibi is pushing :) 14:09:24 questions about stable? 14:09:41 #topic Bugs (stuck/critical) 14:09:52 there is 1 critical bug, 14:09:54 #link https://bugs.launchpad.net/nova/+bug/1848499 14:09:54 Launchpad bug 1848499 in OpenStack Compute (nova) "tests fail with networkx 2.4: "AttributeError: 'DiGraph' object has no attribute 'node'"" [Critical,In progress] - Assigned to Balazs Gibizer (balazs-gibizer) 14:10:02 gibi has a patch: https://review.opendev.org/#/c/689152/ 14:10:08 but we're totally blocked in the gate until that lands 14:10:12 so hold your rechecks 14:10:26 #link 85 new untriaged bugs (+11 since the last meeting): https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 14:10:29 so, 14:10:36 that was not the failure I saw and rechecked for 14:10:47 yeah it is 14:10:48 there were like five powervm unit fails 14:10:50 it's the powervm driver tests, 14:10:53 which use taskflow, 14:10:57 which uses networkx 14:10:59 oh, jeez, okay 14:11:06 which just released and is uncapped 14:11:13 turbugen 14:11:29 as seen from the numbers above we're falling behind in bug triage, 14:11:32 #link 9 untagged untriaged bugs (+6 since the last meeting): https://bugs.launchpad.net/nova/+bugs?field.tag=-*&field.status%3Alist=NEW 14:11:53 #link bug triage how-to: https://wiki.openstack.org/wiki/Nova/BugTriage#Tags 14:11:57 #help need help with bug triage 14:12:13 we do have a few random oslo.messaging/conductor related bugs sitting in triage that i don't know what to do with 14:12:22 since they are like, "rpc something, service crashed" 14:12:29 *shrug* 14:12:30 o/ sorry folks 14:12:34 * efried_afk catches up 14:12:46 efried_afk: we're at gate status, you want me to just keep going? 14:13:05 efried: ^ 14:13:14 #topic Gate status 14:13:19 #link check queue gate status http://status.openstack.org/elastic-recheck/index.html 14:13:31 so we obviously have a few gate bugs, the one above being the main issue, 14:13:43 http://status.openstack.org/elastic-recheck/index.html#1844929 is our other major gate bug, 14:13:47 for which i don't have an answer 14:14:17 i have a couple of patches up to grenade and devstack gate to try and get more detailed mysql logs to see why mysql connections are getting aborted, 14:14:25 but those patches aren't working yet and i haven't been prioritizing it 14:14:29 so if someone can help that'd be cool 14:14:31 mriedem: yes please and thanks. 14:14:53 #link 3rd party CI status (seems to be back in action) http://ciwatch.mmedvede.net/project?project=nova 14:15:11 not much to report on 3rd party ci status except i'm sure some jobs are still not working, 14:15:28 and i reported a thing to the hyper-v guys last week and it was already fixed by the time i reported it so thanks to them for being on top of their game 14:15:45 #topic Reminders 14:15:53 there is nothing in the agenda for reminders 14:16:01 #topic Sub/related team Highlights 14:16:07 Placement (cdent/tetsuro) ? 14:16:20 i was just going to note https://review.opendev.org/#/c/688969/ if there was nothing else 14:16:31 tetsuro: posted an email about ptg topics for people who will be there 14:16:32 as i said on there i'm a .5 meh 14:16:54 melwitt made some updates to the consumer types stuff (thank you!) 14:16:57 that's all I know 14:17:07 ack, 14:17:17 #link placement ptg http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010182.html 14:17:37 #link placement ptg etherpad https://etherpad.openstack.org/p/placement-shanghai-ptg 14:17:38 yeah, I just addressed my own test comments. stack has passed CI 14:18:05 re COMPUTE_NODE patch, if we can get that merged and released, I can roll up two versions in the placement catch-up. 14:18:23 efried: I will look at it soon ^^ 14:18:26 thx 14:18:40 dansmith: just so you're aware, and i don't remember if there is any reason not to do this, 14:18:58 but that patch is to add a generic COMPUTE_NODE trait to more easily filter compute node resource providers, like in sylvain's audit CLI 14:19:12 i feel like we've talked about this a few times over the years but just never did it for whatever reason 14:19:40 moving on to, 14:19:40 API (gmann) 14:19:47 I have not started updates for ussuri yet (may be after PTG I will collect the API works and status) 14:19:54 and reminder for review on policy spec - https://review.opendev.org/#/c/686058/ 14:20:09 that's all from me today. 14:20:12 ok 14:20:14 #topic Stuck Reviews 14:20:19 nothing in the agenda 14:20:26 #topic Review status page 14:20:30 #link http://status.openstack.org/reviews/#nova 14:20:34 Count: 456 (-1); Top score: 2133 (+42) 14:20:39 #help Pick a patch near the top, shepherd it to closure 14:20:48 just a reminder on ^, 14:21:02 if you abandon a change, update the related bug (mark it invalid if it's invalid, etc) 14:21:16 #topic Open discussion 14:21:30 since we skipped last week's meeting we have a lot more than usual items here 14:21:35 #help Volunteer core(s) for "Meet the Project Leaders" in Shanghai 14:21:39 i think that is done 14:21:44 efried: ^? 14:21:58 ... 14:22:13 i thought stephenfin and gibi were signing up for all outreach work 14:22:15 Yup, bauzas and stephenfin have volunteered 14:22:18 ok 14:22:20 #link https://etherpad.openstack.org/p/meet-the-project-leaders 14:22:24 I can be there if needed 14:22:25 stephenfin 14:22:33 thanks for that 14:22:37 i'm assuming alex_xu will also be at the summit...? 14:22:39 both 14:22:55 seems like it would be good to have the one chinese nova core involved in that stuff while in china :) 14:23:06 yes, yes it would 14:23:34 #help Volunteer for interop guidelines 14:23:39 #link http://paste.openstack.org/raw/781768/ 14:23:48 not sure if anyone signed up for this, 14:23:51 FWIW I think this is the last time we added anything and it was the first time we added a guideline that depended on a microversion: https://review.opendev.org/509955/ 14:24:22 that was adding the 2.2 keypair type microversion to interop 14:24:30 that microversion was added in kilo.... 14:24:39 so you can see how far behind the guidelines are compared to the compute API 14:25:12 i'm not signing up, but if someone is interested you could just look for something newer than kilo that would be non-admin and hypervisor-agnostic as a potential guideline 14:25:39 moving on, 14:25:44 #link in-place numa rebuild https://blueprints.launchpad.net/nova/+spec/inplace-rebuild-of-numa-instances 14:25:53 sean-k-mooney: ^ 14:25:58 i'm not sure why that's a blueprint rather than a bug? 14:26:07 ya so its acatully 2 bugs 14:26:20 for interop microversion, interop question was for corresponding tempest tests and we said we have exiting tests and if no tests then it can be added 14:26:22 i create a blue pritn just to group them together 14:26:52 https://review.opendev.org/#/c/687957/ is 99% of the work addressing https://bugs.launchpad.net/nova/+bug/1763766 14:26:52 Launchpad bug 1763766 in OpenStack Compute (nova) "nova needs to disallow resource consumption changes on image rebuild" [Medium,In progress] - Assigned to sean mooney (sean-k-mooney) 14:27:10 the final step is skiping the numa toplogy filer on rebuild 14:27:18 if it's bugs, do it as bugs, save some paperwork. 14:27:30 you also can't backport blueprints generally 14:27:34 so i'd nack the blueprint 14:27:35 i just wanted to make sure people were ok with that apparch 14:27:37 if this is more than one bug I think having a bp helps to track the work 14:27:44 this is a pretty big change though 14:28:00 the bugs are more "this was never implemented" which we've previously said are not good bugs 14:28:00 dansmith: not really well feature wise maybe 14:28:05 code wise its small 14:28:09 sean-k-mooney: I know 14:28:17 seems like a BP to me 14:28:52 ya which is the main reason i filled the blueprint 14:29:04 so honestly i dont mind either way 14:29:25 but im tracking it as an RFE downstream 14:29:59 regarding changing rebuild to explicitly fail, 14:30:07 we did that as a bug fix for volume-backed servers when the image changes, 14:30:26 because we used to silently allow that and the image in the root volume never changed and confused everyone 14:30:28 is this like that? 14:30:38 yep exactly 14:30:44 we do a noop claim 14:30:49 so the numa toplogy does not change 14:30:57 untill you do a move operation 14:30:59 but the image might have a different topology 14:31:03 yep 14:31:44 i would only question if people have been relying on that behavior such that it's now a "feature", 14:31:51 i.e. rebuild with new image with new topology and then migrate 14:32:04 dansmith: is that your concern? ^ 14:32:37 i debated adding a workaround config option to allow that for backport reaons. but going forword i would prefer not to support that 14:32:58 kind of like the numa migration 14:33:03 mriedem: no, I don't really have any concerns, I just think that it makes sense to call this a blueprint/feature addition 14:33:25 "foo is not implemented" is not a bug to me 14:33:29 which we've said several times 14:34:06 but we've already spent too much time on it here, so I'd say call it a pink elephant if you want, let's just move on and do it 14:34:55 ok moving on, i left some thoughts in the patch 14:35:03 (melwitt) requesting approval for specless blueprint: https://blueprints.launchpad.net/nova/+spec/policy-rule-for-host-status-unknown 14:35:27 yeah, this is the same blueprint from last cycle that I had asked about. adding a new policy rule 14:35:29 assuming melwitt's patch matches the conditions set in the blueprint whiteboard the last time we talked about this, 14:35:35 i don't see why we wouln't approve 14:35:47 yes, the patch is implemented as described by mriedem in the bp whiteboard 14:36:06 then i'm +1 to approve 14:36:40 gibi also said he was ok with it in the bp 14:36:45 and i'm assuming efried is as well 14:36:52 so barring any objections i'll approve after the meeting 14:36:57 go for it 14:37:01 thank you 14:37:06 (stephenfin) approval for specless blueprint: https://blueprints.launchpad.net/nova/+spec/remove-xvpvncproxy 14:37:19 * stephenfin perks up 14:37:20 ok with me 14:37:32 only complication is that there's an API here 14:38:04 but we've removed (HTTP Gone) nova-network APIs in the past without specs so it should be okay, imo 14:38:15 this would be done similar to how you removed os-cells right? is there anything different 14:38:24 just a 410, it's the same 14:38:30 nah, same thing 14:38:35 ok, then yeah seems fine to me 14:38:46 BobBall said in the mailing list before/around the denver ptg that no one is probably using this right? 14:38:59 correct. I've a patch up and have linked to those discussions from it 14:38:59 of course we don't know who uses xenserver, they seem to randomly show up 14:39:19 we also included it in reno, and they have another option (noVNC) 14:39:21 the only point i wasn't clear on from bob's email was a replacement for this 14:39:25 noVNC 14:39:27 ah 14:39:28 ok 14:39:35 yeah what I recall from the ML thread, it is safe to remove and an old legacy thing 14:39:57 alright so barring objections (again) i'll approve after the meeting 14:40:11 (mriedem): Need to discuss options for https://bugs.launchpad.net/nova/+bug/1847367 14:40:11 Launchpad bug 1847367 in OpenStack Compute (nova) "Images with hw:vif_multiqueue_enabled can be limited to 8 queues even if more are supported" [Undecided,Confirmed] 14:40:41 so apparently we have hard-coded some tap queue limits in the libvirt driver vif module based on kernel version, 14:40:42 mriedem: thats on my dodo list for this week 14:40:53 keep your doodoo to yourself sean 14:41:06 :) 14:41:06 hah, I need to call it my dodo list. Work that's extinct. 14:41:09 but centos has a patched kernel that allows more than what we have hard-coded 14:41:24 i don't really want distro-specific checks in our code, 14:41:27 am its a long know bug 14:41:49 basically the limit should never have applied to vhost-user ports 14:41:51 so another option is just having a workaround flag to allow deployers to set the value 14:42:13 yet another is that we just document this 14:42:27 document what 14:42:28 ? 14:42:32 is it an option that we expect to eventually remove? that's the usual condition for [workarounds] 14:42:37 that some distros are broken 14:42:38 you want the fix, go bug your vendor to backport the fix a lá CentOS 14:42:48 assuming I've understood it correctly 14:42:52 well the issue is that rhel/centos backported chagnes form the 4.x kernel 14:43:04 ah, wait, yeah, other way round 14:43:06 ignore me 14:43:08 I think it's the opposite? we hard-coded something based on kernel version 14:43:17 yeah 14:43:18 yes 14:43:38 melwitt: idk what the timeline would be for removal, 14:43:40 so we could add a workaorund flag for it as mriedem suggested 14:43:44 we don't really require min versions of kernels 14:43:54 unless there is some procfile or sysfile magic to do here this is the best solution imo 14:43:58 if not a workaround flag, it'd be a libvirt group flag 14:44:16 i would default to None to let nova decide but allow you to override if you're using a patched kernel 14:44:24 are people ok with that? 14:44:30 this is also applied to vhost-user ports today which it should nerver have been applie too so that is what i planned to work on tomorrow 14:44:37 wfm 14:45:04 mriedem: that works for me 14:45:06 sean-k-mooney: ok so you can mention vhostuser ports in the bug report i guess, sounds like those are separate changes 14:45:10 yeah. I can't decide whether it should be [workarounds] or not, but config option seems like the best way 14:45:26 #agree add libvirt group option to override max tap limit for bug 1847367 14:45:26 bug 1847367 in OpenStack Compute (nova) "Images with hw:vif_multiqueue_enabled can be limited to 8 queues even if more are supported" [Undecided,Confirmed] https://launchpad.net/bugs/1847367 14:45:26 mriedem: we applie the kernel limit to vhoust-user ports too 14:45:35 so ya its a sperate but related issue 14:45:42 mriedem: do you want me to take this 14:45:46 sean-k-mooney: you're talking to me about this like i know anything about vhostuser limits and max tap queues 14:45:54 since i was going to fix it for vhostuser anyway 14:45:55 moving on 14:46:01 sean-k-mooney: separate bugs 14:46:04 just report a different bug 14:46:13 ok 14:46:15 (mriedem): Do we have a particular stance on features to the libvirt driver for non-integration tested configurations, e.g. lxc [ https://review.opendev.org/#/c/667976/ ] and xen [ https://review.opendev.org/#/c/687827/ ], meaning if they are trivial enough do we just say the driver's quality warning on startup is sufficient to let them land since these are changes from casual contributors scratching an itch? 14:46:49 maybe ^ should be in the ML 14:47:04 yeah, that's a weird one. 14:47:07 but these examples seem trivial enough to me to fall under part-time hobby contributor scratching an itch 14:47:16 probably, yeah 14:47:16 yeah, maybe good for ML 14:47:16 so saying "3rd party ci or die" won't fly 14:47:19 but with that said 14:47:28 and I agree, if trivial then ok to land 14:47:32 if it doesn't impact _other_ drivers, go nuts, I say 14:48:01 #action mriedem to ask on the ML about accepting trivial feature patches for driver configs that don't get integration testing 14:48:13 There is already a notice saying these drivers aren't integration tested 14:48:28 i know, hence the "quality warning" thing in my line above 14:48:32 also we can test lc and xen upstream I think (xen might require some hax though) 14:48:40 *lxc 14:48:51 clarkb: "can" and "who is going to do and maintain that" are different things 14:49:06 clarkb: we could test xen more siplely if we had a xen image/lable 14:49:13 moving on, 14:49:16 final item, 14:49:17 (melwitt): requesting approval for specless blueprint: https://blueprints.launchpad.net/nova/+spec/nova-manage-db-purge-task-log 14:49:23 melwitt: i left a comment in there, 14:49:24 I get that but telling soneone they can write a job is way easier that run an entore ci system 14:49:26 smells like a bug 14:49:38 I MLed about this awhile back 14:50:01 orly? I'm happy to treat it as a bug 14:50:23 the bug is that these records pile up and you have to go directly to the db to clean them up 14:50:39 it's somewhere in between a super low priority bug and a feature i guess 14:51:05 anyway, i'd probably just create a wishlist bug for it? idk what others think 14:51:17 our customers are using the task_log stuff by way of telemetry always enabling the audit API, so we have a lot of people affected by the record build up. and there are enough of them not otherwise truncating the table, it's all been one-offs 14:51:22 we don't have great process rules around adding ops tooling things for nova clis 14:51:58 ok. I'm good either way so if ppl prefer a wishlist bug, I can do that 14:51:58 i'm ok with a bug but others can overrule 14:52:30 maybe lobby for opions in -nova after the meeting, let efried decide 14:52:41 anything else? 14:52:45 ok 14:52:57 ok thanks for hanging in there everyone 14:52:59 #endmeeting