14:00:12 #startmeeting nova 14:00:13 Meeting started Thu Aug 23 14:00:12 2018 UTC and is due to finish in 60 minutes. The chair is melwitt. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:14 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:16 The meeting name has been set to 'nova' 14:00:19 o/ 14:00:23 greetings everyone 14:00:28 o/ 14:00:48 \o 14:00:48 ō/ 14:00:51 * dansmith gurgles 14:00:51 o/ 14:00:52 hi 14:01:02 o/ 14:01:04 whoa johnthetubaguy sighting! 14:01:04 o/ 14:01:05 let's make a start 14:01:13 wow, hi johnthetubaguy o/ 14:01:17 ő/ 14:01:21 * johnthetubaguy nods in shame 14:01:27 #topic Release News 14:01:36 #link Rocky release schedule: https://wiki.openstack.org/wiki/Nova/Rocky_Release_Schedule 14:01:43 today is the deadline for RCs 14:01:49 and we're having a RC3 today 14:01:55 #link https://etherpad.openstack.org/p/nova-rocky-release-candidate-todo 14:02:08 tracking RC3 changes in there ^ 14:02:34 one of them, I think we have to punt because the correct fix is not yet identified. are we doing a release note or something for that one mriedem? 14:03:02 we should, 14:03:03 but, 14:03:11 i can't get a straight answer out of anyone that knows anything about the problem 14:03:18 ah, okay 14:03:20 sahid is our best bet probably but i'm not sure if he's gone now 14:03:24 so you guys can chase him downstream 14:03:32 why is revert not an option? 14:03:40 the rx/tx queue config stuff? 14:03:49 yeah 14:03:57 i mean, it's always an option 14:04:09 it's not overly hard here yeah? 14:04:11 presumably it works for at least one type of sriov port, 14:04:21 just saying, if there's no obvious workaround and we don't know what to do to fix it.. 14:04:22 but i don't know if anyone claims to have tested it 14:04:23 okay, I just pinged sahid in the nova channel 14:04:32 about release note advice. not sure if he's around though 14:04:49 moshe said vnic type direct doesn't work, 14:04:58 which leads me to believe sahid tested with macvtap 14:05:02 but again, idk 14:05:10 mriedem, melwitt, dansmith: Don't know how much I can do in a few hours, but I can take a look 14:05:15 If sahid isn't about 14:05:26 mriedem: moshe's email makes it sound like turning this on in tripleo was the trigger? 14:05:27 stephenfin: that would be great - it's aslo in the ML 14:05:31 ack 14:05:35 dansmith: depends on the vnic type 14:05:36 so does just leaving it disabled (despite their default) work okay? 14:05:40 yep thanks stephenfin 14:06:01 if vnic_type == 'direct' we don't set some stuff in the domain xml and it explodes 14:06:06 or we set the wrong thing 14:06:14 b/c of some other TODOs related to that rx/tx queue code 14:06:16 regardles of the config? 14:06:20 there are TODOs on top of TODOs in there 14:06:28 tripleo defaults to 512 in the rx queue 14:06:37 so the workaround in tripleo is, don't default 14:06:44 right, that's what I'm asking.. 14:06:45 I understood it as we _do_ set it when we shouldn't 14:06:54 if there is a "don't configure that" workaround, then revert is too nuclear 14:06:54 set the XML attribute, that is 14:07:00 if there's not a workaround, then we should probably revert 14:07:10 sahid just said in the nova channel, he'll reply with all the info he has 14:07:10 stephenfin: there is some code which assumes a default vhost interface driver, 14:07:17 which doesn't work for rx queue with direct vnics, apparently 14:07:19 per moshe's paste 14:07:39 I'll propose a fast-fail patch whenever vnic_type=='direct' 14:07:46 ha 14:07:47 (too soon?) 14:07:51 right on time 14:08:19 so, 14:08:34 if we aren't sure by eod, we can at least doc a known issue reno with the bug 14:08:40 saying "some types of vnics don't work with rx queues" 14:08:46 "test it first, obviously, dummy" 14:09:05 ok. who will propose the reno? 14:09:08 i can crank out that docs change as a placeholder for now 14:09:10 i can 14:09:19 okay cool, thanks 14:09:29 * mriedem mriedem to write a docs/reno patch for https://review.openstack.org/#/c/595592/ 14:09:34 oops 14:09:41 #action mriedem to write a docs/reno patch for https://review.openstack.org/#/c/595592/ 14:09:49 i guess you have to as chair 14:09:59 yeah, I wasn't sure how that worked 14:10:01 #action mriedem to write a docs/reno patch for https://review.openstack.org/#/c/595592/ 14:10:06 #action mriedem to write a docs/reno patch for https://review.openstack.org/#/c/595592/ 14:10:17 ok, anything else for release news before moving on? 14:10:18 is there somewhere that those show up before the end of the meeting? 14:10:27 the bot doesn't ack them 14:10:43 I dunno, I've wondered that too 14:10:48 they do 14:10:51 in the meeting summary 14:10:56 right 14:10:58 before the meeting is ended? 14:10:59 html 14:11:02 no 14:11:13 yeah that's what efried was asking 14:11:14 we'll see if the minutes include one, two, or three instances of that message :) 14:11:14 http://eavesdrop.openstack.org/meetings/nova/2018/nova.2018-08-16-21.00.html 14:11:52 yeah, we'll see what happens after the meeting 14:11:59 #topic Bugs (stuck/critical) 14:12:09 no critical bugs in the link 14:12:14 #link 45 new untriaged bugs (up 5 since the last meeting): https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 14:12:20 #link 7 untagged untriaged bugs (up 5 since the last meeting): https://bugs.launchpad.net/nova/+bugs?field.tag=-*&field.status%3Alist=NEW 14:12:41 #link bug triage how-to: https://wiki.openstack.org/wiki/Nova/BugTriage#Tags 14:12:45 #help need help with bug triage 14:12:51 Gate status 14:12:56 #link check queue gate status http://status.openstack.org/elastic-recheck/index.html 14:13:22 anecdotally, I've been seeing a lot of gate timeouts and seemingly random failures 14:13:31 3rd party CI 14:13:36 #link 3rd party CI status http://ci-watch.tintri.com/project?project=nova&time=7+days 14:13:52 anyone have anything else for bugs or gate status or third party CI? 14:14:07 #topic Reminders 14:14:12 #link Stein Subteam Patches n Bugs: https://etherpad.openstack.org/p/stein-nova-subteam-tracking 14:14:20 #link Stein PTG planning: https://etherpad.openstack.org/p/nova-ptg-stein 14:14:25 #link Rocky retrospective for the PTG: https://etherpad.openstack.org/p/nova-rocky-retrospective 14:14:42 re gate status https://bugs.launchpad.net/nova/+bug/1788403 has been gumming up the works 14:14:42 Launchpad bug 1788403 in OpenStack Compute (nova) "test_server_connectivity_cold_migration_revert randomly fails ssh check" [Medium,Confirmed] 14:14:46 i'll e-r that after the meeting 14:14:59 72 hits in 7 days 14:15:24 oh yeah, I've seen that one 14:15:46 thanks 14:15:48 #topic Stable branch status 14:15:55 #link stable/queens: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/queens,n,z 14:16:00 #link stable/pike: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/pike,n,z 14:16:04 #link stable/ocata: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/ocata,n,z 14:16:17 if we're good with https://review.openstack.org/#/c/594178/ for rc3, 14:16:20 we need stable cores to +2 14:16:26 which would be dansmith and someone else 14:16:30 johnthetubaguy: ? 14:16:37 yeah, just looking at it now 14:17:42 I've been saying I'm going to propose stable branch releases for queens/pike/ocata but we were waiting on a specific series of backports to merge. I need to go through and see if there's anything else to hold for 14:17:59 anything else for stable branch status? 14:18:03 i forgot about https://review.openstack.org/#/c/590801/ 14:18:11 and can't remember if we were doing that for an rc? 14:18:36 "LGTM. This was a regression in Rocky which we backported to queens and pike so we're going to get it in either way, plus this is extremely low risk." 14:18:41 I don't think we identified it as such but it seems appropriate. it's a regression 14:19:05 oh nope, it's tagged 14:19:22 oh i bet i know why it didn't show up in lp 14:19:27 because it's marked as fixed on master 14:19:28 didn't show up in the link because it's no longer open? 14:19:30 yeah 14:19:43 approved 14:20:03 thanks for catching that 14:20:26 okay, moving on 14:20:33 #topic Subteam Highlights 14:20:46 cells v2, we had a meeting. dansmith want to summarize? 14:21:02 we talked through some of the big changes we have on the plate 14:21:12 current stuff like batching and handling down cells, 14:21:20 as well as upcoming fun stuff like cross-cell migration 14:21:39 and we highlighted a few of the patch sets that need more review and have been getting neglected 14:21:46 and finally, we established that I'm a terrible person 14:21:48 I think that's it 14:21:56 great 14:22:04 scheduler, efried? 14:22:07 Okay, here we go 14:22:07 /me glares squinty-eyed at Sigyn 14:22:07 tumbleweed rolls across dusty street 14:22:07 fingers twitch above holsters 14:22:14 #link nova-scheduler meeting minutes http://eavesdrop.openstack.org/meetings/nova_scheduler/2018/nova_scheduler.2018-08-20-14.00.html 14:22:15 ha 14:22:18 "great" 14:22:27 Still no recent pupdates. cdent says he'll resume these around PTG time. Yay! 14:22:45 Discussed status of: 14:22:45 #link reshaper series: https://review.openstack.org/#/q/topic:bp/reshape-provider-tree+status:open 14:22:45 which has since seen some additional reviews/revisions. 14:22:58 Bottom has procedural hold, but two +2s. 14:23:13 Remainder of series has reviews, needs more, should be landable very soon. 14:23:27 #link Gigantor SQL split and debug logging: https://review.openstack.org/#/c/590041/ 14:23:27 Has one +2, gibi is on the hook to +A 14:23:42 #link consumer generation handling (gibi): https://review.openstack.org/#/q/topic:consumer_gen+(status:open+OR+status:merged) 14:23:42 #link ML thread on consumer gen conflict handling: http://lists.openstack.org/pipermail/openstack-dev/2018-August/133373.html 14:23:52 #link nested and shared providers for initial & migration (and other?) allocations: https://review.openstack.org/#/q/topic:use-nested-allocation-candidates+(status:open+OR+status:merged) 14:23:58 note on the reshaper series... 14:24:15 libvirt and xenapi drivers, the two that need reshaper, won't have any code started until next week at the earliest 14:24:51 I feel pretty strongly that we should not hold the series for those. 14:25:38 we don't have to debate it here, i was just noting it for others not in the emails 14:25:49 i'm on the not block forever side of the fence too 14:25:54 Even if it is imperfect - and that's a big "if" - we have a tendency to find and fix bugs as and when they are discovered by consuming code/operators/whatever. 14:25:56 i just need to go through the nova client side changes 14:25:58 yeah, I think it's fine to move forward with the series, fwiw. we can fix stuff after 14:26:14 cool beans 14:26:47 okay, next is notifications, gibi? 14:26:49 Since we're talking about reshaper 14:26:53 oh, I wasn't done :) 14:26:55 oops 14:26:56 sorry 14:27:00 * gibi holds 14:27:31 Matt's review of the API patch revealed a hole in policy handling, which led cdent to come up with this: https://review.openstack.org/#/c/595559/ 14:27:31 ...and I was wondering if that would be appropriate to port to nova too. 14:27:51 * stephenfin worries about efried consuming operators 14:27:59 that would be harder in nova, 14:28:07 okay. just a thought. 14:28:09 placement defaults to admin-only for all routes except / 14:28:11 yeah I left a commen to that effect 14:28:15 ight 14:28:15 nova is all over the obard 14:28:17 *board 14:28:27 moving on with sched subteam report... 14:28:30 #link Spec: Placement modeling of PCI devices ("generic device management") https://review.openstack.org/#/c/591037/ 14:28:30 harder because of finding urls, harder because of finding policies 14:28:44 And now for the big one 14:28:49 Placement extraction 14:28:49 #link epic dev ML thread http://lists.openstack.org/pipermail/openstack-dev/2018-August/133445.html 14:28:49 #link epic discussion in #openstack-tc http://eavesdrop.openstack.org/irclogs/%23openstack-tc/%23openstack-tc.2018-08-20.log.html#t2018-08-20T15:27:57 14:28:59 TL;DR: 1) Consensus that we should put placement in its own repo - more on that in a minute. 14:29:00 2) Consensus that initial placement-core team should be a superset of nova-core 14:29:00 3) Still no consensus on whether placement should be under compute or independent governance 14:29:19 So back to 1), cdent and edleafe have started work on this in a 14:29:19 #link temporary placement github repo https://github.com/EdLeafe/placement 14:29:19 which is being experimented on and respun as needed to get it presentable before seeding the official openstack repo. 14:29:26 cdent, edleafe: comments? 14:29:40 it "works" on https://github.com/EdLeafe/placement/pull/2 14:29:49 \o/ 14:29:50 but is messy, but cleaning it up should be fine 14:29:52 I will be re-extracting with the items cdent identified later today 14:30:32 Okay, that's it for sched subteam. Questions, comments, concerns, heckles? 14:30:49 at this point, 14:30:57 with what 2 weeks to ptg? 14:31:05 i assume we wait for decisions on #3? 14:31:25 i mean i know where i am, and it's in the ML 14:31:34 and gibi too 14:31:57 What do you mean, wait for decisions? You mean try to make that decision _at_ the PTG? 14:32:15 that's what i'm asking 14:32:22 i don't know if the ML thread is dying out or what 14:32:28 i haven't read the latest 14:33:22 I actually haven't caught up since last night, so my statement about #3 may have been premature. 14:33:33 to summarize my position on #3, i mostly don't care, but think the extraction is going to be a pain in the ass and would rather not deal with the governance question at the same time 14:33:46 once we've extracted, 14:33:48 setup a new core team, 14:33:51 and run that for 6 months, 14:33:59 then flipping the governance in time for T PTL elections is easy peasy 14:34:50 if people are vehemently opposed to ^ as a compromise, they should speak up 14:34:57 yeah, my position is that I would prefer we start with things extracted to a new repo, with a new placement-core team, under compute 14:34:59 dansmith: jaypipes: melwitt: edleafe: cdent ^ 14:35:54 mriedem: I am muting myself. 14:35:54 I'm not opposed to that, but as I understand some of the conversations haven't fully resolved, for people not currently here. 14:36:25 that's why i asked about the ptg 14:36:31 where people that should care will be in person 14:36:34 and it can be hashed out 14:36:35 My concern - that separation of governance gets put off indefinitely - is assuaged by reading that as "we intend and plan to split governance in T unless something serious happens to make us think that's the really wrong thing to do". 14:36:45 I'm not opposed, even if it seems unnecessary 14:37:27 i have always assumed long-term separate governance personally 14:37:56 if it's T or later, as i said, 14:38:00 fwiw, I don't want to put it off indefinitely. I want to start with things under compute, make some progress on the major tightly coupled work we need (vGPUs, NUMA, affinity, shared storage) and when that tails off, flip the governance 14:38:08 depends on getting through the extraction and new core team first for a cycle at least 14:38:42 i also don't want to hold hostages 14:38:47 I don't think we need to rush on making a decision. There are people involved in the discussion who don't process email as fast as most of the people talking right now. From what various people have told me in the past week, they want to think about things for a few more days. 14:38:54 if nova isn't going to make progress client-side on ^ then we can't hold forever 14:38:55 IMO 14:39:22 Yes, I have a problem with the premises of "tightly coupled" and "tails off", which are recipes for "hostage" and putting off indefinitely. 14:39:23 So I'd prefer to only proceed on the extraction and make, as yet, no assertions on the governance side of things 14:39:40 cdent: i think we're saying the same thing 14:39:43 so agree from me 14:39:46 I agree, if things are dragging and the client-side isn't working to make progress, then that is another issue and we won't hold forever over that 14:41:46 I have similar concerns to efried about the perception of coupling and tailing, but I don't think we need to address that right here right now 14:41:46 but we are not there yet, people have been working on the client-side. but we need to prioritize that higher 14:41:46 So I'm fine with setting an actual plan to run Stein under compute and separate governance in Train. Accepting that plans can change, but that's the way we execute if things go fairly close to expected. 14:41:46 rather than explicitly leaving it open-ended. 14:42:22 efried: yes, that's a good idea (I think), but I think we should wait before declaring that. Both ttx and dhellmann seem to have more they want to do 14:42:36 ack 14:42:52 * efried feels progress was made toward a consensus \\o/ 14:42:57 yes 14:44:15 okay, anything else for scheduler subteam before we move on to notifications? 14:44:28 Not from me. Thanks. 14:44:45 okay, gibi? 14:44:50 I was on PTO so no meeting and no status mail. I will resume those next week 14:45:07 nothing serious is ongoing in notification side at the moment 14:46:04 that is all 14:46:12 cool, thanks 14:46:17 gmann, API subteam? 14:46:36 he's on PTO 14:46:50 working through the server view builder extension merge series 14:47:01 ah, okay. thanks 14:47:18 anything else for subteams before we move on? 14:47:28 #topic Stuck Reviews 14:47:42 no items in the agenda. anyone in the room have anything for stuck reviews? 14:47:47 yes 14:48:01 Bug https://review.openstack.org/#/c/579897 14:48:10 oh sorry, I missed it in the agenda 14:48:10 Hide hypervisor id on windows guests 14:48:15 :) 14:48:19 #link Bug: Hide hypervisor id on windows guests https://review.openstack.org/#/c/579897 14:49:00 there's review from jaypipes there. is this stuck or are you just waiting for a response from him? 14:49:05 The bug is that Windows guests with PCI passthrough of Nvidia GPUs can't use the GPU due to a restriction of the nvidia driver 14:49:18 oh, I see now 14:49:19 I guess both. 14:49:20 we already have a similar thing elsewhere 14:49:29 and i thought there was another related wishlist type bug for this 14:49:45 from kosamara also, 14:49:46 but i can't find it 14:49:49 Which we patched some months ago 14:49:56 i might be thinking of the bp in rocky 14:50:05 https://blueprints.launchpad.net/nova/+spec/hide-hypervisor-id-flavor-extra-spec 14:50:07 yes ^ 14:50:09 that was for all guests, yes that 14:50:27 so this is a bug in the same vein 14:50:30 does any other hypervisor allow for this? 14:50:39 this is in particular for windows guests. The problem are the extra HyperV tags, which reveal that there is a hypervisor 14:50:44 since this is about hiding hyper-v's signature in kvm, I would expect that they don't 14:51:26 dansmith: We have a flag to hide KVM's signature for the same reason 14:51:42 yeah I know, and I understand the use case 14:53:18 jaypipes: I want to add on this bug that it is triggered by a need of CERN users 14:53:41 pulling the CERN card eh 14:53:47 well, in that case! 14:54:58 so given we have prior art here, 14:54:59 kosamara: I've said my piece on the review in question. I won't hold it up. I'm just not going to +2 it. 14:55:01 I dunno, I think I'm against this 14:55:17 why would the previous hide be ok but this isn't? 14:55:19 this is a libvirt-specific hack to disable flags to sidestep licensing at the expense of performance 14:55:25 It tries to solve a real end user problem, namely running engineering and rendering software, and it doesn't come from a vendor. 14:56:45 As mriedem says, the use case is essentially the same as the previous hiding. 14:57:04 kosamara: like I said, I feel for you, and I don't hold this against you personally (or CERN). I just don't support hacks like this to get around what is essentially a vendor-specific licensing dilemma for NVIDIA users. 14:57:56 is there any correct way to fix this elsewhere? 14:58:12 this patch for sure is way too targeted 14:58:15 it's just hacked into place 14:58:27 gah, we only have a few minutes left here 14:58:42 melwitt: I don't see any other way 14:58:52 I'm not sure what knob we have for disabling the base flags right now, but I'm not sure why we need another one for windows specifically 14:59:18 okay, we'll have to move to the nova channel, have to wrap up here 14:59:19 I have another meeting 14:59:29 I'll continue in nova 14:59:32 #topic Open discussion 14:59:39 someone left a note 14:59:44 #link Edge Computing Group PTG schedule with a Nova session planned: https://etherpad.openstack.org/p/EdgeComputingGroupPTG4 15:00:00 edge computing PTG etherpad for those interested ^ 15:00:06 jaypipes said he was on the edge 15:00:13 heh 15:00:14 so he can be our rep 15:00:19 okay, we have to wrap, thanks eveyone 15:00:21 #endmeeting