14:00:14 <melwitt> #startmeeting nova 14:00:14 <openstack> Meeting started Thu Sep 6 14:00:14 2018 UTC and is due to finish in 60 minutes. The chair is melwitt. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:15 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:18 <openstack> The meeting name has been set to 'nova' 14:00:18 <cdent> o/ 14:00:27 <melwitt> hello everybody 14:00:28 <dansmith> o. 14:00:32 <mriedem> o/ 14:00:40 <takashin> o/ 14:00:46 <tetsuro> o/ 14:00:49 <edleafe> \o 14:00:56 <melwitt> let's get started 14:01:06 <melwitt> #topic Release News 14:01:15 <melwitt> #link Stein release schedule: https://wiki.openstack.org/wiki/Nova/Stein_Release_Schedule 14:01:40 <efried> ō/ 14:01:41 <melwitt> final rocky release was last thursday. we're still working on bugs and backporting them to stable/rocky 14:02:00 <melwitt> so now, we kick off the stein cycle with the PTG next week 14:02:39 <melwitt> that's all I have for release news. anyone have anything else? 14:03:00 <melwitt> #topic Bugs (stuck/critical) 14:03:17 <melwitt> we have one bug in the critical link 14:03:26 <melwitt> https://bugs.launchpad.net/nova/+bug/1790701 14:03:26 <openstack> Launchpad bug 1790701 in OpenStack Compute (nova) "online_data_migrations fail in rocky+" [Critical,In progress] - Assigned to Matt Riedemann (mriedem) 14:03:49 * bauzas waves late 14:03:55 <mriedem> need https://review.openstack.org/#/c/599744/ approved 14:04:04 <dansmith> I have that open now 14:04:07 <mriedem> with that and another fix already merged, 14:04:11 <mriedem> we have nova-status passing in devstack https://review.openstack.org/#/c/599847/ 14:04:14 <mriedem> for fresh install 14:04:18 <bauzas> I'm on the patch 14:04:20 <mriedem> something i should have added long ago 14:04:32 <mriedem> i'll start backports after the meeting 14:04:45 <melwitt> ok, coolness 14:05:06 <melwitt> thsnk 14:05:09 <melwitt> *thanks 14:05:14 <melwitt> #link 51 new untriaged bugs (up 1 since the last meeting): https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 14:05:22 <melwitt> #link 11 untagged untriaged bugs (up 1 since the last meeting): https://bugs.launchpad.net/nova/+bugs?field.tag=-*&field.status%3Alist=NEW 14:05:39 <melwitt> not too big of an increase from last week in bugs, thanks to all who have been helping with triage 14:05:47 <melwitt> #link bug triage how-to: https://wiki.openstack.org/wiki/Nova/BugTriage#Tags 14:05:52 <melwitt> #help need help with bug triage 14:06:09 <melwitt> Gate status 14:06:10 <melwitt> #link check queue gate status http://status.openstack.org/elastic-recheck/index.html 14:06:11 <melwitt> gate has seemed OK 14:06:22 <melwitt> 3rd party CI 14:06:27 <melwitt> #link 3rd party CI status http://ci-watch.tintri.com/project?project=nova&time=7+days 14:06:46 <melwitt> anything else for bugs or gate status or 3rd party CI? 14:06:52 <mriedem> 3rd party ci needs https://review.openstack.org/#/c/599672/ 14:07:05 <mriedem> that was the 0.0 allocation ratio thing killing the non-libvirt ci jobs 14:07:13 <melwitt> right, ok. will review 14:07:22 <melwitt> thanks 14:07:43 <melwitt> #topic Reminders 14:07:51 <melwitt> #link Stein Subteam Patches n Bugs: https://etherpad.openstack.org/p/stein-nova-subteam-tracking 14:07:56 <melwitt> #link Stein PTG planning: https://etherpad.openstack.org/p/nova-ptg-stein 14:08:12 <melwitt> I've updated the etherpad with a schedule ^ 14:08:39 <melwitt> the cyborg team is going to talk about placement integration stuff on monday from 2pm - 3pm at the cyborg room 14:08:53 <melwitt> they'd like for interested folks from the nova team to join 14:09:08 <cdent> there's a blazar one on tuesday at 10am (I think) 14:09:12 <efried> yes 14:09:25 <melwitt> ok, will add a note about that on the schedule 14:09:28 <mriedem> and mfing edge at 4pm on tuesday 14:09:53 <melwitt> edge is having an all day thing on tuesday, I 14:09:58 <mriedem> right, 14:10:00 <melwitt> will add a note about 4pm being nova time 14:10:03 <mriedem> but their nova-specific stuff starts around 4 14:10:06 <mriedem> already done 14:10:09 <melwitt> thanks 14:10:49 <melwitt> we have the rocky retro first thing on wednesday 14:10:54 <melwitt> #link Rocky retrospective for the PTG: https://etherpad.openstack.org/p/nova-rocky-retrospective 14:11:09 <melwitt> there's almost nothing on the etherpad, so I expect it to be short 14:11:28 <melwitt> but we'll at least talk about runways and any changes we'd like to make to the spec freeze date this time 14:11:39 <melwitt> and kick off runways for stein accordingly 14:11:46 <efried> oo, I just thought of this, when we do the retrospective *next* time, we get to call it the stein whine 14:11:53 * efried crawls back into hole 14:12:00 <melwitt> that's something to look forward to 14:12:17 <melwitt> ok, that's all I have for reminders. anyone else have anything to add for reminders? 14:12:36 <melwitt> #topic Stable branch status 14:12:58 <melwitt> #link stable/rocky: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/rocky,n,z 14:13:10 * melwitt needs to review 14:13:18 <melwitt> #link stable/queens: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/queens,n,z 14:13:23 <melwitt> #link stable/pike: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/pike,n,z 14:13:28 <melwitt> #link stable/ocata: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/ocata,n,z 14:13:53 <melwitt> #help please help with stable reviews 14:14:04 <melwitt> there are a lot of reviews 14:14:22 <melwitt> anything else for stable branch status? 14:14:36 <melwitt> #topic Subteam Highlights 14:14:51 <melwitt> we didn't have a cells v2 meeting yesterday. anything you'd like to mention here dansmith? 14:14:59 <dansmith> not really, 14:15:15 <dansmith> several of us have been out here and there, like surya this week 14:15:38 <dansmith> I think we got the flag for reverting to the old skip behavior all nailed down (not sure if it merged yet or not) 14:15:52 <dansmith> and we've been iterating on the proper down-cell stuff, which has been a little slow with the people outages 14:15:53 <mriedem> i haven't looked at that yet 14:15:56 <dansmith> but otherwise going pretty well 14:16:01 <dansmith> mriedem: yeah, would be good to get your ack on that 14:16:10 <mriedem> given i asked for it... 14:16:11 <mriedem> yeah 14:16:22 <melwitt> cool, thank you 14:16:42 <melwitt> scheduler, efried? 14:16:46 <efried> No sched meeting this week due to labor day (though in retrospect it would have been polite of me to send an email to that effect). 14:16:46 <efried> But I would like to have a brief update on placement extraction. 14:16:52 <efried> As of yesterday we've merged the forty-some patches to get the extracted repository to the point of gating/voting unit/func/pep, which is a great milestone. 14:17:25 <cdent> it was in honors of efried's 42 birthday 14:17:36 <efried> And with a couple of pending patches as deps, I think cdent has gotten devstack working, as proven by placecat etc. cdent, care to unmuddle that? 14:18:03 <cdent> I got tempest working against https://review.openstack.org/#/c/600162/ 14:18:07 <cdent> but not grenade of course 14:18:29 <cdent> and placecat is my docker driven test suite for placement, the container now uses openstack/placement instead of openstack/nova as its source 14:19:59 <melwitt> cool, glad things are going well 14:20:35 <melwitt> I think gibi isn't around, no notes left for notifications team 14:20:58 <melwitt> and I think gmann isn't around, no notes left for API team 14:21:09 <melwitt> anything else for subteams before we move on? 14:21:44 <melwitt> #topic Stuck Reviews 14:21:56 <melwitt> no items in the agenda. does anyone in the room have anything for stuck reviews? 14:22:36 <melwitt> #topic Open discussion 14:22:57 <mriedem> if it's not on the agenda, 14:23:03 <mriedem> cern is going to have a specless bp request for https://blueprints.launchpad.net/nova/+spec/extend-in-use-rbd-volumes 14:23:12 <mriedem> to support extending in-use rbd volumes 14:23:23 <mriedem> the os-brick code isn't merged yet 14:23:26 <melwitt> ok, yeah not in the agenda 14:23:40 <mriedem> and i've said on the nova change that i want to see the ceph job passing with the volume extend tempest test on that nova change first 14:23:40 <melwitt> ok 14:24:07 <melwitt> sounds like a good plan to me 14:24:25 <efried> Something I'd like to put in folks' noggins: 14:24:25 <efried> Do we ultimately see *all* device passthrough eventually going through cyborg, or just accelerators? 14:24:51 <efried> Looking at the long-term plan for torching the existing pci passthrough code 14:25:35 <bauzas> efried: no 14:25:40 <bauzas> efried: please 14:25:52 <melwitt> mriedem: looks like a parity thing for that blueprint, so I'm +1 on approving 14:26:07 <bauzas> efried: cyborg is a management API for accelerators, but please don't purge the capabilities that nova has to manage a set of devices out of it 14:26:16 <melwitt> anyone else have opinions about the approval of specless blueprint https://blueprints.launchpad.net/nova/+spec/extend-in-use-rbd-volumes ? 14:26:46 <mriedem> melwitt: it might be premature until the actual brick change is approved and ceph testing is green 14:26:50 <mriedem> i was just bringing it up as an fyi 14:26:54 <bauzas> I was about to say the same 14:26:56 <efried> sorry for the cross-talk, lemme know when you're done 14:27:01 <bauzas> it requires a new osbrick version 14:27:10 <bauzas> os-brick even 14:27:30 <melwitt> mriedem: ok. so we'll wait to approve until after that. sorry, I thought you were asking to approve now 14:27:35 <bauzas> but if that's straightforward in nova, I'm not opposed to the specless-y 14:28:21 <melwitt> k. cool. I think we're done with that then 14:28:59 <melwitt> efried: go ahead, sorry about that 14:29:09 <efried> thanks. 14:29:37 <dansmith> my opinion is that cyborg isn't far enough along to have enough confidence in it to replace things like basic device attach, 14:29:38 <efried> So I know cyborg is going to get involved in doing the discovery and reporting (to placement) of accelerator inventory. 14:29:48 <dansmith> especially with SRIOV type things that need some network attention 14:30:19 <dansmith> I would kindof expect that the PCI attach functionality in nova is how we end up attaching accelerators under the covers anyway, perhaps without the same level of whitelisting nonsense 14:30:21 <bauzas> yeah I think it's premature 14:30:28 <efried> well, I agree with that for sure. We're not going to be able to replace the whole pci subsystem all at once. 14:30:33 <dansmith> but until cyborg becomes a much more mature thing, I'm not really in favor of replacing anything with it, 14:30:39 <efried> but we can take one of two paths wrt cyborg 14:30:40 <dansmith> and only trying to enable what new things it might bring 14:30:59 <bauzas> I'm still a bit concerned 14:31:11 <efried> we can either make the effort to embrace it and thus help it mature, pulling in pieces as they become available/usable 14:31:17 <bauzas> if we say this way, then we should have said to leave vGPUs out of the nova radar 14:31:32 <efried> or we can go our own way and then do a second, bigger, more painful integration later when we consider cyborg "mature". 14:31:34 <bauzas> the most crucial thing is not what we have, but how we support it 14:31:53 <dansmith> efried: you mean "if/when" 14:31:56 <mriedem> we've said no to fpga directly in nova for years, 14:32:01 <dansmith> because the if part is the important bit to me 14:32:02 <mriedem> cyborg is the path to fpga in nova 14:32:08 <mriedem> so let's see that happen first 14:32:12 <dansmith> mriedem: exactly 14:32:21 <mriedem> before spending a bunch of time retrofitting what we already have 14:32:28 <bauzas> oh yeah 14:32:36 <efried> chicken/egg, self-fulfilling prophecy, and all that. 14:32:50 <efried> I.e. if we take path A, cyborg is more likely to be a long-term success. 14:32:51 <cdent> If I'm understanding efried correctly, the concern here is about architecture over the long term 14:33:05 <bauzas> and we could potentially improve the PCI functionality without really pulling it out of nova 14:33:08 <cdent> if there's a chance that cyborg will become more generic it needs to start out that way sooner 14:33:14 <efried> yes, that ^ 14:33:28 <bauzas> I'm not opposed to have the same feature be done in two different ways 14:33:54 <efried> Look, it actually makes my life easier if we say we're going to ignore cyborg for a couple of cycles and start rolling our own placement-based device passthrough, per kosamara's spec as written. 14:34:05 <bauzas> after all, it's now 4 cycles that we are wondering how cyborg will interact with nova 14:34:52 <bauzas> yeah, and I think it's not a big deal for placement, right? 14:34:53 <efried> well, not really, only since Dublin has it been more than a haze 14:35:14 <bauzas> I heard of cyborg since barcelona 14:35:39 <bauzas> it's just that we had a chat with them since Dublin, yeah 14:35:48 <efried> but I'm trying to consider what's best long term, and whether we have a duty^Wresponsibility^Wopportunity to help raise the project and help it mature. 14:36:14 <bauzas> but we can also try to avoid overguessing what the future could be, and leave people engage with us 14:36:30 <bauzas> for example, blazar is way older than placement 14:36:38 <bauzas> but at the end, they will use it 14:36:59 <bauzas> I don't see a problem having both efforts 14:37:19 <melwitt> I guess I'm not sure how cyborg being generic enough is related to which thing we integrate first 14:37:29 <efried> so I believe it was dansmith who asked the question on kosamara's spec, lemme find that... 14:37:47 <melwitt> the fpga thing will be the first step and if that works well, we could consider moving other passthrough to it right? 14:37:49 <cdent> a question standing here is "do we have a chance to collaborate rather than duplicate effort" 14:38:07 <dansmith> melwitt: cyborg is not generic enough today, as defined/planned I think 14:38:14 <efried> dansmith: https://review.openstack.org/#/c/591037/ PS5: "I would have expected a lot of the stuff described here to be in scope for cyborg. Not that we should exclude all that from nova necessarily, but I think that it's probably worth calling out how this intersects (or not) cyborg's intended scope." 14:38:15 <dansmith> melwitt: they're asking if we should encourage them to *be* generic enough 14:38:20 <melwitt> dansmith: oh, ok 14:38:46 <melwitt> was just looking at their wiki again, "various types of accelerators such as GPU, FPGA, ASIC, NP, SoCs, NVMe/NOF SSDs, ODP, DPDK/SPDK and so on" so I thought that sounded generic 14:38:47 <efried> well, when I asked Sundar this question, his reaction was yes. 14:39:09 <mriedem> this is premature - given how slow things move, they should opt to be generic if possible, 14:39:15 <mriedem> but not at the expense of actually getting shit done 14:39:22 <efried> right 14:39:30 <mriedem> i don't think we have any duty to raise that project 14:39:34 <mriedem> we can collaborate, sure 14:39:40 <mriedem> but it's not my top priority by any means 14:39:44 <efried> My point being that that affects how we proceed in nova with device passthrough and making existing pci code diaf 14:39:44 <dansmith> neither mine, 14:39:47 <mriedem> and expect it's not the priority for others 14:39:54 <efried> it is mine, actually. 14:39:57 <dansmith> but I think that efried is asking because he wants to know whether to push on the nova-centric generic device approach, 14:40:00 <dansmith> or go push in cyborg 14:40:02 <efried> correct 14:40:07 <efried> thanks dansmith, nail on head 14:40:37 <mriedem> i'm likely not going to be involved in that either way, at least not in stein, so doesn't matter to me personally 14:40:54 <mriedem> obviously decomp is best if possible, 14:40:59 <mriedem> but that might take a couple of years 14:41:09 <dansmith> decomp like "let that corpse rot" ? 14:41:16 <efried> vay 14:42:39 <efried> Okay, so dansmith if the response in that review is, "this may or may not be in scope for cyborg long-term, but we're going to do it this way until that project matures more"... 14:42:40 <mriedem> punt to ptg? 14:42:41 <efried> that wfy? 14:43:06 <dansmith> mriedem: yeah, I've typed out several responses and deleted them all because I can't articulate my feelings on the matter 14:43:11 <dansmith> so maybe ptg 14:43:12 <efried> Yeah, definitely going to discuss some at ptg, but wanted to get a couple of gears turning in y'all's heads. 14:43:26 <dansmith> I guess the bottom line is: 14:43:42 <dansmith> I don't have a lot of faith in cyborg becoming a useful generic device service as it is today 14:43:54 <dansmith> so if I cared about generic devices a lot, I probably wouldn't put my eggs in that basket 14:44:23 <dansmith> but, since I don't care so much, putting them over there keeps them out of the way in nova 14:44:24 <dansmith> so..? :) 14:44:37 <efried> okay 14:44:38 <efried> so 14:45:00 <efried> I'm going to be pushing hard for at least a small piece (full GPUs) of generic placement-based device passthrough in Stein. 14:45:25 <efried> And obviously will be asking people like those present here to review things in that space. 14:45:41 <efried> so wanted to get pre-buy-in for which approach to take short-term (stein) 14:45:44 <efried> which I think I have now 14:45:44 <dansmith> I think fleshing out GPUs in nova, which we already have makes sense 14:45:45 <efried> so 14:45:46 <efried> thanks. 14:45:56 <efried> well, distinguishing VGPU from GPU in this case dansmith 14:46:09 <efried> Those are going to be very different things. 14:46:09 <dansmith> oh, 14:46:26 <dansmith> you want a GPU-specific PCI passthrough replacement? 14:46:30 <efried> The full-GPU passthrough thing is going to actually subsume some of the functionality you can currently do with [pci]* 14:46:35 <efried> yes exactly 14:46:43 <efried> GPU first 14:46:55 * dansmith looks for his spoon 14:46:56 <efried> or possibly any "full card" 14:47:53 <efried> I see being able to use either mechanism (legacy [pci]passthrough_whitelist/alias or The New Thing) for multiple releases 14:48:03 <efried> until we have full parity and can start ripping out the legacy thing 14:48:09 <efried> if we try to do it all at once, fail 14:48:19 <mriedem> we should make a list of the ginormous tasks we think we're going to take on in stein - at the ptg of course 14:48:28 <mriedem> b/c i remember a lot of wailing about not having shared storage support yet 14:48:48 <mriedem> cross-cell cold migrate is going to be my albatross 14:48:56 <dansmith> or a plan for numa 14:49:12 <mriedem> or just being able to upgrade to stein with placement working :) 14:49:20 <dansmith> yeah 14:49:20 <melwitt> yeah, I want to get shared storage squared away. being that it looks like it's close too 14:49:27 <mriedem> or instance ownership transfers 14:49:28 <mriedem> etc 14:49:33 <mriedem> lots of big proposals on the plate right now 14:49:45 <mriedem> we're gonna need to weigh this stuff 14:50:09 <cdent> should be plenty of scales in colorado 14:50:20 <mriedem> b/c of fat coloradoans? 14:50:27 <cdent> weeeeeeeed 14:50:28 <dansmith> cows 14:50:30 <mriedem> oh right 14:50:31 <mriedem> heh 14:50:36 <mriedem> *rimshot* 14:50:55 <melwitt> ok, are we done? :) 14:50:57 <dansmith> cdent: post-legalization, it's not that big a deal to make sure the dime bag is no larger than it should be :) 14:51:49 <efried> I think I'm done 14:51:53 * cdent avoids going off into too much weed jargon 14:52:13 <melwitt> ok, let's call it. thanks everyone 14:52:17 <melwitt> #endmeeting