14:00:17 <cdent> #startmeeting nova-scheduler 14:00:17 <openstack> Meeting started Mon Apr 11 14:00:17 2016 UTC and is due to finish in 60 minutes. The chair is cdent. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:18 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:21 <openstack> The meeting name has been set to 'nova_scheduler' 14:00:32 <mlavalle> o/ 14:00:32 <cdent> who is here to have a fun and exciting nova scheduler team meeting? 14:00:34 <mriedem> o/ 14:00:35 <mlavalle> me 14:00:41 <cdent> \o/ 14:01:04 <cdent> bauzas, jaypipes, dansmith about? 14:01:15 <cdent> There's nothing specific on the agenda this week. 14:01:18 <bauzas> \o 14:01:19 <Yingxin> o/ 14:01:39 <tonytan4ever> o/ 14:01:43 <jaypipes> o..../ 14:01:53 <cdent> mriedem: I assume we are still in pre-summit new-spec freeze? 14:02:02 <mriedem> yes 14:02:16 <jaypipes> mlavalle, ajo: did you see my response to ajo this morning on scheduer and RT stuff for NIC_BW_KB? 14:02:38 <mlavalle> jaypipes: not yet. I just connected. Will take a look soon 14:02:56 * edleafe is here for a little while 14:03:00 <_gryf> o/ 14:03:29 <cdent> What scheduler related specs are currently in play? compute node migration. what else? 14:03:32 <sarafraj> o/ 14:03:54 <jaypipes> cdent: that's it, AFAIK. pre-summit thats all that is accepted. 14:04:19 <cdent> that's my sense of things too 14:04:36 <cdent> mriedem still has a few questions on dansmith's migration patch, but thats's almost there 14:04:43 <bauzas> cdent: I'm also working on check-destinations 14:04:51 <cdent> link? 14:04:55 <bauzas> which has more side effects than I originally thoguht 14:04:57 <edleafe> are there any that can be reviewed so that they get approved asap post-summit? 14:04:59 <bauzas> thought, even 14:05:42 <cdent> #link: compute node migration https://review.openstack.org/#/c/279313/ 14:06:14 <bauzas> https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/check-destination-on-migrations-newton is reviewable too 14:06:16 <cdent> edleafe: there is a newton version of generic resource pools: https://review.openstack.org/#/c/300176/ 14:06:41 <cdent> and I think jaypipes restacked all the resource-* stuff for newton, yeah? 14:06:49 <jaypipes> yes 14:07:15 <edleafe> cdent: ok, great. I'd hate to see that stuff linger throughout newton 14:07:17 <cdent> jaypipes: is there a nice happy little single link for that? 14:08:14 * mlavalle will review jaypipes generic resource pool spec 14:08:25 <cdent> mlavalle++ 14:08:50 <jaypipes> cdent: https://blueprints.launchpad.net/nova/+spec/compute-node-inventory-newton 14:08:57 <jaypipes> cdent: has the dependency graph and links in there. 14:09:02 <cdent> Any other specs or review that anyone would like to draw attention to? 14:09:06 <cdent> thanks jaypipes 14:09:09 <jaypipes> np 14:09:12 <jaypipes> #link https://blueprints.launchpad.net/nova/+spec/compute-node-inventory-newton 14:10:03 <cdent> #topic: bugs 14:10:18 <cdent> #link https://bugs.launchpad.net/nova/+bugs?field.tag=scheduler&orderby=-id&start=0 14:10:42 <cdent> doesn't appear to be many new bugs, which is good, but not much shrinkage in number of bugs, which is less good 14:11:07 <cdent> anything to highlight? 14:11:12 <bauzas> a bit of triage 14:11:20 <bauzas> but nothing really worth commenting it 14:11:45 <bauzas> basically, someone expresses frustration about 300+ computes having hard time reading DB 14:11:57 <ajo> (sorry about the late pong jayg) I saw your response but still had no time to properly process it 14:11:57 <bauzas> well, 14:12:23 <bauzas> s/.*/about the scheduler having hard time to read eq. of 300+ computes/ 14:12:28 <mriedem> there are a lot of bugs with 'should' in the title that are over a year old 14:12:43 <cdent> mriedem: I'm looking forward to monday when we get to stomp on those 14:12:46 <mriedem> we 'should' start pushing for blueprints on those 14:12:56 <mriedem> cdent: that doesn't need to wait for monday 14:13:11 <cdent> mriedem: it's all about the time slices, hon 14:13:13 <bauzas> mriedem: I'm about to fence a lof of those during the bug scrub day 14:13:42 <jaypipes> ajo: np! 14:13:43 <cdent> (by which I mean, I think setting aside a designated time is a great idea) 14:14:36 <cdent> anything else on bugs? 14:15:27 <cdent> okay: moving on 14:15:32 <cdent> #topic open 14:15:43 <cdent> and go 14:15:58 <Yingxin> The benchmark tool I mentioned in the paragraph[-1] of 14:15:59 <Yingxin> http://lists.openstack.org/pipermail/openstack-dev/2016-March/088344.html is available now. 14:16:11 <Yingxin> jaypipes: bauzas: ^ 14:16:28 <Yingxin> It's https://github.com/cyx1231st/nova-scheduler-bench 14:16:35 <bauzas> coolness 14:16:48 <Yingxin> The experimantal results for filter scheduler is here: http://paste.openstack.org/show/493438/ 14:16:59 <cdent> #link Yingxin's benchmark tool https://github.com/cyx1231st/nova-scheduler-bench 14:16:59 <Yingxin> I'll try to explain them in the summit session Dive into nova scheduler performance - Where is the bottleneck? 14:17:12 <Yingxin> cdent: thanks :) 14:17:29 <cdent> any quick summary or highlight to share? 14:17:33 <bauzas> what I'm a bit surprised is what jaypipes found with SQL queries getting 30% performant than usual python modules 14:18:15 <jaypipes> bauzas: 38% at 8 threads. 14:18:16 <bauzas> which means the generator we use is underperformant for iterating over all the filters (and hosts) 14:18:33 <jaypipes> bauzas: no, it means that C is faster than Python. 14:18:54 <bauzas> jaypipes: there were other benchs in the past demonstrating not a clear win for C over Python 14:19:21 <edleafe> jaypipes: it also means that Python has less to filter 14:19:22 <bauzas> and the filters are imported once 14:19:31 <Yingxin> I found that db is a big problem during scheduling 14:19:49 <bauzas> Yingxin: that ^, I think we all agree 14:19:51 <jaypipes> bauzas: it's simple. The more compute nodes you have in the deployment, the slower the existing filter scheduler is, because we transfer every compute node in the system on each call to select_destinations(). Filtering that list of compute nodes to only return one that matches the conditions means you don't loop over all the compute nodes. 14:20:12 <jaypipes> Yingxin: I don't agree at all. 14:20:35 <jaypipes> Yingxin: it's only a problem because we are transferring giant collections of compute nodes each time we schedule. 14:20:35 <bauzas> jaypipes: that's the current bottleneck 14:20:40 <jaypipes> bauzas: no it isn't. 14:20:43 <cdent> Yingxin: is your tool using real vms and message bus? 14:20:48 <Yingxin> many requests are stucked before getting sent to the scheduler service 14:21:01 <dansmith> bauzas: you mean no difference in the performance between python and C mysql drivers, right? I think it's pretty clear that generic C code will be much faster than python, for things like processing the results and doing the actual filtering, right? 14:21:07 <Yingxin> cdent: using real compute node services and message bus 14:21:11 <edleafe> jaypipes: that was the huge gain that I saw using a distributed db, with filters as db queries 14:21:14 <jaypipes> Yingxin: the DB itself is hardly breaking a sweat in all my benchmarks (real and the placement-bench stuff). 14:21:23 <jaypipes> it's the way we *use* the DB that is suboptimal. 14:21:29 <edleafe> jaypipes: not having to constantly pull the state of the nodes was a huge win 14:21:34 <bauzas> jaypipes: well, the whole purpose of CachingScheduler is to reduce the number of DB calls we made 14:21:43 <bauzas> in order to pass the bar 14:22:22 <jaypipes> bauzas: and CachingScheduler substitutes a cache invalidation and race condition problem for reduced set of DB calls instead of correcting the root of the problem, which is poor *use* of the DB. 14:22:59 <jaypipes> we use the DB to store stuff but don't use it to filter things which is what it's purpose is... 14:23:01 <bauzas> dansmith: sure, I'm not clear, what I'm trying to explain is that in the workflow, it was identified in the past that the filtering part of the scheduler was not a performance problem compared to the DB calls we made by multiple orders of magnitude 14:23:02 <Yingxin> caching schedulers has a great performance improvement in my experiments using the real openstack deployment 14:23:21 <cdent> bauzas: because those db calls are bad, that's the point jaypipes is trying to make 14:23:29 <jaypipes> correct. 14:23:29 <cdent> if we store data well, and then query it will, we have huge gains 14:23:33 <dansmith> bauzas: right but the reason those are expensive is because of how much we have to pull back into python land right? 14:23:33 <bauzas> cdent: sure, and I agree with the approach 14:23:45 <jaypipes> dansmith: correct. 14:23:46 <dansmith> bauzas: the calls we will be making will be massively more efficient where the old ones were not 14:23:48 <bauzas> dansmith: yeah, that's the #1 improvement axe 14:23:58 <bauzas> dansmith: hence my support on the series 14:24:09 <jaypipes> dansmith: and, more importantly, the greater the number of compute nodes, the less our current approach scales. 14:24:09 <dansmith> okay, sorry if I'm stating the obvious :) 14:24:16 <dansmith> yeah 14:24:19 <bauzas> but like I said, I never estimated the filtering part of that as requiring such modification 14:24:53 <jaypipes> bauzas: as I mentioned this morning in my response to ajo, I don't have an issue creating a separate scheduler driver that does things in the DB instead of Python. 14:25:11 <cdent> semi-related: I had a brain mush last night about nested resource-pools that should allow us to implement celled schedulers and the "super scheduling" that people talk about 14:25:27 <bauzas> jaypipes: sorry if I'm unclear, I'm just saying I was surprised to see the figures, not that I'm against those :) 14:25:30 <cdent> need to write it down before it disappears again 14:26:29 <cdent> so is this an accurate summary: 14:26:30 <bauzas> jaypipes: like I said in reviews, I trust you for that, I was just expressing my mindset that I wasn't seeing a clear performance benefit of that 14:26:36 <cdent> the way we use the db now is costly 14:26:40 <bauzas> until your figures, which makes me very torn 14:26:43 <cdent> the way we plan to use the db is better 14:26:46 <cdent> EOF 14:27:20 <jaypipes> bauzas: I will modify the resource-providers-scheduler-db-filters blueprint to have it create a new scheduler driver instead of modfy the existing ones. 14:27:34 <bauzas> okay 14:27:38 <Yingxin> jaypipes: I'm eager to test resource-provider scheduler using my benchmarking tool once it's available 14:28:40 <Yingxin> That would be more fair than the placement-bench 14:28:48 * edleafe has to run off... 14:29:06 <mriedem> so what are the goals for this week? 14:29:31 <mriedem> 1. https://review.openstack.org/#/q/topic:bp/compute-node-inventory-newton+status:open merging 14:29:40 <dansmith> mriedem: 1. get you to stop -1ing my patch, 2. merge my patch, 3. don't care about the rest 14:29:41 <dansmith> :P 14:29:47 <jaypipes> dansmith: k, 279313 reviewed. nice work. 14:30:01 <mriedem> what else is needed in the compute-node-inventory-newton bp? pci devices and something else needs migrating right? 14:30:28 <jaypipes> Yingxin: did you see my review of your proposed scheduler functional testing? 14:30:39 <Yingxin> jaypipes: yes 14:31:06 <mriedem> still need to migrate pci devices and numa topologies 14:31:14 <mriedem> dansmith: are you working on that next or is someone else doing that? 14:31:23 <jaypipes> mriedem: the PCI devices stuff isn't changing for compute-node-inventory right now. I need to resubmit the pci-generate-stats blueprint for Newton after feedback from ndipanov 14:31:45 <_gryf> speaking of pci… 14:31:52 <_gryf> I've posted the mail on ML regarding FPGA (as requested on previous meeting), which have quite a response [http://lists.openstack.org/pipermail/openstack-dev/2016-April/091411.html] 14:31:56 <mriedem> can we finish this one thought before moving on please? 14:32:00 <_gryf> k 14:32:02 <mriedem> http://specs.openstack.org/openstack/nova-specs/specs/newton/approved/compute-node-inventory-newton.html 14:32:04 <mriedem> what is left? 14:32:10 <dansmith> jaypipes: we were looking for your feedback on L373 on that patch 14:32:11 <mriedem> pci devices is deferred to another bp 14:33:28 <mriedem> cdent: btw, on http://specs.openstack.org/openstack/nova-specs/specs/newton/approved/compute-node-inventory-newton.html it could use an update since "Grab the resource class identifier for CPU from the resource_classes table" isn't what we do, we use the enums 14:33:35 <Yingxin> jaypipes: I think refactor servicegroup functional tests would be a better start. 14:33:44 <jaypipes> dansmith: you are correct there. I had nothing to add, sorry :( 14:34:02 <dansmith> jaypipes: okay, just making sure, thanks 14:34:24 <mriedem> jaypipes: should the spec for compute-node-inventory-newton be updated to say that pci device migration will happen elsewhere? 14:34:31 <mriedem> or it's TBD at this point? 14:35:05 <jaypipes> mriedem: TBD at this point. 14:35:13 <mriedem> ok 14:35:17 <mriedem> how about numa topologies? 14:35:36 <jaypipes> mriedem: PCI devices and NUMA topology placement will remain handled by Python-side filtering for the foreseeable future. 14:35:57 <jaypipes> mriedem: and the access to those resources (via ComputeNode object) also remain unchanged. 14:36:14 <bauzas> yeah 14:36:26 <bauzas> we could possibly improve how we work with NUMA resources 14:36:30 <mriedem> so once https://review.openstack.org/#/c/279313/ is merged is compute-node-inventory-newton complete? 14:36:42 <bauzas> because there is a big helper module in nova.hardware that I'd like to remove 14:36:58 <bauzas> basically doing lots of isinstance() else 14:38:53 <jaypipes> mriedem: yes, though I see a dependent patch for 279313 in Gertty and Gerrit. 14:39:19 <bauzas> FWIW, https://review.openstack.org/#/c/279313/ is planned to be reviewed today 14:40:09 <mriedem> bauzas: i'm +2 on it once the test i asked for is added 14:40:19 <mriedem> i think dan is just waiting for this meeting to be done 14:40:50 <mriedem> ok, so if ram/cpu/disk migration completes that spec, it seems like http://specs.openstack.org/openstack/nova-specs/specs/newton/approved/compute-node-inventory-newton.html should be amended 14:40:58 <bauzas> mriedem: okay, good to know, I already reviewed that patch without voting it yet, but I'm almost happy with it 14:41:31 <bauzas> just wanted to make sure everything is okay before pushing the red button 14:41:33 <mriedem> but i don't want that to make waves if we don't really know yet 14:41:40 <mriedem> bauzas: i put my -1 on it to be safe 14:41:47 <bauzas> I just saw 14:41:51 <mriedem> and because dansmith told me to stop -1ing his changes :) 14:42:16 <bauzas> couldn't we ask a i² button ? 14:42:17 <dansmith> mriedem: jaypipes I'm pushing up a rev with those tweaks now 14:42:24 <dansmith> bauzas: niiiice 14:43:06 <mriedem> alright, well let's just move on, i'll just follow up on spec amendments once the code is merged and we talk about completing the bp 14:43:42 <cdent> anybody have any other open topics _not_ related to resource providers? _gryf ? 14:44:05 <_gryf> cdent, :) 14:44:20 <jaypipes> dansmith: coolio. 14:44:53 <_gryf> cdent, just wanted to point out the thread. 14:45:04 <bauzas> cdent: like I said, check-destination-on-migrations 14:45:30 <bauzas> _gryf: have you seen my proposal for a performance VMs discussion at the summit? 14:45:38 <_gryf> bauzas, yes, saw that 14:45:57 <bauzas> not totally sold on the idea, just want to stop people working in silos 14:46:03 <_gryf> does it means, I can remove entry from unconference section? 14:46:36 <bauzas> _gryf: let's discuss that maybe out of that meeting 14:46:37 <mriedem> _gryf: we haven't decided what the design summit sessions are going to be yet 14:46:38 <mriedem> so leave it 14:46:44 <bauzas> ++ 14:46:56 <_gryf> mriedem, bauzas, ok 14:47:25 <mlavalle> mriedem: does that mean that the time and date for the Neutron / Nova joint session in Austin haven't firmed up? 14:47:45 <mriedem> mlavalle: i think that one on wed is pretty firm 14:47:52 <jaypipes> dansmith: wallaby'd 14:48:12 <mlavalle> mriedem: so we can feel confident that routed networks will be discussed? 14:48:18 <mriedem> jaypipes: bauzas: don't forget https://review.openstack.org/#/c/303531/ 14:48:32 * dansmith tips his hat to jaypipes 14:48:33 <mriedem> mlavalle: yes, it's the 3rd session on wed 14:48:41 <mriedem> right after the 2 scheduler sessions 14:48:53 <mlavalle> mriedem: yaay! 14:49:35 <mriedem> mlavalle: please talk to armax and see if there are going to be a bunch of other things that neutron wants to cover in that nova/neutron session, because it might get too full 14:49:58 <mriedem> this is our summit session etherpad https://etherpad.openstack.org/p/newton-nova-summit-ideas 14:49:59 <mlavalle> mriedem: will take that action item and report back to you 14:51:08 <cdent> anything else from anyone? 14:51:23 <dansmith> so, 14:51:31 <dansmith> sorry I've been distracted in another channel 14:51:42 <jaypipes> mriedem: Wallaby'd that one too. 14:51:42 <dansmith> but we're good on the inventory migration patch, it looks like 14:52:00 <mriedem> yes, but i'm unclear on the rest of the spec 14:52:03 <dansmith> are we going to open up the next one (allocations) or wait until summit? I haven't tracked that spec 14:52:05 <mriedem> but we can talk about that later 14:53:09 <mriedem> i don't have an answer re: the allocations spec right now, would have to look into it 14:53:14 <dansmith> okay 14:53:22 <mriedem> we have lots of other stuff that could be worked on before the summit, like the cells v2 build request and cell0 stuff 14:53:30 <dansmith> so I also need to remove the aggregate online migration thing 14:53:47 <dansmith> yep 14:54:38 * mriedem has to run to another meeting 14:54:58 <cdent> I think we can use that as our signal to call it, unless someone has something for the last 5 minutes? 14:55:11 <dansmith> just one more thing... 14:55:14 * dansmith jokes 14:55:15 <cdent> hehe 14:55:24 <cdent> #stopmeeting 14:55:30 <dansmith> #endmeeting 14:55:39 <cdent> #endmeeting