14:00:17 #startmeeting nova-scheduler 14:00:17 Meeting started Mon Apr 11 14:00:17 2016 UTC and is due to finish in 60 minutes. The chair is cdent. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:18 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:21 The meeting name has been set to 'nova_scheduler' 14:00:32 o/ 14:00:32 who is here to have a fun and exciting nova scheduler team meeting? 14:00:34 o/ 14:00:35 me 14:00:41 \o/ 14:01:04 bauzas, jaypipes, dansmith about? 14:01:15 There's nothing specific on the agenda this week. 14:01:18 \o 14:01:19 o/ 14:01:39 o/ 14:01:43 o..../ 14:01:53 mriedem: I assume we are still in pre-summit new-spec freeze? 14:02:02 yes 14:02:16 mlavalle, ajo: did you see my response to ajo this morning on scheduer and RT stuff for NIC_BW_KB? 14:02:38 jaypipes: not yet. I just connected. Will take a look soon 14:02:56 * edleafe is here for a little while 14:03:00 <_gryf> o/ 14:03:29 What scheduler related specs are currently in play? compute node migration. what else? 14:03:32 o/ 14:03:54 cdent: that's it, AFAIK. pre-summit thats all that is accepted. 14:04:19 that's my sense of things too 14:04:36 mriedem still has a few questions on dansmith's migration patch, but thats's almost there 14:04:43 cdent: I'm also working on check-destinations 14:04:51 link? 14:04:55 which has more side effects than I originally thoguht 14:04:57 are there any that can be reviewed so that they get approved asap post-summit? 14:04:59 thought, even 14:05:42 #link: compute node migration https://review.openstack.org/#/c/279313/ 14:06:14 https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/check-destination-on-migrations-newton is reviewable too 14:06:16 edleafe: there is a newton version of generic resource pools: https://review.openstack.org/#/c/300176/ 14:06:41 and I think jaypipes restacked all the resource-* stuff for newton, yeah? 14:06:49 yes 14:07:15 cdent: ok, great. I'd hate to see that stuff linger throughout newton 14:07:17 jaypipes: is there a nice happy little single link for that? 14:08:14 * mlavalle will review jaypipes generic resource pool spec 14:08:25 mlavalle++ 14:08:50 cdent: https://blueprints.launchpad.net/nova/+spec/compute-node-inventory-newton 14:08:57 cdent: has the dependency graph and links in there. 14:09:02 Any other specs or review that anyone would like to draw attention to? 14:09:06 thanks jaypipes 14:09:09 np 14:09:12 #link https://blueprints.launchpad.net/nova/+spec/compute-node-inventory-newton 14:10:03 #topic: bugs 14:10:18 #link https://bugs.launchpad.net/nova/+bugs?field.tag=scheduler&orderby=-id&start=0 14:10:42 doesn't appear to be many new bugs, which is good, but not much shrinkage in number of bugs, which is less good 14:11:07 anything to highlight? 14:11:12 a bit of triage 14:11:20 but nothing really worth commenting it 14:11:45 basically, someone expresses frustration about 300+ computes having hard time reading DB 14:11:57 (sorry about the late pong jayg) I saw your response but still had no time to properly process it 14:11:57 well, 14:12:23 s/.*/about the scheduler having hard time to read eq. of 300+ computes/ 14:12:28 there are a lot of bugs with 'should' in the title that are over a year old 14:12:43 mriedem: I'm looking forward to monday when we get to stomp on those 14:12:46 we 'should' start pushing for blueprints on those 14:12:56 cdent: that doesn't need to wait for monday 14:13:11 mriedem: it's all about the time slices, hon 14:13:13 mriedem: I'm about to fence a lof of those during the bug scrub day 14:13:42 ajo: np! 14:13:43 (by which I mean, I think setting aside a designated time is a great idea) 14:14:36 anything else on bugs? 14:15:27 okay: moving on 14:15:32 #topic open 14:15:43 and go 14:15:58 The benchmark tool I mentioned in the paragraph[-1] of 14:15:59 http://lists.openstack.org/pipermail/openstack-dev/2016-March/088344.html is available now. 14:16:11 jaypipes: bauzas: ^ 14:16:28 It's https://github.com/cyx1231st/nova-scheduler-bench 14:16:35 coolness 14:16:48 The experimantal results for filter scheduler is here: http://paste.openstack.org/show/493438/ 14:16:59 #link Yingxin's benchmark tool https://github.com/cyx1231st/nova-scheduler-bench 14:16:59 I'll try to explain them in the summit session Dive into nova scheduler performance - Where is the bottleneck? 14:17:12 cdent: thanks :) 14:17:29 any quick summary or highlight to share? 14:17:33 what I'm a bit surprised is what jaypipes found with SQL queries getting 30% performant than usual python modules 14:18:15 bauzas: 38% at 8 threads. 14:18:16 which means the generator we use is underperformant for iterating over all the filters (and hosts) 14:18:33 bauzas: no, it means that C is faster than Python. 14:18:54 jaypipes: there were other benchs in the past demonstrating not a clear win for C over Python 14:19:21 jaypipes: it also means that Python has less to filter 14:19:22 and the filters are imported once 14:19:31 I found that db is a big problem during scheduling 14:19:49 Yingxin: that ^, I think we all agree 14:19:51 bauzas: it's simple. The more compute nodes you have in the deployment, the slower the existing filter scheduler is, because we transfer every compute node in the system on each call to select_destinations(). Filtering that list of compute nodes to only return one that matches the conditions means you don't loop over all the compute nodes. 14:20:12 Yingxin: I don't agree at all. 14:20:35 Yingxin: it's only a problem because we are transferring giant collections of compute nodes each time we schedule. 14:20:35 jaypipes: that's the current bottleneck 14:20:40 bauzas: no it isn't. 14:20:43 Yingxin: is your tool using real vms and message bus? 14:20:48 many requests are stucked before getting sent to the scheduler service 14:21:01 bauzas: you mean no difference in the performance between python and C mysql drivers, right? I think it's pretty clear that generic C code will be much faster than python, for things like processing the results and doing the actual filtering, right? 14:21:07 cdent: using real compute node services and message bus 14:21:11 jaypipes: that was the huge gain that I saw using a distributed db, with filters as db queries 14:21:14 Yingxin: the DB itself is hardly breaking a sweat in all my benchmarks (real and the placement-bench stuff). 14:21:23 it's the way we *use* the DB that is suboptimal. 14:21:29 jaypipes: not having to constantly pull the state of the nodes was a huge win 14:21:34 jaypipes: well, the whole purpose of CachingScheduler is to reduce the number of DB calls we made 14:21:43 in order to pass the bar 14:22:22 bauzas: and CachingScheduler substitutes a cache invalidation and race condition problem for reduced set of DB calls instead of correcting the root of the problem, which is poor *use* of the DB. 14:22:59 we use the DB to store stuff but don't use it to filter things which is what it's purpose is... 14:23:01 dansmith: sure, I'm not clear, what I'm trying to explain is that in the workflow, it was identified in the past that the filtering part of the scheduler was not a performance problem compared to the DB calls we made by multiple orders of magnitude 14:23:02 caching schedulers has a great performance improvement in my experiments using the real openstack deployment 14:23:21 bauzas: because those db calls are bad, that's the point jaypipes is trying to make 14:23:29 correct. 14:23:29 if we store data well, and then query it will, we have huge gains 14:23:33 bauzas: right but the reason those are expensive is because of how much we have to pull back into python land right? 14:23:33 cdent: sure, and I agree with the approach 14:23:45 dansmith: correct. 14:23:46 bauzas: the calls we will be making will be massively more efficient where the old ones were not 14:23:48 dansmith: yeah, that's the #1 improvement axe 14:23:58 dansmith: hence my support on the series 14:24:09 dansmith: and, more importantly, the greater the number of compute nodes, the less our current approach scales. 14:24:09 okay, sorry if I'm stating the obvious :) 14:24:16 yeah 14:24:19 but like I said, I never estimated the filtering part of that as requiring such modification 14:24:53 bauzas: as I mentioned this morning in my response to ajo, I don't have an issue creating a separate scheduler driver that does things in the DB instead of Python. 14:25:11 semi-related: I had a brain mush last night about nested resource-pools that should allow us to implement celled schedulers and the "super scheduling" that people talk about 14:25:27 jaypipes: sorry if I'm unclear, I'm just saying I was surprised to see the figures, not that I'm against those :) 14:25:30 need to write it down before it disappears again 14:26:29 so is this an accurate summary: 14:26:30 jaypipes: like I said in reviews, I trust you for that, I was just expressing my mindset that I wasn't seeing a clear performance benefit of that 14:26:36 the way we use the db now is costly 14:26:40 until your figures, which makes me very torn 14:26:43 the way we plan to use the db is better 14:26:46 EOF 14:27:20 bauzas: I will modify the resource-providers-scheduler-db-filters blueprint to have it create a new scheduler driver instead of modfy the existing ones. 14:27:34 okay 14:27:38 jaypipes: I'm eager to test resource-provider scheduler using my benchmarking tool once it's available 14:28:40 That would be more fair than the placement-bench 14:28:48 * edleafe has to run off... 14:29:06 so what are the goals for this week? 14:29:31 1. https://review.openstack.org/#/q/topic:bp/compute-node-inventory-newton+status:open merging 14:29:40 mriedem: 1. get you to stop -1ing my patch, 2. merge my patch, 3. don't care about the rest 14:29:41 :P 14:29:47 dansmith: k, 279313 reviewed. nice work. 14:30:01 what else is needed in the compute-node-inventory-newton bp? pci devices and something else needs migrating right? 14:30:28 Yingxin: did you see my review of your proposed scheduler functional testing? 14:30:39 jaypipes: yes 14:31:06 still need to migrate pci devices and numa topologies 14:31:14 dansmith: are you working on that next or is someone else doing that? 14:31:23 mriedem: the PCI devices stuff isn't changing for compute-node-inventory right now. I need to resubmit the pci-generate-stats blueprint for Newton after feedback from ndipanov 14:31:45 <_gryf> speaking of pci… 14:31:52 <_gryf> I've posted the mail on ML regarding FPGA (as requested on previous meeting), which have quite a response [http://lists.openstack.org/pipermail/openstack-dev/2016-April/091411.html] 14:31:56 can we finish this one thought before moving on please? 14:32:00 <_gryf> k 14:32:02 http://specs.openstack.org/openstack/nova-specs/specs/newton/approved/compute-node-inventory-newton.html 14:32:04 what is left? 14:32:10 jaypipes: we were looking for your feedback on L373 on that patch 14:32:11 pci devices is deferred to another bp 14:33:28 cdent: btw, on http://specs.openstack.org/openstack/nova-specs/specs/newton/approved/compute-node-inventory-newton.html it could use an update since "Grab the resource class identifier for CPU from the resource_classes table" isn't what we do, we use the enums 14:33:35 jaypipes: I think refactor servicegroup functional tests would be a better start. 14:33:44 dansmith: you are correct there. I had nothing to add, sorry :( 14:34:02 jaypipes: okay, just making sure, thanks 14:34:24 jaypipes: should the spec for compute-node-inventory-newton be updated to say that pci device migration will happen elsewhere? 14:34:31 or it's TBD at this point? 14:35:05 mriedem: TBD at this point. 14:35:13 ok 14:35:17 how about numa topologies? 14:35:36 mriedem: PCI devices and NUMA topology placement will remain handled by Python-side filtering for the foreseeable future. 14:35:57 mriedem: and the access to those resources (via ComputeNode object) also remain unchanged. 14:36:14 yeah 14:36:26 we could possibly improve how we work with NUMA resources 14:36:30 so once https://review.openstack.org/#/c/279313/ is merged is compute-node-inventory-newton complete? 14:36:42 because there is a big helper module in nova.hardware that I'd like to remove 14:36:58 basically doing lots of isinstance() else 14:38:53 mriedem: yes, though I see a dependent patch for 279313 in Gertty and Gerrit. 14:39:19 FWIW, https://review.openstack.org/#/c/279313/ is planned to be reviewed today 14:40:09 bauzas: i'm +2 on it once the test i asked for is added 14:40:19 i think dan is just waiting for this meeting to be done 14:40:50 ok, so if ram/cpu/disk migration completes that spec, it seems like http://specs.openstack.org/openstack/nova-specs/specs/newton/approved/compute-node-inventory-newton.html should be amended 14:40:58 mriedem: okay, good to know, I already reviewed that patch without voting it yet, but I'm almost happy with it 14:41:31 just wanted to make sure everything is okay before pushing the red button 14:41:33 but i don't want that to make waves if we don't really know yet 14:41:40 bauzas: i put my -1 on it to be safe 14:41:47 I just saw 14:41:51 and because dansmith told me to stop -1ing his changes :) 14:42:16 couldn't we ask a i² button ? 14:42:17 mriedem: jaypipes I'm pushing up a rev with those tweaks now 14:42:24 bauzas: niiiice 14:43:06 alright, well let's just move on, i'll just follow up on spec amendments once the code is merged and we talk about completing the bp 14:43:42 anybody have any other open topics _not_ related to resource providers? _gryf ? 14:44:05 <_gryf> cdent, :) 14:44:20 dansmith: coolio. 14:44:53 <_gryf> cdent, just wanted to point out the thread. 14:45:04 cdent: like I said, check-destination-on-migrations 14:45:30 _gryf: have you seen my proposal for a performance VMs discussion at the summit? 14:45:38 <_gryf> bauzas, yes, saw that 14:45:57 not totally sold on the idea, just want to stop people working in silos 14:46:03 <_gryf> does it means, I can remove entry from unconference section? 14:46:36 _gryf: let's discuss that maybe out of that meeting 14:46:37 _gryf: we haven't decided what the design summit sessions are going to be yet 14:46:38 so leave it 14:46:44 ++ 14:46:56 <_gryf> mriedem, bauzas, ok 14:47:25 mriedem: does that mean that the time and date for the Neutron / Nova joint session in Austin haven't firmed up? 14:47:45 mlavalle: i think that one on wed is pretty firm 14:47:52 dansmith: wallaby'd 14:48:12 mriedem: so we can feel confident that routed networks will be discussed? 14:48:18 jaypipes: bauzas: don't forget https://review.openstack.org/#/c/303531/ 14:48:32 * dansmith tips his hat to jaypipes 14:48:33 mlavalle: yes, it's the 3rd session on wed 14:48:41 right after the 2 scheduler sessions 14:48:53 mriedem: yaay! 14:49:35 mlavalle: please talk to armax and see if there are going to be a bunch of other things that neutron wants to cover in that nova/neutron session, because it might get too full 14:49:58 this is our summit session etherpad https://etherpad.openstack.org/p/newton-nova-summit-ideas 14:49:59 mriedem: will take that action item and report back to you 14:51:08 anything else from anyone? 14:51:23 so, 14:51:31 sorry I've been distracted in another channel 14:51:42 mriedem: Wallaby'd that one too. 14:51:42 but we're good on the inventory migration patch, it looks like 14:52:00 yes, but i'm unclear on the rest of the spec 14:52:03 are we going to open up the next one (allocations) or wait until summit? I haven't tracked that spec 14:52:05 but we can talk about that later 14:53:09 i don't have an answer re: the allocations spec right now, would have to look into it 14:53:14 okay 14:53:22 we have lots of other stuff that could be worked on before the summit, like the cells v2 build request and cell0 stuff 14:53:30 so I also need to remove the aggregate online migration thing 14:53:47 yep 14:54:38 * mriedem has to run to another meeting 14:54:58 I think we can use that as our signal to call it, unless someone has something for the last 5 minutes? 14:55:11 just one more thing... 14:55:14 * dansmith jokes 14:55:15 hehe 14:55:24 #stopmeeting 14:55:30 #endmeeting 14:55:39 #endmeeting