14:00:17 <cdent> #startmeeting nova-scheduler
14:00:17 <openstack> Meeting started Mon Apr 11 14:00:17 2016 UTC and is due to finish in 60 minutes.  The chair is cdent. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:18 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:21 <openstack> The meeting name has been set to 'nova_scheduler'
14:00:32 <mlavalle> o/
14:00:32 <cdent> who is here to have a fun and exciting nova scheduler team meeting?
14:00:34 <mriedem> o/
14:00:35 <mlavalle> me
14:00:41 <cdent> \o/
14:01:04 <cdent> bauzas, jaypipes, dansmith about?
14:01:15 <cdent> There's nothing specific on the agenda this week.
14:01:18 <bauzas> \o
14:01:19 <Yingxin> o/
14:01:39 <tonytan4ever> o/
14:01:43 <jaypipes> o..../
14:01:53 <cdent> mriedem: I assume we are still in pre-summit new-spec freeze?
14:02:02 <mriedem> yes
14:02:16 <jaypipes> mlavalle, ajo: did you see my response to ajo this morning on scheduer and RT stuff for NIC_BW_KB?
14:02:38 <mlavalle> jaypipes: not yet. I just connected. Will take a look soon
14:02:56 * edleafe is here for a little while
14:03:00 <_gryf> o/
14:03:29 <cdent> What scheduler related specs are currently in play? compute node migration. what else?
14:03:32 <sarafraj> o/
14:03:54 <jaypipes> cdent: that's it, AFAIK. pre-summit thats all that is accepted.
14:04:19 <cdent> that's my sense of things too
14:04:36 <cdent> mriedem still has a few questions on dansmith's migration patch, but thats's almost there
14:04:43 <bauzas> cdent: I'm also working on check-destinations
14:04:51 <cdent> link?
14:04:55 <bauzas> which has more side effects than I originally thoguht
14:04:57 <edleafe> are there any that can be reviewed so that they get approved asap post-summit?
14:04:59 <bauzas> thought, even
14:05:42 <cdent> #link: compute node migration https://review.openstack.org/#/c/279313/
14:06:14 <bauzas> https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/check-destination-on-migrations-newton is reviewable too
14:06:16 <cdent> edleafe: there is a newton version of generic resource pools: https://review.openstack.org/#/c/300176/
14:06:41 <cdent> and I think jaypipes restacked all the resource-* stuff for newton, yeah?
14:06:49 <jaypipes> yes
14:07:15 <edleafe> cdent: ok, great. I'd hate to see that stuff linger throughout newton
14:07:17 <cdent> jaypipes: is there a nice happy little single link for that?
14:08:14 * mlavalle will review jaypipes generic resource pool spec
14:08:25 <cdent> mlavalle++
14:08:50 <jaypipes> cdent: https://blueprints.launchpad.net/nova/+spec/compute-node-inventory-newton
14:08:57 <jaypipes> cdent: has the dependency graph and links in there.
14:09:02 <cdent> Any other specs or review that anyone would like to draw attention to?
14:09:06 <cdent> thanks jaypipes
14:09:09 <jaypipes> np
14:09:12 <jaypipes> #link https://blueprints.launchpad.net/nova/+spec/compute-node-inventory-newton
14:10:03 <cdent> #topic: bugs
14:10:18 <cdent> #link https://bugs.launchpad.net/nova/+bugs?field.tag=scheduler&orderby=-id&start=0
14:10:42 <cdent> doesn't appear to be many new bugs, which is good, but not much shrinkage in number of bugs, which is less good
14:11:07 <cdent> anything to highlight?
14:11:12 <bauzas> a bit of triage
14:11:20 <bauzas> but nothing really worth commenting it
14:11:45 <bauzas> basically, someone expresses frustration about 300+ computes having hard time reading DB
14:11:57 <ajo> (sorry about the late pong jayg) I saw your response but still had no time to properly process it
14:11:57 <bauzas> well,
14:12:23 <bauzas> s/.*/about the scheduler having hard time to read eq. of 300+ computes/
14:12:28 <mriedem> there are a lot of bugs with 'should' in the title that are over a year old
14:12:43 <cdent> mriedem: I'm looking forward to monday when we get to stomp on those
14:12:46 <mriedem> we 'should' start pushing for blueprints on those
14:12:56 <mriedem> cdent: that doesn't need to wait for monday
14:13:11 <cdent> mriedem: it's all about the time slices, hon
14:13:13 <bauzas> mriedem: I'm about to fence a lof of those during the bug scrub day
14:13:42 <jaypipes> ajo: np!
14:13:43 <cdent> (by which I mean, I think setting aside a designated time is a great idea)
14:14:36 <cdent> anything else on bugs?
14:15:27 <cdent> okay: moving on
14:15:32 <cdent> #topic open
14:15:43 <cdent> and go
14:15:58 <Yingxin> The benchmark tool I mentioned in the paragraph[-1] of
14:15:59 <Yingxin> http://lists.openstack.org/pipermail/openstack-dev/2016-March/088344.html is available now.
14:16:11 <Yingxin> jaypipes: bauzas: ^
14:16:28 <Yingxin> It's https://github.com/cyx1231st/nova-scheduler-bench
14:16:35 <bauzas> coolness
14:16:48 <Yingxin> The experimantal results for filter scheduler is here: http://paste.openstack.org/show/493438/
14:16:59 <cdent> #link Yingxin's benchmark tool https://github.com/cyx1231st/nova-scheduler-bench
14:16:59 <Yingxin> I'll try to explain them in the summit session Dive into nova scheduler performance - Where is the bottleneck?
14:17:12 <Yingxin> cdent: thanks :)
14:17:29 <cdent> any quick summary or highlight to share?
14:17:33 <bauzas> what I'm a bit surprised is what jaypipes found with SQL queries getting 30% performant than usual python modules
14:18:15 <jaypipes> bauzas: 38% at 8 threads.
14:18:16 <bauzas> which means the generator we use is underperformant for iterating over all the filters (and hosts)
14:18:33 <jaypipes> bauzas: no, it means that C is faster than Python.
14:18:54 <bauzas> jaypipes: there were other benchs in the past demonstrating not a clear win for C over Python
14:19:21 <edleafe> jaypipes: it also means that Python has less to filter
14:19:22 <bauzas> and the filters are imported once
14:19:31 <Yingxin> I found that db is a big problem during scheduling
14:19:49 <bauzas> Yingxin: that ^,  I think we all agree
14:19:51 <jaypipes> bauzas: it's simple. The more compute nodes you have in the deployment, the slower the existing filter scheduler is, because we transfer every compute node in the system on each call to select_destinations(). Filtering that list of compute nodes to only return one that matches the conditions means you don't loop over all the compute nodes.
14:20:12 <jaypipes> Yingxin: I don't agree at all.
14:20:35 <jaypipes> Yingxin: it's only a problem because we are transferring giant collections of compute nodes each time we schedule.
14:20:35 <bauzas> jaypipes: that's the current bottleneck
14:20:40 <jaypipes> bauzas: no it isn't.
14:20:43 <cdent> Yingxin: is your tool using real vms and message bus?
14:20:48 <Yingxin> many requests are stucked before getting sent to the scheduler service
14:21:01 <dansmith> bauzas: you mean no difference in the performance between python and C mysql drivers, right? I think it's pretty clear that generic C code will be much faster than python, for things like processing the results and doing the actual filtering, right?
14:21:07 <Yingxin> cdent: using real compute node services and message bus
14:21:11 <edleafe> jaypipes: that was the huge gain that I saw using a distributed db, with filters as db queries
14:21:14 <jaypipes> Yingxin: the DB itself is hardly breaking a sweat in all my benchmarks (real and the placement-bench stuff).
14:21:23 <jaypipes> it's the way we *use* the DB that is suboptimal.
14:21:29 <edleafe> jaypipes: not having to constantly pull the state of the nodes was a huge win
14:21:34 <bauzas> jaypipes: well, the whole purpose of CachingScheduler is to reduce the number of DB calls we made
14:21:43 <bauzas> in order to pass the bar
14:22:22 <jaypipes> bauzas: and CachingScheduler substitutes a cache invalidation and race condition problem for reduced set of DB calls instead of correcting the root of the problem, which is poor *use* of the DB.
14:22:59 <jaypipes> we use the DB to store stuff but don't use it to filter things which is what it's purpose is...
14:23:01 <bauzas> dansmith: sure, I'm not clear, what I'm trying to explain is that in the workflow, it was identified in the past that the filtering part of the scheduler was not a performance problem compared to the DB calls we made by multiple orders of magnitude
14:23:02 <Yingxin> caching schedulers has a great performance improvement in my experiments using the real openstack deployment
14:23:21 <cdent> bauzas: because those db calls are bad, that's the point jaypipes is trying to make
14:23:29 <jaypipes> correct.
14:23:29 <cdent> if we store data well, and then query it will, we have huge gains
14:23:33 <dansmith> bauzas: right but the reason those are expensive is because of how much we have to pull back into python land right?
14:23:33 <bauzas> cdent: sure, and I agree with the approach
14:23:45 <jaypipes> dansmith: correct.
14:23:46 <dansmith> bauzas: the calls we will be making will be massively more efficient where the old ones were not
14:23:48 <bauzas> dansmith: yeah, that's the #1 improvement axe
14:23:58 <bauzas> dansmith: hence my support on the series
14:24:09 <jaypipes> dansmith: and, more importantly, the greater the number of compute nodes, the less our current approach scales.
14:24:09 <dansmith> okay, sorry if I'm stating the obvious :)
14:24:16 <dansmith> yeah
14:24:19 <bauzas> but like I said, I never estimated the filtering part of that as requiring such modification
14:24:53 <jaypipes> bauzas: as I mentioned this morning in my response to ajo, I don't have an issue creating a separate scheduler driver that does things in the DB instead of Python.
14:25:11 <cdent> semi-related: I had a brain mush last night about nested resource-pools that should allow us to implement celled schedulers and the "super scheduling" that people talk about
14:25:27 <bauzas> jaypipes: sorry if I'm unclear, I'm just saying I was surprised to see the figures, not that I'm against those :)
14:25:30 <cdent> need to write it down before it disappears again
14:26:29 <cdent> so is this an accurate summary:
14:26:30 <bauzas> jaypipes: like I said in reviews, I trust you for that, I was just expressing my mindset that I wasn't seeing  a clear performance benefit of that
14:26:36 <cdent> the way we use the db now is costly
14:26:40 <bauzas> until your figures, which makes me very torn
14:26:43 <cdent> the way we plan to use the db is better
14:26:46 <cdent> EOF
14:27:20 <jaypipes> bauzas: I will modify the resource-providers-scheduler-db-filters blueprint to have it create a new scheduler driver instead of modfy the existing ones.
14:27:34 <bauzas> okay
14:27:38 <Yingxin> jaypipes: I'm eager to test resource-provider scheduler using my benchmarking tool once it's available
14:28:40 <Yingxin> That would be more fair than the placement-bench
14:28:48 * edleafe has to run off...
14:29:06 <mriedem> so what are the goals for this week?
14:29:31 <mriedem> 1. https://review.openstack.org/#/q/topic:bp/compute-node-inventory-newton+status:open merging
14:29:40 <dansmith> mriedem: 1. get you to stop -1ing my patch, 2. merge my patch, 3. don't care about the rest
14:29:41 <dansmith> :P
14:29:47 <jaypipes> dansmith: k, 279313 reviewed. nice work.
14:30:01 <mriedem> what else is needed in the compute-node-inventory-newton bp? pci devices and something else needs migrating right?
14:30:28 <jaypipes> Yingxin: did you see my review of your proposed scheduler functional testing?
14:30:39 <Yingxin> jaypipes: yes
14:31:06 <mriedem> still need to migrate pci devices and numa topologies
14:31:14 <mriedem> dansmith: are you working on that next or is someone else doing that?
14:31:23 <jaypipes> mriedem: the PCI devices stuff isn't changing for compute-node-inventory right now. I need to resubmit the pci-generate-stats blueprint for Newton after feedback from ndipanov
14:31:45 <_gryf> speaking of pci…
14:31:52 <_gryf> I've posted the mail on ML regarding FPGA (as requested on previous meeting), which have quite a response [http://lists.openstack.org/pipermail/openstack-dev/2016-April/091411.html]
14:31:56 <mriedem> can we finish this one thought before moving on please?
14:32:00 <_gryf> k
14:32:02 <mriedem> http://specs.openstack.org/openstack/nova-specs/specs/newton/approved/compute-node-inventory-newton.html
14:32:04 <mriedem> what is left?
14:32:10 <dansmith> jaypipes: we were looking for your feedback on L373 on that patch
14:32:11 <mriedem> pci devices is deferred to another bp
14:33:28 <mriedem> cdent: btw, on http://specs.openstack.org/openstack/nova-specs/specs/newton/approved/compute-node-inventory-newton.html it could use an update since "Grab the resource class identifier for CPU from the resource_classes table" isn't what we do, we use the enums
14:33:35 <Yingxin> jaypipes: I think refactor servicegroup functional tests would be a better start.
14:33:44 <jaypipes> dansmith: you are correct there. I had nothing to add, sorry :(
14:34:02 <dansmith> jaypipes: okay, just making sure, thanks
14:34:24 <mriedem> jaypipes: should the spec for compute-node-inventory-newton be updated to say that pci device migration will happen elsewhere?
14:34:31 <mriedem> or it's TBD at this point?
14:35:05 <jaypipes> mriedem: TBD at this point.
14:35:13 <mriedem> ok
14:35:17 <mriedem> how about numa topologies?
14:35:36 <jaypipes> mriedem: PCI devices and NUMA topology placement will remain handled by Python-side filtering for the foreseeable future.
14:35:57 <jaypipes> mriedem: and the access to those resources (via ComputeNode object) also remain unchanged.
14:36:14 <bauzas> yeah
14:36:26 <bauzas> we could possibly improve how we work with NUMA resources
14:36:30 <mriedem> so once https://review.openstack.org/#/c/279313/ is merged is compute-node-inventory-newton complete?
14:36:42 <bauzas> because there is a big helper module in nova.hardware that I'd like to remove
14:36:58 <bauzas> basically doing lots of isinstance() else
14:38:53 <jaypipes> mriedem: yes, though I see a dependent patch for 279313 in Gertty and Gerrit.
14:39:19 <bauzas> FWIW, https://review.openstack.org/#/c/279313/ is planned to be reviewed today
14:40:09 <mriedem> bauzas: i'm +2 on it once the test i asked for is added
14:40:19 <mriedem> i think dan is just waiting for this meeting to be done
14:40:50 <mriedem> ok, so if ram/cpu/disk migration completes that spec, it seems like http://specs.openstack.org/openstack/nova-specs/specs/newton/approved/compute-node-inventory-newton.html should be amended
14:40:58 <bauzas> mriedem: okay, good to know, I already reviewed that patch without voting it yet, but I'm almost happy with it
14:41:31 <bauzas> just wanted to make sure everything is okay before pushing the red button
14:41:33 <mriedem> but i don't want that to make waves if we don't really know yet
14:41:40 <mriedem> bauzas: i put my -1 on it to be safe
14:41:47 <bauzas> I just saw
14:41:51 <mriedem> and because dansmith told me to stop -1ing his changes :)
14:42:16 <bauzas> couldn't we ask a i² button ?
14:42:17 <dansmith> mriedem: jaypipes I'm pushing up a rev with those tweaks now
14:42:24 <dansmith> bauzas: niiiice
14:43:06 <mriedem> alright, well let's just move on, i'll just follow up on spec amendments once the code is merged and we talk about completing the bp
14:43:42 <cdent> anybody have any other open topics _not_ related to resource providers? _gryf ?
14:44:05 <_gryf> cdent, :)
14:44:20 <jaypipes> dansmith: coolio.
14:44:53 <_gryf> cdent, just wanted to point out the thread.
14:45:04 <bauzas> cdent: like I said, check-destination-on-migrations
14:45:30 <bauzas> _gryf: have you seen my proposal for a performance VMs discussion at the summit?
14:45:38 <_gryf> bauzas, yes, saw that
14:45:57 <bauzas> not totally sold on the idea, just want to stop people working in silos
14:46:03 <_gryf> does it means, I can remove entry from unconference section?
14:46:36 <bauzas> _gryf: let's discuss that maybe out of that meeting
14:46:37 <mriedem> _gryf: we haven't decided what the design summit sessions are going to be yet
14:46:38 <mriedem> so leave it
14:46:44 <bauzas> ++
14:46:56 <_gryf> mriedem, bauzas, ok
14:47:25 <mlavalle> mriedem: does that mean that the time and date for the Neutron / Nova joint session in Austin haven't firmed up?
14:47:45 <mriedem> mlavalle: i think that one on wed is pretty firm
14:47:52 <jaypipes> dansmith: wallaby'd
14:48:12 <mlavalle> mriedem: so we can feel confident that routed networks will be discussed?
14:48:18 <mriedem> jaypipes: bauzas: don't forget https://review.openstack.org/#/c/303531/
14:48:32 * dansmith tips his hat to jaypipes
14:48:33 <mriedem> mlavalle: yes, it's the 3rd session on wed
14:48:41 <mriedem> right after the 2 scheduler sessions
14:48:53 <mlavalle> mriedem: yaay!
14:49:35 <mriedem> mlavalle: please talk to armax and see if there are going to be a bunch of other things that neutron wants to cover in that nova/neutron session, because it might get too full
14:49:58 <mriedem> this is our summit session etherpad https://etherpad.openstack.org/p/newton-nova-summit-ideas
14:49:59 <mlavalle> mriedem: will take that action item and report back to you
14:51:08 <cdent> anything else from anyone?
14:51:23 <dansmith> so,
14:51:31 <dansmith> sorry I've been distracted in another channel
14:51:42 <jaypipes> mriedem: Wallaby'd that one too.
14:51:42 <dansmith> but we're good on the inventory migration patch, it looks like
14:52:00 <mriedem> yes, but i'm unclear on the rest of the spec
14:52:03 <dansmith> are we going to open up the next one (allocations) or wait until summit? I haven't tracked that spec
14:52:05 <mriedem> but we can talk about that later
14:53:09 <mriedem> i don't have an answer re: the allocations spec right now, would have to look into it
14:53:14 <dansmith> okay
14:53:22 <mriedem> we have lots of other stuff that could be worked on before the summit, like the cells v2 build request and cell0 stuff
14:53:30 <dansmith> so I also need to remove the aggregate online migration thing
14:53:47 <dansmith> yep
14:54:38 * mriedem has to run to another meeting
14:54:58 <cdent> I think we can use that as our signal to call it, unless someone has something for the last 5 minutes?
14:55:11 <dansmith> just one more thing...
14:55:14 * dansmith jokes
14:55:15 <cdent> hehe
14:55:24 <cdent> #stopmeeting
14:55:30 <dansmith> #endmeeting
14:55:39 <cdent> #endmeeting