14:01:57 <cdent> #startmeeting nova_scheduler
14:01:58 <openstack> Meeting started Mon May 22 14:01:57 2017 UTC and is due to finish in 60 minutes.  The chair is cdent. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:01:59 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:02:01 <openstack> The meeting name has been set to 'nova_scheduler'
14:02:21 <cdent> jaypipes, bauzas anyone else, shall we do this, or is there no need today?
14:02:34 <jaypipes> sorry, was emailng ;)
14:02:38 <bauzas> well
14:02:45 <bauzas> good question
14:02:51 <bauzas> edleafe is not there
14:03:00 <bauzas> so I wonder if we really need it
14:03:02 <diga> o/
14:03:06 <jaypipes> cdent: spent this morning getting all my scheduler-related patches in order (other then the nested resource providers stuff).
14:03:23 <cdent> #action everyone review all of jaypipes' updated stuff
14:03:26 <bauzas> just one thing we discussed last week was around letting people know what we discussed during Summit
14:03:33 <bauzas> and we postponed it
14:03:47 <jaypipes> cdent: so reviews on those series (https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/shared-resources-pike) and https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/resource-provider-traits) would be appreciated.
14:03:48 <bauzas> so, not sure we would really need that given mriedem's emails
14:04:01 * alex_xu reads some recap email, that is helpful
14:04:09 <cdent> #link shared rps https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/shared-resources-pike
14:04:15 <jaypipes> alex_xu: indeed. I'm reading a bunch of those emails todya...
14:04:21 <cdent> #link last bits of traits https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/resource-provider-traits
14:04:27 <alex_xu> jaypipes: :)
14:04:47 <jaypipes> alex_xu: now that the GFWoC is not blocking me from emailing people ;)
14:05:00 <alex_xu> hah
14:05:00 <cdent> bauzas: I agree that mriedem's email provided good coverage
14:06:11 <cdent> bauzas: you reported in the nova channel that your stuff is ready for review, correct?
14:06:28 <bauzas> yup, until the last one about the conductor change
14:06:39 <cdent> #link claims in * https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/placement-claims
14:06:39 <bauzas> I just discovered a bug with ChanceScheduler
14:06:45 <cdent> as ya do
14:06:59 <bauzas> so I provided a bugfix given it's needed for functionaltests
14:07:14 <bauzas> cdent: series starts with https://review.openstack.org/#/c/460177/9
14:07:29 <cdent> #link fixing chance scheduler: https://review.openstack.org/#/c/466725
14:07:30 <bauzas> as we have an alternative series possibly confusing for reviews
14:07:53 <cdent> #link main series of claims starts at https://review.openstack.org/#/c/460177/
14:08:34 <cdent> so other than doing a bunch of reviewing, anyone have anything else?
14:09:03 <bauzas> I have a question for continuing the series
14:09:13 <cdent> I'll say that there are a fair few changes in last week's rp update email message that are not in the mainline of stuff (bug fixes etc) that could do with some review to get them moved along
14:09:35 <bauzas> now that we agreed on passing the alternative hosts to the compute node for rescheduling needs, should we pass the ReqSpec object to compute node ?
14:09:40 <bauzas> I tend to think so
14:09:56 <bauzas> because it would cleanup the compute RPC interface
14:10:05 <bauzas> but it would require a bit more of work
14:10:51 * alex_xu and lei-zh plans to restore the spec https://review.openstack.org/351063, and plan some PoC
14:11:39 <cdent> bauzas: i thought we decided we'd avoid changing the compute rpc interface?
14:11:53 <bauzas> cdent: for the spec, yes
14:12:17 <cdent> #link request standardized capabilities spec: https://review.openstack.org/#/c/460177/
14:12:29 <cdent> #undo
14:12:30 <openstack> Removing item from minutes: #link https://review.openstack.org/#/c/460177/
14:12:36 <bauzas> cdent: now that we discussed at the Forum and operators agreed on using alternative hosts, then it would mean that we would need to persist the alternatives somewhere instead of passing them thru RPC
14:12:50 <cdent> #link request standardizd capabilities spec to be restarted: https://review.openstack.org/#/c/351063/
14:13:09 <bauzas> an alternative approach would be to store the alternatives in the Spec object (and persist those) so that a reschedule would lookup those by getting the former Spec
14:13:23 <bauzas> that wouldn't require any RPC change
14:15:00 <cdent> which services need to know about the alternatives?
14:17:05 <cdent> i thought it was just the conductor?
14:18:12 <bauzas> yup, agreed
14:18:45 <bauzas> okay, so, let's see what other think about persisting the alternatives by the conductor so reschedules within the conductor will lookup those
14:19:34 <cdent> when you say persisted, are you meaning "written to disk so some other process might be able to pick them up later"?
14:19:47 <cdent> if so, let's not do that, if possible
14:20:26 <bauzas> cdent: the problem is that we have distributed conductors
14:20:36 <cdent> i know
14:20:37 <cdent> but we also have a shared data store
14:20:51 <bauzas> cdent: so persisting in memory would mean that if we reschedule to another conductor, it doesn't know the former Spec
14:21:14 <cdent> I think it may be better to avoid data duplication at the cost of a slight loss of performance.
14:21:23 <bauzas> cdent: which is ?
14:21:30 <bauzas> cdent: sorry for my ignorance
14:21:42 <bauzas> when I said "persisted", I meant written in DB
14:21:50 <bauzas> but I'm open to ideas
14:22:12 <cdent> correct, but we can already "calculate" that information from the data in placement
14:22:39 <cdent> persisting placement's decisions in the conductor's view of the db is an early optimization (to me)
14:22:51 <bauzas> cdent: unfortunately, it's not a placement information
14:23:08 <cdent> you're talking about the clumping into cell's, right?
14:23:10 <bauzas> cdent: because placement returns all acceptable nodes, and then we pick only a few of them and we sort them
14:23:40 <bauzas> nope, about the fact we pass alternative hosts to the conductor based on filtering/weighting feedback
14:23:48 <bauzas> for placement, all hosts are equal
14:24:38 <cdent> right, but what I'm getting at is that that filtering/weighing (and the additional cell clumping) should be treated as if it is fast
14:24:54 <cdent> if it is not, we should fix that, not come up with additional architectural complexity to compensate for it
14:25:09 <bauzas> cdent: filtering/weighting is fast, yes
14:25:14 <cdent> or, if we do need architectural complexity, we should just leapfrog to having a real global shared cache
14:25:24 <bauzas> cdent: and later will be something done by conductor hopefully
14:26:23 <cdent> If we reschedule to another conductor, I really don't see that much of a problem with re-creating the result set that it needs
14:27:18 <cdent> or we should at least write that first, and then tune it up later
14:27:18 <bauzas> cdent: meaning calling again the scheduler to get the new list of alternatives ?
14:27:24 <cdent> yes
14:27:27 <bauzas> maybe
14:27:41 <bauzas> I see your point
14:27:58 <bauzas> as the claim is done by the conductor, I'm fine with both possibilities
14:28:25 <bauzas> but keep in mind that cell-local conductors can't upcall scheduler
14:28:28 <jaypipes> agree with cdent on this.
14:28:44 <bauzas> which is all the crux of the problem
14:29:30 <bauzas> jaypipes: agreeing on calling again the scheduler ? that can't work for the reason I just said :(
14:30:07 <bauzas> mid-term, I see filters and weighers as part of the conductor, so it's fine
14:30:21 <cdent> under what circumstances does a reschedule go to a different conductor?
14:30:35 <bauzas> cdent: in a cells v2 world
14:30:54 <bauzas> cdent: schedule_and_build_instances() is a super-conductor method
14:31:03 <bauzas> cdent: while build_instances() is just purely local
14:31:16 <cdent> yes, I know that much
14:31:32 <cdent> when would a build_instances happen in a same cell, but on a different conductor?
14:32:01 <bauzas> if you have 2 conductors, you can end up on a separate worker, right?
14:32:07 <bauzas> unless I misunderstand your question
14:32:24 <jaypipes> bauzas: I was agreeing on the premature optimization comment by cdent.
14:33:34 <bauzas> anyway, seems we're having a design discussion now and I didn't wanted to diverge that much
14:33:35 <cdent> sub-conductor A has received a build_instances call that fails for some reason. Now at this point can it be a different conductor that tries to recover from that? If so, who/what is making that decision?
14:34:04 <cdent> bauzas: Thanks for indulding me here, I'm trying to understand the process a bit more clearly because some of these details sometimes get left out.
14:34:18 <cdent> (and anyway, we've got nothing else on the agenda)
14:34:25 <bauzas> cdent: schedule_and_build_instances() call the compute node thru cells V2 MQ switching
14:34:55 <bauzas> cdent: if the compute node fails, it triggers a call to a conductor.build_instances() method which is not MQ switching to the global MQ
14:36:21 <cdent> ah, this is the detail that was unclear to me: I was assuming the compute node returned some form of results to the same conductor, but I guess that wouldn't make sense because we want the request and the response to be async
14:36:46 <bauzas> cdent: I can point you to some code if you will, but let's do that offline
14:37:14 <cdent> I think that's given me enough to do my own digging
14:37:31 <cdent> I just wanted to make sure I had a more clear view on the situation than I did before
14:37:37 <bauzas> okay np
14:38:20 <cdent> anybody got anything else or should we end the meeting?
14:38:26 <bauzas> also, let's state that here, as said in the nova channel, I'll offline for Tuesday afternoon my time to Friday included
14:42:14 <cdent> everybody good?
14:42:23 <cdent> jaypipes, alex_xu ?
14:42:32 <jaypipes> cdent: yup
14:42:40 <cdent> cool
14:42:54 <cdent> #endmeeting