14:01:57 #startmeeting nova_scheduler 14:01:58 Meeting started Mon May 22 14:01:57 2017 UTC and is due to finish in 60 minutes. The chair is cdent. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:01:59 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:02:01 The meeting name has been set to 'nova_scheduler' 14:02:21 jaypipes, bauzas anyone else, shall we do this, or is there no need today? 14:02:34 sorry, was emailng ;) 14:02:38 well 14:02:45 good question 14:02:51 edleafe is not there 14:03:00 so I wonder if we really need it 14:03:02 o/ 14:03:06 cdent: spent this morning getting all my scheduler-related patches in order (other then the nested resource providers stuff). 14:03:23 #action everyone review all of jaypipes' updated stuff 14:03:26 just one thing we discussed last week was around letting people know what we discussed during Summit 14:03:33 and we postponed it 14:03:47 cdent: so reviews on those series (https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/shared-resources-pike) and https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/resource-provider-traits) would be appreciated. 14:03:48 so, not sure we would really need that given mriedem's emails 14:04:01 * alex_xu reads some recap email, that is helpful 14:04:09 #link shared rps https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/shared-resources-pike 14:04:15 alex_xu: indeed. I'm reading a bunch of those emails todya... 14:04:21 #link last bits of traits https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/resource-provider-traits 14:04:27 jaypipes: :) 14:04:47 alex_xu: now that the GFWoC is not blocking me from emailing people ;) 14:05:00 hah 14:05:00 bauzas: I agree that mriedem's email provided good coverage 14:06:11 bauzas: you reported in the nova channel that your stuff is ready for review, correct? 14:06:28 yup, until the last one about the conductor change 14:06:39 #link claims in * https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/placement-claims 14:06:39 I just discovered a bug with ChanceScheduler 14:06:45 as ya do 14:06:59 so I provided a bugfix given it's needed for functionaltests 14:07:14 cdent: series starts with https://review.openstack.org/#/c/460177/9 14:07:29 #link fixing chance scheduler: https://review.openstack.org/#/c/466725 14:07:30 as we have an alternative series possibly confusing for reviews 14:07:53 #link main series of claims starts at https://review.openstack.org/#/c/460177/ 14:08:34 so other than doing a bunch of reviewing, anyone have anything else? 14:09:03 I have a question for continuing the series 14:09:13 I'll say that there are a fair few changes in last week's rp update email message that are not in the mainline of stuff (bug fixes etc) that could do with some review to get them moved along 14:09:35 now that we agreed on passing the alternative hosts to the compute node for rescheduling needs, should we pass the ReqSpec object to compute node ? 14:09:40 I tend to think so 14:09:56 because it would cleanup the compute RPC interface 14:10:05 but it would require a bit more of work 14:10:51 * alex_xu and lei-zh plans to restore the spec https://review.openstack.org/351063, and plan some PoC 14:11:39 bauzas: i thought we decided we'd avoid changing the compute rpc interface? 14:11:53 cdent: for the spec, yes 14:12:17 #link request standardized capabilities spec: https://review.openstack.org/#/c/460177/ 14:12:29 #undo 14:12:30 Removing item from minutes: #link https://review.openstack.org/#/c/460177/ 14:12:36 cdent: now that we discussed at the Forum and operators agreed on using alternative hosts, then it would mean that we would need to persist the alternatives somewhere instead of passing them thru RPC 14:12:50 #link request standardizd capabilities spec to be restarted: https://review.openstack.org/#/c/351063/ 14:13:09 an alternative approach would be to store the alternatives in the Spec object (and persist those) so that a reschedule would lookup those by getting the former Spec 14:13:23 that wouldn't require any RPC change 14:15:00 which services need to know about the alternatives? 14:17:05 i thought it was just the conductor? 14:18:12 yup, agreed 14:18:45 okay, so, let's see what other think about persisting the alternatives by the conductor so reschedules within the conductor will lookup those 14:19:34 when you say persisted, are you meaning "written to disk so some other process might be able to pick them up later"? 14:19:47 if so, let's not do that, if possible 14:20:26 cdent: the problem is that we have distributed conductors 14:20:36 i know 14:20:37 but we also have a shared data store 14:20:51 cdent: so persisting in memory would mean that if we reschedule to another conductor, it doesn't know the former Spec 14:21:14 I think it may be better to avoid data duplication at the cost of a slight loss of performance. 14:21:23 cdent: which is ? 14:21:30 cdent: sorry for my ignorance 14:21:42 when I said "persisted", I meant written in DB 14:21:50 but I'm open to ideas 14:22:12 correct, but we can already "calculate" that information from the data in placement 14:22:39 persisting placement's decisions in the conductor's view of the db is an early optimization (to me) 14:22:51 cdent: unfortunately, it's not a placement information 14:23:08 you're talking about the clumping into cell's, right? 14:23:10 cdent: because placement returns all acceptable nodes, and then we pick only a few of them and we sort them 14:23:40 nope, about the fact we pass alternative hosts to the conductor based on filtering/weighting feedback 14:23:48 for placement, all hosts are equal 14:24:38 right, but what I'm getting at is that that filtering/weighing (and the additional cell clumping) should be treated as if it is fast 14:24:54 if it is not, we should fix that, not come up with additional architectural complexity to compensate for it 14:25:09 cdent: filtering/weighting is fast, yes 14:25:14 or, if we do need architectural complexity, we should just leapfrog to having a real global shared cache 14:25:24 cdent: and later will be something done by conductor hopefully 14:26:23 If we reschedule to another conductor, I really don't see that much of a problem with re-creating the result set that it needs 14:27:18 or we should at least write that first, and then tune it up later 14:27:18 cdent: meaning calling again the scheduler to get the new list of alternatives ? 14:27:24 yes 14:27:27 maybe 14:27:41 I see your point 14:27:58 as the claim is done by the conductor, I'm fine with both possibilities 14:28:25 but keep in mind that cell-local conductors can't upcall scheduler 14:28:28 agree with cdent on this. 14:28:44 which is all the crux of the problem 14:29:30 jaypipes: agreeing on calling again the scheduler ? that can't work for the reason I just said :( 14:30:07 mid-term, I see filters and weighers as part of the conductor, so it's fine 14:30:21 under what circumstances does a reschedule go to a different conductor? 14:30:35 cdent: in a cells v2 world 14:30:54 cdent: schedule_and_build_instances() is a super-conductor method 14:31:03 cdent: while build_instances() is just purely local 14:31:16 yes, I know that much 14:31:32 when would a build_instances happen in a same cell, but on a different conductor? 14:32:01 if you have 2 conductors, you can end up on a separate worker, right? 14:32:07 unless I misunderstand your question 14:32:24 bauzas: I was agreeing on the premature optimization comment by cdent. 14:33:34 anyway, seems we're having a design discussion now and I didn't wanted to diverge that much 14:33:35 sub-conductor A has received a build_instances call that fails for some reason. Now at this point can it be a different conductor that tries to recover from that? If so, who/what is making that decision? 14:34:04 bauzas: Thanks for indulding me here, I'm trying to understand the process a bit more clearly because some of these details sometimes get left out. 14:34:18 (and anyway, we've got nothing else on the agenda) 14:34:25 cdent: schedule_and_build_instances() call the compute node thru cells V2 MQ switching 14:34:55 cdent: if the compute node fails, it triggers a call to a conductor.build_instances() method which is not MQ switching to the global MQ 14:36:21 ah, this is the detail that was unclear to me: I was assuming the compute node returned some form of results to the same conductor, but I guess that wouldn't make sense because we want the request and the response to be async 14:36:46 cdent: I can point you to some code if you will, but let's do that offline 14:37:14 I think that's given me enough to do my own digging 14:37:31 I just wanted to make sure I had a more clear view on the situation than I did before 14:37:37 okay np 14:38:20 anybody got anything else or should we end the meeting? 14:38:26 also, let's state that here, as said in the nova channel, I'll offline for Tuesday afternoon my time to Friday included 14:42:14 everybody good? 14:42:23 jaypipes, alex_xu ? 14:42:32 cdent: yup 14:42:40 cool 14:42:54 #endmeeting