13:59:56 #startmeeting nova-scheduler 13:59:57 Meeting started Mon Feb 29 13:59:56 2016 UTC and is due to finish in 60 minutes. The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:59:58 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:01 The meeting name has been set to 'nova_scheduler' 14:00:06 \o 14:00:08 anyone here to talk about the scheduler? 14:00:10 o/ 14:00:23 \o 14:00:47 n0ano: you're early today, you beated me at the clock 14:01:07 bauzas, yeah, I beat the top of the hour by abouot 10 sec :-) 14:01:33 * johnthetubaguy lurking 14:01:50 oh, beat, beat, beat, irregular verb 14:01:54 well, the usual suspects are here so let's get started 14:02:11 #topic Patches/Reviews – https://etherpad.openstack.org/p/mitaka-nova-priorities-tracking 14:02:29 beat, beat, beaten even 14:02:33 I see I go away for a week have you guys have a major discusson on this 14:02:47 cdent: around ? 14:02:51 o/ 14:02:53 anything to continue this week or are we good? 14:03:11 so yeah, there's the resource-provider/pool stuff 14:03:14 n0ano: we're closing M-3 this week, so we are going to sprint merging some of cdent's teeth 14:03:27 starts here: https://review.openstack.org/#/c/281837/ 14:03:41 there are some questions/issues that need to resolved 14:04:04 cdent, are they minor enough they can be closed in 1 week? 14:04:08 cdent: I missed the unresolved questions ? 14:04:11 yes,I think so 14:04:30 bauzas: one of them is my last comment on that review 14:04:48 the other are the several question in the commit message on: https://review.openstack.org/#/c/284963/ 14:04:49 oh missed that one 14:05:09 me too (took the weekend off) 14:05:09 that comment (about the constraint) was discovered as a result of the work on the latter review 14:05:21 cdent: can I ask you some questions about the resource pool after the meeting? 14:06:08 Yingxin: of course, and I'll try to answer, but I'm not sure I will have all the answers. My ignorance is why I'm hoping for feedback from bauzas, johnthetubaguy, dansmith and jaypipes 14:06:39 cdent: so, I was on PTO last Friday 14:06:42 * johnthetubaguy hopes to hit those ones soon 14:06:55 cdent: so I missed the problem, could you please rephrase it ? 14:07:08 I can see it's an UC problem, right? 14:07:57 bauzas: Yeah. It's possible, as the database is currently constructed to create an inventory for the same resource-provider resource-class pair. 14:08:09 According to the comments on the resource-pools spec this is not desired 14:08:16 cdent: for the moment, nothing is populating that table, right? 14:08:26 Or at least the comments can be interpreted that way 14:08:37 bauzas: only tests 14:08:54 (that I'm aware of, but some of the patches in that stack I dont know) 14:09:41 cdent: so, let's be clear, we're missing this https://review.openstack.org/#/c/283253/ at least ? 14:10:12 in talking with dansmith we decided that if we we did need to add another migration it would be better to do it in the existing one that hasn't merged yet, not add another 14:10:16 cdent: I don't see a discussion around UCs in there 14:10:25 cdent: and I agree with him 14:10:33 bauzas: that's because I only discovered it late friday 14:10:52 cdent: okay, gotcah 14:11:09 cdent: so, what's the rationale behind needing an UC ? sorry, missing context obviously 14:11:22 is that point being persisted in a gerrit comment ? 14:11:33 or is this an IRC convo ? 14:11:41 I need to understand *why* we need an UC 14:11:54 and if so, we could quickly iterate on that 14:11:55 look at the conversation between Roman and Jay at the end of this: https://review.openstack.org/#/c/253187/ 14:11:57 what is a UC in this context? 14:12:04 unique constraint 14:12:19 ah, doh, gotcha 14:12:58 cdent: oooooh, all of the discussion was not in files, just general messages, missed that :/ 14:13:00 so setting aside the technical terms for a moment: the questinon is: "Can a resource provider have more than one inventory for the same resource class?" 14:13:13 gotcha 14:13:33 and that's a good point :-) 14:14:04 I'm pretty sure the answer has to be no, otherwise the structure for accounting for things falls apart 14:14:19 as it is hard to know which inventory an allocation would be associated with 14:14:39 not sure why you would `need` multiple inventories anyway 14:14:45 yeah, that has to be a no, I feel 14:15:13 <_gryf> jay has explained that matter on irc to me 14:15:17 n0ano: there was some discussion of a provider having two different batches of the same thing 14:15:29 Seems that while it might be physically possible, it isn't something that we should encourage 14:15:37 edleafe++ 14:15:49 cdent: +1 for a no, that's one of the motivations we had for having resource-providers, just because we don't have a clear way of having consistent resources 14:15:56 cdent, I'd want a stronger justification than a discussion, more like a valid use case 14:16:03 <_gryf> so the answer is no in this case - if there is a need to have another resource of the same class, than another pool will be created 14:16:11 _gryf: yeah 14:16:13 n0ano: and not just a *possible* use case 14:16:16 I'm happy to barrell forward, but I'm a bit cautious as there's a _lot_ of work I've done for this stack which has been discovered to be off track well past the time I did it, so I'm a bit...shy and frustrated? 14:16:21 edleafe, +1 14:16:47 so we shouldn't care about "possible" usecases 14:16:56 we should care about what's in the spec 14:16:56 k, I'll go ahead and put the constraint on the existing migration 14:17:00 cdent, don't get too discouraged, that's just the nature of changing complex systems 14:17:27 n0ano: I'm familiar. It's the feedback latency that has me frustrated not the complexity of the systems. 14:17:34 cdent: ping me when you're done with https://review.openstack.org/#/c/281837/ 14:17:39 so the dropping a unique constraint is a non-impacting DB change I am guessing? 14:17:51 cdent, indeed 14:17:51 johnthetubaguy: we're adding 14:17:58 so this is adding 14:18:05 cdent that was one of the motivations for specs: so we could agree on the big picture before writing code 14:18:05 but to a table that doesn't have anything in it 14:18:06 I am thinking if we find we need to drop it 14:18:08 thats easy I think? 14:18:10 johnthetubaguy: yeah, it's just a matter of amending https://review.openstack.org/#/c/281837/ with adding an UC on an empty table 14:18:14 johnthetubaguy: yes 14:18:20 cool, so lets just do it 14:18:22 the freeze is making us write code before the specs have been settled 14:18:30 johnthetubaguy: that https://review.openstack.org/#/c/281837/ is already for adjusting the model 14:19:02 johnthetubaguy: with https://review.openstack.org/#/c/283253/ reflecting that on nova-specs 14:19:20 (so for example, explaining the 'generation' field) 14:19:29 so the spec is probably too detailed for some of these 14:19:42 johnthetubaguy++ 14:19:44 the large brush strokes agreements we made in person, so I am OK with how we are going here 14:20:02 normally thats a spec merge, but lets just do whats needed at this point 14:20:43 So I'm good to go on the migration, but there's plenty of other stuff in that stack that needs feedback, pronto, if we want to get it merged. Most of it is model or object changes so stuff we want in M 14:21:40 right, so which is the last change we need in this cycle, so we have all the migrations complete? 14:21:47 bauzas: 'generation' looks like a field to force atomic transactions and resolve races between schedulers' claims. 14:22:40 johnthetubaguy: I'm not entirely certain. I don't really understand the rules that impact what limits things. 14:23:02 I'm pretty sure it's something very close to "all of it" 14:23:02 its really about upgrade 14:23:07 Yingxin: it's rather because of a compare-and-update model in a transactional model 14:23:20 we need those data migrations and schema migrations in place I think 14:23:27 the data ones being the uuid generation on compute node, I think 14:23:30 because that stuff leaves out all the actually doing stuff with resource providers in favor of getting just the groundwork in place 14:23:44 cdent: johnthetubaguy: so, the thing is, we need computes to update things in DB to get that not require an online migration for Newton 14:24:05 cdent: johnthetubaguy: not only the model to be present AFAIK 14:24:16 the uuid is the only key bit, really 14:24:21 not only the model to be /up-to-date/ rather 14:24:22 migration wise 14:24:27 bauzas, johnthetubaguy If you guys review that entire stack I think it will be more clear which is required 14:24:33 we can read from old ad new location 14:24:40 (even if it is just a light drive-by review) 14:24:42 cdent: yeah, thats probably quicker at this point 14:24:49 johnthetubaguy++ 14:25:41 so, have we beat this to death and just review the patch series is the next step? 14:26:04 I think so, and those with particular interest/investment can help determine which is the required stuff 14:26:20 and also help me figure out the ResourcePool objct 14:26:21 cdent: well, I was seeing https://review.openstack.org/#/c/279313/ as a needed merge for preventing data online migrations in Newton, unless I'm wrong ? 14:27:03 we also need all the objects, don't we, otherwise actually using them is delayed by an additional cycle? 14:27:05 because if old computes don't store that information new-way-ish, then we need to do an online migration when calling that 14:27:31 yes, we do need 279313 14:28:13 cdent, tbc we need object remotable methods as well, yes 14:29:51 OK, moving on 14:29:57 #topic Bugs - https://bugs.launchpad.net/nova/+bugs?field.tag=scheduler 14:30:22 so bauzas cdent: lets catch up about https://review.openstack.org/#/c/279313 I think thats the most important thing we need 14:30:23 I believe we've actually reduced the bug count by about 4, we down to `only` 35 bugs 14:31:09 I don't know who closed a few but that's good news 14:31:22 johnthetubaguy: that's the last patch of the branch, yes 14:32:33 are there any specific bugs anyone is concerned about? 14:33:26 if not, let's move on 14:33:33 #topic opens 14:34:03 I was modeling my design based on Jay Pipes benchmarking tool. The work is nearly done. 14:34:04 actually, I do have a slightly personal matter to tell everyone about... 14:34:41 after a `long` career I have decided to retire as of April this year... 14:35:19 n0ano: we'll miss you, but that's awesome news for you! 14:35:20 Do you want to be congratulated or consoled? :) 14:35:28 I'll be doing other things, hopefully involved in open source, but it looks like I'll be dropping off of the Open Stack project 14:35:37 cdent, yes :-) 14:36:11 it's been a pleasure to work with you all but you'll have to find someone else manage you guys :-) 14:36:20 n0ano: :) I've just started my open source career. 14:36:40 Yingxin, may it be long 14:36:59 I volunteer jaypipes - he doesn't have very much going on :) 14:37:11 edleafe, +1 14:38:08 n0ano: oh, pleasure as well 14:38:26 so, winding down, anything else for today? 14:38:41 n0ano: yes, I have something to bring up 14:38:49 mlavalle, go ahead 14:40:08 mlavalle, you there? 14:40:36 (probably typing an essay) 14:41:04 edleafe, I try and use vi and then paste in that situation but to each their own :-) 14:41:32 n0ano: I would like the team to take a look at this spec we are working for Newton: https://review.openstack.org/#/c/263898/ 14:41:33 n0ano: it complements an effort we will be conducting on the Neutron side to implement routed networks 14:41:33 n0ano: this has impact on the nove scheduler 14:41:33 nova scheduler^^^^ 14:41:33 it's not urgent, but I would like to start giving it visibility in this meeting 14:41:33 n0ano: so when the team has time, I encourage everybody to give us feedback on it 14:42:24 mlavalle, NP, just realize that, given the schedule pressures, it might be a little while before we can give you good feedback 14:42:48 mlavalle: ack 14:42:49 mlavalle, but tnx for pointing this out and we'll add it to our queues 14:43:19 mlavalle: was just going to say the same as n0ano. I've starred it, but don't be shy about pinging us with reminders when Newton opens up 14:43:53 OK, anything else? 14:44:02 By the way, the modeling work is at https://github.com/cyx1231st/placement-bench/tree/shared-state-demonstration 14:44:05 mmm, its a backlog spec, we can litterally review it when we want 14:44:41 bauzas: sure, but time is the limit here 14:45:20 n0ano: I know, I just want to start early drawing attention to it 14:45:20 :-) 14:45:20 that's all I have. Thanks 14:45:33 mlavalle, NP 14:46:26 edleafe: sure, just wanted to explain it's just a matter of finding time, not a release process issue :) 14:47:25 OK, anything else? 14:47:31 nosir 14:48:07 then let's close for today and I want to thank everyone, it's been a real trip 14:48:18 #endmeeting