15:01:02 <n0ano> #startmeeting gantt 15:01:03 <openstack> Meeting started Tue Sep 2 15:01:02 2014 UTC and is due to finish in 60 minutes. The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:04 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:07 <openstack> The meeting name has been set to 'gantt' 15:01:13 <n0ano> anyone here to talk about the scheduler? 15:01:18 <bauzas_> \o 15:01:21 <mspreitz> yes 15:02:04 <n0ano> bauzas, you're supposed to catching up on sleep since you don't get any for about the next 6 months :-) 15:02:22 <bauzas> n0ano: eh 15:02:36 <bauzas> n0ano: last night was pretty good for me, not for my wife ;) 15:03:07 <n0ano> in my family bad night for one pretty much meant a bod night for the other 15:03:40 <bauzas> but just coming back from 3 days PTO :) 15:03:45 <bauzas> n0ano: :) 15:03:49 <n0ano> anyway, let's get started... 15:04:01 <bauzas> jaypipes seems to be not around 15:04:03 <n0ano> occurs to me I put the items in the wrong order so let's start with 15:04:08 <n0ano> #topic next steps 15:04:13 <bauzas> eh eh 15:04:31 <n0ano> as everyone should be aware the isolate scheduler DB bp was rejected for Juno... 15:04:52 <n0ano> I've started a thread to address what I think are problems with the current approval process... 15:04:54 <bauzas> n0ano: yeah, thanks for raising up the discussion in the list 15:05:18 <n0ano> ignoring that, what do we do to keep momentum for gantt going... 15:05:19 <bauzas> n0ano: as I'm the proposer, I was expecting to not reply on that specific problem 15:06:02 <bauzas> n0ano: here, the problem was about the -2 going one week before FPF 15:06:24 <n0ano> my thought is that this is a bump but not a major blockage, as soon as Kilo opens up we will just re-propose the BP, hopefully get it approved as people have more time to think about it, and we still do the split early in the Kilo cycle 15:06:32 <bauzas> n0ano: so I guess that discussion is related to all the ones about how we can improve design summits, reviewing slots etc. 15:06:51 <bauzas> n0ano: well, that's not that easy 15:06:59 <bauzas> n0ano: the concerns were good 15:07:12 <bauzas> n0ano: I mean, we were having a dependency on ERT 15:07:25 <bauzas> n0ano: and the current interface is not so good 15:07:40 <bauzas> n0ano: so, I'm thinking about proposing a new iteration as we discussed last week 15:07:49 <n0ano> bauzas, I'm hoping ERT will be resolved this week and then we can fix our BP, with or without ERT as appropriate 15:07:58 <bauzas> n0ano: oh, you didn't catch them 15:08:23 <bauzas> n0ano: ERT revert patch is abandoned, but Scheduler related one has been -2 today 15:08:30 <n0ano> I was kind of fixated on the BP review process, what did I miss 15:08:43 <bauzas> n0ano: and today is FF, so that means the scheduler side won't be there until Kilo opens 15:09:01 <bauzas> n0ano: one sec, giving you the reviews 15:09:15 * bauzas is coming back from vacation but still on page, eh ;) 15:09:23 <n0ano> bauzas, that's a given so, as I said, we get prepared and re-propose our BP as soon as Kilo opens 15:10:10 <bauzas> https://review.openstack.org/#/c/61773/ ERT scheduler side has been -2 for Junoi 15:10:30 <bauzas> n0ano: that's what I'm saying, I think we need to look at other ways to do the bp 15:10:53 <n0ano> bauzas, question is, what are the odds that they will re-propose ERT? 15:11:05 <bauzas> n0ano: for both the reasons that ERT won't be there when Kilo opens, and also because the concerns were good 15:11:30 <bauzas> n0ano: I guess Paul will ask for a FFE or resubmit it for Kilo 15:11:40 <bauzas> n0ano: but I can't speak on behalf of it 15:12:08 <n0ano> so the question is should we wait for Paul to resolve ERT or should we just work around it 15:12:12 <bauzas> s/it/him (Paul is not an object, although he's working on some patches about them...) 15:12:29 <bauzas> n0ano: as we discussed last week, I will propose another way 15:12:45 <bauzas> n0ano: the idea is to pass Objects from RT to Sched 15:12:54 <n0ano> personally, I would prefer to not wait, I would prefer to work around it and, if need be, redo things when/if ERT is finalized. 15:13:03 <n0ano> bauzas, that works for me 15:13:21 <bauzas> n0ano: well, the design is really different if we don't go with ERT 15:13:35 <bauzas> n0ano: and IMHO, that's more "resilient" in terms of community approval :) 15:13:55 <n0ano> yeah but I don't feel we should wait on it, deal with ERT only when it's finalized 15:14:23 <bauzas> n0ano: agreed that's top prio, but we need to sort some things previously 15:15:28 <bauzas> n0ano: seriously, the current way is pretty crap... 15:15:52 <n0ano> so, if I understand, you're going to update the current BP and then we can look at implementing based upon that new design - right? 15:16:05 <bauzas> n0ano: yeah 15:16:18 <bauzas> n0ano: I was just expecting more support from others than just you and me 15:16:21 <bauzas> :) 15:16:28 <bauzas> n0ano: ie. I need to discuss with jay first 15:16:35 <n0ano> #action bauzas to update the isolate scheduler DB BP 15:16:42 <bauzas> thanks :) 15:17:02 <bauzas> will mark it as WIP, until Kilo spec template is there 15:17:20 <bauzas> because there are chances that the template will change 15:17:28 <n0ano> we're in agreement, getting some sort of concensus from Jay would be good but I'm willing to push through others if that's a problem. 15:17:36 <bauzas> n0ano: indeed 15:17:53 <bauzas> n0ano: hence I'm waiting to see what will be the Kilo process for blueprints 15:18:05 <n0ano> bauzas, I doubt the template will change dramitically, just a tweak or two 15:18:07 <bauzas> n0ano: because atm that's unclear 15:18:36 <n0ano> s/dramit/dramat 15:18:36 <bauzas> n0ano: I'm not really waiting for the template, I'm waiting for big decisions like runways and design summit reboot 15:18:43 * n0ano needs to work on spellin skills 15:18:57 * n0ano and typing skills 15:19:11 <bauzas> n0ano: the design summit format will change for Paris 15:19:41 <bauzas> n0ano: and the cores's reviewing process will probably change for Kilo 15:20:02 <n0ano> the format will change but we will still push for gantt no matter what the procedure we have to follow 15:20:50 <n0ano> as I've said in the past, no one has objected to gantt, the only issues have been how & when 15:20:53 <bauzas> n0ano: of course, but coming from a patch where 54 iterations were necessary for having merged it, I can just say that sometimes, you need to have visibility before doing anything 15:21:21 <n0ano> bauzas, completely agree (don't know whether to laugh or cry over that) 15:21:24 <bauzas> n0ano: the "how" is sometimes requiring more than 3 months for getting it done 15:21:51 <n0ano> remember, I started the gantt work over a year ago and we're still debating it 15:21:54 <bauzas> n0ano: the golden rule for Nova is patience 15:22:15 <bauzas> n0ano: indeed, I was sitting 2 rows behind you in HKG :) 15:22:39 <n0ano> HKG, I started this before Portland :-) 15:22:54 <bauzas> then that's not 1 year... :) 15:23:35 <bauzas> anyway, reviewing the NUMA patches was pretty worth it 15:23:45 <bauzas> I now totally understand ndipanov's concerns 15:24:00 <n0ano> Oh, you're right, I can't count, it was after Grizzly I started (I think, too long ago) 15:24:18 <mspreitz_> bauzas: can you explain briefly? I have not been following the NUMA stuff 15:24:22 <bauzas> atm, RT is pretty good for consuming CPUs and memory, but very bad at counting more complex objects 15:24:37 <n0ano> anyway, bottom line is you get to redo the BP and I get to needle the powers that be over review process 15:24:48 <bauzas> mspreitz_: well, the problem is really about details 15:24:56 <bauzas> mspreitz_: like what you get is not typed 15:25:08 <bauzas> mspreitz_: so you have to doublecheck what you get 15:25:26 <bauzas> mspreitz_: you also have to do some extra calls for getting some info 15:25:31 <mspreitz_> bauzas: are you explaining the NUMA concerns? 15:25:49 <bauzas> mspreitz_: yeah, all the fixme stuff these folks were doing 15:26:10 <mspreitz_> OK, thanks. 15:26:28 <bauzas> mspreitz_: I can just summarize that the problem is that you're providing non-typed and non-validated dictionaries 15:26:53 <bauzas> mspreitz_: and based on where you look at these bits, they are either serialized or not 15:26:55 <mspreitz_> I have to admit I have not been following the details, since it became clear a while ago that it would take a long time to get progress here 15:27:14 <bauzas> mspreitz_: seriously, I understand 15:27:41 <bauzas> mspreitz_: here the problem with isolate-sched-db is that we're not counting cpus or ram, but apples and bananas 15:27:59 <bauzas> mspreitz_: I mean, real complex objects 15:28:16 <bauzas> so we kinda need some formalization 15:28:54 <bauzas> and as I said last week, we *already* have the toolbox for it, that's called.... objects 15:29:33 <mspreitz_> thanks 15:29:38 <bauzas> anyway, I'll have to leave earlier today, we can maybe move on ? 15:29:49 <mspreitz_> yes 15:29:54 <n0ano> bauzas, fer sur 15:30:02 <n0ano> #topic opens 15:30:13 <n0ano> anyone have anything new for today? 15:30:17 <bauzas> I'm glad to say scheduler-lib is merged 15:30:26 <n0ano> +1 15:30:32 * bauzas silently says hurrah 15:30:47 * n0ano vocally says hurrah 15:31:03 <bauzas> that will ease the next steps discussed previously 15:31:44 <n0ano> anything else? 15:31:50 <bauzas> yup 15:32:11 <bauzas> https://review.openstack.org/#/c/117042/ that's something prerequisite for the work 15:32:28 <bauzas> the idea discussed previously is to provide ComputeNode objects to Scheduler 15:32:47 <bauzas> but unfortunately, ComputeNode is having a FK on Services for the bad or the good 15:33:07 <bauzas> so we need to do some prework about that, and this patch is doing that 15:33:42 <bauzas> because that makes no sense to give to Scheduler something having a dependency on anything else 15:33:46 <n0ano> how does this relate to the change to not send those updates unless something changes, there's no longer an update every 60 secons 15:34:16 <bauzas> n0ano: this is not related, here is the patch is about creating the DB entry at startup 15:34:35 <bauzas> but the updates are still done the same way, ie. checking if updated or not 15:35:34 <bauzas> n0ano: the problem was that we were looking the existence of an object every 60 secs 15:35:56 <bauzas> n0ano: your check is not before that, but after IIRC 15:36:09 <bauzas> n0ano: ie. before updating the DB 15:36:20 <n0ano> I though the DB entry was already created at startup 15:36:28 <bauzas> n0ano: nope 15:36:34 <bauzas> n0ano: not exactly 15:37:01 <bauzas> n0ano: the update_avail_resource method was called when nova-compute was booting 15:37:16 <bauzas> n0ano: because of a post-hook mechanism 15:37:38 <bauzas> n0ano: but the check was still done every 60 secs 15:38:01 <bauzas> n0ano: here we refactor where we look at services table, that's really helpful 15:38:37 <n0ano> I have to study the code, I don't understand, let me try and understand this first 15:39:14 <bauzas> n0ano: sure 15:39:19 <bauzas> that's it for me 15:39:36 <n0ano> OK, unless there's anything else 15:40:03 <n0ano> I'll thank everyone and we'll talk next week. 15:40:08 <n0ano> #endmeeting