15:01:02 <n0ano> #startmeeting gantt
15:01:03 <openstack> Meeting started Tue Sep  2 15:01:02 2014 UTC and is due to finish in 60 minutes.  The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:01:04 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:01:07 <openstack> The meeting name has been set to 'gantt'
15:01:13 <n0ano> anyone here to talk about the scheduler?
15:01:18 <bauzas_> \o
15:01:21 <mspreitz> yes
15:02:04 <n0ano> bauzas, you're supposed to catching up on sleep since you don't get any for about the next 6 months :-)
15:02:22 <bauzas> n0ano: eh
15:02:36 <bauzas> n0ano: last night was pretty good for me, not for my wife ;)
15:03:07 <n0ano> in my family bad night for one pretty much meant a bod night for the other
15:03:40 <bauzas> but just coming back from 3 days PTO :)
15:03:45 <bauzas> n0ano: :)
15:03:49 <n0ano> anyway, let's get started...
15:04:01 <bauzas> jaypipes seems to be not around
15:04:03 <n0ano> occurs to me I put the items in the wrong order so let's start with
15:04:08 <n0ano> #topic next steps
15:04:13 <bauzas> eh eh
15:04:31 <n0ano> as everyone should be aware the isolate scheduler DB bp was rejected for Juno...
15:04:52 <n0ano> I've started a thread to address what I think are problems with the current approval process...
15:04:54 <bauzas> n0ano: yeah, thanks for raising up the discussion in the list
15:05:18 <n0ano> ignoring that, what do we do to keep momentum for gantt going...
15:05:19 <bauzas> n0ano: as I'm the proposer, I was expecting to not reply on that specific problem
15:06:02 <bauzas> n0ano: here, the problem was about the -2 going one week before FPF
15:06:24 <n0ano> my thought is that this is a bump but not a major blockage, as soon as Kilo opens up we will just re-propose the BP, hopefully get it approved as people have more time to think about it, and we still do the split early in the Kilo cycle
15:06:32 <bauzas> n0ano: so I guess that discussion is related to all the ones about how we can improve design summits, reviewing slots etc.
15:06:51 <bauzas> n0ano: well, that's not that easy
15:06:59 <bauzas> n0ano: the concerns were good
15:07:12 <bauzas> n0ano: I mean, we were having a dependency on ERT
15:07:25 <bauzas> n0ano: and the current interface is not so good
15:07:40 <bauzas> n0ano: so, I'm thinking about proposing a new iteration as we discussed last week
15:07:49 <n0ano> bauzas, I'm hoping ERT will be resolved this week and then we can fix our BP, with or without ERT as appropriate
15:07:58 <bauzas> n0ano: oh, you didn't catch them
15:08:23 <bauzas> n0ano: ERT revert patch is abandoned, but Scheduler related one has been -2 today
15:08:30 <n0ano> I was kind of fixated on the BP review process, what did I miss
15:08:43 <bauzas> n0ano: and today is FF, so that means the scheduler side won't be there until Kilo opens
15:09:01 <bauzas> n0ano: one sec, giving you the reviews
15:09:15 * bauzas is coming back from vacation but still on page, eh ;)
15:09:23 <n0ano> bauzas, that's a given so, as I said, we get prepared and re-propose our BP as soon as Kilo opens
15:10:10 <bauzas> https://review.openstack.org/#/c/61773/ ERT scheduler side has been -2 for Junoi
15:10:30 <bauzas> n0ano: that's what I'm saying, I think we need to look at other ways to do the bp
15:10:53 <n0ano> bauzas, question is, what are the odds that they will re-propose ERT?
15:11:05 <bauzas> n0ano: for both the reasons that ERT won't be there when Kilo opens, and also because the concerns were good
15:11:30 <bauzas> n0ano: I guess Paul will ask for a FFE or resubmit it for Kilo
15:11:40 <bauzas> n0ano: but I can't speak on behalf of it
15:12:08 <n0ano> so the question is should we wait for Paul to resolve ERT or should we just work around it
15:12:12 <bauzas> s/it/him (Paul is not an object, although he's working on some patches about them...)
15:12:29 <bauzas> n0ano: as we discussed last week, I will propose another way
15:12:45 <bauzas> n0ano: the idea is to pass Objects from RT to Sched
15:12:54 <n0ano> personally, I would prefer to not wait, I would prefer to work around it and, if need be, redo things when/if ERT is finalized.
15:13:03 <n0ano> bauzas, that works for me
15:13:21 <bauzas> n0ano: well, the design is really different if we don't go with ERT
15:13:35 <bauzas> n0ano: and IMHO, that's more "resilient" in terms of community approval :)
15:13:55 <n0ano> yeah but I don't feel we should wait on it, deal with ERT only when it's finalized
15:14:23 <bauzas> n0ano: agreed that's top prio, but we need to sort some things previously
15:15:28 <bauzas> n0ano: seriously, the current way is pretty crap...
15:15:52 <n0ano> so, if I understand, you're going to update the current BP and then we can look at implementing based upon that new design - right?
15:16:05 <bauzas> n0ano: yeah
15:16:18 <bauzas> n0ano: I was just expecting more support from others than just you and me
15:16:21 <bauzas> :)
15:16:28 <bauzas> n0ano: ie. I need to discuss with jay first
15:16:35 <n0ano> #action bauzas to update the isolate scheduler DB BP
15:16:42 <bauzas> thanks :)
15:17:02 <bauzas> will mark it as WIP, until Kilo spec template is there
15:17:20 <bauzas> because there are chances that the template will change
15:17:28 <n0ano> we're in agreement, getting some sort of concensus from Jay would be good but I'm willing to push through others if that's a problem.
15:17:36 <bauzas> n0ano: indeed
15:17:53 <bauzas> n0ano: hence I'm waiting to see what will be the Kilo process for blueprints
15:18:05 <n0ano> bauzas, I doubt the template will change dramitically, just a tweak or two
15:18:07 <bauzas> n0ano: because atm that's unclear
15:18:36 <n0ano> s/dramit/dramat
15:18:36 <bauzas> n0ano: I'm not really waiting for the template, I'm waiting for big decisions like runways and design summit reboot
15:18:43 * n0ano needs to work on spellin skills
15:18:57 * n0ano and typing skills
15:19:11 <bauzas> n0ano: the design summit format will change for Paris
15:19:41 <bauzas> n0ano: and the cores's reviewing process will probably change for Kilo
15:20:02 <n0ano> the format will change but we will still push for gantt no matter what the procedure we have to follow
15:20:50 <n0ano> as I've said in the past, no one has objected to gantt, the only issues have been how & when
15:20:53 <bauzas> n0ano: of course, but coming from a patch where 54 iterations were necessary for having merged it, I can just say that sometimes, you need to have visibility before doing anything
15:21:21 <n0ano> bauzas, completely agree (don't know whether to laugh or cry over that)
15:21:24 <bauzas> n0ano: the "how" is sometimes requiring more than 3 months for getting it done
15:21:51 <n0ano> remember, I started the gantt work over a year ago and we're still debating it
15:21:54 <bauzas> n0ano: the golden rule for Nova is patience
15:22:15 <bauzas> n0ano: indeed, I was sitting 2 rows behind you in HKG :)
15:22:39 <n0ano> HKG, I started this before Portland :-)
15:22:54 <bauzas> then that's not 1 year... :)
15:23:35 <bauzas> anyway, reviewing the NUMA patches was pretty worth it
15:23:45 <bauzas> I now totally understand ndipanov's concerns
15:24:00 <n0ano> Oh, you're right, I can't count, it was after Grizzly I started (I think, too long ago)
15:24:18 <mspreitz_> bauzas: can you explain briefly? I have not been following the NUMA stuff
15:24:22 <bauzas> atm, RT is pretty good for consuming CPUs and memory, but very bad at counting more complex objects
15:24:37 <n0ano> anyway, bottom line is you get to redo the BP and I get to needle the powers that be over review process
15:24:48 <bauzas> mspreitz_: well,  the problem is really about details
15:24:56 <bauzas> mspreitz_: like what you get is not typed
15:25:08 <bauzas> mspreitz_: so you have to doublecheck what you get
15:25:26 <bauzas> mspreitz_: you also have to do some extra calls for getting some info
15:25:31 <mspreitz_> bauzas: are you explaining the NUMA concerns?
15:25:49 <bauzas> mspreitz_: yeah, all the fixme stuff these folks were doing
15:26:10 <mspreitz_> OK, thanks.
15:26:28 <bauzas> mspreitz_: I can just summarize that the problem is that you're providing non-typed and non-validated dictionaries
15:26:53 <bauzas> mspreitz_: and based on where you look at these bits, they are either serialized or not
15:26:55 <mspreitz_> I have to admit I have not been following the details, since it became clear a while ago that it would take a long time to get progress here
15:27:14 <bauzas> mspreitz_: seriously, I understand
15:27:41 <bauzas> mspreitz_: here the problem with isolate-sched-db is that we're not counting cpus or ram, but apples and bananas
15:27:59 <bauzas> mspreitz_: I mean, real complex objects
15:28:16 <bauzas> so we kinda need some formalization
15:28:54 <bauzas> and as I said last week, we *already* have the toolbox for it, that's called.... objects
15:29:33 <mspreitz_> thanks
15:29:38 <bauzas> anyway, I'll have to leave earlier today, we can maybe move on ?
15:29:49 <mspreitz_> yes
15:29:54 <n0ano> bauzas, fer sur
15:30:02 <n0ano> #topic opens
15:30:13 <n0ano> anyone have anything new for today?
15:30:17 <bauzas> I'm glad to say scheduler-lib is merged
15:30:26 <n0ano> +1
15:30:32 * bauzas silently says hurrah
15:30:47 * n0ano vocally says hurrah
15:31:03 <bauzas> that will ease the next steps discussed previously
15:31:44 <n0ano> anything else?
15:31:50 <bauzas> yup
15:32:11 <bauzas> https://review.openstack.org/#/c/117042/ that's something prerequisite for the work
15:32:28 <bauzas> the idea discussed previously is to provide ComputeNode objects to Scheduler
15:32:47 <bauzas> but unfortunately, ComputeNode is having a FK on Services for the bad or the good
15:33:07 <bauzas> so we need to do some prework about that, and this patch is doing that
15:33:42 <bauzas> because that makes no sense to give to Scheduler something having a dependency on anything else
15:33:46 <n0ano> how does this relate to the change to not send those updates unless something changes, there's no longer an update every 60 secons
15:34:16 <bauzas> n0ano: this is not related, here is the patch is about creating the DB entry at startup
15:34:35 <bauzas> but the updates are still done the same way, ie. checking if updated or not
15:35:34 <bauzas> n0ano: the problem was that we were looking the existence of an object every 60 secs
15:35:56 <bauzas> n0ano: your check is not before that, but after IIRC
15:36:09 <bauzas> n0ano: ie. before updating the DB
15:36:20 <n0ano> I though the DB entry was already created at startup
15:36:28 <bauzas> n0ano: nope
15:36:34 <bauzas> n0ano: not exactly
15:37:01 <bauzas> n0ano: the update_avail_resource method was called when nova-compute was booting
15:37:16 <bauzas> n0ano: because of a post-hook mechanism
15:37:38 <bauzas> n0ano: but the check was still done every 60 secs
15:38:01 <bauzas> n0ano: here we refactor where we look at services table, that's really helpful
15:38:37 <n0ano> I have to study the code, I don't understand, let me try and understand this first
15:39:14 <bauzas> n0ano: sure
15:39:19 <bauzas> that's it for me
15:39:36 <n0ano> OK, unless there's anything else
15:40:03 <n0ano> I'll thank everyone and we'll talk next week.
15:40:08 <n0ano> #endmeeting