15:00:57 #startmeeting gantt 15:00:58 Meeting started Tue Feb 18 15:00:57 2014 UTC and is due to finish in 60 minutes. The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:59 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:01 The meeting name has been set to 'gantt' 15:01:10 anyone here to talk about the scheduler 15:01:15 o/ 15:02:00 I can see the topic name as changed from scheduler to gantt ;) 15:02:22 since gantt is the new scheduler project that seemed appropriate 15:02:34 n0ano: can't agree more 15:02:54 n0ano: my concern was that gantt wasn't enough visible yet 15:03:08 yet now that we're saying maybe we're putting gantt out of openstack git 15:03:22 I try and raise visibility whenever possible, talking about it at the mid cycle meetup certainly helps 15:03:38 shoud we put the meetings back under nova banner? 15:03:41 toan-tran, since when, I haven't heard that 15:03:48 n0ano: indeed, we're waiting for the summary of what has been discussed during the mid-cycle meeting 15:04:07 but I agree with the fact it should be a stackforge project 15:04:08 n0ano: in the mid-cycle mail from Russell 15:04:39 gimme a second... 15:04:53 bauzas: there is a summary on here, a rough one, https://etherpad.openstack.org/p/nova-icehouse-mid-cycle-meetup-items 15:05:05 johnthetubaguy: thanks 15:05:13 hmm, not so good with the gantt section mind 15:05:14 oops 15:05:15 #topic mid-cycle meetup 15:05:20 n0ano: I'm currently wondering where I should put blueprints for gantt 15:05:33 well, let me give my summary 15:05:45 sure :) 15:06:05 here: #link http://lists.openstack.org/pipermail/openstack-dev/2014-February/027370.html 15:06:08 What I heard is everyone wants to move the scheduler to a separate tree but we;'re not sure the current nova scheduler is ready for that 15:06:39 "2) Gantt - We discussed the progress of the Gantt effort. After 15:06:39 discussing the problems encountered so far and the other scheduler work 15:06:40 going on, the consensus was that we really need to focus on decoupling 15:06:40 the scheduler from the rest of Nova while it's still in the Nova tree. 15:06:40 Don was still interested in working on the existing gantt tree to learn 15:06:40 what he can about the coupling of the scheduler to the rest of Nova. 15:06:42 Nobody had a problem with that, but it doesn't sound like we'll be ready 15:06:44 to regenerate the gantt tree to be the "real" gantt tree soon. We 15:06:46 probably need another cycle of development before it will be ready." 15:07:09 we should be working on creating clean interfaces in the current nova scheduler and, once we have clean interfaces, we can split this out 15:07:25 toan-tran, yes, that matches what I heard 15:07:34 n0ano: so the first effort would be to split the interfaces ? 15:07:54 as I remember, there were two approaches for Gantt 15:07:54 n0ano: I'm particularly concerned about the nova imports in the gantt code 15:08:08 one from Robert that fork out, then cut tie 15:08:10 moving the gantt tree from openstack to stackforge is not a major concern, that's just a housekeeping issue. 15:08:19 the other from Boris that cut tie, then fork out 15:08:44 bauzas, yes, cleaning up the interfaces is the first concern, as you say, the nova imports can be a bit problematic 15:08:49 n0ano: well, I tried to update the oslo commons 15:08:59 toan-tran: yeah, seems that both approaches ignore the interfaces issue.. 15:09:07 n0ano: by replacing from nova to gantt and run an update.py 15:09:26 glikson: I think that Boris' group is working on some part of the interface 15:09:30 basically, we mentioned the no-db compute work, lets do something similar for the scheduler, then split it out 15:09:36 namely : the data access 15:09:36 until we have a 100% gantt classes import, we raise some DuplicateOptError 15:10:01 bauzas, yes, that handles openstack/common (other projects do that so that's easy), it's the other imports of straight nova code that create problesm. 15:10:22 n0ano: I'm trying to provide a patch to oslo.config for handling different projects 15:10:35 n0ano: but that requires a clean-up in Gantt anyway 15:11:01 bauzas, I'm a little curious why we don't have an oslo.common, why do you have to do the update.py all the time? 15:11:07 n0ano: IMHO, we can't focus on having clean interfaces with Nova without doing the clean-up stuff 15:11:34 n0ano: well, gantt should stand up by its own with the oslo commons 15:11:36 johnthetubaguy: the question is whether we have a good approach to 'something similar' here.. 15:11:53 n0ano: so, I made a clean-up in reqs.txt, removing what was not present 15:11:59 oops 15:12:08 not in reqs.txt, openstack-common.conf sorry 15:12:17 and then ran an update.py 15:12:28 (and updated the imports, of course) 15:12:31 glikson: sure, we kinda agreed it seemed like there was 15:12:42 I can provide a draft review if you want 15:13:04 bauzas, yeah, I've tried that too, it gets the openstack/common imports local to gantt but you still have other nova imports to deal with 15:13:09 yey 15:13:27 n0ano: that's why I'm focusing on having a wrapper on top of oslo.config 15:13:36 johnthetubaguy: you mean, with in-memory cache, updated via RPC? 15:13:44 glikson: no 15:13:54 bauzas, when I did that I started hitting circular dependencies and wound up moving 90% of nova over to the gantt tree, not what we want 15:14:15 n0ano: well, it seems we could join our efforts 15:14:28 bauzas, I think that comes back to cleaning up the interfaces and not making the scheduler so intertwined with nova 15:14:35 johnthetubaguy: ok, then maybe I am not aware of the discussion/agreement you refer to 15:14:43 bauzas, for sure, we be good to work together 15:15:11 n0ano: well, there are also some huge concerns about interfaces 15:15:18 glikson: I am just talking about what we spoke about at the mid cycle meet up, to me this one is the key:https://blueprints.launchpad.net/nova/+spec/remove-cast-to-schedule-run-instance 15:15:32 but that's matter of priority 15:15:36 OK, so do we know where we add the interface in, the above blueprint seems to make it much easier 15:15:51 I know its probably not going into icehouse at this point though 15:16:10 basically removes compute rpc calls from inside the scheduler manager 15:16:23 so conductor calls scheduler for a list of hosts 15:16:28 gives a request spec 15:16:32 gets back a list of hosts 15:16:46 the other interface is something like an update_compute_stats_api 15:16:55 where at the moment it writes to the DB, but does other things later 15:17:10 (a slice inside the current host manager, or something like that) 15:17:21 anyways, seems two good places to start making a cut? 15:17:58 yep, there is also the need of having HostState querying Nova 15:18:13 johnthetubaguy: getting the entire data model via an API for every placement decision would be far from optimal.. or you have some other approach in mind? 15:18:36 the host state belongs to the scheduler 15:18:52 but the resource_tracker belongs to the compute, no ? 15:19:06 ish… let me explain 15:19:29 if the api to record the stats from compute belongs the the scheduler, then the current stats db, can live in the scheduler 15:19:48 its a case of where you draw the line, I think 15:20:15 does that make any sense? 15:20:29 johnthetubaguy, isn't that related to the no-db scheduler work, remove the stats db then it resides in the scheduler 15:20:30 ok gotcha 15:20:38 johnthetubaguy: was speaking about https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L440 15:20:46 n0ano: related, but shound't be dependent on it 15:21:47 bauzas: in my model you have all the scheduler db owned by the scheduler, so it doesn't matter so much, its not querying nova, its talking to its own db, that just looks like the current nova one, and might even be stored in the same place 15:21:57 carry on, I have to go deal with my trashcan blowing over in the wind 15:22:08 johnthetubaguy: well, I got your view 15:22:26 johnthetubaguy: there is also a question on frenquently_task update & nova-conductor 15:22:43 what question is that? sorry, not sure I get the issue here? 15:22:48 johnthetubaguy: how that DB will be updated? 15:23:02 nova-compute will update its stats to *some one* 15:23:03 it calls a libary provided by the scheduler 15:23:16 which at the moment just updates the same old DB 15:23:22 in the end it might send an rpc 15:23:31 doesn't matter, its owned by gantt now 15:23:31 johnthetubaguy: ok got it 15:23:39 last meeting someone taled about using nova-conductor as it is now 15:23:41 johnthetubaguy: "it" = nova-compute? 15:24:03 then conductor updates/synchron all schedulers (if there are many) 15:24:04 yes, at the moment, might need to switch the arrow in the end too 15:24:14 so conductor is interesting here... 15:24:21 with this: https://blueprints.launchpad.net/nova/+spec/remove-cast-to-schedule-run-instance 15:24:33 it means there are no calls from scheduler to compute 15:24:39 or casts 15:24:49 conductor calls scheduler to get a list of hosts 15:24:52 and its done 15:25:19 johnthetubaguy: yes 15:25:36 all I think is, add these two seams 15:25:36 johnthetubaguy: for novaAPI call 15:25:37 I like this.. This is heading towards scheduler being a pure resource placement decision engine 15:25:43 then see how far we get 15:25:51 nova API --> conductor --> scheduler --> host states 15:25:58 yep 15:26:08 johnthetubaguy: what is still unclear to me is how the metrics are updated 15:26:10 nova compute --> update stats --> conductor --> synchron schedulers 15:26:12 that is what we get after: https://blueprints.launchpad.net/nova/+spec/remove-cast-to-schedule-run-instance 15:26:23 johnthetubaguy: that's basically the mission of the compute resource_tracker 15:26:26 But has anyone given a thought about a unified data repository.. a db sitting outside.. 15:26:27 which calls the conductor 15:26:32 bauzas: metrics are updated using code provided by gantt 15:26:52 johnthetubaguy: well, I missed this big piece 15:27:05 thats the second piece 15:27:19 first piece is conductor so its a simple rpc query 15:27:21 second bit 15:27:27 code lib to update metrics 15:27:32 johnthetubaguy: so gantt will define the metric / host_state model? 15:27:36 right now, it just does what we do today 15:27:53 toan-tran: if you agree with the extensible metric thing, its just a dict 15:27:59 does it go into blueprints now ? 15:28:04 but yes, gantt would own the meaning of the keys 15:28:16 we have blueprints for all this already, except the above code lib 15:28:24 bauzas: depends if we agree with this 15:28:30 johnthetubaguy: and so this is why I missed it :) 15:28:31 also, no more blueprints until Juno now 15:28:38 code freeze is tomorrow 15:28:44 johnthetubaguy: yup, I know 15:29:03 I guessed, just wanted to share that deadline :) 15:29:06 :) 15:29:18 n0ano: do we have blueprints in place ? 15:29:21 johnthetubaguy: ok, I agree that these 2 interfaces make sense, as a first step 15:29:37 n0ano: I mean for gantt, of course 15:29:38 glikson: agreed, its just a first step, then we see what "mess" is left 15:29:39 bauzas, which ones to you mean, we have lots 15:29:53 yes, there's a blueprint for splitting it out 15:29:58 in Nova ? 15:30:28 would it be worth putting some here : https://blueprints.launchpad.net/gantt ? 15:30:39 actually, there's an ether pad for that, https://etherpad.openstack.org/p/icehouse-external-scheduler 15:30:46 johnthetubaguy: actually splitting the DB's would be non-trivial, I gues.. 15:30:55 n0ano: woah, this etherpad is really big :) 15:31:18 bauzas, creating separate BPs for specific tasks would be good 15:31:19 glikson: with the above model, its already done, I feel, because the scheduler gets info pushed in, except for the stuff it owns 15:31:48 the original idea was that just splitting the code would be easy, turns out it's not that trivial so some work (cleaning interfaces) will need to be done first 15:31:59 johnthetubaguy: should gantt create its own model instead of ugin nova's 15:31:59 ? 15:32:10 s/ugin/using/ 15:32:28 toan-tran: well, I think that's just matter of backporting the existing one 15:32:40 johnthetubaguy: well, not exactly, because same DB is used also for queries, keeping track of in-flight operations, etc.. now you will have 2 DBs.. 15:32:44 the metrics column is enough flexible for the first needs 15:32:46 toan-tran: I don't think it matters if we do the split I suggested 15:33:02 glikson: right, but the scheduler doesn't write to those other tables, after the move to conductor work 15:33:54 the only worrying thing I can see is that we have a big dependency with Nova in Gantt 15:34:02 even if the interfaces are clean 15:34:13 johnthetubaguy: so, maybe some of the updates that conductor does will need to go to scheduler's DB too.. 15:34:15 that does require to deploy both gantt and nova on the same host 15:34:26 johnthetubaguy: This is probably for future, Do you think it would be possible to push other metrics (non-compute related - storage, network, etc) to the scheduler DB, to make interesting scheduling decisions 15:34:33 glikson: again, thats through the code lib, but its unlikely 15:34:35 because gantt is currently heaving using nova libs 15:35:02 Yathi: sure, thats the extensible resource scheduler stuff, and it goes through the code libs 15:35:15 Yathi: that should be possible using the extensible resources bp 15:35:20 bauzas, I was hoping that cleaning up the interfaces would involve removing the use of those nova libs 15:35:29 n0ano: that's hard work 15:35:30 bauzas: yes, but that should be minimal, given how much oslo has, thats step three, lets say 15:35:46 johnthetubaguy: I agree with this as step #3 15:35:51 cool.. 15:35:56 bauzas: no, nova and gantt can be deployed on different hosts 15:36:09 as nova-* & nova-scheduler now 15:36:18 toan-tran: how do you deal the nova libs imports in Gantt ? :) 15:36:19 just duplicating some code 15:36:28 deal ^with 15:36:54 johnthetubaguy: otherwise there might be ugly race conditions.. scheduler may need to "remember" the decisions it made, before they propagate back via nova-comute.. 15:37:03 The other discussion we had in the ML was about reservations (via CLimate or otherwise).. how will that fit in here 15:37:20 bauzas: it's like copying nova into new place, replace some part with gantt, then once we have clean interface we can remove the nova part 15:37:20 Yathi: I think it's too early for this 15:37:24 glikson: we already have all those races, and there is code in the scheduler to avoid some of them 15:37:25 Yathi: I guess my last comment is related to reservations.. 15:37:40 glikson: its why we have to retry sometimes 15:37:44 toan-tran: I'm not saying I dunno how to do this, I'm just saying that's hard work :) 15:38:07 what do you exactly want to reserve ? 15:38:10 no argument here :) 15:38:20 bauzas: If we are defining APIs in gantt now, I guess it makes sense now 15:38:44 Yathi: please provide an usecase, I can't see what needs to be reserved in Climate 15:38:44 well, aparently, they're developping some mechanism for reserving instances via Climate 15:39:06 I'm a core contributor for Climate, what do you exactly want to know ? :) 15:39:19 after the scheduler has determined the hosts to use, it should call the reservations to get the lease.. the conductor can then use the lease to create the instances 15:39:29 so… I have tried to put all this plan into the etherpad 15:39:30 https://etherpad.openstack.org/p/icehouse-external-scheduler 15:39:34 does that make sense 15:39:50 johnthetubaguy, certainly, tnx 15:39:56 johnthetubaguy: will review it 15:40:16 I just tried to summarise the discussion above, just now 15:40:41 Climate, if I understand correctly, tackles the pretty hard job of scheduling over both time and space 15:40:43 again, tnx, I was just about to ask you to summarize 15:41:06 bauzas: regarding reservations, when conductor asks scheduler for a list of hosts, it can get back the host list along with reservation ? 15:41:21 Climate is providing some way to provision resources by defining plugins 15:41:26 Integrating that with the sort of placement logic that some others of us have discussed looks like a pretty tall order to me 15:42:01 you can imagine a call from the scheduler to Climate for creating a lease 15:42:15 bauzas: does climate interact with nova via nova API? or directly to nova-scheduler/nova-conductor? 15:42:26 How do the plugins divide the job amongst themselves? 15:42:28 toan-tran: nah, thru the API 15:42:49 toan-tran: we are a separate project, no way to use the AMQP protocol for this 15:43:25 ok, so I imagine the msg flow would be Climate -> nova API -> nova condcutor -> nova-scheduler 15:43:38 mspreitz: that depends on the resource plugin logic 15:43:38 maybe we are getting distracted 15:43:41 what will nova-scheduler would return at this point? 15:43:44 johnthetubaguy: indeed 15:44:05 that would be step #4 15:44:11 or maybe more 15:44:32 bauzas, feel free to update the etherpad with a 4th step 15:44:59 bauzas: is there a short sharp writeup of this? 15:45:02 n0ano: well, I'm not exacttly understanding the need for placing a call to Climate, but will put an entry 15:45:07 bauzas: we can probably continue the discussion in ML 15:45:14 and others could amend it 15:45:27 bauzas, +1 15:46:02 I think we're bottoming out here, let's move on 15:46:06 bauzas, + 1 15:46:11 #no-db scheduler 15:46:21 anyone here who can give an update? 15:46:35 we were very worried about merging this in icehouse 15:46:41 at the mid term meet up 15:46:49 very late, massive change, thingy 15:47:08 johnthetubaguy, yeah, I was hoping it would be farther along but looks like they hit issues 15:47:29 I think the split we just discussed should be depend on this work 15:47:30 I agree, unfortunately, it's not looking good for Icehouse 15:48:10 johnthetubaguy: how far is the change impact the interface with conductor ? 15:48:26 It will make splitting easier but it's always easy to say let's just wait for this next thing, sometimes you just have to do it 15:48:40 johnthetubaguy: I was thinking that as all DB calls were passing thru the conductor, it was quite transparent for the scheduler? 15:48:57 erm, there are patches up 15:48:59 provided the interfaces wouldn't change, of couse 15:49:08 johnthetubaguy: ok, will look 15:49:13 the DB calls would not go through the conductor from the scheduler, unless we wanted it to 15:50:29 johnthetubaguy: ok, glancing at the reviews 15:50:35 well, no one from the no-db team seems to be here so 15:50:39 #topic opens 15:50:49 anyone have anthing new they want to raise? 15:51:01 I have a question on nova object 15:51:21 there is an issue on unified object 15:51:36 the Solver Scheduler blueprint has a few patches now, and we are hoping some or all of them will get in the icehouse timeframe depending on the reviews 15:51:38 does it go to Icehouse? and what is the impact of this? 15:53:20 hmm, good question 15:53:28 here is the discussion: http://lists.openstack.org/pipermail/openstack-dev/2014-February/026160.html 15:53:30 it seems ok to go in, its optional, and patches are up there 15:55:36 hello? 15:55:54 toan-tran, didn't johnthetubaguy answer your question 15:56:14 Can someone help a newbie here, what is the alternative to "objecty" ? 15:56:18 n0ano, sorry, not really understood :) 15:57:14 (I mean "objecty" as used in the email cited above) 15:57:15 the unified object is here: https://blueprints.launchpad.net/nova/+spec/icehouse-objects 15:58:28 well, approaching the top of the hour so I'll thank everyone and we'll talk next week. 15:58:33 #endmeeting