15:01:19 <n0ano> #startmeeting gantt 15:01:20 <openstack> Meeting started Tue Jan 21 15:01:19 2014 UTC and is due to finish in 60 minutes. The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:22 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:24 <openstack> The meeting name has been set to 'gantt' 15:01:34 <n0ano> anyone hear to talk about the scheduler? 15:01:41 <garyk> hi 15:01:43 <toan-tran> i'm here, hi all 15:01:57 <doron_> hi 15:01:58 <johnthetubaguy> hi 15:02:02 <coolsvap> hi 15:02:24 <alaski> hi 15:02:48 <mspreitz> o/ 15:02:52 <n0ano> I wanted to talk about no db but boris doesn't seem to be on yet, let's go to the code forklift first 15:02:58 <n0ano> #topic code forklift 15:03:20 <n0ano> hopefully all have seen the email thread on the devel list... 15:03:47 <n0ano> we've decided to get the new gantt tree working first, and then re-sync (probably by recreating the tree) to the nova tree 15:04:00 <n0ano> (rather than trying to continuously keep in sync with the nova tree) 15:04:20 <n0ano> to that effort, there are 3 primary top level goals: 15:04:28 <n0ano> 1) integrate with devstack 15:04:34 <n0ano> 2) get the unit tests to work 15:04:44 <n0ano> 3) integration tests working 15:05:01 <garyk> sorry, not sure that i follow 15:05:06 <n0ano> I've actually done task 1, the patches for devstack are posted and awaiting review 15:05:15 <garyk> instead of running n-sched will we be running gantt on devstack? 15:06:09 <alaski> n0ano: do you have links for the devstack patches? 15:06:10 <garyk> the review is https://review.openstack.org/#/c/67666/ 15:06:11 <n0ano> garyk, the idea is that you can ask devstack to include gantt rather than n-sch (the same way you can ask for neutron vs. n-net) 15:06:19 <garyk> n0ano: thanks 15:07:12 <n0ano> surprisingly enought, asking for gantt rather than n-sch works (you wind up calling into nova code but it installs the gantt tree and starts things out from the top of the gantt tree) 15:07:31 <garyk> thanks, i will take alook at it. is the jenkins −1 due to the grumpy old man or a real issue? 15:07:56 <n0ano> garyk, yep, grump old man (I think is the dsm which seems to always fail) 15:08:54 <n0ano> so yeah, reviewing the devstack patches would be great, I'd love to get that in. 15:09:17 <n0ano> I've looked at task 2 & 3 and they are not totally trivial 15:09:19 <garyk> i'll look at the devstack 15:10:03 <n0ano> for 2, getting the service started has some futzy issues that need to be dealt with, I'm looking at it and not getting far... 15:10:46 <n0ano> for 3, I'm missing something obivious, the testing harness won't allow includes from nova so you wind up pulling half the nova tree over, not what we want 15:11:22 <alaski> is #3 tempest tests, or something else? 15:11:27 <johnthetubaguy> but surely, you don't need to change any of the tests? 15:12:01 <n0ano> johnthetubaguy, I would hope the tests would only need a change to an import line at most, the body of the test wouldn't change 15:12:18 <johnthetubaguy> ah, sorry, I mean tempest tests 15:12:23 <johnthetubaguy> they should just run as normal 15:12:36 <n0ano> alaski, I believe that run_tests.sh calls testr (with a virtual environment) and that doesn't seem to allow imports from nova 15:12:41 <johnthetubaguy> unit tests will take some care, mostly the whole DB hooks thing 15:13:17 <johnthetubaguy> n0ano tox is used in the gate, but almost the same difference 15:13:25 <n0ano> johnthetubaguy, true but we want to be just a drop in for now, utilizing code for things like DB access from nova should be fine 15:13:42 <n0ano> johnthetubaguy, yeah, I did tox manually and, as I remember, I hit the same issue 15:14:01 <alaski> I think of run_tests.sh/tox as unit tests, and tempest as integration tests. And integrating with devstack basically gives you integration for free 15:14:07 <toan-tran> n0ano so basically we can re-run tests with import gannt instead of nova? 15:14:25 <toan-tran> gantt 15:14:52 <johnthetubaguy> tempest talks directly to the nova cli, so shouldn't need any imports, the unit tests are how we test if we brake the dependence on nova I guess? 15:14:55 <n0ano> toan-tran, sort of, we want to import scheduler code from gantt but nova specific code (like DB) should come from nova, we don't want to duplicate the nova tree 15:15:38 <johnthetubaguy> seem like we would need to stop accessing the nova db, and copy the bits that the scheduler needs for its own db? 15:15:44 <johnthetubaguy> but maybe I am complicating things? 15:15:45 <n0ano> johnthetubaguy, not sure what you mean, all I know is many python files in gantt/tests import nova objects 15:16:04 <alaski> yeah, gannt right now needs pretty deep nova knowledge 15:16:05 <johnthetubaguy> yeah, we only need to run the unit tests in the scheduler sub directory though 15:16:18 <n0ano> johnthetubaguy, it's more than that, it's not just the DB, it things like objects and others 15:16:26 <johnthetubaguy> Ok, so its import nova, make it run, then cut the links I guess? 15:16:34 <johnthetubaguy> thats fair enough 15:16:38 <toan-tran> johnthetubaguy you're suggesting fork out db, which is no-db approach 15:16:39 <n0ano> johnthetubaguy, +1 15:16:47 <garyk> yup, at the moment it is solely dependant on nova na dwill most probaly need a complete overhaul when we start to add cross service support 15:17:02 <johnthetubaguy> toan-tran: not really, its cut and paste db code, but importing nova will do the same for now 15:17:14 <n0ano> garyk, +1 (that's all part of our wold domination plan :-) 15:17:17 <johnthetubaguy> yeah, just not sure if the gate can deal with all this 15:17:31 <johnthetubaguy> wait, I guess it can, ignore me 15:17:49 <garyk> is there anyway to have a symbolic link to the nova master branch in git? 15:17:57 <alaski> n0ano: adding nova to test-requirements isn't enough? 15:18:38 <n0ano> alaski, I might be an idiot, I didn't think of that, let me get back to you, this might be a simple problem. 15:18:46 <johnthetubaguy> well, its a bit tricker, you need the version zuul wants to give you and is installed on that machine 15:19:05 <johnthetubaguy> I guess requiirements file could link to src in /opt/stack/nova or something like that 15:19:15 <johnthetubaguy> can't remember the details now 15:19:57 <n0ano> I'm play with it (I've totally corrupted my working gantt tree for now, it'll take me a bit to try this but it sounds promising) 15:20:14 <n0ano> s/I'm/I'll 15:21:15 <n0ano> does anyone have any cycles to look at getting gantt to actually work? e.g. deal with the process startup issues 15:21:42 <garyk> i may have next week. not sure at the moment though 15:22:02 <alaski> n0ano: I don't this week, but I'd like to help so I'll stay hopeful for next week 15:22:36 <coolsvap> n0ano: I can try this week 15:22:41 <garyk> sorry, but could not resists, do we have a gantt with what all of our resources are doing? 15:22:42 <n0ano> not to worry, it'll probably take me most of this week to get the unit tests working, there'll be plenty of work still available next week. 15:22:44 <toan-tran> I have a question, stupid maybe, 15:22:53 <johnthetubaguy> n0ano: another week it would be yes, sorry 15:23:04 <n0ano> toan-tran, go ahead 15:23:06 <toan-tran> how is that different from having nova-scheduler separated from other nova services, and gannt? 15:23:34 <toan-tran> in code separation, I mean 15:24:03 <toan-tran> when I install nova-scheduler separated from others, the nova code is duplicated 15:24:15 <n0ano> toan-tran, not sure what your question is, what we need is the ability to start a sheduler from the gantt tree, have it call gantt `scheduler` code (calling nova code for DB, objects and what not is OK) 15:24:27 <toan-tran> installing gantt should be the same, no? 15:25:11 <toan-tran> I mean, for now, as we have the same code in nova and in gantt 15:25:25 <n0ano> toan-tran, currently most of the code in the gantt tree includes modules/classes from nova, some of those imports need to be from gantt, some need to remain nova 15:25:29 <toan-tran> installing gantt on another server is like installing nova-scheduler on it 15:26:34 <alaski> toan-tran: gantt needs to be able to use nova like a library for now, not have it in the source tree. while nova-scheduler is installed with it all together 15:27:06 <johnthetubaguy> alaski: +1 15:27:17 <n0ano> alaski, what he said 15:27:19 <toan-tran> ok, so basically we need to tell gantt to use nova code, not use its local lib 15:27:30 <toan-tran> or import nova lib from somewhere else 15:27:49 <n0ano> toan-tran, for nova things, for scheduler code we want gantt to include the gantt code 15:29:19 <toan-tran> n0ano thanks 15:29:23 <n0ano> garyk, in re a gantt of our gantt work - not yet, we at the stage where we don't even know what all the tasks are until we do the task... 15:29:46 <n0ano> after we get the top 3 goals accomplished I think we can make a more detailed plan 15:30:45 <toan-tran> wihch step intergrates gantt with nova/keystone? 15:31:09 <toan-tran> I mean, for now nova list nova-scheduler in its service list, with host IP and all 15:31:35 <toan-tran> but gantt is outsider, so basically nova-conductor should contact keystone for its endpoint 15:31:43 <n0ano> will be running the same code, whether from the gantt tree or the nova tree, so that should work the same 15:32:20 <n0ano> note that n-sch already runs as a separate process, that doesn't change 15:32:46 <toan-tran> n0ano just tapping in the rabbitmq channel? 15:33:12 <n0ano> should do the same taps into the same channel 15:35:16 <n0ano> I think that's about all on this subject for now, since boris isn't here johnthetubaguy you wanted to talk about: 15:35:23 <n0ano> #topic caching scheduler 15:35:37 <johnthetubaguy> yeah, I have an idea/blueprint 15:35:52 <johnthetubaguy> https://blueprints.launchpad.net/nova/+spec/caching-scheduler 15:35:57 <johnthetubaguy> I have a review up 15:36:02 <johnthetubaguy> good to get your feedback 15:36:47 <johnthetubaguy> basic idea 15:36:53 <johnthetubaguy> do expensive things up front 15:37:56 <n0ano> one question, how does this fit with the current filters/weights, can you still specify a set of filters 15:38:01 <garyk> johnthetubaguy: in certain cases we can cache things, but when there are changes to the system it becomes very difficult. do you have some doc describing what you are doing? 15:38:33 <johnthetubaguy> not really, the idea is it responds to scheduler update in the cache 15:38:39 <johnthetubaguy> my basic idea, lets try it 15:39:04 <glikson> johnthetubaguy: interesting idea, will take a look 15:39:18 <toan-tran> what kind of information/decision that you cache ? 15:39:20 <johnthetubaguy> I hope we start having many drivers 15:39:24 <johnthetubaguy> that get documented 15:39:34 <n0ano> garyk, does raise a good question, what happens when your cache gets stale? 15:39:40 <johnthetubaguy> basically cache chose hosts for specific partial request-specs 15:40:20 <alaski> yeah. I chatted with johnthetubaguy about this in person last week, and what I really like is starting to have differentiated scheduler drivers that may have different features/characteristics. 15:40:59 <toan-tran> alaski: meaning ? 15:41:07 <garyk> alaski: if the work load are homogenous then that could be a nice solution. 15:41:24 <johnthetubaguy> n0ano: it has a periodic task to keep the cache fresh 15:42:02 <alaski> toan-tran: something like, maybe I don't care about supporting affinity/anti-affinity so I can choose a driver that doesn't have it but is blazing fast. Or maybe I dont' care about speed and can use a complex sat solver for all my placements, etc... 15:42:48 <glikson> regarding having different drivers -- do we also expect the user-facing APIs to be different? e.g., today's filters/hints? 15:43:29 <johnthetubaguy> hopefully not 15:43:32 <alaski> glikson: I would expect some change eventually. but I think that should happen anyways. 15:43:34 <johnthetubaguy> it might be some don't work 15:44:10 <alaski> scheduler hints are a poor api right now, because there's no feedback on whether or not they were used 15:44:17 <johnthetubaguy> thats very true 15:45:02 <glikson> one of the ideas we have been discussing internally was to separate the placement calcuation logic from the logic/API specifying 'inputs' (constraints, etc). right now the two are rather tightly coupled.. 15:45:38 <johnthetubaguy> my basic points are, lets don't be afraid to break things, try things, but do it in different drivers 15:45:49 <n0ano> raising questions of latency and recovery from invalid cache entries, theoretically it could work but the devil is in the details. 15:46:23 <alaski> glikson: I agree. I think there's a layer of abstraction missing from scheduling 15:47:09 <johnthetubaguy> glikson: that makes sense 15:47:40 <glikson> hopefully having the code separately (whatever that means) would make it easier to evolve.. 15:47:50 <toan-tran> glikson +1 15:48:12 <toan-tran> that's when i have to rethink of the gantt API 15:48:16 <toan-tran> =)) 15:48:22 <johnthetubaguy> true, when we have an gnatt API then it gets better 15:49:48 <johnthetubaguy> do people fancy giving this caching a go then? 15:50:05 <toan-tran> johnthetubaguy count me in 15:50:06 <johnthetubaguy> its going to be experimental in icehouse 15:50:11 <johnthetubaguy> but its worth a whirl 15:50:19 <n0ano> johnthetubaguy, sounds interesting to me, a proof of concept would be good 15:50:32 <garyk> johnthetubaguy: i think that it is an intersting direction. 15:50:45 <coolsvap> johnthetubaguy: yes sounds interesting 15:50:52 <garyk> it is worth exploring and the fact that you have posted something gives us a way to play around with it 15:51:08 <johnthetubaguy> I know the current stuff is quite broken right now 15:51:14 <johnthetubaguy> but its gives you the idea 15:51:20 <garyk> current schedulre or this :) 15:51:54 <toan-tran> it woud be greate if you have a google doc, not necessarity too detailed 15:52:21 <n0ano> toan-tran, I'd prefer a wiki page, that's pretty standard for what we do. 15:52:45 <johnthetubaguy> yeah, I can do something like that 15:52:50 <johnthetubaguy> and attach to the blueprint 15:53:03 <n0ano> johnthetubaguy, +1 15:53:14 <johnthetubaguy> the commit message and blueprint has all the data I have at the moment 15:53:25 <garyk> would it be possible that people please take a look at the anti affinity patch with the instance groups. 15:53:29 <johnthetubaguy> basically an idea, and I am testing it out 15:53:38 <garyk> hopefully this week there will be the API's for v2 and v3. 15:54:10 <n0ano> garyk, are those APIs the last bits for this work? 15:54:12 <johnthetubaguy> garyk: do we have those blueprints sponsored by anyone yet? 15:54:44 <garyk> johnthetubaguy: they were approved in havana. we missed the dealine by a week. so it is just carried through. not sure who is sponsoring though 15:54:59 <johnthetubaguy> garyk: I mean in terms of the nova blueprint process 15:55:20 <garyk> i do not think that anyone is sponsoring this at the moment. 15:55:38 <garyk> it was approved prior to the process discussed 15:55:45 <garyk> any takers? 15:55:45 <johnthetubaguy> OK, not certain I have time, given all the other stuff, else I would offer, just wondered if anyone else was keen? 15:56:12 <glikson> if we are open for general discussion, I have similar request for the multi-sched patches.. 15:56:16 <glikson> https://review.openstack.org/#/q/topic:bp/multiple-scheduler-drivers,n,z 15:56:44 <n0ano> glikson, I pinged people for reviews at the last nova meeting, looks like that didn't have any impact :-( 15:57:22 <glikson> n0ano: thanks! we did have one new reviewer recently. 15:57:42 <glikson> but the comments were mostly syntactical 15:57:53 * n0ano strains to get hand to pat back :-) 15:58:03 <garyk> glikson: i have reviewed that pacth a number of times and it is starting to look good 15:58:50 <n0ano> coming to the top of the hour, closing in 3... 15:58:50 <toan-tran> glikson: i'll give it a try 15:59:06 <n0ano> 2... 15:59:18 <mspreitz> garyk: which patch exactly? 15:59:19 <glikson> grayk: thanks. I hope that it is a matter of fixing small things, and we won't be surprised by major concerns few days/weeks before the feature-freeze.. 15:59:21 <toan-tran> the last time i saw it was not completed, so not sure what i've seen was kept ... 15:59:41 <glikson> toan-tran: thanks! 16:00:09 <n0ano> 1 16:00:16 <n0ano> tnx everyone, talk to you next week. 16:00:19 <n0ano> #endmeeting