15:01:19 #startmeeting gantt 15:01:20 Meeting started Tue Jan 21 15:01:19 2014 UTC and is due to finish in 60 minutes. The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:22 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:24 The meeting name has been set to 'gantt' 15:01:34 anyone hear to talk about the scheduler? 15:01:41 hi 15:01:43 i'm here, hi all 15:01:57 hi 15:01:58 hi 15:02:02 hi 15:02:24 hi 15:02:48 o/ 15:02:52 I wanted to talk about no db but boris doesn't seem to be on yet, let's go to the code forklift first 15:02:58 #topic code forklift 15:03:20 hopefully all have seen the email thread on the devel list... 15:03:47 we've decided to get the new gantt tree working first, and then re-sync (probably by recreating the tree) to the nova tree 15:04:00 (rather than trying to continuously keep in sync with the nova tree) 15:04:20 to that effort, there are 3 primary top level goals: 15:04:28 1) integrate with devstack 15:04:34 2) get the unit tests to work 15:04:44 3) integration tests working 15:05:01 sorry, not sure that i follow 15:05:06 I've actually done task 1, the patches for devstack are posted and awaiting review 15:05:15 instead of running n-sched will we be running gantt on devstack? 15:06:09 n0ano: do you have links for the devstack patches? 15:06:10 the review is https://review.openstack.org/#/c/67666/ 15:06:11 garyk, the idea is that you can ask devstack to include gantt rather than n-sch (the same way you can ask for neutron vs. n-net) 15:06:19 n0ano: thanks 15:07:12 surprisingly enought, asking for gantt rather than n-sch works (you wind up calling into nova code but it installs the gantt tree and starts things out from the top of the gantt tree) 15:07:31 thanks, i will take alook at it. is the jenkins −1 due to the grumpy old man or a real issue? 15:07:56 garyk, yep, grump old man (I think is the dsm which seems to always fail) 15:08:54 so yeah, reviewing the devstack patches would be great, I'd love to get that in. 15:09:17 I've looked at task 2 & 3 and they are not totally trivial 15:09:19 i'll look at the devstack 15:10:03 for 2, getting the service started has some futzy issues that need to be dealt with, I'm looking at it and not getting far... 15:10:46 for 3, I'm missing something obivious, the testing harness won't allow includes from nova so you wind up pulling half the nova tree over, not what we want 15:11:22 is #3 tempest tests, or something else? 15:11:27 but surely, you don't need to change any of the tests? 15:12:01 johnthetubaguy, I would hope the tests would only need a change to an import line at most, the body of the test wouldn't change 15:12:18 ah, sorry, I mean tempest tests 15:12:23 they should just run as normal 15:12:36 alaski, I believe that run_tests.sh calls testr (with a virtual environment) and that doesn't seem to allow imports from nova 15:12:41 unit tests will take some care, mostly the whole DB hooks thing 15:13:17 n0ano tox is used in the gate, but almost the same difference 15:13:25 johnthetubaguy, true but we want to be just a drop in for now, utilizing code for things like DB access from nova should be fine 15:13:42 johnthetubaguy, yeah, I did tox manually and, as I remember, I hit the same issue 15:14:01 I think of run_tests.sh/tox as unit tests, and tempest as integration tests. And integrating with devstack basically gives you integration for free 15:14:07 n0ano so basically we can re-run tests with import gannt instead of nova? 15:14:25 gantt 15:14:52 tempest talks directly to the nova cli, so shouldn't need any imports, the unit tests are how we test if we brake the dependence on nova I guess? 15:14:55 toan-tran, sort of, we want to import scheduler code from gantt but nova specific code (like DB) should come from nova, we don't want to duplicate the nova tree 15:15:38 seem like we would need to stop accessing the nova db, and copy the bits that the scheduler needs for its own db? 15:15:44 but maybe I am complicating things? 15:15:45 johnthetubaguy, not sure what you mean, all I know is many python files in gantt/tests import nova objects 15:16:04 yeah, gannt right now needs pretty deep nova knowledge 15:16:05 yeah, we only need to run the unit tests in the scheduler sub directory though 15:16:18 johnthetubaguy, it's more than that, it's not just the DB, it things like objects and others 15:16:26 Ok, so its import nova, make it run, then cut the links I guess? 15:16:34 thats fair enough 15:16:38 johnthetubaguy you're suggesting fork out db, which is no-db approach 15:16:39 johnthetubaguy, +1 15:16:47 yup, at the moment it is solely dependant on nova na dwill most probaly need a complete overhaul when we start to add cross service support 15:17:02 toan-tran: not really, its cut and paste db code, but importing nova will do the same for now 15:17:14 garyk, +1 (that's all part of our wold domination plan :-) 15:17:17 yeah, just not sure if the gate can deal with all this 15:17:31 wait, I guess it can, ignore me 15:17:49 is there anyway to have a symbolic link to the nova master branch in git? 15:17:57 n0ano: adding nova to test-requirements isn't enough? 15:18:38 alaski, I might be an idiot, I didn't think of that, let me get back to you, this might be a simple problem. 15:18:46 well, its a bit tricker, you need the version zuul wants to give you and is installed on that machine 15:19:05 I guess requiirements file could link to src in /opt/stack/nova or something like that 15:19:15 can't remember the details now 15:19:57 I'm play with it (I've totally corrupted my working gantt tree for now, it'll take me a bit to try this but it sounds promising) 15:20:14 s/I'm/I'll 15:21:15 does anyone have any cycles to look at getting gantt to actually work? e.g. deal with the process startup issues 15:21:42 i may have next week. not sure at the moment though 15:22:02 n0ano: I don't this week, but I'd like to help so I'll stay hopeful for next week 15:22:36 n0ano: I can try this week 15:22:41 sorry, but could not resists, do we have a gantt with what all of our resources are doing? 15:22:42 not to worry, it'll probably take me most of this week to get the unit tests working, there'll be plenty of work still available next week. 15:22:44 I have a question, stupid maybe, 15:22:53 n0ano: another week it would be yes, sorry 15:23:04 toan-tran, go ahead 15:23:06 how is that different from having nova-scheduler separated from other nova services, and gannt? 15:23:34 in code separation, I mean 15:24:03 when I install nova-scheduler separated from others, the nova code is duplicated 15:24:15 toan-tran, not sure what your question is, what we need is the ability to start a sheduler from the gantt tree, have it call gantt `scheduler` code (calling nova code for DB, objects and what not is OK) 15:24:27 installing gantt should be the same, no? 15:25:11 I mean, for now, as we have the same code in nova and in gantt 15:25:25 toan-tran, currently most of the code in the gantt tree includes modules/classes from nova, some of those imports need to be from gantt, some need to remain nova 15:25:29 installing gantt on another server is like installing nova-scheduler on it 15:26:34 toan-tran: gantt needs to be able to use nova like a library for now, not have it in the source tree. while nova-scheduler is installed with it all together 15:27:06 alaski: +1 15:27:17 alaski, what he said 15:27:19 ok, so basically we need to tell gantt to use nova code, not use its local lib 15:27:30 or import nova lib from somewhere else 15:27:49 toan-tran, for nova things, for scheduler code we want gantt to include the gantt code 15:29:19 n0ano thanks 15:29:23 garyk, in re a gantt of our gantt work - not yet, we at the stage where we don't even know what all the tasks are until we do the task... 15:29:46 after we get the top 3 goals accomplished I think we can make a more detailed plan 15:30:45 wihch step intergrates gantt with nova/keystone? 15:31:09 I mean, for now nova list nova-scheduler in its service list, with host IP and all 15:31:35 but gantt is outsider, so basically nova-conductor should contact keystone for its endpoint 15:31:43 will be running the same code, whether from the gantt tree or the nova tree, so that should work the same 15:32:20 note that n-sch already runs as a separate process, that doesn't change 15:32:46 n0ano just tapping in the rabbitmq channel? 15:33:12 should do the same taps into the same channel 15:35:16 I think that's about all on this subject for now, since boris isn't here johnthetubaguy you wanted to talk about: 15:35:23 #topic caching scheduler 15:35:37 yeah, I have an idea/blueprint 15:35:52 https://blueprints.launchpad.net/nova/+spec/caching-scheduler 15:35:57 I have a review up 15:36:02 good to get your feedback 15:36:47 basic idea 15:36:53 do expensive things up front 15:37:56 one question, how does this fit with the current filters/weights, can you still specify a set of filters 15:38:01 johnthetubaguy: in certain cases we can cache things, but when there are changes to the system it becomes very difficult. do you have some doc describing what you are doing? 15:38:33 not really, the idea is it responds to scheduler update in the cache 15:38:39 my basic idea, lets try it 15:39:04 johnthetubaguy: interesting idea, will take a look 15:39:18 what kind of information/decision that you cache ? 15:39:20 I hope we start having many drivers 15:39:24 that get documented 15:39:34 garyk, does raise a good question, what happens when your cache gets stale? 15:39:40 basically cache chose hosts for specific partial request-specs 15:40:20 yeah. I chatted with johnthetubaguy about this in person last week, and what I really like is starting to have differentiated scheduler drivers that may have different features/characteristics. 15:40:59 alaski: meaning ? 15:41:07 alaski: if the work load are homogenous then that could be a nice solution. 15:41:24 n0ano: it has a periodic task to keep the cache fresh 15:42:02 toan-tran: something like, maybe I don't care about supporting affinity/anti-affinity so I can choose a driver that doesn't have it but is blazing fast. Or maybe I dont' care about speed and can use a complex sat solver for all my placements, etc... 15:42:48 regarding having different drivers -- do we also expect the user-facing APIs to be different? e.g., today's filters/hints? 15:43:29 hopefully not 15:43:32 glikson: I would expect some change eventually. but I think that should happen anyways. 15:43:34 it might be some don't work 15:44:10 scheduler hints are a poor api right now, because there's no feedback on whether or not they were used 15:44:17 thats very true 15:45:02 one of the ideas we have been discussing internally was to separate the placement calcuation logic from the logic/API specifying 'inputs' (constraints, etc). right now the two are rather tightly coupled.. 15:45:38 my basic points are, lets don't be afraid to break things, try things, but do it in different drivers 15:45:49 raising questions of latency and recovery from invalid cache entries, theoretically it could work but the devil is in the details. 15:46:23 glikson: I agree. I think there's a layer of abstraction missing from scheduling 15:47:09 glikson: that makes sense 15:47:40 hopefully having the code separately (whatever that means) would make it easier to evolve.. 15:47:50 glikson +1 15:48:12 that's when i have to rethink of the gantt API 15:48:16 =)) 15:48:22 true, when we have an gnatt API then it gets better 15:49:48 do people fancy giving this caching a go then? 15:50:05 johnthetubaguy count me in 15:50:06 its going to be experimental in icehouse 15:50:11 but its worth a whirl 15:50:19 johnthetubaguy, sounds interesting to me, a proof of concept would be good 15:50:32 johnthetubaguy: i think that it is an intersting direction. 15:50:45 johnthetubaguy: yes sounds interesting 15:50:52 it is worth exploring and the fact that you have posted something gives us a way to play around with it 15:51:08 I know the current stuff is quite broken right now 15:51:14 but its gives you the idea 15:51:20 current schedulre or this :) 15:51:54 it woud be greate if you have a google doc, not necessarity too detailed 15:52:21 toan-tran, I'd prefer a wiki page, that's pretty standard for what we do. 15:52:45 yeah, I can do something like that 15:52:50 and attach to the blueprint 15:53:03 johnthetubaguy, +1 15:53:14 the commit message and blueprint has all the data I have at the moment 15:53:25 would it be possible that people please take a look at the anti affinity patch with the instance groups. 15:53:29 basically an idea, and I am testing it out 15:53:38 hopefully this week there will be the API's for v2 and v3. 15:54:10 garyk, are those APIs the last bits for this work? 15:54:12 garyk: do we have those blueprints sponsored by anyone yet? 15:54:44 johnthetubaguy: they were approved in havana. we missed the dealine by a week. so it is just carried through. not sure who is sponsoring though 15:54:59 garyk: I mean in terms of the nova blueprint process 15:55:20 i do not think that anyone is sponsoring this at the moment. 15:55:38 it was approved prior to the process discussed 15:55:45 any takers? 15:55:45 OK, not certain I have time, given all the other stuff, else I would offer, just wondered if anyone else was keen? 15:56:12 if we are open for general discussion, I have similar request for the multi-sched patches.. 15:56:16 https://review.openstack.org/#/q/topic:bp/multiple-scheduler-drivers,n,z 15:56:44 glikson, I pinged people for reviews at the last nova meeting, looks like that didn't have any impact :-( 15:57:22 n0ano: thanks! we did have one new reviewer recently. 15:57:42 but the comments were mostly syntactical 15:57:53 * n0ano strains to get hand to pat back :-) 15:58:03 glikson: i have reviewed that pacth a number of times and it is starting to look good 15:58:50 coming to the top of the hour, closing in 3... 15:58:50 glikson: i'll give it a try 15:59:06 2... 15:59:18 garyk: which patch exactly? 15:59:19 grayk: thanks. I hope that it is a matter of fixing small things, and we won't be surprised by major concerns few days/weeks before the feature-freeze.. 15:59:21 the last time i saw it was not completed, so not sure what i've seen was kept ... 15:59:41 toan-tran: thanks! 16:00:09 1 16:00:16 tnx everyone, talk to you next week. 16:00:19 #endmeeting