15:00:19 #startmeeting gantt 15:00:19 Meeting started Tue Apr 29 15:00:19 2014 UTC and is due to finish in 60 minutes. The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:20 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:22 The meeting name has been set to 'gantt' 15:00:31 anyone here to talk about the scheduler? 15:00:33 o/ 15:00:47 hi 15:00:55 o/ 15:01:10 bauzas, again, tnx for covering for me the last couple of week, it's been crazy :-) 15:01:16 o/ 15:01:16 hi 15:01:18 n0ano: no pb 15:01:19 ;) 15:01:40 #topic review action items 15:01:52 I believe there were just a couple from last week. 15:01:55 yup 15:02:09 merge sessions 262 & 140 - I believe you did that, right bauzas ? 15:02:33 n0ano: yup 15:02:41 we'll call it done then 15:02:42 n0ano: now the etherpad is up-to-date 15:02:58 #link https://etherpad.openstack.org/p/Gantt-summit-sessions 15:03:04 excellent, I looked it over, it's a good overview of the scheduler sessions 15:03:13 I still have to provide the times 15:03:29 but I was waiting to see the agenda finalized, as it's subject to changes 15:03:49 indeed, but I don't expect it to change to dramatically 15:03:54 think so too 15:04:00 what time will gantt sessions be scheduled? 15:04:04 on which day? 15:04:10 llu-laptop: see the etherpad 15:04:13 on Friday 15:04:16 Friday 15:04:21 llu-laptop, mostly on Fri morning, up until about 2PM 15:04:23 there is one interesting session on Tuesday 15:04:28 about Climate 15:04:54 important point is to not schedule a flight home until late Fri 15:05:08 * johnthetubaguy raises quite late hand 15:05:12 IMHO, the Climate session is worth it for discussing with Gantt 15:05:20 * n0ano high fives johnthetubaguy 15:05:59 sounds like we can encourage all the scheduler centric people to try and make that session then 15:06:19 this session on Wed is also related to scheduling: http://junodesignsummit.sched.org/event/a0d38e1278182eb09f06e22457d94c0c 15:07:16 glikson, not obvious from the session title but maybe we should include that on our etherpad 15:07:26 glikson: Yes, not sure if we can discuss how to migrate VM between clusters with only one nova compute 15:08:17 sorry, got disconnected :( 15:08:19 I think the scheduler session etherpad should be rather organic so anyone can update it with stuff they think is interesting 15:08:31 missed some parts, maybe 15:08:58 glikson, pointed out the http://junodesignsummit.sched.org/event/a0d38e1278182eb09f06e22457d94c0c session might be scheduler interesting 15:09:00 jay-lau-513: hopefully that stuff becomes more obvious after we agree how to do the clustered stuff works 15:09:20 n0ano: oh, right 15:09:30 n0ano: node vs host thing in the scheduler is certainly related 15:09:31 n0ano: then feel free to amend this ethepad :) 15:09:49 johnthetubaguy yes, hope so, at least it is an imporatant feature for VMware now 15:10:02 as I said, I encourage everyone to update the etherpad as they think fit 15:10:23 I was just speaking about Hi peers, 15:10:23 Today is my last day at Bull. I truly appreciated working with you, in particular when looking at all the achievements we had with the XLCloud project. Seeing all people going into the same direction warmed me, and I'm truly convinced that HPC and Remote Rendering coupled with OpenStack will be someday the next key winning areas. 15:10:23 I really enjoyed working on Climate with the help of INRIA, Bull and latter Mirantis, and seeing people like Intel or IBM interested in the project led me thinking that mixing upstream contributions to open-source projects and business deals are complementary. 15:10:23 I'll follow my path in another big OpenStack specialist, but I'm sure we'll meet again at some events as the OpenStack ecosystem is quite small and friendly. 15:10:23 You can reach me out by e-mail at sylvain.bauza@gmail.com or by IRC (Freenode) PM bauzas, I'm still keen to discuss with all of you ! ;-) 15:10:23 See you, 15:10:23 -Sylvain 15:10:29 woops 15:10:33 pastebomb 15:10:36 sorry all ! 15:10:44 I was speaking about http://junodesignsummit.sched.org/event/a848270c309b10517d4d186fecaf768f#.U1_A1ea9h-M 15:10:44 bauzas, :-) 15:11:18 nevermind this above message, wrong c/p 15:11:33 let's see, the other action was to publish nova-spec for no-db scheduling BP, anyone know if this was done? 15:11:40 YorikSar ? 15:11:46 Hi 15:11:54 I didn't finish the spec yet. 15:12:05 ok, sure 15:12:10 YorikSar, NP, I'll just bug you again next week :-) 15:12:11 It seems that the blueprint is too old... 15:12:19 n0ano: maybe we should put it again as action 15:12:26 I gues I'll have to reword it completely. 15:12:32 oh 15:12:49 #action YorikSar to publish nova-spec for no-db scheduling BP 15:12:50 (but first I have to understand which part of the scope is still actual) 15:13:16 YorikSar, sounds like this might take you a little while, do you need more time? 15:13:20 do you feel you would need to wait for the Summit feedback ? 15:13:34 YorikSar: who will held the session btw. ? 15:13:43 is it boris-42 ? 15:13:49 I would do the spec to lead to a good session 15:13:53 s/held/hold 15:14:06 johnthetubaguy, +1 15:14:12 I think I'll be able to produce at least a draft for it in spite of long holidays in Russia. 15:14:16 nothing like something on paper to focus attention 15:14:42 bauzas: Yes, boris-42 is going to hold it. I'm going to work with him on my draft. 15:14:45 I think making the no-db scheduler pluggable is the key 15:14:55 I mean, optional 15:15:00 YorikSar, OK, I don't want to pressure you too much but, if you're comfortable getting a draft out relatively soon, that's great 15:15:39 johnthetubaguy, so you thing nodb pluggability is only an option? 15:15:47 johnthetubaguy: Well... If we won't cut out old state synchronisation, adding another one would only increase the load on the system. 15:16:16 YorikSar: right, the idea I have, is being able to choose between the new and the old option 15:16:30 johnthetubaguy: +1 15:16:37 YorikSar: many of us deploy off trunk, so we need to keep trunk working as we try new options, thats my main worry 15:16:49 johnthetubaguy: From the user perspective there is no difference between them. 15:17:04 YorikSar: I like the no-db stuff, I just need to make sure its optional in the first release, so we can sort out smooth upgrades between the old and new world 15:17:16 YorikSar: its massive from the deployer perspective though 15:17:32 johnthetubaguy: Once you restart your cloud new scheduler will sync state of all working compute node just like the old one would but w/o global SELECTs 15:17:49 YorikSar: does that require memcache backend at least ? 15:17:56 bauzas: Nope 15:18:00 YorikSar: is there any external dep ? 15:18:04 YorikSar: agreed, but its quite an architectural change to force people to adopt 15:18:12 YorikSar, the issue is we want to chose when the new scheduler goes into use 15:18:16 It can with SQL 15:18:24 PaulMurray: +1 15:18:42 johnthetubaguy, PaulMurray: I'm not sure I understand the reason for that... 15:18:43 YorikSar: then it's worth having this as optional for Juno 15:19:01 New scheduler is basically an implementation improvement. 15:19:02 YorikSar, because we run large public services that we are very worried about 15:19:13 YorikSar, are you from mirantis? 15:19:15 YorikSar: that's an disruptive change 15:19:24 PaulMurray: Yes, I'm from Mirantis. 15:19:31 YorikSar: we need to be careful about the level of impact 15:19:33 YorikSar: the other way of looking at this, if someone comes up with some other idea for doing this, that could also be another option, I want to encourage innovation here, but also not de-stabalise things 15:19:43 YorikSar, then I am sure you would not like me to add a change that made you run your service differently 15:19:54 YorikSar, I think it's more a perception issue, it's a major change, even if hidden, so that can make deployers uncomforatble 15:20:07 s/uncomforatble/uncomfortable 15:20:57 YorikSar, I want you to understand that I am all for this change, but I do need to control when it happens in my infrastructure - its a risk issue for the business 15:21:20 n0ano: Oh... It's not a major change, actually. It's just an optimization of state synchronization within the service. If you don't want to add new deps, you can just run migrations and you won't notice it. 15:21:34 anyway, the idea is to write the spec and discuss it at summit time 15:21:56 bauzas, +1 - that's what the summit is for 15:22:15 Yeah. I hope Boris won't scare people any further... 15:22:16 if there is no consensus before the summit, then let's the summit decide 15:22:25 YorikSar: :) 15:22:29 YorikSar, :-) 15:22:37 let's move on 15:22:38 yep, I am −2 on this till its optional, its just way too risky as a "big bang" change, but your spec may convince me otherwise, just it seems very risky 15:22:41 YorikSar: well, at least we know we have different view 15:22:46 views 15:23:10 moving on 15:23:13 bauzas: Yeah... 15:23:14 +1 15:23:34 +1 looking forward to the summit and reviewing the spec, but yeah, lets move on... 15:23:37 #topic forklift status 15:24:00 bauzas, I believe your client library BP is approved, anything else to add? 15:24:37 yup 15:24:40 what about the follow up blueprint? any news on that? 15:24:50 the Juno-2 stuff 15:24:59 #link https://review.openstack.org/89893 15:25:04 the proposal is there 15:25:22 cools, its in my queue now 15:25:26 I also reopened the draft implementation of Juno-1 15:25:33 #linkhttps://review.openstack.org/82778 15:26:12 I will have limited time this week to work on 15:26:22 mostly holidays 15:26:38 but I'll resume my work by next week 15:27:01 all sounds good, thanks for your work on that other spec, glad to get that all approved 15:27:09 bauzas, is there anything you want help on, I think we can find some people 15:27:13 johnthetubaguy: thanks 15:27:37 n0ano: well, the amount of work for Juno-1 is not that huge 15:27:44 +1 I would like to help out with this effort, where possible, I guess its probably just reviewing at this point 15:27:59 bauzas: we can always start the Juno-2 work during Juno-1 if we get there early 15:28:11 n0ano: I'm just in the middle of changing my affiliation (see my bad c/p above...) 15:28:25 johnthetubaguy: agreed 15:28:37 bauzas, just let us know, I don't believe in assigning 9 women to create a baby in 1 month :-) 15:28:40 johnthetubaguy: but the Juno-2 stuff is maybe more discussional yet :) 15:28:54 n0ano: ;) 15:28:57 ture 15:29:00 true 15:29:12 n0ano: you can maybe look at the very old implem 15:29:32 n0ano: and review it to see some big misses 15:29:43 bauzas, sure, I can do that 15:30:04 #action n0ano to review the old client library implementation 15:30:04 n0ano: but I think we need to review the sched code to make sure we don't miss a crucial thing for the isolate-sched-db bp 15:30:26 n0ano: that bp is important 15:30:39 n0ano: and I don't want to miss an important thing 15:30:57 I am worried that there are some hidden dependencies in the code that will surprise us when we do the implementation but we'll deal with that as it happens 15:31:14 n0ano: in particular, I'm really interested in discussing how we store aggs info if we consider not to rely on extensible RT at the moment 15:31:40 n0ano: I went through all module imports 15:31:58 n0ano: to see what filters and managers were importing 15:32:14 n0ano: but maybe I missed some other things 15:32:17 n0ano: its sure to happen, I don't think we should worry too much, its just an extra blueprint 15:32:21 they way I see it... 15:32:38 fix how scheduler gets given info, and gets it 15:32:45 some stuff comes from compute 15:32:48 some from api request 15:32:59 do that first, then see where we end up 15:33:11 johnthetubaguy: well, specing bps are good for catching unnoticed things :) 15:33:17 I just always worry that the devil is in the details, but yes, we'll just solve the problems that arise 15:33:19 s/are/is 15:33:27 n0ano: +1 15:33:40 agreed, we identified issues, fix them in the bp, then repeat 15:33:51 I think we have identified two clear classes of problem 15:33:57 and two easy fixes for that issue 15:34:05 (I hope so) 15:34:12 well we have those two 15:34:15 johnthetubaguy: if the BP is approved quickly 15:34:17 lets run with them 15:34:22 and then see what we missed 15:34:39 yjiang51: yup, this breaks down if that dragges on too long, 15:34:50 sure thing :) 15:35:13 n0ano, o/ 15:35:24 OK, sounds like the forklift is making progress, anything else on this subject? 15:35:25 I am sure there are loose ends, but they are going to be much easier to spot after the currently planned clean up 15:35:38 PaulMurray: you had a question? 15:35:39 n0ano: nope 15:35:51 PaulMurray, go ahead 15:36:33 johnthetubaguy, i have a call sorry 15:37:21 PaulMurray, if you have something I should warn you I think we'll be ending soon 15:37:28 moving on 15:37:38 #topic summit sessions 15:38:00 I think we've already discussed the pretty well, anyone want to add anything? 15:38:22 s/the pretty/this pretty 15:38:47 what was that etherpad again? 15:38:57 https://etherpad.openstack.org/p/Gantt-summit-sessions 15:39:01 thanks 15:39:22 bauzas, beat me :-( 15:39:29 ideally, we should provide the sched.org links 15:39:41 +1 for sched.org links, when they go up 15:39:49 it's an etherpad, feel free to update it 15:40:08 moving on 15:40:09 * bauzas doing it 15:40:14 #opens 15:40:21 #topic opens 15:40:24 do we want to pre-discuss anything from that summit session list? 15:40:42 just wondering if there is any design prep to make the sessions as productive as possible 15:41:38 johnthetubaguy seems most of the topics have submitted its own nova-specs 15:41:42 scheduler hints, I think it should become the next generation server groups 15:42:06 jay-lau-513: cools, do you have a spec up for scheduler hints now? 15:42:07 johnthetubaguy: do you have any BP for your two-phase-commit idea for the claims? 15:42:18 johnthetubaguy: well, at least we should provide soon some detailed etherpads for each session 15:42:19 the gantt API & no-db sessions are well researched, not sure about the others 15:42:29 yjiang51: nope its more an implementation option 15:42:30 johnthetubaguy i want to know more about your thinking about scheduler hint and server group 15:42:52 once that done, we should amend https://wiki.openstack.org/wiki/Summit/Juno/Etherpads#Nova 15:43:16 bauzas, +1 15:43:16 johnthetubaguy https://review.openstack.org/#/c/88983/ 15:43:19 bauzas: good point, do we have etherpads with a session plan yet? 15:43:22 johnthetubaguy: you mean no BP needed for it? 15:43:32 I would suggest the the sussion proposer should set up an etherpad 15:43:36 johnthetubaguy: none I heard of 15:43:51 yjiang51: more the other way around, its an implemetation for some blueprint in the future, not really needed by its-self 15:44:06 johnthetubaguy: ok, got it. 15:44:06 n0ano: +1 15:44:14 jay-lau-513: thanks for the link, we can cover that here if we are done on other things 15:44:24 n0ano: +1 thats the usual way 15:44:41 n0ano: press the red 'action' button then ;) 15:44:59 * n0ano was searching for the right button :-) 15:45:04 jay-lau-513: basic plan, server groups (affitiy/anti-affitiy) is basically a scheduler hint on steroids, I feel scheduler hints should look very similar 15:45:06 #action ;) 15:45:15 #action session proposers to create etherpad for their session 15:45:40 jay-lau-513: at least I would love them to look the same 15:46:10 IIRC, we have YorikSar, jay-lau-513, mspreitz, devananda and me 15:46:26 as session proposers 15:46:32 johnthetubaguy, then would you propose we get rid of scheduler hints in favor of groups 15:46:40 johnthetubaguy currently, the server group using hint to specify server group uuid as scheduler hint when create vm instance 15:47:10 n0ano: maybe, but I kinda like jaypipe's suggestion of making server groups a scheduler hint, I am cool with either way around really 15:47:12 johnthetubaguy do you want enhance server group to use affinty/anti-affinity as scheduler hints? 15:47:35 so, basically I feel they should be the same, and easy to use 15:47:44 and they should be persisted 15:47:46 I just worry that if a is a superset of b when keep b around 15:48:07 s/when/then why 15:48:12 johnthetubaguy: they are similar in a sense that in both cases a placement policy can be derived.. but the semantics is a bit different, overall. 15:48:17 n0ano: sure, we might need to deprecate one of the APIs, or at least make them equivalent in the backend 15:48:24 johnthetubaguy I think that we can discuss JayPipes proposal for server group in the scheduler hint session 15:48:45 glikson: yes, but there is an overall more general thing that covers both I feel 15:49:07 jay-lau-513: +1 15:49:46 johnthetubaguy with JayPipe's proposal, we do not need create server groups, but use scheudler hinits to achieve affinity/anti-affinity with scheduler hints, it is good to persist them 15:50:12 bauzas I will update etherpad later to reflect the change 15:50:15 jay-lau-513: yep, I think thats the best idea at this point, but need to think about the use cases a bit more 15:50:42 not all hints should be "sticky", like build on host X probably should be overridden by a live-migrate to Y 15:50:58 what should happen with resize is an interesting one, but needs some thought 15:51:15 yes, sticky should be ignored 15:51:20 I was once told by some nova core that the scheduler hint was in its way to be deprecated. is it changed? 15:52:14 llu-laptop: maybe, not heard of the deprecated plan myself, we need something better though 15:53:07 it was said that it might not be very good to have those kind of thing affecting scheduling specified in the api call from end user 15:53:51 llu-laptop: thats basically why its needed, but maybe they were talking about a specific scheduler hint 15:54:25 llu-laptop, not sure I understand the, the end user always specifies things it needs (# cpus, mem, disk, ...) 15:54:44 johnthetubaguy: agree, I don't see why hints would be removedf 15:54:58 n0ano: nope, the admin specify those in flavor and let the user to choose 15:55:33 llu-laptop: yeah, for some stuff, maybe, doesn't work for server group like hints 15:55:52 llu-laptop johnthetubaguy n0ano imho scheduler hint is an important feature for resource selection requirement, putting thoes in flavors is not convinent for operators as they may need to create many flavors for different requirement 15:56:05 llu-laptop, implicitly the user is requesting that, the admin always controls what the nodes in a cloud support 15:57:10 n0ano, - sorry all - I'm back, it was a call from my daughter's doctor - I had to take it 15:57:29 PaulMurray, hope she's OK, is your question quick? 15:57:45 n0ano, its ok, we past that I can deal with it later 15:58:07 OK then, it's almost at the top of the hour so we'll have to close 15:58:27 tnx everyone, we'll talk again next week (last talk before the summit) 15:58:30 thanks all, bye 15:58:36 #endmeeting