21:03:28 <russellb> #startmeeting nova 21:03:29 <openstack> Meeting started Thu Jan 17 21:03:28 2013 UTC. The chair is russellb. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:03:30 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:03:32 <russellb> #chair vishy 21:03:33 <openstack> The meeting name has been set to 'nova' 21:03:34 <openstack> Current chairs: russellb vishy 21:03:38 <russellb> Howdy, nova folks 21:03:44 <ndipanov> hola 21:03:50 <rainya> hihi :) 21:03:51 <russellb> #link http://wiki.openstack.org/Meetings/Nova 21:04:09 <dansmith> meh. 21:04:18 <devananda> hiya 21:04:19 <vishy> hi! 21:04:20 <jog0> o/ 21:04:21 <comstud> haaaaaai 21:04:24 <russellb> dansmith: aw, hype it up! 21:04:35 <comstud> is it Thursday already? 21:04:36 <alaski> hey 21:04:43 <rainya> and it's snowing here! 21:04:46 <dansmith> russellb: can't.. I'm beat down by the corporate machine today :( 21:04:55 <russellb> so, we have a couple of agenda items ... grizzly-3 review can easily eat up the whole time I bet, so perhaps we should start with the other 21:04:58 <russellb> dansmith: :( 21:05:07 <annegentle> rainya: Get. Out. 21:05:09 <russellb> vishy: up to you 21:05:17 <rainya> annegentle, i'm in blacksburg :D 21:05:24 <annegentle> rainya: LOL 21:05:27 <vishy> #topic Review grizzly-3 objectives 21:05:43 <vishy> okie so our grizzly-3 list is kind of insane 21:05:58 <russellb> #link https://launchpad.net/nova/+milestone/grizzly-3 21:06:33 <vishy> if you are assigned to any of these and you know it is not going to make it, let me know and i will untarget 21:07:25 <russellb> I think I'll untarget and unassign myself from rpc-based-servicegroup-driver ... I did some conductor related changes to do what we need for no-db-compute for now 21:07:26 <vishy> dansmith: can i untarget this from grizzly? https://blueprints.launchpad.net/nova/+spec/rpc-version-control 21:07:29 <russellb> that was my main motivation to work on 21:07:30 <russellb> it 21:07:44 <dansmith> vishy: actually, 21:07:51 <vishy> russellb: ok pull it off the grizzly list then? 21:08:03 <russellb> yeah, sounds good. i'll get it now 21:08:11 <dansmith> I was going to ask about that.. I think it's important for upgrade.. we need to have that a release before we want to be able to support rolling upgrades, right? 21:08:52 <dansmith> actually, wait, maybe I'm confusing this with another thing 21:09:15 <vishy> does anyone have any other blueprints that they know will miss? 21:10:17 <russellb> dansmith: you're right, it's important for upgrades in some cases anyway 21:10:28 <dansmith> russellb: well, I was thinking this would be a version discovery thing, 21:10:34 <devananda> openvz isn't assigned to me but i still haven't seen any movement on it 21:10:48 <devananda> so i think it's safe to assume that will miss 21:10:55 <dansmith> but now that I think of it, H could be taught to speak G messages, with a config flag until everything is upgraded and limp along, 21:10:56 <russellb> mainly the case where you can not clearly upgrade the server side before client (like computes talkikng to each other) 21:11:05 <russellb> right 21:11:07 <dansmith> but it would be nicer if we could do something smarter 21:11:07 <dansmith> right 21:11:23 <ndipanov> I have this: https://blueprints.launchpad.net/nova/+spec/improve-boot-from-volume that I am hoping to make but am becoming less optimistic 21:11:25 <russellb> not sure it's realistic to think even that will happen for G though 21:11:52 <vishy> ok i'm untagging stuff 21:12:01 <ndipanov> If I suspect it'll sleep I will inform you - leave for now please 21:12:02 <vishy> ndipanov: would love to see that in 21:12:17 <ndipanov> vishy, if bdm goes well it'll be easey 21:12:18 <devananda> for general-baremetal-provisioning-framework, enough of that is now implemented that I am inclined to mark that BP as "implemented" and start opening other BPs for more specific changes, like quantum integration, more scalable deploys, etc. 21:12:24 <devananda> lifeless: thoughts on ^ ? 21:12:24 <ndipanov> vishy, but I like to underpromise 21:12:32 <ndipanov> vishy, leave it for now... 21:12:39 <russellb> dansmith: how far off do you think we are with no-db-compute? i haven't done a good audit lately 21:12:44 <lifeless> devananda: worksforme 21:13:03 <dansmith> russellb: I dunno, I've been meaning to do one.. I think we're actually making pretty good progress 21:13:10 <dansmith> russellb: I can try to have an audit for next week though if you want 21:13:17 <russellb> dansmith: yeah that's my gut feeling, too 21:13:24 <dansmith> I think I keep putting it off, expecting it will make me depressed :) 21:13:48 <russellb> dansmith: that'd be great. you know i was thinking ... we could throw up a WIP patch that intentionally breaks the db API for nova-compute ... and look at test results to find stuff :) 21:13:59 <dansmith> russellb: yeah 21:14:23 <dansmith> russellb: also, if you could stop racing with my rpc api version changes, that'd help :D 21:14:32 <russellb> dansmith: :-p 21:14:44 <russellb> dansmith: really need to figure out a better way to handle that that isn't so conflict heavy ... 21:15:01 <vishy> #topic API-v3 and API-v2 coexistence options 21:15:03 <russellb> vishy: i'm at a conference for the rest of the weekend, so if not sooner, i can continue to help with g-3 blueprint cleanup next week 21:15:03 <vishy> hehe 21:15:07 <dansmith> yeah, I've thought about it 21:15:19 <vishy> who is giving the update here 21:15:23 <vishy> sdague, maurosr ? 21:15:26 <russellb> maurosr: ^^ 21:15:26 <dansmith> maurosr: 21:15:31 <maurosr> hi 21:15:41 * russellb imagines maurosr's IRC client blowing up 21:16:00 <maurosr> the main question is: how are we going to allow this coexistence? 21:16:27 <maurosr> I have a huge patch with 13k lines (splitted into ~26 that I didn't pushed) with only copies 21:16:56 <russellb> 13k lines of copied code does not seem ideal :-/ 21:17:08 <dansmith> russellb: let him finish :) 21:17:15 * russellb shuts up 21:17:21 <dansmith> russellb: he's prepping everyone to love his alternative plan :D 21:17:28 <russellb> lol 21:17:31 <maurosr> I don't like it.. so I thought: enable api-v2 and v3 then make v2 inherits v3 (the default api) stuff and only if you want to change something in api-v3 you will need to specify those changes 21:17:33 <rainya> hahahaha 21:18:01 <maurosr> so new classes in a v2 tree that inherits from the v3 ones 21:18:33 <maurosr> changes on v3 will break v2 so change a method on v3 and you will need to copy it back to v2 class 21:18:39 <maurosr> that will avoid the copy 21:18:39 <dansmith> maurosr: the only problem I see is that someone will go in and make a change to the v3 one, and affect the v2 one without realizing it, unless we have really good api_samples testing 21:18:57 <maurosr> yes 21:19:24 <dansmith> the opposite of v3 inheriting from v2 would be a little safer in that regard 21:19:29 <vishy> maurosr: alternative plan 21:19:30 <maurosr> that is what I thought: completing api samples bp and we will be sure that no format and response code changed 21:19:31 <dansmith> but becomes a bit messier over the long run 21:19:47 <dansmith> however, since v3 is short-lived and not expected to evolve much, it might be better 21:19:55 <vishy> symlic all of the files, if we want to change things we delete the symlink and use a real copy instead and modify it 21:20:02 <vishy> *symlink 21:20:11 <maurosr> dansmith: yeah, we can make it too... but eventually when we remove v2, it will be a bunch of code copied into v3 21:20:18 <dansmith> vishy: that won't work on, say, a win32 api host, right? 21:20:32 <vishy> dansmith: we don't have those yet 21:20:35 <vishy> :) 21:20:46 <dansmith> I figured *someome* was doing that 21:21:12 <russellb> don't want to knowingly prohibit it ... 21:21:19 <maurosr> so symlinks it's? 21:21:33 <dansmith> if we don't care, then the symlink approach is probably a more explicit divorce of the two as soon as they diverge, 21:21:52 <dansmith> but it seems like we'll eventually end up with all 13k lines changed in the long run 21:22:21 <maurosr> dansmith: not if the links are in v2 right? 21:23:05 <maurosr> btw gerrit will ignore the links in the review? 21:23:07 <dansmith> maurosr: you'll still end up coping v3 -> v2 every time you make a small change to v3 21:23:40 <maurosr> yeah, but at least the guys will see a weird change on v2 that he didn't made 21:24:20 <maurosr> so the copy will be smooth 21:24:42 <dansmith> vishy: so, you think symlinks over inheritance? 21:24:46 <maurosr> exactly what I was thinking when suggested the inheritance stuff 21:25:03 <vishy> dansmith: the inheritance part means we can't ever remove the v2 version easily 21:25:18 <dansmith> vishy: not if v2 inherits from v3 21:25:29 <maurosr> yes ^ 21:25:33 <russellb> v2 inheriting from v3 seems so unnatural 21:25:34 <vishy> dansmith: that sounds painful 21:25:41 <maurosr> it's weird but avoid problems 21:25:49 <dansmith> vishy: that was his proposal 21:25:54 <vishy> dansmith, maurosr: when we initially did v1 and v1.1 we used inheritance 21:25:58 <dansmith> I agree that it feels unnatural 21:26:06 <vishy> and it was a maintenance nightmare 21:26:18 <dansmith> okay, well, experience trumps speculation 21:26:23 <vishy> I don't want to have to repeat that if we can avoid it 21:26:28 <dansmith> maurosr: so wanna do symlinks and see how it works? 21:26:49 <maurosr> I liked the symlinks cause we can just ban any commit on v2 tree so if the developer doesn't notice the change on v2 the tests will 21:27:12 <maurosr> dansmith: i think it will work fine, will submit it, until the end of day 21:27:19 <dansmith> maurosr: okay 21:28:14 <vishy> i have another question 21:28:27 <vishy> is it reasonable to think that this will be in grizzly? 21:28:33 <vishy> it seems too late 21:28:46 <rmk> I'd say it's too late 21:28:47 <dansmith> well, 21:29:03 <vishy> I'm thinking at this point we do v3 immediately after grizzly 21:29:04 <rmk> it's just going to leave an incomplete project in grizzly 21:29:05 <dansmith> I think the audit part took longer than expected, along with other usual delays 21:29:08 <rmk> vishy: i agree 21:29:11 <maurosr> sorry but I don't know the dates 21:29:18 <dansmith> we've been feeling the pressure, but have been trying to make it happen 21:29:31 <dansmith> our thought was potentially to get what we can in by grizzly, 21:29:35 <vishy> dansmith: I don't see the point of releasing a v3 api until we've had time to solidify it 21:29:43 <rmk> ^^ 21:29:43 <dansmith> but perhaps not call it stable until later 21:29:44 <uvirtbot> rmk: Error: "^" is not a valid command. 21:29:52 <rmk> uvirtbot: thanks smart ass bot 21:29:53 <uvirtbot> rmk: Error: "thanks" is not a valid command. 21:29:55 <rmk> hah 21:30:14 <maurosr> when is the dead line? 21:30:19 <rmk> I'd make v3 api as a whole an H target 21:30:27 <rmk> That would be my vote if I had one 21:30:55 <rmk> maurosr: We're about 2 weeks out from g3, after which it's pretty much lockdown for bugfixes 21:31:15 <dansmith> is it really that close? 21:31:39 <vishy> rmk: I agree. There is no point in complicating things before the g3 release 21:31:42 <russellb> feb 21 IIRC 21:31:46 <vishy> * grizzyl release 21:31:50 <maurosr> ok, I know that I can assume that v3 will be stable until there.. but maybe include it as beta? 21:31:57 <russellb> #link http://wiki.openstack.org/GrizzlyReleaseSchedule 21:31:58 <rmk> OK so I'm off by 2 weeks -- still not enough time to solidify a whole new version of the API 21:32:04 <dansmith> rmk: I could five :) 21:32:08 <maurosr> cause we have some extensions changes depending on that 21:32:12 <dansmith> rmk: it's not a new version 21:32:29 <vishy> maurosr: the issue is I don't want people to start expecting api compatibility 21:32:40 <dansmith> okay, so it sounds like we need to hold off on committing anything then, right? 21:32:59 <maurosr> right 21:33:02 <vishy> dansmith: I would say lets hold off until we open up H 21:33:06 <dansmith> okay 21:33:07 <russellb> sounds like a good idea to me ... 21:33:14 <dansmith> a couple of us will get fired over this, but that's okay 21:33:24 * maurosr runs 21:33:32 <russellb> O.O 21:33:41 <dansmith> hmm, this is logged.. JUST KIDDING 21:33:46 <comstud> lol 21:33:47 <rmk> haha 21:34:02 <dansmith> only maurosr will get fired 21:34:10 <dansmith> dammit.. JUST KIDDING 21:34:19 <maurosr> hehe 21:34:26 <dansmith> okay, so that's the end of that 21:34:35 <maurosr> one more thing 21:34:48 <maurosr> well nm.. it's not for g3 21:35:14 <vishy> ok 21:35:22 <vishy> #topic bugs 21:35:46 <vishy> #link http://webnumbr.com/untouched-nova-bugs 21:35:57 <vishy> I went through a few yesterday and found some interesting ones 21:36:12 <vishy> i suspect we also have a lot of open bugs 21:36:25 <comstud> I looked through a number the other day, but didn't find time to really dig into them 21:36:42 <comstud> lots in areas I'm not completely familar with 21:37:05 <russellb> i triaged a few at least 21:37:08 <comstud> libvirt, networking :) 21:37:12 <vishy> yup 466 21:37:24 <vishy> 466 in new/confirmed/triaged 21:37:27 <vishy> that is a lot 21:37:42 <vishy> i suppose we have a whole month to close those guys after g3 21:37:45 <rmk> I'll take a look at some of the libvirt ones 21:37:54 <russellb> was up over 50 at the beginning of the week 21:37:56 <vishy> but if anyone feels like switching over to bug fixing i don't mid! 21:38:13 <rmk> I'm still a relative noob to anything outside that driver so may as well 21:38:19 <vishy> also: https://bugs.launchpad.net/nova/+bugs?field.tag=folsom-backport-potential 21:38:30 <vishy> there are some there that need to be fixed and backported 21:40:03 <vishy> #topic open discussion 21:40:07 <jog0> https://bugs.launchpad.net/nova/+bug/1098380 21:40:09 <uvirtbot> Launchpad bug 1098380 in nova "Quotas showing in use when no VMs are running" [Critical,Confirmed] 21:40:26 <jog0> not sure best way to fix that bug 21:40:27 <comstud> Please review the last cells patches I have up: 15234, 15235, 15236, 16221 21:40:30 <comstud> :) 21:41:04 <dansmith> I'm out next week, so don't expect me to be around for any no-db-compute fires I may start this week :D 21:41:20 <comstud> Yeah, also: I'll be tied up in all day meetings starting next week for 2 weeks 21:41:23 <comstud> and I'm OOTO tomorrow 21:41:35 <comstud> So I'll be around less 21:41:52 <dansmith> comstud: I was only worried about you hunting me down, so this'll work out great 21:41:56 <comstud> lol 21:42:05 <comstud> I'm sure I'll still hear about your bugs 21:42:10 <dansmith> heh 21:42:15 <comstud> I'll just hunt you down at different hours 21:42:20 <dansmith> I'll give you my phone number 21:42:27 <comstud> perfect 21:42:28 <russellb> nice 21:42:29 <dansmith> just let me look up the pizza joint around the block 21:42:36 <comstud> :) 21:42:49 <vishy> jog0: can you give an overview? 21:42:54 <jog0> sure 21:43:21 <vishy> comstud: can you poke Vek? He might have some ideas 21:43:22 <russellb> dansmith: i got yo' back! 21:43:25 <jog0> that bug has a small script that makes the quotas fail 21:43:37 <comstud> vishy: ideas about? 21:43:43 <comstud> Im sure he has ideas 21:43:50 <dansmith> russellb: heh :) 21:43:51 <vishy> comstud: jog0 's bug with quotas 21:43:54 <comstud> Oh ok 21:44:03 <jog0> you can get stuck in an situation where quota usage is higher than quota limits 21:44:06 <russellb> why the world is broken 21:44:15 <russellb> i mean quotas 21:44:29 <comstud> I let him know.. if he sees my msg 21:44:53 <jog0> this happens when quotas go negative (if have two actions running at once the mess up), they reset but ignore VM state 21:45:22 <jog0> so if you have 10 VMs in error state they get counted as used 21:45:24 <jog0> in quota land 21:45:53 <comstud> VMs in error... maybe were meant to be counted 21:46:03 <comstud> because we can have VMs go to ERROR for 'all of the things' 21:46:17 <comstud> meaning.. they can go to ERROR later if some random task fails 21:46:49 <jog0> comstud: right, but the problem is you can force the quota usage to go higher then the actual quota 21:47:01 <jog0> additionally you can make it go negative 21:47:21 <comstud> Yea, something doesn't sound right there :) 21:47:28 <jog0> spin a VM up, delete vm while spinning up 21:47:41 <jog0> wait till task state goes away from deleting redelete 21:47:50 <jog0> and you decrement the quota by 2 21:47:54 <jog0> instead of by 1 21:48:06 <jog0> only 1 task state per vm 21:48:16 <jog0> so when have two tasks they fight for the task_state 21:48:28 <comstud> yep 21:48:45 <vishy> jog0: so do you have any ideas about how we might fix it? 21:49:33 <jog0> so right now quotas have there own table tracking usage 21:49:44 <comstud> one idea (i don't like).. associate instances with the quota entries 21:49:53 <comstud> maybe I don't like 21:50:11 <comstud> there's not really quota entries per instance anyway 21:50:14 <jog0> we could just look at the instances table and do the logic on the fly when trying to allocate a resource 21:50:29 <jog0> so don't have to deallocate when resource fails 21:50:32 <comstud> that's how it was before 21:50:55 <jog0> comstud: why did it change? 21:50:56 <comstud> and it was racey... not sure for worse or better than now 21:50:57 <comstud> :) 21:51:05 <jog0> comstud: :/ 21:51:09 <comstud> there was no atomic check-and-set 21:51:55 <jog0> can't the db lookup be made atomic? 21:52:43 <comstud> sure, if you move quota checking perhaps inside of db.instance_create()...and the other methods where quotas come into play 21:53:02 <comstud> it'll affect performance 21:53:47 <jog0> I would rather take a performance hit if it makes them work all the time 21:54:35 <jog0> comstud: can you give an example of the kinds of race conditions we used to have? 21:54:43 <vishy> right now we have to do a periodic cleanup of quota calculations 21:54:50 <vishy> which is fugly 21:55:01 <comstud> jog0: The code before was: 21:55:04 <comstud> on new build: 21:55:11 <jog0> vishy: the cleanup code is broken too though 21:55:18 <comstud> 1) Will this pass quotas? 21:55:24 <russellb> endmeeting? 21:55:30 <comstud> 2) Carry on... 21:55:34 <comstud> 3) create instance 21:55:46 <jog0> vishy: https://github.com/openstack/nova/blob/master/nova/quota.py#L1036 21:55:47 <comstud> you could sneak a number of builds into step 2 21:55:52 <comstud> surpassing the quotas 21:56:03 <jog0> comstud: ahh 21:56:07 <comstud> 1 and 3 were not atomic 21:56:55 <jog0> what about counting vms in building state in step 1? 21:57:08 <jog0> and now that the DB entry is always done in API 21:57:14 <comstud> so this new quota system was generated to be rather generic.. and its checking and setting operations were atomic 21:57:35 <comstud> But yeah, it fails when you delete an instance.. reset state, delete an instance again... 21:57:40 <comstud> and all sorts of cases you mention 21:58:02 <vishy> yeah we can cary on in dev 21:58:08 <comstud> ya 21:58:08 <vishy> #endmeeting nova