21:03:28 #startmeeting nova 21:03:29 Meeting started Thu Jan 17 21:03:28 2013 UTC. The chair is russellb. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:03:30 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:03:32 #chair vishy 21:03:33 The meeting name has been set to 'nova' 21:03:34 Current chairs: russellb vishy 21:03:38 Howdy, nova folks 21:03:44 hola 21:03:50 hihi :) 21:03:51 #link http://wiki.openstack.org/Meetings/Nova 21:04:09 meh. 21:04:18 hiya 21:04:19 hi! 21:04:20 o/ 21:04:21 haaaaaai 21:04:24 dansmith: aw, hype it up! 21:04:35 is it Thursday already? 21:04:36 hey 21:04:43 and it's snowing here! 21:04:46 russellb: can't.. I'm beat down by the corporate machine today :( 21:04:55 so, we have a couple of agenda items ... grizzly-3 review can easily eat up the whole time I bet, so perhaps we should start with the other 21:04:58 dansmith: :( 21:05:07 rainya: Get. Out. 21:05:09 vishy: up to you 21:05:17 annegentle, i'm in blacksburg :D 21:05:24 rainya: LOL 21:05:27 #topic Review grizzly-3 objectives 21:05:43 okie so our grizzly-3 list is kind of insane 21:05:58 #link https://launchpad.net/nova/+milestone/grizzly-3 21:06:33 if you are assigned to any of these and you know it is not going to make it, let me know and i will untarget 21:07:25 I think I'll untarget and unassign myself from rpc-based-servicegroup-driver ... I did some conductor related changes to do what we need for no-db-compute for now 21:07:26 dansmith: can i untarget this from grizzly? https://blueprints.launchpad.net/nova/+spec/rpc-version-control 21:07:29 that was my main motivation to work on 21:07:30 it 21:07:44 vishy: actually, 21:07:51 russellb: ok pull it off the grizzly list then? 21:08:03 yeah, sounds good. i'll get it now 21:08:11 I was going to ask about that.. I think it's important for upgrade.. we need to have that a release before we want to be able to support rolling upgrades, right? 21:08:52 actually, wait, maybe I'm confusing this with another thing 21:09:15 does anyone have any other blueprints that they know will miss? 21:10:17 dansmith: you're right, it's important for upgrades in some cases anyway 21:10:28 russellb: well, I was thinking this would be a version discovery thing, 21:10:34 openvz isn't assigned to me but i still haven't seen any movement on it 21:10:48 so i think it's safe to assume that will miss 21:10:55 but now that I think of it, H could be taught to speak G messages, with a config flag until everything is upgraded and limp along, 21:10:56 mainly the case where you can not clearly upgrade the server side before client (like computes talkikng to each other) 21:11:05 right 21:11:07 but it would be nicer if we could do something smarter 21:11:07 right 21:11:23 I have this: https://blueprints.launchpad.net/nova/+spec/improve-boot-from-volume that I am hoping to make but am becoming less optimistic 21:11:25 not sure it's realistic to think even that will happen for G though 21:11:52 ok i'm untagging stuff 21:12:01 If I suspect it'll sleep I will inform you - leave for now please 21:12:02 ndipanov: would love to see that in 21:12:17 vishy, if bdm goes well it'll be easey 21:12:18 for general-baremetal-provisioning-framework, enough of that is now implemented that I am inclined to mark that BP as "implemented" and start opening other BPs for more specific changes, like quantum integration, more scalable deploys, etc. 21:12:24 lifeless: thoughts on ^ ? 21:12:24 vishy, but I like to underpromise 21:12:32 vishy, leave it for now... 21:12:39 dansmith: how far off do you think we are with no-db-compute? i haven't done a good audit lately 21:12:44 devananda: worksforme 21:13:03 russellb: I dunno, I've been meaning to do one.. I think we're actually making pretty good progress 21:13:10 russellb: I can try to have an audit for next week though if you want 21:13:17 dansmith: yeah that's my gut feeling, too 21:13:24 I think I keep putting it off, expecting it will make me depressed :) 21:13:48 dansmith: that'd be great. you know i was thinking ... we could throw up a WIP patch that intentionally breaks the db API for nova-compute ... and look at test results to find stuff :) 21:13:59 russellb: yeah 21:14:23 russellb: also, if you could stop racing with my rpc api version changes, that'd help :D 21:14:32 dansmith: :-p 21:14:44 dansmith: really need to figure out a better way to handle that that isn't so conflict heavy ... 21:15:01 #topic API-v3 and API-v2 coexistence options 21:15:03 vishy: i'm at a conference for the rest of the weekend, so if not sooner, i can continue to help with g-3 blueprint cleanup next week 21:15:03 hehe 21:15:07 yeah, I've thought about it 21:15:19 who is giving the update here 21:15:23 sdague, maurosr ? 21:15:26 maurosr: ^^ 21:15:26 maurosr: 21:15:31 hi 21:15:41 * russellb imagines maurosr's IRC client blowing up 21:16:00 the main question is: how are we going to allow this coexistence? 21:16:27 I have a huge patch with 13k lines (splitted into ~26 that I didn't pushed) with only copies 21:16:56 13k lines of copied code does not seem ideal :-/ 21:17:08 russellb: let him finish :) 21:17:15 * russellb shuts up 21:17:21 russellb: he's prepping everyone to love his alternative plan :D 21:17:28 lol 21:17:31 I don't like it.. so I thought: enable api-v2 and v3 then make v2 inherits v3 (the default api) stuff and only if you want to change something in api-v3 you will need to specify those changes 21:17:33 hahahaha 21:18:01 so new classes in a v2 tree that inherits from the v3 ones 21:18:33 changes on v3 will break v2 so change a method on v3 and you will need to copy it back to v2 class 21:18:39 that will avoid the copy 21:18:39 maurosr: the only problem I see is that someone will go in and make a change to the v3 one, and affect the v2 one without realizing it, unless we have really good api_samples testing 21:18:57 yes 21:19:24 the opposite of v3 inheriting from v2 would be a little safer in that regard 21:19:29 maurosr: alternative plan 21:19:30 that is what I thought: completing api samples bp and we will be sure that no format and response code changed 21:19:31 but becomes a bit messier over the long run 21:19:47 however, since v3 is short-lived and not expected to evolve much, it might be better 21:19:55 symlic all of the files, if we want to change things we delete the symlink and use a real copy instead and modify it 21:20:02 *symlink 21:20:11 dansmith: yeah, we can make it too... but eventually when we remove v2, it will be a bunch of code copied into v3 21:20:18 vishy: that won't work on, say, a win32 api host, right? 21:20:32 dansmith: we don't have those yet 21:20:35 :) 21:20:46 I figured *someome* was doing that 21:21:12 don't want to knowingly prohibit it ... 21:21:19 so symlinks it's? 21:21:33 if we don't care, then the symlink approach is probably a more explicit divorce of the two as soon as they diverge, 21:21:52 but it seems like we'll eventually end up with all 13k lines changed in the long run 21:22:21 dansmith: not if the links are in v2 right? 21:23:05 btw gerrit will ignore the links in the review? 21:23:07 maurosr: you'll still end up coping v3 -> v2 every time you make a small change to v3 21:23:40 yeah, but at least the guys will see a weird change on v2 that he didn't made 21:24:20 so the copy will be smooth 21:24:42 vishy: so, you think symlinks over inheritance? 21:24:46 exactly what I was thinking when suggested the inheritance stuff 21:25:03 dansmith: the inheritance part means we can't ever remove the v2 version easily 21:25:18 vishy: not if v2 inherits from v3 21:25:29 yes ^ 21:25:33 v2 inheriting from v3 seems so unnatural 21:25:34 dansmith: that sounds painful 21:25:41 it's weird but avoid problems 21:25:49 vishy: that was his proposal 21:25:54 dansmith, maurosr: when we initially did v1 and v1.1 we used inheritance 21:25:58 I agree that it feels unnatural 21:26:06 and it was a maintenance nightmare 21:26:18 okay, well, experience trumps speculation 21:26:23 I don't want to have to repeat that if we can avoid it 21:26:28 maurosr: so wanna do symlinks and see how it works? 21:26:49 I liked the symlinks cause we can just ban any commit on v2 tree so if the developer doesn't notice the change on v2 the tests will 21:27:12 dansmith: i think it will work fine, will submit it, until the end of day 21:27:19 maurosr: okay 21:28:14 i have another question 21:28:27 is it reasonable to think that this will be in grizzly? 21:28:33 it seems too late 21:28:46 I'd say it's too late 21:28:47 well, 21:29:03 I'm thinking at this point we do v3 immediately after grizzly 21:29:04 it's just going to leave an incomplete project in grizzly 21:29:05 I think the audit part took longer than expected, along with other usual delays 21:29:08 vishy: i agree 21:29:11 sorry but I don't know the dates 21:29:18 we've been feeling the pressure, but have been trying to make it happen 21:29:31 our thought was potentially to get what we can in by grizzly, 21:29:35 dansmith: I don't see the point of releasing a v3 api until we've had time to solidify it 21:29:43 ^^ 21:29:43 but perhaps not call it stable until later 21:29:44 rmk: Error: "^" is not a valid command. 21:29:52 uvirtbot: thanks smart ass bot 21:29:53 rmk: Error: "thanks" is not a valid command. 21:29:55 hah 21:30:14 when is the dead line? 21:30:19 I'd make v3 api as a whole an H target 21:30:27 That would be my vote if I had one 21:30:55 maurosr: We're about 2 weeks out from g3, after which it's pretty much lockdown for bugfixes 21:31:15 is it really that close? 21:31:39 rmk: I agree. There is no point in complicating things before the g3 release 21:31:42 feb 21 IIRC 21:31:46 * grizzyl release 21:31:50 ok, I know that I can assume that v3 will be stable until there.. but maybe include it as beta? 21:31:57 #link http://wiki.openstack.org/GrizzlyReleaseSchedule 21:31:58 OK so I'm off by 2 weeks -- still not enough time to solidify a whole new version of the API 21:32:04 rmk: I could five :) 21:32:08 cause we have some extensions changes depending on that 21:32:12 rmk: it's not a new version 21:32:29 maurosr: the issue is I don't want people to start expecting api compatibility 21:32:40 okay, so it sounds like we need to hold off on committing anything then, right? 21:32:59 right 21:33:02 dansmith: I would say lets hold off until we open up H 21:33:06 okay 21:33:07 sounds like a good idea to me ... 21:33:14 a couple of us will get fired over this, but that's okay 21:33:24 * maurosr runs 21:33:32 O.O 21:33:41 hmm, this is logged.. JUST KIDDING 21:33:46 lol 21:33:47 haha 21:34:02 only maurosr will get fired 21:34:10 dammit.. JUST KIDDING 21:34:19 hehe 21:34:26 okay, so that's the end of that 21:34:35 one more thing 21:34:48 well nm.. it's not for g3 21:35:14 ok 21:35:22 #topic bugs 21:35:46 #link http://webnumbr.com/untouched-nova-bugs 21:35:57 I went through a few yesterday and found some interesting ones 21:36:12 i suspect we also have a lot of open bugs 21:36:25 I looked through a number the other day, but didn't find time to really dig into them 21:36:42 lots in areas I'm not completely familar with 21:37:05 i triaged a few at least 21:37:08 libvirt, networking :) 21:37:12 yup 466 21:37:24 466 in new/confirmed/triaged 21:37:27 that is a lot 21:37:42 i suppose we have a whole month to close those guys after g3 21:37:45 I'll take a look at some of the libvirt ones 21:37:54 was up over 50 at the beginning of the week 21:37:56 but if anyone feels like switching over to bug fixing i don't mid! 21:38:13 I'm still a relative noob to anything outside that driver so may as well 21:38:19 also: https://bugs.launchpad.net/nova/+bugs?field.tag=folsom-backport-potential 21:38:30 there are some there that need to be fixed and backported 21:40:03 #topic open discussion 21:40:07 https://bugs.launchpad.net/nova/+bug/1098380 21:40:09 Launchpad bug 1098380 in nova "Quotas showing in use when no VMs are running" [Critical,Confirmed] 21:40:26 not sure best way to fix that bug 21:40:27 Please review the last cells patches I have up: 15234, 15235, 15236, 16221 21:40:30 :) 21:41:04 I'm out next week, so don't expect me to be around for any no-db-compute fires I may start this week :D 21:41:20 Yeah, also: I'll be tied up in all day meetings starting next week for 2 weeks 21:41:23 and I'm OOTO tomorrow 21:41:35 So I'll be around less 21:41:52 comstud: I was only worried about you hunting me down, so this'll work out great 21:41:56 lol 21:42:05 I'm sure I'll still hear about your bugs 21:42:10 heh 21:42:15 I'll just hunt you down at different hours 21:42:20 I'll give you my phone number 21:42:27 perfect 21:42:28 nice 21:42:29 just let me look up the pizza joint around the block 21:42:36 :) 21:42:49 jog0: can you give an overview? 21:42:54 sure 21:43:21 comstud: can you poke Vek? He might have some ideas 21:43:22 dansmith: i got yo' back! 21:43:25 that bug has a small script that makes the quotas fail 21:43:37 vishy: ideas about? 21:43:43 Im sure he has ideas 21:43:50 russellb: heh :) 21:43:51 comstud: jog0 's bug with quotas 21:43:54 Oh ok 21:44:03 you can get stuck in an situation where quota usage is higher than quota limits 21:44:06 why the world is broken 21:44:15 i mean quotas 21:44:29 I let him know.. if he sees my msg 21:44:53 this happens when quotas go negative (if have two actions running at once the mess up), they reset but ignore VM state 21:45:22 so if you have 10 VMs in error state they get counted as used 21:45:24 in quota land 21:45:53 VMs in error... maybe were meant to be counted 21:46:03 because we can have VMs go to ERROR for 'all of the things' 21:46:17 meaning.. they can go to ERROR later if some random task fails 21:46:49 comstud: right, but the problem is you can force the quota usage to go higher then the actual quota 21:47:01 additionally you can make it go negative 21:47:21 Yea, something doesn't sound right there :) 21:47:28 spin a VM up, delete vm while spinning up 21:47:41 wait till task state goes away from deleting redelete 21:47:50 and you decrement the quota by 2 21:47:54 instead of by 1 21:48:06 only 1 task state per vm 21:48:16 so when have two tasks they fight for the task_state 21:48:28 yep 21:48:45 jog0: so do you have any ideas about how we might fix it? 21:49:33 so right now quotas have there own table tracking usage 21:49:44 one idea (i don't like).. associate instances with the quota entries 21:49:53 maybe I don't like 21:50:11 there's not really quota entries per instance anyway 21:50:14 we could just look at the instances table and do the logic on the fly when trying to allocate a resource 21:50:29 so don't have to deallocate when resource fails 21:50:32 that's how it was before 21:50:55 comstud: why did it change? 21:50:56 and it was racey... not sure for worse or better than now 21:50:57 :) 21:51:05 comstud: :/ 21:51:09 there was no atomic check-and-set 21:51:55 can't the db lookup be made atomic? 21:52:43 sure, if you move quota checking perhaps inside of db.instance_create()...and the other methods where quotas come into play 21:53:02 it'll affect performance 21:53:47 I would rather take a performance hit if it makes them work all the time 21:54:35 comstud: can you give an example of the kinds of race conditions we used to have? 21:54:43 right now we have to do a periodic cleanup of quota calculations 21:54:50 which is fugly 21:55:01 jog0: The code before was: 21:55:04 on new build: 21:55:11 vishy: the cleanup code is broken too though 21:55:18 1) Will this pass quotas? 21:55:24 endmeeting? 21:55:30 2) Carry on... 21:55:34 3) create instance 21:55:46 vishy: https://github.com/openstack/nova/blob/master/nova/quota.py#L1036 21:55:47 you could sneak a number of builds into step 2 21:55:52 surpassing the quotas 21:56:03 comstud: ahh 21:56:07 1 and 3 were not atomic 21:56:55 what about counting vms in building state in step 1? 21:57:08 and now that the DB entry is always done in API 21:57:14 so this new quota system was generated to be rather generic.. and its checking and setting operations were atomic 21:57:35 But yeah, it fails when you delete an instance.. reset state, delete an instance again... 21:57:40 and all sorts of cases you mention 21:58:02 yeah we can cary on in dev 21:58:08 ya 21:58:08 #endmeeting nova