21:03:28 <russellb> #startmeeting nova
21:03:29 <openstack> Meeting started Thu Jan 17 21:03:28 2013 UTC.  The chair is russellb. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:03:30 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:03:32 <russellb> #chair vishy
21:03:33 <openstack> The meeting name has been set to 'nova'
21:03:34 <openstack> Current chairs: russellb vishy
21:03:38 <russellb> Howdy, nova folks
21:03:44 <ndipanov> hola
21:03:50 <rainya> hihi :)
21:03:51 <russellb> #link http://wiki.openstack.org/Meetings/Nova
21:04:09 <dansmith> meh.
21:04:18 <devananda> hiya
21:04:19 <vishy> hi!
21:04:20 <jog0> o/
21:04:21 <comstud> haaaaaai
21:04:24 <russellb> dansmith: aw, hype it up!
21:04:35 <comstud> is it Thursday already?
21:04:36 <alaski> hey
21:04:43 <rainya> and it's snowing here!
21:04:46 <dansmith> russellb: can't.. I'm beat down by the corporate machine today :(
21:04:55 <russellb> so, we have a couple of agenda items ... grizzly-3 review can easily eat up the whole time I bet, so perhaps we should start with the other
21:04:58 <russellb> dansmith: :(
21:05:07 <annegentle> rainya: Get. Out.
21:05:09 <russellb> vishy: up to you
21:05:17 <rainya> annegentle, i'm in blacksburg :D
21:05:24 <annegentle> rainya: LOL
21:05:27 <vishy> #topic Review grizzly-3 objectives
21:05:43 <vishy> okie so our grizzly-3 list is kind of insane
21:05:58 <russellb> #link https://launchpad.net/nova/+milestone/grizzly-3
21:06:33 <vishy> if you are assigned to any of these and you know it is not going to make it, let me know and i will untarget
21:07:25 <russellb> I think I'll untarget and unassign myself from rpc-based-servicegroup-driver ... I did some conductor related changes to do what we need for no-db-compute for now
21:07:26 <vishy> dansmith: can i untarget this from grizzly? https://blueprints.launchpad.net/nova/+spec/rpc-version-control
21:07:29 <russellb> that was my main motivation to work on
21:07:30 <russellb> it
21:07:44 <dansmith> vishy: actually,
21:07:51 <vishy> russellb: ok pull it off the grizzly list then?
21:08:03 <russellb> yeah, sounds good.  i'll get it now
21:08:11 <dansmith> I was going to ask about that.. I think it's important for upgrade.. we need to have that a release before we want to be able to support rolling upgrades, right?
21:08:52 <dansmith> actually, wait, maybe I'm confusing this with another thing
21:09:15 <vishy> does anyone have any other blueprints that they know will miss?
21:10:17 <russellb> dansmith: you're right, it's important for upgrades in some cases anyway
21:10:28 <dansmith> russellb: well, I was thinking this would be a version discovery thing,
21:10:34 <devananda> openvz isn't assigned to me but i still haven't seen any movement on it
21:10:48 <devananda> so i think it's safe to assume that will miss
21:10:55 <dansmith> but now that I think of it, H could be taught to speak G messages, with a config flag until everything is upgraded and limp along,
21:10:56 <russellb> mainly the case where you can not clearly upgrade the server side before client (like computes talkikng to each other)
21:11:05 <russellb> right
21:11:07 <dansmith> but it would be nicer if we could do something smarter
21:11:07 <dansmith> right
21:11:23 <ndipanov> I have this: https://blueprints.launchpad.net/nova/+spec/improve-boot-from-volume that I am hoping to make but am becoming less optimistic
21:11:25 <russellb> not sure it's realistic to think even that will happen for G though
21:11:52 <vishy> ok i'm untagging stuff
21:12:01 <ndipanov> If I suspect it'll sleep I will inform you - leave for now please
21:12:02 <vishy> ndipanov: would love to see that in
21:12:17 <ndipanov> vishy, if bdm goes well it'll be easey
21:12:18 <devananda> for general-baremetal-provisioning-framework, enough of that is now implemented that I am inclined to mark that BP as "implemented" and start opening other BPs for more specific changes, like quantum integration, more scalable deploys, etc.
21:12:24 <devananda> lifeless: thoughts on ^ ?
21:12:24 <ndipanov> vishy, but I like to underpromise
21:12:32 <ndipanov> vishy, leave it for now...
21:12:39 <russellb> dansmith: how far off do you think we are with no-db-compute?  i haven't done a good audit lately
21:12:44 <lifeless> devananda: worksforme
21:13:03 <dansmith> russellb: I dunno, I've been meaning to do one.. I think we're actually making pretty good progress
21:13:10 <dansmith> russellb: I can try to have an audit for next week though if you want
21:13:17 <russellb> dansmith: yeah that's my gut feeling, too
21:13:24 <dansmith> I think I keep putting it off, expecting it will make me depressed :)
21:13:48 <russellb> dansmith: that'd be great.  you know i was thinking ... we could throw up a WIP patch that intentionally breaks the db API for nova-compute ... and look at test results to find stuff :)
21:13:59 <dansmith> russellb: yeah
21:14:23 <dansmith> russellb: also, if you could stop racing with my rpc api version changes, that'd help :D
21:14:32 <russellb> dansmith: :-p
21:14:44 <russellb> dansmith: really need to figure out a better way to handle that that isn't so conflict heavy ...
21:15:01 <vishy> #topic API-v3 and API-v2 coexistence options
21:15:03 <russellb> vishy: i'm at a conference for the rest of the weekend, so if not sooner, i can continue to help with g-3 blueprint cleanup next week
21:15:03 <vishy> hehe
21:15:07 <dansmith> yeah, I've thought about it
21:15:19 <vishy> who is giving the update here
21:15:23 <vishy> sdague, maurosr ?
21:15:26 <russellb> maurosr: ^^
21:15:26 <dansmith> maurosr:
21:15:31 <maurosr> hi
21:15:41 * russellb imagines maurosr's IRC client blowing up
21:16:00 <maurosr> the main question is: how are we going to allow this coexistence?
21:16:27 <maurosr> I have a huge patch with 13k lines (splitted into ~26 that I didn't pushed)  with only copies
21:16:56 <russellb> 13k lines of copied code does not seem ideal :-/
21:17:08 <dansmith> russellb: let him finish :)
21:17:15 * russellb shuts up
21:17:21 <dansmith> russellb: he's prepping everyone to love his alternative plan :D
21:17:28 <russellb> lol
21:17:31 <maurosr> I don't like it.. so I thought: enable api-v2 and v3 then make v2 inherits v3 (the default api) stuff  and only if you want to change something in api-v3 you will need to specify those changes
21:17:33 <rainya> hahahaha
21:18:01 <maurosr> so new classes in a v2 tree that inherits from the v3 ones
21:18:33 <maurosr> changes on v3 will break v2 so change a method on v3 and you will need to copy it back to v2 class
21:18:39 <maurosr> that will avoid the copy
21:18:39 <dansmith> maurosr: the only problem I see is that someone will go in and make a change to the v3 one, and affect the v2 one without realizing it, unless we have really good api_samples testing
21:18:57 <maurosr> yes
21:19:24 <dansmith> the opposite of v3 inheriting from v2 would be a little safer in that regard
21:19:29 <vishy> maurosr: alternative plan
21:19:30 <maurosr> that is what I thought: completing api samples bp and we will be sure that no format and response code changed
21:19:31 <dansmith> but becomes a bit messier over the long run
21:19:47 <dansmith> however, since v3 is short-lived and not expected to evolve much, it might be better
21:19:55 <vishy> symlic all of the files, if we want to change things we delete the symlink and use a real copy instead and modify it
21:20:02 <vishy> *symlink
21:20:11 <maurosr> dansmith: yeah, we can make it too... but eventually when we remove v2, it will be a bunch of code copied into v3
21:20:18 <dansmith> vishy: that won't work on, say, a win32 api host, right?
21:20:32 <vishy> dansmith: we don't have those yet
21:20:35 <vishy> :)
21:20:46 <dansmith> I figured *someome* was doing that
21:21:12 <russellb> don't want to knowingly prohibit it ...
21:21:19 <maurosr> so symlinks it's?
21:21:33 <dansmith> if we don't care, then the symlink approach is probably a more explicit divorce of the two as soon as they diverge,
21:21:52 <dansmith> but it seems like we'll eventually end up with all 13k lines changed in the long run
21:22:21 <maurosr> dansmith: not if the links are in v2 right?
21:23:05 <maurosr> btw gerrit will ignore the links in the review?
21:23:07 <dansmith> maurosr: you'll still end up coping v3 -> v2 every time you make a small change to v3
21:23:40 <maurosr> yeah, but at least the guys will see a weird change on v2 that he didn't made
21:24:20 <maurosr> so the copy will be smooth
21:24:42 <dansmith> vishy: so, you think symlinks over inheritance?
21:24:46 <maurosr> exactly what I was thinking when suggested the inheritance stuff
21:25:03 <vishy> dansmith: the inheritance part means we can't ever remove the v2 version easily
21:25:18 <dansmith> vishy: not if v2 inherits from v3
21:25:29 <maurosr> yes ^
21:25:33 <russellb> v2 inheriting from v3 seems so unnatural
21:25:34 <vishy> dansmith: that sounds painful
21:25:41 <maurosr> it's weird but avoid problems
21:25:49 <dansmith> vishy: that was his proposal
21:25:54 <vishy> dansmith, maurosr: when we initially did v1 and v1.1 we used inheritance
21:25:58 <dansmith> I agree that it feels unnatural
21:26:06 <vishy> and it was a maintenance nightmare
21:26:18 <dansmith> okay, well, experience trumps speculation
21:26:23 <vishy> I don't want to have to repeat that if we can avoid it
21:26:28 <dansmith> maurosr: so wanna do symlinks and see how it works?
21:26:49 <maurosr> I liked the symlinks cause we can just ban any commit on v2 tree so if the developer doesn't notice the change on v2 the tests will
21:27:12 <maurosr> dansmith: i think it will work fine, will submit it, until the end of day
21:27:19 <dansmith> maurosr: okay
21:28:14 <vishy> i have another question
21:28:27 <vishy> is it reasonable to think that this will be in grizzly?
21:28:33 <vishy> it seems too late
21:28:46 <rmk> I'd say it's too late
21:28:47 <dansmith> well,
21:29:03 <vishy> I'm thinking at this point we do v3 immediately after grizzly
21:29:04 <rmk> it's just going to leave an incomplete project in grizzly
21:29:05 <dansmith> I think the audit part took longer than expected, along with other usual delays
21:29:08 <rmk> vishy: i agree
21:29:11 <maurosr> sorry but I don't know the dates
21:29:18 <dansmith> we've been feeling the pressure, but have been trying to make it happen
21:29:31 <dansmith> our thought was potentially to get what we can in by grizzly,
21:29:35 <vishy> dansmith: I don't see the point of releasing a v3 api until we've had time to solidify it
21:29:43 <rmk> ^^
21:29:43 <dansmith> but perhaps not call it stable until later
21:29:44 <uvirtbot> rmk: Error: "^" is not a valid command.
21:29:52 <rmk> uvirtbot: thanks smart ass bot
21:29:53 <uvirtbot> rmk: Error: "thanks" is not a valid command.
21:29:55 <rmk> hah
21:30:14 <maurosr> when is the dead line?
21:30:19 <rmk> I'd make v3 api as a whole an H target
21:30:27 <rmk> That would be my vote if I had one
21:30:55 <rmk> maurosr: We're about 2 weeks out from g3, after which it's pretty much lockdown for bugfixes
21:31:15 <dansmith> is it really that close?
21:31:39 <vishy> rmk: I agree. There is no point in complicating things before the g3 release
21:31:42 <russellb> feb 21 IIRC
21:31:46 <vishy> * grizzyl release
21:31:50 <maurosr> ok, I know that I can assume that v3 will be stable until there.. but maybe include it as beta?
21:31:57 <russellb> #link http://wiki.openstack.org/GrizzlyReleaseSchedule
21:31:58 <rmk> OK so I'm off by 2 weeks -- still not enough time to solidify a whole new version of the API
21:32:04 <dansmith> rmk: I could five :)
21:32:08 <maurosr> cause we have some extensions  changes depending on that
21:32:12 <dansmith> rmk: it's not a new version
21:32:29 <vishy> maurosr: the issue is I don't want people to start expecting api compatibility
21:32:40 <dansmith> okay, so it sounds like we need to hold off on committing anything then, right?
21:32:59 <maurosr> right
21:33:02 <vishy> dansmith: I would say lets hold off until we open up H
21:33:06 <dansmith> okay
21:33:07 <russellb> sounds like a good idea to me ...
21:33:14 <dansmith> a couple of us will get fired over this, but that's okay
21:33:24 * maurosr runs
21:33:32 <russellb> O.O
21:33:41 <dansmith> hmm, this is logged.. JUST KIDDING
21:33:46 <comstud> lol
21:33:47 <rmk> haha
21:34:02 <dansmith> only maurosr will get fired
21:34:10 <dansmith> dammit.. JUST KIDDING
21:34:19 <maurosr> hehe
21:34:26 <dansmith> okay, so that's the end of that
21:34:35 <maurosr> one more thing
21:34:48 <maurosr> well nm.. it's not for g3
21:35:14 <vishy> ok
21:35:22 <vishy> #topic bugs
21:35:46 <vishy> #link http://webnumbr.com/untouched-nova-bugs
21:35:57 <vishy> I went through a few yesterday and found some interesting ones
21:36:12 <vishy> i suspect we also have a lot of open bugs
21:36:25 <comstud> I looked through a number the other day, but didn't find time to really dig into them
21:36:42 <comstud> lots in areas I'm not completely familar with
21:37:05 <russellb> i triaged a few at least
21:37:08 <comstud> libvirt, networking :)
21:37:12 <vishy> yup 466
21:37:24 <vishy> 466 in new/confirmed/triaged
21:37:27 <vishy> that is a lot
21:37:42 <vishy> i suppose we have a whole month to close those guys after g3
21:37:45 <rmk> I'll take a look at some of the libvirt ones
21:37:54 <russellb> was up over 50 at the beginning of the week
21:37:56 <vishy> but if anyone feels like switching over to bug fixing i don't mid!
21:38:13 <rmk> I'm still a relative noob to anything outside that driver so may as well
21:38:19 <vishy> also: https://bugs.launchpad.net/nova/+bugs?field.tag=folsom-backport-potential
21:38:30 <vishy> there are some there that need to be fixed and backported
21:40:03 <vishy> #topic open discussion
21:40:07 <jog0> https://bugs.launchpad.net/nova/+bug/1098380
21:40:09 <uvirtbot> Launchpad bug 1098380 in nova "Quotas showing in use when no VMs are running" [Critical,Confirmed]
21:40:26 <jog0> not sure best way to fix that bug
21:40:27 <comstud> Please review the last cells patches I have up: 15234, 15235, 15236, 16221
21:40:30 <comstud> :)
21:41:04 <dansmith> I'm out next week, so don't expect me to be around for any no-db-compute fires I may start this week :D
21:41:20 <comstud> Yeah, also: I'll be tied up in all day meetings starting next week for 2 weeks
21:41:23 <comstud> and I'm OOTO tomorrow
21:41:35 <comstud> So I'll be around less
21:41:52 <dansmith> comstud: I was only worried about you hunting me down, so this'll work out great
21:41:56 <comstud> lol
21:42:05 <comstud> I'm sure I'll still hear about your bugs
21:42:10 <dansmith> heh
21:42:15 <comstud> I'll just hunt you down at different hours
21:42:20 <dansmith> I'll give you my phone number
21:42:27 <comstud> perfect
21:42:28 <russellb> nice
21:42:29 <dansmith> just let me look up the pizza joint around the block
21:42:36 <comstud> :)
21:42:49 <vishy> jog0: can you give an overview?
21:42:54 <jog0> sure
21:43:21 <vishy> comstud: can you poke Vek? He might have some ideas
21:43:22 <russellb> dansmith: i got yo' back!
21:43:25 <jog0> that bug has a small script that makes the quotas fail
21:43:37 <comstud> vishy: ideas about?
21:43:43 <comstud> Im sure he has ideas
21:43:50 <dansmith> russellb: heh :)
21:43:51 <vishy> comstud: jog0 's bug with quotas
21:43:54 <comstud> Oh ok
21:44:03 <jog0> you can get stuck in an situation where quota usage is higher than quota limits
21:44:06 <russellb> why the world is broken
21:44:15 <russellb> i mean quotas
21:44:29 <comstud> I let him know.. if he sees my msg
21:44:53 <jog0> this happens when quotas go negative (if have two actions running at once the mess up), they reset but ignore VM state
21:45:22 <jog0> so if you have 10 VMs in error state they get counted as used
21:45:24 <jog0> in quota land
21:45:53 <comstud> VMs in error... maybe were meant to be counted
21:46:03 <comstud> because we can have VMs go to ERROR for 'all of the things'
21:46:17 <comstud> meaning.. they can go to ERROR later if some random task fails
21:46:49 <jog0> comstud: right, but the problem is you can force the quota usage to go higher then the  actual quota
21:47:01 <jog0> additionally you can make it go negative
21:47:21 <comstud> Yea, something doesn't sound right there :)
21:47:28 <jog0> spin a VM up, delete vm while spinning up
21:47:41 <jog0> wait till task state goes away from deleting redelete
21:47:50 <jog0> and you decrement the quota by 2
21:47:54 <jog0> instead of by 1
21:48:06 <jog0> only 1 task state per vm
21:48:16 <jog0> so when have two tasks they fight for the task_state
21:48:28 <comstud> yep
21:48:45 <vishy> jog0: so do you have any ideas about how we might fix it?
21:49:33 <jog0> so right now quotas have there own table tracking usage
21:49:44 <comstud> one idea (i don't like).. associate instances with the quota entries
21:49:53 <comstud> maybe I don't like
21:50:11 <comstud> there's not really quota entries per instance anyway
21:50:14 <jog0> we could just look at the instances table and do the logic on the fly when trying to allocate a resource
21:50:29 <jog0> so don't have to deallocate when resource fails
21:50:32 <comstud> that's how it was before
21:50:55 <jog0> comstud: why did it change?
21:50:56 <comstud> and it was racey... not sure for worse or better than now
21:50:57 <comstud> :)
21:51:05 <jog0> comstud: :/
21:51:09 <comstud> there was no atomic check-and-set
21:51:55 <jog0> can't the db lookup be made atomic?
21:52:43 <comstud> sure, if you move quota checking perhaps inside of db.instance_create()...and the other methods where quotas come into play
21:53:02 <comstud> it'll affect performance
21:53:47 <jog0> I would rather take a performance hit if it makes them work all the time
21:54:35 <jog0> comstud: can you give an example of the kinds of race conditions we used to have?
21:54:43 <vishy> right now we have to do a periodic cleanup of quota calculations
21:54:50 <vishy> which is fugly
21:55:01 <comstud> jog0: The code before was:
21:55:04 <comstud> on new build:
21:55:11 <jog0> vishy: the cleanup code is broken  too though
21:55:18 <comstud> 1) Will this pass quotas?
21:55:24 <russellb> endmeeting?
21:55:30 <comstud> 2) Carry on...
21:55:34 <comstud> 3) create instance
21:55:46 <jog0> vishy:  https://github.com/openstack/nova/blob/master/nova/quota.py#L1036
21:55:47 <comstud> you could sneak a number of builds into step 2
21:55:52 <comstud> surpassing the quotas
21:56:03 <jog0> comstud: ahh
21:56:07 <comstud> 1 and 3 were not atomic
21:56:55 <jog0> what about counting vms in building state in step 1?
21:57:08 <jog0> and now that the DB entry is always done in API
21:57:14 <comstud> so this new quota system was generated to be rather generic.. and its checking and setting operations were atomic
21:57:35 <comstud> But yeah, it fails when you delete an instance.. reset state, delete an instance again...
21:57:40 <comstud> and all sorts of cases you mention
21:58:02 <vishy> yeah we can cary on in dev
21:58:08 <comstud> ya
21:58:08 <vishy> #endmeeting nova