#openstack-meeting log

21:04:05 <russellb> #startmeeting nova
21:04:06 <openstack> Meeting started Thu Jan 24 21:04:05 2013 UTC.  The chair is russellb. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:04:07 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:04:09 <openstack> The meeting name has been set to 'nova'
21:04:10 <russellb> #chair vishy
21:04:11 <openstack> Current chairs: russellb vishy
21:04:12 <russellb> Hi!
21:04:29 <russellb> #link http://wiki.openstack.org/Meetings/Nova
21:04:38 <russellb> who's around
21:04:45 <hemna> howdy
21:04:46 <vishy> o/
21:04:53 <alaski> hi
21:05:00 <krtaylor> o/
21:05:21 <russellb> cool ... grizzly-3 first, shall we?
21:05:25 <russellb> #topic grizzly-3
21:05:29 <russellb> #link https://launchpad.net/nova/+milestone/grizzly-3
21:05:39 <toanster> hello
21:05:41 <cburgess> here
21:05:43 <rmk> here
21:05:52 <kmartin> hello
21:06:06 <jog0> o/
21:06:11 <russellb> 44 blueprints targeted for grizzly-3, most not done yet
21:06:17 <russellb> anyone have updates on status for these?
21:06:49 <russellb> if you're assigned one, make sure the status reflects reality
21:06:51 <rmk> I'm actually working on one that isn't submitted yet, which is redoing libvirt snapshots to eliminate instance downtime.
21:06:59 <hemna> I submitted a patch for my FiberChannel BP
21:07:01 <russellb> and if you know one isn't going to make it, speak up so we can update accordingly
21:07:13 <hemna> doing a small rework currently from feedback.
21:07:20 <alaski> Instance actions is churning through patchsets, but should be done in time.
21:07:29 <russellb> hemna: cool, so good progress.  blueprint says "needs code review" which sounds right
21:07:46 <hemna> yup, just grinding through that phase :)
21:07:49 <russellb> alaski: getting feedback and all that?
21:08:23 <alaski> russellb: yes.  But soon I'll probably be pushing more aggressively for it
21:08:27 <russellb> jog0: around?  how about "clean up nova's db api?"
21:08:35 <jog0> russellb: yup
21:08:50 <jog0> russellb: got side tracked with some API benchmarking and performance
21:09:00 <russellb> that's good stuff too :-)
21:09:01 <jog0> but db api work is moving along nicely
21:09:06 <russellb> k
21:09:15 <russellb> still grizzly-3 material?
21:09:26 <jog0> russellb:  I hope so
21:09:32 <russellb> k, updated to "good progress"
21:09:37 <jog0> one big part is ready: https://review.openstack.org/#/c/18493/
21:09:43 <jog0> although that may be another bp
21:10:15 <russellb> devananda: a lot of patches have gone in for db-session-cleanup, how much more is there on that
21:11:22 <russellb> he may not be around ...
21:11:42 <russellb> well, we just need to keep these up to date as we get closer to grizzly-3 so we have a closer and closer picture of what's going to make it (or not)
21:11:55 <russellb> anything else on grizzly-3?
21:12:20 <russellb> #topic differences in virt drivers
21:12:30 <russellb> who's up?  :)
21:12:35 <vishy> russellb: i asked him yesterday and we had a bit to go
21:12:48 <alaski> rmk and I started this briefly in -nova
21:12:54 <russellb> vishy: ok thanks, sorry to duplicate nagging :)
21:12:59 <alaski> but I think we have slightly different goals
21:13:06 <rmk> I'm not sure about the specific scope of this topic but there's probably a discussion worthy topic at least for the libvirt driver
21:13:18 <russellb> vishy: could use like a "last checked on status" field, heh
21:13:27 <rmk> It's more architectural than anything immediate
21:14:05 <rmk> There's a whole lot of if/else in the libvirt driver specifically around LXC, I'm beginning to think that needs to be split out somewhat.
21:15:01 <alaski> rmk: I added this based on your comments around static enforcement of task state transitions, so that was the intended starting scope
21:15:14 <rmk> ok great
21:15:18 <rmk> Yeah that was the other part of this
21:15:45 <rmk> There are all sorts of restrictions in the API around which state/task transitions are allowed versus not
21:16:04 <rmk> The reality is that every hypervisor is different, so enforcing this statically is simply going to limit us
21:16:22 <rmk> For example, one hypervisor might be perfectly happy to allow rebooting a suspended VM and another may not
21:16:45 <rmk> My thought was there should be a method for dynamically setting these restrictions
21:16:49 <alaski> and I wanted to go one step further and get a sense of how to handle other differences between hypervisors that may affect the api
21:17:42 <rmk> Maybe we should explore a compute registration process, where different hypervisors check in with their capabilities (policy)
21:17:59 <rmk> And potentially limit what policy is enforced based on the destination of the command
21:18:31 <rmk> I'm just throwing out rough ideas here to invoke discussion around how best to handle this
21:18:49 <vishy> rmk: seems interesting but also a bit complex
21:19:14 <vishy> rmk: it seems like we can define slightly looser transitions
21:19:24 <rmk> vishy: That's my short term thought for sure
21:19:24 <vishy> and handle the outliers with try: excepts
21:19:42 <russellb> or start with strict base transitions, and let drivers register additional ones that are allowed
21:19:43 <rmk> It's actually what I proposed in my pending review about this
21:19:44 <russellb> something like that
21:20:00 <rmk> So just loosen what we're restricting today at the API and rely on the drivers to raise exceptions
21:20:02 <vishy> are there really going to be enough differences to have a whole registration process?
21:20:13 <russellb> i don't know
21:20:16 <rmk> vishy: I think it's worth exploring
21:20:37 <rmk> We need to assess what we're limiting and why to really make a decision on whether the effort is ultimately worthwhile
21:20:39 <vishy> rmk: i guess the issues is where there is async stuff
21:20:49 <vishy> rmk: it sucks to put things into error if we don't have to
21:21:23 <rmk> https://review.openstack.org/20009 is the review which sort of started this
21:21:49 <alaski> vishy: Well the instance actions stuff should remove the need to set an error state in these cases
21:22:00 <rmk> Also, on the same note, we don't expose the current restrictions anywhere.  There's no API call to figure it out, so Horizon ends up having to match our restrictions.
21:22:16 <rmk> Anyway that's a sidebar to this topic
21:22:35 <alaski> and what I'm really curious about is how much divergence will be acceptable.  Especially with no immediate feedback in the api
21:22:41 <soren> Would it be terribly hard for instances to carry capability attributes?
21:23:00 <soren> If they did, the API server would know if it could be rebooted.
21:23:07 <rmk> soren: Wouldn't you want that to be associated to the host and not the instance itself?
21:23:09 <soren> It has to look up the validity of the server's id anyway.
21:23:30 <soren> rmk: Not necessarily.
21:23:40 <rmk> So maybe it's not registration as much as a policy for each hypervisor which plugs into the API
21:23:50 <soren> rmk: Different vm types on the same host can have differing capabilities.
21:24:01 <soren> rmk: physical host, that is.
21:24:03 <vishy> rmk: we have to be able to map instances to hypervisors
21:24:08 <rmk> i.e. I'm destined for an instance on a libvirt compute node, check the libvirt api policy
21:24:44 <vishy> rmk: sounds like this should be a design summit discussion
21:24:54 <rmk> Sounds good, I thought it might be
21:25:07 <russellb> "lock the hypervisor guys in a room"
21:25:09 <rmk> I would advocate relaxing the restrictions starting sooner than that though
21:25:31 <rmk> We end up making direct DB changes constantly because of this
21:26:13 <rmk> I probably need to classify the types of changes we're going to the DB for, it's way too often
21:27:11 <alaski> and we end up with a lot of instances in error because the restrictions are very relaxed.
21:27:26 <alaski> but we can handle that while we figure out a good solution
21:27:38 <rmk> Isn't it possible for the driver to return in a manner which doesn't trigger an error state/
21:27:43 <rmk> Just that it ignored the operation?
21:27:53 <vishy> rmk: not really
21:27:53 <alaski> not in a way that's exposed to a user
21:28:16 <vishy> although with alaski's patches maybe
21:28:30 <vishy> alaski: to see everything that has happened to the instance
21:29:01 <alaski> for now can we come to a rough consensus on restricted vs relaxed?  For reviewing purposes.
21:29:25 <rmk> I'd advocate relaxing a bit and relaxing more as we have an appropriate framework
21:29:39 <alaski> vishy: that's what my work is intending, we should be able to see everything that has happened
21:29:44 <rmk> I've been pretty gung ho on making "reboot" the fixit hammer
21:30:03 <vishy> alaski, rmk no dramatic changes are really appropriate. I do like relaxing reboot as much as possible
21:30:49 <rmk> That's the one I think helps the most right this moment
21:30:54 <rmk> THere are others but not as big a deal
21:31:26 <rmk> Most of the others are just annoying and not "an admin needs to intereve"
21:31:28 <rmk> intervene
21:32:06 <rmk> anyway that's all I had, would like to discuss more at the summit if we can
21:32:18 <russellb> sounds like a good session idea
21:32:43 <russellb> #topic vm-ensembles
21:32:51 <russellb> should be a quick topic ...
21:32:54 <russellb> #link https://blueprints.launchpad.net/nova/+spec/vm-ensembles
21:32:59 <russellb> i just wanted to draw some attention to this blueprint
21:33:03 <russellb> and there's also a ML thread about it
21:33:26 <russellb> it's proposing adding some additional complexity to scheduling
21:33:49 <russellb> from my first pass on it, i wasn't convinced that it was justified, so i'd like to get some other opinions from those heavily involved with nova
21:34:04 <russellb> doesn't have to be this second, but go give it a read, and post feedback to the ML
21:34:10 <russellb> (the author isn't here to defend himself anyway)
21:34:13 <rmk> I like what's being proposed, I'm not sure it needs a whole new paradigm of grouping
21:34:34 <vishy> i went back and forth with the authors a few times
21:34:51 <vishy> i think need some minimal support in the scheduler to achieve this
21:35:02 <vishy> unless we want to expose information from the scheduler to external services
21:35:10 <rmk> There are other use cases for this sort of thing, like making sure you try to distribute a particular class of VM (running a given app) across racks before..
21:35:36 <russellb> there's some basic anti-affinity support there using a scheduler hint IIRC
21:35:50 <rmk> Basically I think you can do this with key/value pairs as hints to the scheduler
21:35:58 <russellb> different-host or whatever
21:36:38 <russellb> so i guess i'm just trying to better understand what's not possible now ... or it's a matter of making it more friendly, or what
21:37:07 <alaski> russellb: I think it has to do with scheduling multiple instance types at the same time, though I'm still not entirely sure that's it
21:37:51 <jog0> russellb: it would be nice to be able to say to spread out this group of VMs, instead of saying antiaffinity to this vm
21:37:54 <russellb> well, hopefuly we can distill it down to the core problems and what needs to be done to solve them on the ML
21:38:02 <russellb> and if it's not resolved sooner, another design summit candidate
21:38:13 <russellb> can probably be wrapped up sooner though]
21:38:38 <russellb> #topic bugs
21:38:47 <russellb> #link http://webnumbr.com/untouched-nova-bugs
21:39:10 <russellb> 47 untriaged ... we've at least kept the untouched bugs list relatively flat this release cycle, so that's good :)
21:39:24 <russellb> one thing that occurred to me today, when we talk about bugs and what needs to be triaged, we never mention python-novaclient
21:39:43 <russellb> there's another 36 New bugs there ... https://bugs.launchpad.net/python-novaclient/+bugs?search=Search&field.status=New
21:40:29 <russellb> i kinda wish the client bugs were in the same list
21:40:40 <russellb> but i guess it really is separate
21:41:11 <russellb> oldest untriaged client bug is april 1st last year, so guess we need to work on that :)
21:41:27 <russellb> that's all i wanted to mention ... any specific bugs we should look at now?
21:41:31 <vishy> russellb: lol yeah
21:42:05 <russellb> vishy: yeah i kinda laughed when i came across it ... poor novaclient, i just totally forgot to ever look at it
21:42:46 <russellb> lots of good low hanging fruit in there if anyone is interested
21:44:01 <russellb> #topic Open Discussion
21:44:55 <rmk> So yeah, any thoughts on whether we continue down this current path with libvirt, where multiple hypervisors are supported all via conditionals?
21:45:58 * russellb isn't familiar enough with that code ... :-/
21:46:07 <rmk> Part of this is it would be nice to be able to focus on the hypervisor of interest, rather than considering those which I don't have deployed anywhere
21:46:12 <rmk> I'm sure that's a common situation too
21:46:23 <russellb> guess this would be a good ML thread ..
21:46:37 <rmk> It would be hard for me to justify time spent on Xen or LXC when I have no use case for it
21:46:47 <rmk> sure, I can post on the ML
21:46:49 <russellb> might need to outline a proposal or two, and get people to weigh in on candidate directions
21:47:32 <devananda> russellb: re db-session-cleanup, there's still ~20 public methods taking a session parameter, which I'd like to cleanup, but haven't had time to tackle recently
21:47:41 <rmk> Did we end upo agreeing to relax API restrictions around reboot?
21:47:43 <russellb> devananda: k thanks
21:47:50 <russellb> rmk: yes sounds like it
21:47:51 <rmk> Or still going to hold on that too?
21:47:57 <vishy> oh i have a topic
21:48:03 <rmk> OK then... https://review.openstack.org/20009 :)
21:48:03 <alaski> rmk: I think we did
21:48:08 <russellb> rmk: and capping the changes at that for now, until discussed in more depth
21:48:20 <rmk> sounds good
21:48:21 <vishy> does anyone care about this: https://blueprints.launchpad.net/nova/+spec/multi-boot-instance-naming
21:48:58 <vishy> my thought is to do something like
21:48:59 <rmk> vishy: It would be nice to have, doesn't have to be super extensive
21:49:16 <russellb> yeah, seems nice ... needs a volunteer?
21:49:16 <vishy> check: osapi_compute_unique_server_name_scope and if it is set
21:49:17 <rmk> Maybe a basic set of template values which we interpolate
21:49:33 <vishy> then just append '-%s' % uuid on to the name
21:49:47 <vishy> seems like the simple solution
21:49:53 <rmk> Why not just have a set of macros and run them through a simple processor?
21:49:56 <cburgess> vishy: Wouldn't a simple sequence number be easier?
21:50:08 <vishy> cburgess: no doesn't really work
21:50:10 <rmk> Let them use any value we already store
21:50:15 <vishy> if the scope is global
21:50:19 <rmk> name-%uuid%
21:50:31 <vishy> and i do launch -n10 test
21:50:39 <vishy> and someone else does launch -n10 test
21:50:43 <vishy> i get a failure
21:50:49 <vishy> which is really annoying
21:51:01 <vishy> rmk: we could config option the param
21:51:14 <vishy> rmk: but I was thinking the simpler the better
21:51:19 <rmk> sure that works too
21:51:44 <cburgess> vishy: You get a failure or a a non-unique name (which isn't guarded against today)?
21:52:17 <russellb> just non-unique name pretty sure
21:52:28 <vishy> cburgess: i'm saying that in global scope the sequence number is pretty bad
21:52:44 <vishy> cburgess: probably ok in project scope although you could still run into issues with it
21:53:05 <russellb> if this is for hostnames ... UUID makes for some ugly hostnames
21:53:06 <cburgess> vishy: I don't think  understand what you mean by global scope? A desire to keep name unique for DNS purposes?
21:53:08 <russellb> but at least it'd be unique
21:53:31 <vishy> anyway, anyone feel like tackling it?
21:53:49 <vishy> cburgess: config option
21:53:50 <vishy> osapi_compute_unique_server_name_scope
21:54:15 <vishy> if you set it to 'global' you get an error if the name conflicts across all tenants
21:54:21 <russellb> #help need a volunteer for https://blueprints.launchpad.net/nova/+spec/multi-boot-instance-naming
21:54:22 <cburgess> Oh I am unfamiliar with that so I shall pipe down.
21:54:56 <cburgess> Is this grizzly-3 milestone?
21:55:10 <russellb> yeah could be
21:55:13 <russellb> if someone takes it
21:55:38 <cburgess> I could take it but I know I won't have time to do it before grizzly-3. If no one else takes it for grizzly-3 I will take it for H.
21:55:39 <vishy> one more thing
21:55:47 <vishy> can everyone please help with reviews: http://reviewday.ohthree.com/
21:56:10 <russellb> sweet my "HACK - DO NOT MERGE." is ranked at the top
21:56:40 <russellb> (sorry, it's helping find remaining db accesses for no-db-compute)
21:57:04 <lifeless> how is score calculated?
21:57:10 <lifeless> and yes, will do reviews
21:57:49 <russellb> combination of various things, if tests are passing, how old it is, if it's associated with a bug or blueprint and if so, what its priority is
21:58:16 <vishy> endmeeting?
21:58:23 <russellb> wfm
21:58:29 <russellb> #endmeeting