21:04:05 #startmeeting nova 21:04:06 Meeting started Thu Jan 24 21:04:05 2013 UTC. The chair is russellb. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:04:07 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:04:09 The meeting name has been set to 'nova' 21:04:10 #chair vishy 21:04:11 Current chairs: russellb vishy 21:04:12 Hi! 21:04:29 #link http://wiki.openstack.org/Meetings/Nova 21:04:38 who's around 21:04:45 howdy 21:04:46 o/ 21:04:53 hi 21:05:00 o/ 21:05:21 cool ... grizzly-3 first, shall we? 21:05:25 #topic grizzly-3 21:05:29 #link https://launchpad.net/nova/+milestone/grizzly-3 21:05:39 hello 21:05:41 here 21:05:43 here 21:05:52 hello 21:06:06 o/ 21:06:11 44 blueprints targeted for grizzly-3, most not done yet 21:06:17 anyone have updates on status for these? 21:06:49 if you're assigned one, make sure the status reflects reality 21:06:51 I'm actually working on one that isn't submitted yet, which is redoing libvirt snapshots to eliminate instance downtime. 21:06:59 I submitted a patch for my FiberChannel BP 21:07:01 and if you know one isn't going to make it, speak up so we can update accordingly 21:07:13 doing a small rework currently from feedback. 21:07:20 Instance actions is churning through patchsets, but should be done in time. 21:07:29 hemna: cool, so good progress. blueprint says "needs code review" which sounds right 21:07:46 yup, just grinding through that phase :) 21:07:49 alaski: getting feedback and all that? 21:08:23 russellb: yes. But soon I'll probably be pushing more aggressively for it 21:08:27 jog0: around? how about "clean up nova's db api?" 21:08:35 russellb: yup 21:08:50 russellb: got side tracked with some API benchmarking and performance 21:09:00 that's good stuff too :-) 21:09:01 but db api work is moving along nicely 21:09:06 k 21:09:15 still grizzly-3 material? 21:09:26 russellb: I hope so 21:09:32 k, updated to "good progress" 21:09:37 one big part is ready: https://review.openstack.org/#/c/18493/ 21:09:43 although that may be another bp 21:10:15 devananda: a lot of patches have gone in for db-session-cleanup, how much more is there on that 21:11:22 he may not be around ... 21:11:42 well, we just need to keep these up to date as we get closer to grizzly-3 so we have a closer and closer picture of what's going to make it (or not) 21:11:55 anything else on grizzly-3? 21:12:20 #topic differences in virt drivers 21:12:30 who's up? :) 21:12:35 russellb: i asked him yesterday and we had a bit to go 21:12:48 rmk and I started this briefly in -nova 21:12:54 vishy: ok thanks, sorry to duplicate nagging :) 21:12:59 but I think we have slightly different goals 21:13:06 I'm not sure about the specific scope of this topic but there's probably a discussion worthy topic at least for the libvirt driver 21:13:18 vishy: could use like a "last checked on status" field, heh 21:13:27 It's more architectural than anything immediate 21:14:05 There's a whole lot of if/else in the libvirt driver specifically around LXC, I'm beginning to think that needs to be split out somewhat. 21:15:01 rmk: I added this based on your comments around static enforcement of task state transitions, so that was the intended starting scope 21:15:14 ok great 21:15:18 Yeah that was the other part of this 21:15:45 There are all sorts of restrictions in the API around which state/task transitions are allowed versus not 21:16:04 The reality is that every hypervisor is different, so enforcing this statically is simply going to limit us 21:16:22 For example, one hypervisor might be perfectly happy to allow rebooting a suspended VM and another may not 21:16:45 My thought was there should be a method for dynamically setting these restrictions 21:16:49 and I wanted to go one step further and get a sense of how to handle other differences between hypervisors that may affect the api 21:17:42 Maybe we should explore a compute registration process, where different hypervisors check in with their capabilities (policy) 21:17:59 And potentially limit what policy is enforced based on the destination of the command 21:18:31 I'm just throwing out rough ideas here to invoke discussion around how best to handle this 21:18:49 rmk: seems interesting but also a bit complex 21:19:14 rmk: it seems like we can define slightly looser transitions 21:19:24 vishy: That's my short term thought for sure 21:19:24 and handle the outliers with try: excepts 21:19:42 or start with strict base transitions, and let drivers register additional ones that are allowed 21:19:43 It's actually what I proposed in my pending review about this 21:19:44 something like that 21:20:00 So just loosen what we're restricting today at the API and rely on the drivers to raise exceptions 21:20:02 are there really going to be enough differences to have a whole registration process? 21:20:13 i don't know 21:20:16 vishy: I think it's worth exploring 21:20:37 We need to assess what we're limiting and why to really make a decision on whether the effort is ultimately worthwhile 21:20:39 rmk: i guess the issues is where there is async stuff 21:20:49 rmk: it sucks to put things into error if we don't have to 21:21:23 https://review.openstack.org/20009 is the review which sort of started this 21:21:49 vishy: Well the instance actions stuff should remove the need to set an error state in these cases 21:22:00 Also, on the same note, we don't expose the current restrictions anywhere. There's no API call to figure it out, so Horizon ends up having to match our restrictions. 21:22:16 Anyway that's a sidebar to this topic 21:22:35 and what I'm really curious about is how much divergence will be acceptable. Especially with no immediate feedback in the api 21:22:41 Would it be terribly hard for instances to carry capability attributes? 21:23:00 If they did, the API server would know if it could be rebooted. 21:23:07 soren: Wouldn't you want that to be associated to the host and not the instance itself? 21:23:09 It has to look up the validity of the server's id anyway. 21:23:30 rmk: Not necessarily. 21:23:40 So maybe it's not registration as much as a policy for each hypervisor which plugs into the API 21:23:50 rmk: Different vm types on the same host can have differing capabilities. 21:24:01 rmk: physical host, that is. 21:24:03 rmk: we have to be able to map instances to hypervisors 21:24:08 i.e. I'm destined for an instance on a libvirt compute node, check the libvirt api policy 21:24:44 rmk: sounds like this should be a design summit discussion 21:24:54 Sounds good, I thought it might be 21:25:07 "lock the hypervisor guys in a room" 21:25:09 I would advocate relaxing the restrictions starting sooner than that though 21:25:31 We end up making direct DB changes constantly because of this 21:26:13 I probably need to classify the types of changes we're going to the DB for, it's way too often 21:27:11 and we end up with a lot of instances in error because the restrictions are very relaxed. 21:27:26 but we can handle that while we figure out a good solution 21:27:38 Isn't it possible for the driver to return in a manner which doesn't trigger an error state/ 21:27:43 Just that it ignored the operation? 21:27:53 rmk: not really 21:27:53 not in a way that's exposed to a user 21:28:16 although with alaski's patches maybe 21:28:30 alaski: to see everything that has happened to the instance 21:29:01 for now can we come to a rough consensus on restricted vs relaxed? For reviewing purposes. 21:29:25 I'd advocate relaxing a bit and relaxing more as we have an appropriate framework 21:29:39 vishy: that's what my work is intending, we should be able to see everything that has happened 21:29:44 I've been pretty gung ho on making "reboot" the fixit hammer 21:30:03 alaski, rmk no dramatic changes are really appropriate. I do like relaxing reboot as much as possible 21:30:49 That's the one I think helps the most right this moment 21:30:54 THere are others but not as big a deal 21:31:26 Most of the others are just annoying and not "an admin needs to intereve" 21:31:28 intervene 21:32:06 anyway that's all I had, would like to discuss more at the summit if we can 21:32:18 sounds like a good session idea 21:32:43 #topic vm-ensembles 21:32:51 should be a quick topic ... 21:32:54 #link https://blueprints.launchpad.net/nova/+spec/vm-ensembles 21:32:59 i just wanted to draw some attention to this blueprint 21:33:03 and there's also a ML thread about it 21:33:26 it's proposing adding some additional complexity to scheduling 21:33:49 from my first pass on it, i wasn't convinced that it was justified, so i'd like to get some other opinions from those heavily involved with nova 21:34:04 doesn't have to be this second, but go give it a read, and post feedback to the ML 21:34:10 (the author isn't here to defend himself anyway) 21:34:13 I like what's being proposed, I'm not sure it needs a whole new paradigm of grouping 21:34:34 i went back and forth with the authors a few times 21:34:51 i think need some minimal support in the scheduler to achieve this 21:35:02 unless we want to expose information from the scheduler to external services 21:35:10 There are other use cases for this sort of thing, like making sure you try to distribute a particular class of VM (running a given app) across racks before.. 21:35:36 there's some basic anti-affinity support there using a scheduler hint IIRC 21:35:50 Basically I think you can do this with key/value pairs as hints to the scheduler 21:35:58 different-host or whatever 21:36:38 so i guess i'm just trying to better understand what's not possible now ... or it's a matter of making it more friendly, or what 21:37:07 russellb: I think it has to do with scheduling multiple instance types at the same time, though I'm still not entirely sure that's it 21:37:51 russellb: it would be nice to be able to say to spread out this group of VMs, instead of saying antiaffinity to this vm 21:37:54 well, hopefuly we can distill it down to the core problems and what needs to be done to solve them on the ML 21:38:02 and if it's not resolved sooner, another design summit candidate 21:38:13 can probably be wrapped up sooner though] 21:38:38 #topic bugs 21:38:47 #link http://webnumbr.com/untouched-nova-bugs 21:39:10 47 untriaged ... we've at least kept the untouched bugs list relatively flat this release cycle, so that's good :) 21:39:24 one thing that occurred to me today, when we talk about bugs and what needs to be triaged, we never mention python-novaclient 21:39:43 there's another 36 New bugs there ... https://bugs.launchpad.net/python-novaclient/+bugs?search=Search&field.status=New 21:40:29 i kinda wish the client bugs were in the same list 21:40:40 but i guess it really is separate 21:41:11 oldest untriaged client bug is april 1st last year, so guess we need to work on that :) 21:41:27 that's all i wanted to mention ... any specific bugs we should look at now? 21:41:31 russellb: lol yeah 21:42:05 vishy: yeah i kinda laughed when i came across it ... poor novaclient, i just totally forgot to ever look at it 21:42:46 lots of good low hanging fruit in there if anyone is interested 21:44:01 #topic Open Discussion 21:44:55 So yeah, any thoughts on whether we continue down this current path with libvirt, where multiple hypervisors are supported all via conditionals? 21:45:58 * russellb isn't familiar enough with that code ... :-/ 21:46:07 Part of this is it would be nice to be able to focus on the hypervisor of interest, rather than considering those which I don't have deployed anywhere 21:46:12 I'm sure that's a common situation too 21:46:23 guess this would be a good ML thread .. 21:46:37 It would be hard for me to justify time spent on Xen or LXC when I have no use case for it 21:46:47 sure, I can post on the ML 21:46:49 might need to outline a proposal or two, and get people to weigh in on candidate directions 21:47:32 russellb: re db-session-cleanup, there's still ~20 public methods taking a session parameter, which I'd like to cleanup, but haven't had time to tackle recently 21:47:41 Did we end upo agreeing to relax API restrictions around reboot? 21:47:43 devananda: k thanks 21:47:50 rmk: yes sounds like it 21:47:51 Or still going to hold on that too? 21:47:57 oh i have a topic 21:48:03 OK then... https://review.openstack.org/20009 :) 21:48:03 rmk: I think we did 21:48:08 rmk: and capping the changes at that for now, until discussed in more depth 21:48:20 sounds good 21:48:21 does anyone care about this: https://blueprints.launchpad.net/nova/+spec/multi-boot-instance-naming 21:48:58 my thought is to do something like 21:48:59 vishy: It would be nice to have, doesn't have to be super extensive 21:49:16 yeah, seems nice ... needs a volunteer? 21:49:16 check: osapi_compute_unique_server_name_scope and if it is set 21:49:17 Maybe a basic set of template values which we interpolate 21:49:33 then just append '-%s' % uuid on to the name 21:49:47 seems like the simple solution 21:49:53 Why not just have a set of macros and run them through a simple processor? 21:49:56 vishy: Wouldn't a simple sequence number be easier? 21:50:08 cburgess: no doesn't really work 21:50:10 Let them use any value we already store 21:50:15 if the scope is global 21:50:19 name-%uuid% 21:50:31 and i do launch -n10 test 21:50:39 and someone else does launch -n10 test 21:50:43 i get a failure 21:50:49 which is really annoying 21:51:01 rmk: we could config option the param 21:51:14 rmk: but I was thinking the simpler the better 21:51:19 sure that works too 21:51:44 vishy: You get a failure or a a non-unique name (which isn't guarded against today)? 21:52:17 just non-unique name pretty sure 21:52:28 cburgess: i'm saying that in global scope the sequence number is pretty bad 21:52:44 cburgess: probably ok in project scope although you could still run into issues with it 21:53:05 if this is for hostnames ... UUID makes for some ugly hostnames 21:53:06 vishy: I don't think understand what you mean by global scope? A desire to keep name unique for DNS purposes? 21:53:08 but at least it'd be unique 21:53:31 anyway, anyone feel like tackling it? 21:53:49 cburgess: config option 21:53:50 osapi_compute_unique_server_name_scope 21:54:15 if you set it to 'global' you get an error if the name conflicts across all tenants 21:54:21 #help need a volunteer for https://blueprints.launchpad.net/nova/+spec/multi-boot-instance-naming 21:54:22 Oh I am unfamiliar with that so I shall pipe down. 21:54:56 Is this grizzly-3 milestone? 21:55:10 yeah could be 21:55:13 if someone takes it 21:55:38 I could take it but I know I won't have time to do it before grizzly-3. If no one else takes it for grizzly-3 I will take it for H. 21:55:39 one more thing 21:55:47 can everyone please help with reviews: http://reviewday.ohthree.com/ 21:56:10 sweet my "HACK - DO NOT MERGE." is ranked at the top 21:56:40 (sorry, it's helping find remaining db accesses for no-db-compute) 21:57:04 how is score calculated? 21:57:10 and yes, will do reviews 21:57:49 combination of various things, if tests are passing, how old it is, if it's associated with a bug or blueprint and if so, what its priority is 21:58:16 endmeeting? 21:58:23 wfm 21:58:29 #endmeeting