14:01:19 #startmeeting nova 14:01:20 Meeting started Thu Mar 20 14:01:19 2014 UTC and is due to finish in 60 minutes. The chair is russellb. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:01:22 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:01:24 The meeting name has been set to 'nova' 14:01:36 hello everyone! who's here to talk about nova? 14:01:38 o/ 14:01:39 o/ 14:01:45 o/ 14:01:54 #link https://wiki.openstack.org/wiki/Meetings/Nova 14:02:06 hi 14:02:15 #topic icehouse-rc1 14:02:22 #link https://launchpad.net/nova/+milestone/icehouse-rc1 14:02:28 we're aiming to release RC1 next week 14:02:38 9 bugs on the list left to close out 14:02:48 let's see if we can merge as much as we can this week 14:03:07 come next week i want to start allowing only show stopper and regressions on the list 14:03:26 so some of these would get bumped to "nice to have" if they don't make it this week 14:03:45 anyone aware of issues not on this list that should be? 14:03:47 is the state of the gate a concern? 14:04:02 i see the gate time, what's the root cause? 14:04:12 i see this patch was promoted to the head of the gate: https://review.openstack.org/79816 14:04:21 I think a couple of those might be in the gate already 14:04:49 excellent 14:04:58 this seems to be the bug that is reported on failures - https://bugs.launchpad.net/neutron/+bug/1283522 14:05:41 138 fails in 24hrs / 956 fails in 14 days 14:05:42 on that bug 14:05:52 ouch 14:06:21 looks like a related patch merged 39 minutes ago 14:06:23 yeah, neutron deadlock and libvirt timeouts are the two issues I think 14:06:33 OK 14:06:45 so looks like patches going in now related to both 14:06:51 so hopefully we'll see some improvement today 14:07:30 if something is approved by monday and just fighting through the gate, we can still try to get it in 14:07:32 not a huge deal 14:07:51 any comments / questions / concerns on icehouse-rc1? 14:08:07 one bug was brought up by cyeoh for discussion 14:08:09 though he can't be here 14:08:18 #link https://bugs.launchpad.net/horizon/+bug/1286297 14:08:25 "we made a backwards incompatible change in https://review.openstack.org/#/c/40811/ which caused this" 14:08:32 "some details here: http://lists.openstack.org/pipermail/openstack-dev/2014-March/030508.html" 14:08:54 question is, even though it's been merged for almost all of icehouse, do we revert it or not? 14:08:59 api was not very usable before that change though, not by a non-admin anyways 14:09:14 but breaking our users seems possibly worse 14:09:33 my opinion is that it's been out in the wild for a long time, it's hard to justify reverting it at this point, IMHO 14:09:44 I wondered about making the flavor auth silently succeed if its a no-op, but maybe thats making things worse 14:09:54 ttx's point on list was, if we leave it, we decide to also break everyone going release to release 14:10:17 which is honestly a lot more common 14:10:18 than CD 14:10:26 that's my impression at least 14:10:30 it's a semantic change though, right? not a hard break, and it fails in a reasonable way 14:10:35 I guess I didn't realize there was a thread yet 14:10:37 Sounds more like we fixed a bug than made an incompatible change to me - I'd vote it should stay 14:10:47 OK, make your votes on the thread 14:10:54 just want to make sure it's a conscious decision 14:11:00 my gut says leave it, too 14:11:01 FWIW 14:11:23 #topic bugs 14:11:26 other bugs stuff 14:11:30 not sure who added this one to the agenda 14:11:36 "https://review.openstack.org/#/c/77524/ bug fix related to cinder and keystone v3, cinderclient is already fixed but nova needs to use keystone v3 API - question is what needs to happen before we are "supporting" keystone v3? What needs to happen in Tempest? What areas of nova do we know need to change?" 14:12:50 I think we do need to be claer on "Bug fix" vs "incompatible change" though - any bug fix in the API is a visible change, and this makes the code match the API documentation 14:13:20 agreed with that 14:13:23 +1 14:13:25 #topic blueprints 14:13:30 OK, juno blueprints! 14:13:38 there's been some progress on an updated process for Juno 14:13:46 #link https://wiki.openstack.org/wiki/Blueprints#Nova 14:13:50 johnthetubaguy just updated that wiki page 14:13:52 Are we still doing Keystone V3 support as a topic ? 14:14:12 PhilD: let's come back to it if you'd like, didn't seem like anyone was around to cover that, but can come back to it 14:14:22 so, we have a nova-specs git repo 14:14:28 we have a template in the repo for specs 14:14:40 and we want to require *all* juno blueprints to go through this repo for review 14:14:46 even ones previously reviewed 14:14:49 I have just unapproved all non-completed blueprints, turns out it is possible 14:14:55 johnthetubaguy: great 14:15:14 note that we're going to be learning on the go with this, it's a bit of experimentation 14:15:28 but i think it's going to result in much better reviews than using the horrible "whiteboard" 14:15:40 and we'll have a nice archive of specs in git 14:15:58 I think the repo is open for business at this point though 14:16:19 so, every BP that were approved and had code but deferred will have to go tru this new process right? 14:16:25 correct 14:16:39 if it was previously approved, hopefully the review will be quick 14:16:44 great 14:16:47 but it does mean we're requiring a much more thorough design document 14:16:55 where as before we may have been more lenient 14:17:34 one big bonus is the use of gerrit should lead to more consistency, in the long run 14:18:07 and should address a major complaint i get about the quality of blueprint detail 14:18:24 a bit less painful to enforce here 14:18:35 so do you place the non-approved blueprint in /juno, what is the naming convention? 14:18:47 devoid: it's covered in the tempalte 14:18:57 juno/approved/my-blueprint.rst 14:19:12 where my-blueprint is https://blueprints.launchpad.net/nova/+spec/my-blueprint 14:19:13 russellb: do we want to leave the template open to comments till Monday, for those who are interested? 14:19:38 johnthetubaguy: i think we should open it now, and say watch for updates 14:19:45 i suspect we may have a continuous flow of updates through the cycle 14:19:58 I know that the operators group is working to better lay out end-user and deployer impact. 14:20:12 devoid: oh? for this? 14:20:16 sure, that'd be great 14:20:22 for blueprint approvals 14:20:31 and can just be submitted to gerrit as an update to our template 14:20:38 yup. 14:20:50 any questions or concerns? 14:21:01 russellb: OK, so I am keen to wait for that feedback? or do we want to go now? 14:21:01 otherwise i'll post to the mailing list today drawing more attention to the progress 14:21:11 one concern is if blueprints are approved too quickly there's no time for a broad set of people to review. 14:21:44 johnthetubaguy: sure, I guess the udpate this week can be "take a look at our template and provide feedback", that's fine with me 14:21:45 Good point, I'd like to see some guildlines on that point. 14:21:56 that seems fair 14:22:02 russellb: yeah, sorry to add a delay 14:22:03 russellb: even with this new system, we're planning not to approve until there is some actual code as we discussed in UT right? 14:22:08 johnthetubaguy: no worries 14:22:16 so, about delays 14:22:21 dansmith: good question 14:22:27 i think there's two things 14:22:29 1) approving the spec 14:22:38 2) approving a blueprint into a milestone based on that spec 14:22:40 oh, targeting 14:22:41 gotcha 14:22:42 maybe we should separate those things 14:22:44 yes 14:22:46 that sounds fine to me 14:22:52 sometime we need to push little things through, I don't like adding a big delay, but blueprint delays feel better than code. 14:22:54 johnthetubaguy: which makes the "originally approved for" thing a bit more difficult 14:22:55 just need to make it clear what "approving the spec" means 14:23:08 russellb: yeah, true 14:23:25 johnthetubaguy: makes me want to go back to removing it ... 14:23:41 russellb: I wonder about a proposed folder, then move into approved when code goes up, but that feels bad... 14:24:02 sounds like tracking work 14:24:06 and i'm hoping to keep tracking separate 14:24:10 russellb: how about just saying approved in Juno-1, but first target might be Juno-2, I am ok with that 14:24:16 shouldn't you approve a blueprint separate from code? 14:24:31 russellb: yeah, tracking in one place, lp, makes most sense 14:24:51 johnthetubaguy: OK, but maybe a note to the template and wiki page that clarifies the difference between approving the spec, and targeting to a milestone 14:24:52 devoid: we sure will, but don't want to approve too much that no one will ever work on 14:25:02 I'm +1 on removing the milestone stuff from the template fwiw 14:25:21 alaski: yeah +1 i think ... it's confusing 14:25:22 Maybe we could try and capture what would constitute having a wide range of feedback into the spec review - so for example that there should be some review from an operator, etc ? Feels that's more whats needed than just (has it been open for review X days) 14:25:39 alaski: I just worry there is no way to track the history, but yeah, it seems simpler to remove it at this point, its just confusing :( 14:25:40 PhilD: perhaps, but i think it also depends on the blueprint 14:25:48 there's a lot of blueprint stuff ops aren't going to care about 14:25:53 or things that are really just not controversial 14:25:59 major refactorings 14:26:11 agreed, it depends on the bp. 14:26:11 maybe case by case, ensure we have sufficient input based on what it is 14:26:21 PhilD: neutron patches we wait for neutron core, it seems similar to that kind of thing, just do it case by case? 14:26:35 posts to mailing list and operators can help for things that clearly need operator input. 14:27:55 Yeah, its hard to get the balance I know. Just feels that a s John noted a delay here is much better (or at least much less bad) than a delay/rework later on - so we shouldn't be shy of holding off approval in those cases. I'd rather see this stage lean to slower 14:28:15 good poitn 14:28:27 it's much less costly to rework a spec than code 14:28:37 personally, any new process should feel more lightweight than before, I think letting people do the "right thing" and see what breaks is best here 14:28:39 so we need to ensure we get it right 14:28:46 johnthetubaguy: agreed 14:28:50 PhilD: +1 14:28:54 but i think the things we've talked about are good principles 14:29:14 yeah, we should evolve those review guidelines 14:29:20 indeed 14:29:37 #link https://wiki.openstack.org/wiki/Blueprints#Blueprint_Review_Criteria 14:29:39 I think its not always easy to devs to judge what does and doesn;t haev an impact on an operator (see some of the recent reverts) - and part of the point of this is to have BPs expressed to the extent that a non-coder can underdstand what's intended 14:30:23 #link http://git.openstack.org/cgit/openstack/nova-specs/tree/template.rst 14:30:23 If they can;t then the BP isn't really complete enough IMO 14:30:25 PhilD: I would love to see operators join nova-drivers, by participating in lots of nova-specs reviews, just lets see how that goes I think 14:30:44 I'll be there ;-) 14:31:01 PhilD: the big issue was not having a blueprint that made it clear there was an impact, hopefully we will now be better at that! 14:31:45 any more on blueprints? 14:31:54 it will be an evolving process i'm sure 14:31:57 no from me 14:32:05 but appreciate willingness to try it out and evolve 14:32:16 russellb: +1 14:32:19 helps when nobody likes the current situation, heh 14:32:20 +1 14:32:27 #topic open discussion 14:32:37 plenty of time for other topics if anyone would like 14:32:41 Keystone V3 ? 14:33:08 sure 14:33:32 Mostly this is a client issue - I was just wondering what plans were for getting V3 support into the client 14:33:44 no plans on my radar 14:34:11 there's been a bit more broad hierarchical multi-tenancy discussion 14:34:19 Well V2 becomes deprecited in Icehouse, so we'll need to do something 14:34:21 which would require v3 14:34:34 i think there's some coordination fail in there 14:34:49 i think it's absurd to mark something deprecated when almost every project doesn't support the new thing yet 14:35:28 russellb: agreed 14:35:31 * russellb pings to see if dolphm happens to be around 14:35:37 perhaps that should be on the project meeting for next week 14:35:44 i think its their way of cracking the whip... 14:35:55 We want to eb abel to use domains in Keystone, which means you only have to be able to use the V3 into Keystone from Horizon say, but other clients on V2 now need to be able to auch users when there name isn;t unique. You can kind of do this via teh V2 API by going to ID based auth, but that's a bti clutzy (but Ok as a short term move)] 14:36:01 sdague: +1 14:36:02 ressellb: especially when keystone's cli client doesn't support v3 either. 14:36:17 I have patches up for that in nova and neutron client at the moment 14:36:18 devoid: v3 support comes from openstack client 14:36:40 And I see that some other piecemeal changes seem to be trying to land 14:37:03 there's some keystone v3 coordination needed across projects, so let's plan to cover that in the next cross project meeting (tuesday) 14:37:08 mrodden: yes, but the keystoneclient middleware doesn't support v3 i believe 14:37:08 please join if you're interested and able 14:37:19 Ok - what time ? 14:37:38 @browne - I think it does 14:37:46 #link https://wiki.openstack.org/wiki/Meetings/ProjectMeeting 14:37:47 browne: yeah i'm not sure about that 14:37:51 2100 UTC 14:38:02 Ok, I'll see what I can do 14:38:08 sorry for the rough time 14:38:19 browne, mrodden, the middleware supports v3, just not the cli. but openstack client doesn't have packages available yet. 14:38:24 #note need to discuss keystone v3 support across projects in the next cross project meeting 14:39:14 i may be wrong, but i think the auth_token middleware still only supports getting v2 tokens 14:40:14 dolph doesn't seem to be around 14:40:19 another option would be to start a ML thread 14:40:30 anyone interested in doing that? 14:40:38 I thought it could be configured to work with v2 and v3, but I could be wrong too - it would be good to get some clartiy about what the planned migration is for s system running V2 to one running V3 14:41:00 Yeah I could do that 14:41:12 ok perfect 14:41:15 much appreciated 14:41:22 * johnthetubaguy wonders how long it will be before all clients migrate to v3 keystone 14:41:25 yes, need to figure out what the migration is expected to look like 14:42:02 https://github.com/openstack/python-keystoneclient/blob/master/keystoneclient/middleware/auth_token.py#L764 14:43:09 I'm trying to see if we can get our Keystone folks to tackle the v3 for all clients - seems like it would make more sense for them to do it that for each project to have to work out what to do. Most clients don't even include the keystoen client at the moment though 14:43:18 PhilD: +1 14:43:47 I said "trying" - I still have some way to go to convince them ;-) 14:44:00 well, personally that's what i expect from all projects 14:44:03 Of course if only we had a single converged client ..... 14:44:12 just like with nova v3, i expect nova devs to reach out and help do the work to migrate users of nova 14:44:24 PhilD: thats looking closer now right? 14:44:27 our client of course, but also help fix horizon, trove, heat 14:45:35 sorry guys forgot about this 14:45:42 ndipanov: so you had an issue you wanted to bring up? 14:45:42 oops 14:45:50 and my calendar was chillin as well 14:46:17 so yeah... I looked into https://blueprints.launchpad.net/nova/+spec/graceful-shutdown 14:46:46 we are a bit closer on that one these days right? 14:46:56 and apart from a bug in nova that is easy to fix... that makes this dead after switching to oslo 14:47:01 I still don't think this is done 14:47:28 ndipanov: incomplete or fundamentally broken? 14:47:30 ndipanov: you got that bug? 14:47:42 russellb, I'd say incomplete 14:48:23 once the service receives one of these signals - for this to work properly it needs to rally wait for every gt to finish 14:48:39 and also finish any rpc stuff it has going on 14:48:50 but not accept any other connections 14:48:57 does that sound sane? 14:49:06 johnthetubaguy, yes - it's really tiny 14:49:30 ndipanov: I thought that was done already 14:49:49 ndipanov: stops getting new rpc messages at least, and waits for current stuff to finish 14:50:05 ndipanov: I thought it used to anyway... 14:50:28 leifz: ping 14:50:44 johnthetubaguy: I think he's saying that post oslo-messaging things are different now 14:50:47 johnthetubaguy, I don't think so - it just calls (well should) rpc.cleanup() which eventually calls connection.close 14:51:00 PhilD: http://lists.openstack.org/pipermail/openstack-dev/2014-March/030403.html 14:51:02 dansmith, I think they aren't fundamentally 14:51:08 right I got you, so it got regressed by the oslo changes 14:51:12 I think we need to stop listening to compute.$host, but leave everything else until all the GTs die, right? 14:51:29 dansmith, right 14:51:39 dansmith, and that's not what's happening 14:51:42 otherwise conductor things will just fail 14:51:44 yeah 14:52:01 johnthetubaguy, it did regress from not working to not working even more :) 14:52:52 ndipanov: ok… I got the impression the stop listening to the queue stuff got implemented in olso, then sync across, but I have not looked in detail at that… oops 14:52:52 well I don't think it is anyway... 14:53:51 ndipanov: that was the intention at least, so I am certainly agreed with you there 14:54:27 dansmith, would you say that in order to test this - you could 1) kill the conductor just to cause a call delay 14:54:32 2) boot an instance 14:54:45 3) send SIGINT to compute and restart conductor. 14:55:03 4) see that the boot finishes and then compute dies 14:55:08 ? 14:55:22 ndipanov: manual testing? I'd say start a tempest largeops run against devstack and then kill your compute and see some logs that show it cleaning things up 14:55:35 ndipanov: because killing conductor will hide whether you're properly still open for rpc replies I think 14:56:00 hmmm 14:56:25 it's not going to be an easy test 14:56:45 yeah, just kill compute, see the rpc count rise as you issue terminate commands to all the vms on the compute? 14:56:53 maybe just "nova boot foo; sleep 0.5; killall nova-compute" 14:57:10 ideally during a snapshot, so you have to wait for the snapshot to finsih, then nova-compute to die 14:57:16 and then check in the db that it got to active 14:57:34 that bp doesn't do any tidy up 14:57:49 its just about a kill that waits for current things to finish 14:58:10 (to avoid the need for any tidy up) 14:58:52 anyway, we're about out of time, but sounds like some thankin' needs doin' on how to make sure this works 14:59:03 :) 14:59:06 any other topics with 1 minute left? 14:59:31 johnthetubaguy, afaict it doesn't even do that ... it will call service.stop() which just kills gts 14:59:45 k, back to #openstack-nova we go! :) 14:59:47 thanks everyone! 14:59:49 #endmeeting