14:00:33 #startmeeting nova 14:00:33 Meeting started Thu Dec 10 14:00:33 2015 UTC and is due to finish in 60 minutes. The chair is johnthetubaguy. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:34 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:37 The meeting name has been set to 'nova' 14:00:40 o/ 14:00:53 o/ 14:01:00 o/ 14:01:01 \o 14:01:02 o/ 14:01:05 o/ 14:01:06 o/ 14:01:09 \o 14:01:11 welcome all 14:01:11 hi 14:01:14 p/ 14:01:15 o/ 14:01:18 #topic Release Status 14:01:23 \o 14:01:33 #info Jan 21: Nova non-priority feature freeze 14:01:42 #info Thursday December 17th: non-priority feature review bash day 14:01:50 #info Jan 19-21: mitaka-2 14:01:52 o/ 14:01:53 'sup 14:01:53 hi 14:01:58 * kashyap waves 14:02:00 #info Blueprint and Spec Freeze was last week 14:02:26 so lots of dates there, just to share them 14:02:26 o/ 14:02:26 #link https://wiki.openstack.org/wiki/Nova/Mitaka_Release_Schedule 14:02:26 o/ 14:02:34 so just a think about the freeze 14:02:49 I have been through all the exception requests, and commented on the specs 14:02:56 I need to loop back on some that made revisions 14:03:00 #link https://etherpad.openstack.org/p/mitaka-nova-spec-review-tracking 14:03:07 one spec-less one I wanted to raise 14:03:25 #link https://blueprints.launchpad.net/nova/+spec/osprofiler-support-in-nova 14:03:37 now there is some new code up to add osprofiler into nova 14:03:56 its no where near as invasive as it was last time we saw that code 14:04:08 so I am tempted to approve that, let me know if thats crazy 14:04:14 Any more on process or release status 14:04:22 haven't looked at the osprofiler changes 14:04:24 mriedem: how is v3.0 of python-novaclient heading? 14:04:30 i think it is pretty clear that osprofiler is going to be the standard mechanism for openstack projects 14:04:45 johnthetubaguy: i think we're waiting on a keystoneauth change from mordred 14:04:53 so even if there are problems with it, i feel the best thing is to take it and work on improving osprofiler where needed 14:04:58 this https://review.openstack.org/#/c/245200/ 14:05:04 mriedem: ack, didn't see an update last time 14:05:36 as we don't want nova to be the odd one out with profiling - we've already seen people propose other nova specific solutions for profiling such as the spec to hijack notifications as the mechanism 14:05:42 danpb: yeah, we pushed back last time because there was a huge heap of profiler code in Nova, that looks to have been fixed, seems OK now 14:05:57 this is the code btw https://review.openstack.org/#/c/254703/ 14:06:03 it's mostly middleware and config options 14:06:37 i've not looked in detail, but on the surface the propsoed code is pretty reasonable imho and similar to what i would have proposed for my own profiling attempts 14:07:07 if that's an optional middleware, that seems good to me 14:07:14 if it's basically out of the way when off it seems reasonable to get it in and try it 14:07:29 bauzas: its actually better than that, but agreed 14:07:31 from a design PoV, should I say 14:07:45 alaski: IIUC, it is a no-op unless it is enabled for an API call 14:07:47 sounds like we're ok with specless bp approval 14:08:01 * alex_xu waves hands late 14:08:01 danpb: that's my understanding as well, but I haven't looked closely yet 14:08:08 works for me, 14:08:14 mriedem: agreed 14:08:23 yep, lets move on 14:08:25 IMHO there's no sense in a nova spec, as that'd just split the discussion from the cross project spec 14:08:31 reviewing the code is welcome 14:08:44 danpb: +1 totally why I am pushing for specless 14:09:06 as per the newly approved pattern from the summit discussions on cross project stuff 14:09:14 cool, so any more on freeze stuff 14:09:21 just a claficiation 14:09:28 API bugs that need specs, are still allowed 14:09:46 thats proper bugs, not add 17 new API calls kinds of bugs 14:09:53 i have a question about the non priority freeze 14:09:59 andrearosa: sure 14:10:07 considering the ZXmas time can we extend 14:10:12 the deadline 14:10:22 maybe for 2 more weeks? 14:10:30 not it is Jan 21: Nova non-priority feature freeze 14:10:36 andrearosa: its already extended for christmas by mitaka-2 being further away 14:10:40 andrearosa: honestly, it was communicated since the beginning 14:10:50 bauzas, so what? 14:10:56 yeah i don't want to extend, 14:11:00 ok 14:11:03 i feel like we spent most of m-1 on spec reviews 14:11:13 so? 14:11:26 I mean if we hadn't already extended the release schedule an extra week for christmas, I would consider it 14:11:32 johnthetubaguy, bauzas, mriedem, danpb - thank you guys for the approval of the bp 14:11:43 we focus on specs so we should leave as little time as possible to write code 14:11:48 looking at number of weeks per milestone 14:11:59 m-2 is definitely the higher 14:12:10 ndipanov: we aimed to get specs merged before the summit, thats the aim, but it always spills over a little 14:12:12 because it takes account of Xmas 14:12:27 people can also write POC code while working on a spec 14:12:32 ++ 14:12:35 mriedem, lol 14:12:43 that's not what we tell them 14:12:45 anyway 14:12:47 anyways, to be clear, its our deadline, and its not aligned with the tagging 14:12:47 it's not? 14:12:54 well we may say that 14:12:59 but not what we encourage 14:13:03 i encourage it 14:13:04 "Thursday December 17th: non-priority feature review bash day" Will be a day, that we can ask for review on the non-priority feature? 14:13:07 for the reason you're bringing up 14:13:12 I'm in the minority here so I'll just step away 14:13:29 if anyone is discouraging writing POC code in parallel with a spec, i'd say they are wrong 14:13:30 but the reality is 14:13:33 we merged a ton of specs 14:13:39 raildo: well its a day I want everyone to folks on reviewing the low priority blueprints that have been up for review the longest 14:13:39 most of them won't make it 14:13:49 a lot will be close and people will get super frustrated 14:13:55 come Nandos 14:13:57 we start over 14:13:58 johnthetubaguy: nice :) 14:14:02 makes no sense imho 14:14:04 * ndipanov stops 14:14:15 we also have a gate on fire, 14:14:16 ndipanov: ahem, N is Nutella :P 14:14:21 and people need to work on those 14:14:27 or Nachos 14:14:28 mriedem, because we focus on the wrong things in the gate 14:14:38 let's move on 14:14:39 but that';s a whole other topic we won't agree on 14:14:59 so, we have 121 blueprints approved, I expect us to have bandwidth to merge 70 14:15:02 ish 14:15:12 assuming we don't get loads more folks doing quality reviews 14:15:18 we won't 14:15:25 johnthetubaguy: maybe we can extend for another day, if it is not enough? :) 14:15:35 hence the spec deadline 14:16:00 anyways, this isn't a discussion that works well on IRC 14:16:02 lets move on 14:16:37 final point is if you are curious why we do the process this way, its mostly documented in here: https://wiki.openstack.org/wiki/Nova/Process 14:17:00 #topic Regular Reminders 14:17:17 so, we have subteams with lots of code ready for core review 14:17:23 #link https://etherpad.openstack.org/p/mitaka-nova-priorities-tracking 14:17:39 #topic Bugs 14:17:43 mriedem: so the gate 14:18:03 waiting on https://review.openstack.org/#/c/253901/ 14:18:11 multinode grenade fails >50% of the time 14:18:20 a bad gate is almost always the sign of unstable code, so it really matters to me fixing this 14:18:42 mriedem: actually, that's mostly a nicety 14:18:53 so this is about pinning liberty? 14:18:54 upper-constraints comes from the requirements repo 14:18:56 sdague: i see it's dropped off http://status.openstack.org/elastic-recheck/gate.html 14:19:00 sdague: oh right 14:19:02 yeah so we're good now 14:19:10 as of last night 14:19:15 oh, OK, so this is more defensive 14:19:22 well, it's the g-r sync 14:19:27 we blacklisted oslo.messaging 3.1.0 14:19:28 oslo.messaging started having a much higher failure rate after service restart 14:19:34 oh, so there was a workaround for the job ? 14:19:41 it's a 3.1.0 bug 14:19:45 oh I see 14:19:53 good to hear \o/ 14:20:00 we still have a bunch of volume-related failures in the gate, at least one of which should be fixed by https://review.openstack.org/#/c/254428/ 14:20:07 some default timeouts were changed in a way that service restarts would often fail to reconnect to rabbit 14:20:09 or made less frequent by 14:20:09 ah, I see !=3.1.0 14:20:22 ah, so that makes sense, ouch 14:20:47 markus_z: any more bug related things you wanted to highlight today? 14:21:03 yepp 14:21:07 #info: zero known criticals. High prio bugs which are not in progress are steady with 41 bugs 14:21:12 I haven't yet found time to clean up the very old ones 14:21:16 #info: latest bug stats at http://lists.openstack.org/pipermail/openstack-dev/2015-December/081415.html 14:21:25 The amount of bug reports is rising heavily since mitaka-1. We have 76 "new" bugs overall. 14:21:31 23 bugs out of the 76 are without any first bug triage and I'm out of bandwidth to do it. This will be a heavy load later when we want to be RC ready. 14:21:36 #help: anyone (or more) volunteering for a "bug skimming duty" for a week? 14:21:40 mriedem: btw, do we have people working the top couple of gate issues, or are there some specific ones folks need to jump on? 14:21:50 * markus_z drops mic 14:22:02 na, seriously, we need more people triaging 14:22:06 johnthetubaguy: i have that patch up for the volumes one 14:22:07 markus_z: Sure. Some of the week count? :) 14:22:17 more help with bug triage would be a good thing! 14:22:37 johnthetubaguy: i think http://status.openstack.org/elastic-recheck/gate.html#1522488 was fixed by infra, the 2 nodes had the same hostname 14:22:42 dansmith figured that out 14:22:43 doffm: What do you mean? 14:23:06 markus_z: I mean I'm happy to help with triaging and bug skimming. 14:23:19 mriedem, I see there was some good discussion on that patch so will look at it now 14:23:32 ndipanov: yeah, alaski and i looked into the locking more yesterday 14:23:36 markus_z: are there instructions? 14:23:42 doffm: cool! the more the better. 14:24:04 lxsli: I tried to summarize it once but didn't get it merged in any of the manuals 14:24:19 mriedem: yeh the fail rates are getting back under control - http://tinyurl.com/zbg788y 14:24:30 lxsli: That's all I know: http://markuszoeller.github.io/posts/openstack-bugs/ 14:24:41 markus_z: lets try get it into ours again 14:24:58 markus_z: thanks! 14:25:32 so lets keep moving 14:25:52 #topic Stuck Reviews 14:26:05 something tells me I didn't clean these out, I could be wrong 14:26:14 the uefi one still needs discussion i think 14:26:23 i see there was an exception request, and danpb is +2 14:26:27 https://review.openstack.org/#/c/235983/ 14:26:32 mriedem: OK 14:26:34 teh nova-manage is just a quick remainder I added it on purpose 14:26:44 doffm: lxsli: ping me in #openstack-nova after the meeting if you have further questions 14:26:44 i think i'm ok with this not having testing as long as there is a warning when it's used that it's experimental 14:27:37 it currently says add a flag for this in tempest, which sounds wrong 14:28:19 that's wrong 14:28:21 are we writing somewhere that an experimental feature means that we can remove it without a deprecation cycle ? 14:28:32 that implies testing upstream, which we can't do until we have new enough libvirt 14:28:42 bauzas: no 14:28:43 bauzas: so that was my feature classification thing 14:28:50 oh 14:28:55 mriedem: well its just not a flag in tempest right, its an image setting 14:29:04 mriedem: I'd feel that it would help people balance the benefits vs. the risks 14:29:10 johnthetubaguy: right, per my ML thread, you'd configure tempest with the uefi image id 14:29:12 to enable that test 14:29:23 but you need an env with libvirt 1.2.9 and the ovmf package 14:29:28 which we don't have 14:29:47 what about https://review.openstack.org/#/c/228778/ ? it pretty much a bugfix but I posted a spec because of the API change 14:29:49 mriedem: bauzas: well the feature classification doesn't quite do that, just helps us audit what is untested, and think about removing some of it 14:29:54 intel could provide 3rd party ci, but i'm not sure how much we care as long as there is a warning about it being untested and experimental 14:30:06 johnthetubaguy: I certainly agree, I was just thinking of something more formal like what we say for our supported APIs 14:30:10 cells v1 is experimental but if we removed that w/o a deprecation cycle people's heads would explode 14:30:14 right, with a warning, it seems an edge case, thats thats OK 14:30:38 mriedem: I know, but that's just something we want to explain that it's risky 14:31:04 mriedem: bauzas: yeah, I should clarify, any feature removal would need a deprecation cycle, I miss read that, I am thinking about deprecating stuff thats just not being maintained 14:31:04 i think this is why we need to make the bar to inclusion high 14:31:06 honestly, considering that cells v1 as possibly being removed without further notice doesn't really make me afraid :) 14:31:16 bauzas: doesn't it mean "not guaranteed to work 100% of the time"? 14:31:19 not the exit criteria, because just removing stuff pisses people off 14:31:29 edleafe: right 14:31:33 like zookeeper in nova 14:31:36 edleafe: well, it sounds people are not afraid by experimental things 14:31:38 there are lots of untested things 14:31:53 bauzas: the thrill of living on the edge! :) 14:31:54 right, there was a plan to deprecate that, its just not moving very quickly 14:32:04 johnthetubaguy: the tooz thing? 14:32:07 yeah 14:32:08 anyway, just an example 14:32:15 a very valid one 14:32:33 so I'm fine with adding some notion of bleeding edge, but that should be very explicit that it's not really fully covered, nor supported 14:32:47 johnthetubaguy: so for this uefi one, i just asked that the spec point out there will be a warning that it's untetsed 14:32:47 we getting stuck here 14:32:49 *untested 14:32:51 then i'm +2 14:32:59 johnthetubaguy: fair to say that 14:33:12 mriedem: yeah, I would be happy with that too 14:33:29 it doesn't seem big enough to worry about any more than that 14:33:35 agreed 14:33:56 rgerganov: you linked a bug fix that is a spec, thats not affected by the freeze 14:34:19 johnthetubaguy, ah ok then 14:34:21 there way another one on the list right 14:34:40 andrearosa: the nova-manage volume thingy 14:34:45 johnthetubaguy: yes 14:34:52 I dop not want to discuss it here 14:35:01 sdague: started a ML thread to get more info 14:35:07 we didn't want an API for a DB hack, so we said nova-manage, but the current nova-manage command looks a bit like an API 14:35:15 or something like that 14:35:18 I tried to recap the long discussions we had 14:35:37 if ppl interested could follo-up the ML thread I think that will help in moving on 14:35:39 andrearosa: ... where is that recap? 14:35:51 I don't see anything back on the mailing list 14:36:01 recap: http://lists.openstack.org/pipermail/openstack-dev/2015-December/081119.html 14:36:19 http://lists.openstack.org/pipermail/openstack-dev/2015-December/081119.html 14:36:24 the message got chopped a bit 14:36:31 hmm... and broke threading 14:36:33 scottda: thanks 14:36:44 so I didn't see the response 14:36:46 there he is 14:36:55 sdague: yes sorry I blame MS outlook 14:37:25 as I said I do not want to discuss it here,the ML seems more appropriate 14:38:14 scottda: DuncanT: andrearosa: you might also be interested in https://review.openstack.org/#/c/254428/ 14:38:15 well are there any points people want to discuss here 14:38:19 * sdague takes todo to go read the other part of the thread 14:38:24 OK 14:38:32 sdague: thanks. 14:38:53 making the DELETE API just work for when its stuck in the deleteing state, seems to make sense to me 14:39:21 not totally sure how we sort that, feels like the thing that failed should have moved the state from deleting back to attched 14:39:28 but like you say, lets take that offline 14:39:35 seems like: 14:39:46 * should the API just do the correct thing? 14:40:01 does cinder can help with this? we're going to change API a bit to store connector_info in the DB 14:40:03 thought you were taking it offline? :) 14:40:03 * do we need some other API or DB hack to fix up some other edge cases that are left? 14:40:14 mriedem: yeah 14:40:28 we also discussed race related volume fails in https://review.openstack.org/#/c/254428/ 14:40:35 like moving volume create to api or conductor rather than compute 14:40:42 #topic open discussion 14:40:43 so we always have the volume_id before we build the instance 14:40:56 mriedem: oh good point, also related to detaching a volume for a shelved instance 14:41:21 e0ne: we also store the connection_info in the nova db 14:41:41 so I have two things for open discussion 14:41:59 the meetings on 24 December and 31st December 14:42:03 mriedem: I didn't know it. need to do deeper into the code 14:42:13 I was thinking about cancelling those ones, or do people still want to do those? 14:42:26 nope, I won't be here 14:42:32 xmas eve and new years eve 14:42:40 cancel them 14:42:43 mriedem: yeah, seems like prime skip candidates 14:42:44 we should maybe force people to take those days off? 14:42:46 maybe on some other earlier days? 14:42:47 ++ 14:42:54 mriedem: not a terrible idea 14:43:02 so yeah, lets skip those meetings 14:43:05 like make the -nova channel moderated? :) 14:43:12 dansmith: heh 14:43:20 proxy all discussion through twitter 14:43:22 kick ayone discussing by those days ? 14:43:37 so I don't think anyone jumped for joy, but does someone fancy the cross project CPL job? 14:44:35 OK, offer is still out there 14:44:40 so is that someone sitting in the cross project specs repo? 14:44:59 mriedem: yeah, and attending the cross project meeting, should it happen 14:45:17 its not very europe timzone friendly though 14:45:43 so its open open stuff 14:45:49 I am guessing we are done 14:46:00 not alaski or mriedem quick, but a touch early 14:46:03 thanks all 14:46:12 #endmeeting