21:03:19 #startmeeting nova 21:03:20 Meeting started Thu Dec 5 21:03:19 2013 UTC and is due to finish in 60 minutes. The chair is russellb. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:03:20 hey 21:03:21 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:03:22 hi 21:03:24 \o 21:03:25 The meeting name has been set to 'nova' 21:03:25 o/ 21:03:25 hi 21:03:26 o/ 21:03:26 :D 21:03:26 hey everyone! sorry for starting a couple minutes late 21:03:28 . 21:03:32 hi 21:03:34 Hi 21:03:38 o/ 21:03:38 hi 21:03:43 hi 21:03:49 o/ 21:03:56 awesome, lots of folks 21:04:00 #topic general announcements 21:04:00 hi 21:04:05 icehouse-1 is out! 21:04:12 #link https://launchpad.net/nova/+milestone/icehouse-1 21:04:16 13 blueprints, > 200 bugs fixed 21:04:29 release is going by fast already 21:04:31 scary. 21:04:47 we'll talk more about icehouse-2 planning in a bit 21:04:52 other thing ... mid-cycle meetup 21:04:59 #link https://wiki.openstack.org/wiki/Nova/IcehouseCycleMeetup 21:05:03 6 people signed up so far, heh 21:05:05 # of blueprints is not expected:) 21:05:16 I remember last time it was 65? 21:05:25 russellb: 6 people? 21:05:28 well ... that's because *everything* was targeted at icehouse-1 21:05:35 and now most of it is on icehouse-2, some icehouse-3 ... 21:05:38 6? 21:05:43 * jog0 signs up 21:05:49 i suspect some folks are still working on getting budget approval 21:05:56 or just haven't filled out the registration thingy 21:05:58 so just a reminder 21:05:59 * n0ano tentative signup 21:06:00 yeah, should we hold off on signing up before that? 21:06:10 mriedem: yeah probably 21:06:14 k 21:06:27 but just making sure everyone saw all the info there 21:06:28 hotel info added 21:06:55 look at that, 2 people just signed up 21:07:00 meh, I have Marriott points. 21:07:18 :-) 21:07:22 heh, that's what's recommended anyway 21:07:29 just let me know if you have any questions about that 21:07:32 #topic sub-teams 21:07:36 let's do sub-teams first! 21:07:40 whee! 21:07:40 hartsocks: you! 21:07:51 (others raise your virtual hand so i know you're in line) 21:07:54 We're getting our act together for i2 21:08:00 * n0ano hand up 21:08:16 o/ 21:08:21 hartsocks: cool, anything urgent that needs attention? 21:08:21 We have 2 BP now in flight for i2 and I think we'll be proposing more stuff for i3 21:08:34 hartsocks: just approved one BP for i2 for you 21:08:41 danke 21:08:55 I'm trying to figure out how to move more of our stuff into Oslo. 21:08:58 BTW 21:09:09 cool. 21:09:12 or at least seeing what's appropriate to share there. 21:09:14 sure 21:09:31 I would like to do less *special* work in our driver and do smarter work over all. 21:09:47 We'll be getting our act together the next few weeks on that. 21:09:47 i don't think i'll argue with that :) 21:09:52 'sall from me. 21:09:55 thanks! 21:10:08 n0ano: what's up in scheduler land 21:10:47 much discussion about boris' no_db scheduler work, everyone likes the idea, issues about maintaining compatibily while transistioning to it. 21:11:20 yeah, good point, i haven't thought about the migration path too much on that 21:11:22 also a lot of talk about the forklift of the scheduler code, didn't have much time on that, will probably discuss on the email list & next week 21:11:31 ok, was curious if you guys got to that ... 21:11:36 i think we're about ready to start 21:11:48 need someone to do the initial heavy lifting of the code export 21:11:51 do we have a BP for the work involved in the forklift 21:12:02 lifeless: talking about scheduler forklift if you're around 21:12:04 we do have a blueprint 21:12:16 https://blueprints.launchpad.net/nova/+spec/forklift-scheduler-breakout 21:12:47 russellb: i 21:12:49 I'll go look at that, looks like there's lots of people signed up to do the work so it should go 21:12:53 russellb: sorry, I am totally around 21:12:58 lifeless: all good 21:13:00 I think we're ready to start indeed 21:13:01 we need: 21:13:08 - infra definitions for two projects 21:13:28 I was going to use openstack/python-commonschedulerclient and openstack/commonscheduler 21:13:43 - some filter-tree work to get commonscheduler existing 21:13:44 heh, that's fine, assuming we can rename if we want to be clever later 21:13:48 holy long name 21:13:59 and I think just cookiecutter to get the python-commonschedulerclient one in place 21:14:02 * jog0 votes for oslo-scheduler 21:14:07 what's in a name, as long as the code is there and works I don't care that much 21:14:08 * russellb was thinking gantt 21:14:11 mriedem: +1 21:14:18 n0ano: +1 ;) 21:14:28 what's in a name can be annoying for packagers.... 21:14:32 think quantum/neutron 21:14:36 jog0: so,no - not oslo, unless we want to trigger all the integration/incubation questions up now 21:14:44 sure, real name to be decided before it's released/packaged 21:14:47 mriedem: we can rename anytime in the next three months. 21:14:48 * n0ano I'm a developer, what's packaging :-) 21:14:56 yes please, name change so bad 21:14:57 n0ano: it's one way that users get your code :) 21:14:57 gantt! 21:15:01 lifeless: point taken, sorry for side tracking 21:15:13 I'm fine with gantt 21:15:25 I'll put the infra patch up today 21:15:41 do we have volunteers to do the git filtering for the api server tree? 21:15:53 I"ll do the client tree, I've it mostly canned here already anyhow 21:16:11 client tree is basically 1 file i think 21:16:15 nova/scheduler/rpcapi.py 21:16:23 russellb: yes, 95% will be cookiecutter stuff 21:16:29 lifeless, if the work involved is well defined we should be able to get people to do it 21:16:32 making it installable, tests for ietc. 21:16:42 and there's probably a nova/tests/scheduler/test_rpcapi.py 21:16:48 n0ano: it's defined in the etherpad; I don't know if it's well enough defined 21:16:52 russellb: right :) 21:17:16 lifeless, indeed, I guess we'll have to just start at some point in time 21:17:36 n0ano: yep ... and we'll have to periodically sync stuff ... it'll be a pain for a while 21:17:44 that's why we need people looking after it regularly until it's done 21:17:55 kinda like nova-volume -> cinder 21:17:58 ok, so n0ano are you volunteering to do the git filter? 21:18:16 lifeless, sure, either me or I can always delegate someone 21:18:30 heh :) - whatever works 21:18:31 * n0ano what did I just sign up for!! 21:18:35 n0ano: thanks! 21:18:55 #note n0ano (or a delegate) to start working on the new scheduler git tree 21:18:56 :) 21:18:59 lifeless, send me emails with any details you need if necessary 21:19:20 it's in the meeting minutes, you have to now 21:19:21 n0ano: +1 21:19:31 russellb, NP 21:19:35 alright, let's move on 21:19:41 melwitt: hey! python-novaclient 21:19:46 anything on fire? 21:19:55 haha no, fortunately 21:19:59 good. 21:20:06 here is the weekly report: 21:20:06 open bugs, 117 !fix released, 81 !fix committed 21:20:06 24 new bugs 21:20:06 0 high bugs 21:20:06 22 patches up, 7 are WIP, https://review.openstack.org/#/c/51136/ could use more reviewers 21:20:56 for API merged into nova it says, so should be fine 21:20:59 assuming code is sane 21:21:29 looks sane at a glance, i'll take another look afterwards 21:21:42 the count of new bugs is lower than i remember, you must have been working on cleaning that up :) 21:21:55 yes I have :) 21:22:01 excellent! 21:22:42 as usual, if anyone wants to help with novaclient tasks, please talk to melwitt ! 21:22:56 some bugs to be fixed i'm sure 21:23:04 melwitt: anything else you wanted to bring up today? 21:23:14 no, I think that's it 21:23:17 great thanks 21:23:20 #topic bugs 21:23:32 lifeless: any comments on nova bugs for today? 21:23:35 ruh roh 21:23:35 so 21:23:39 ha 21:23:43 200 New bugs 21:23:46 I have some stats but it's not quite right yet 21:23:50 72 untagged 21:24:05 * russellb has falled behind on tagging and could use some help 21:24:12 nice grammar 21:24:16 I falled behind too 21:24:20 damn you 21:24:29 plenty of critical bugs too (I am to blame for most I think) 21:24:36 #note dansmith to catch us up on bug triage this week 21:24:42 oof 21:24:46 #undo 21:24:47 https://review.openstack.org/#/c/58903/ <- is the thing I have put together 21:24:49 Removing item from minutes: 21:24:53 whew 21:25:01 http://pastebin.com/raw.php?i=vj4FuErC 21:25:02 it removed the topic, not the note ... wat 21:25:10 russellb: LOL 21:25:19 lolbug 21:25:25 so my intent is to get this fixed this week, and be able to actually frame workloads sensibly. 21:25:27 oh well, yes, stats! 21:25:39 Also, next week I'll have drafted a proposed update to the triage workflow 21:25:51 so - I know I'm not crushing this, but I am working on it :) 21:26:05 lifeless: thanks for working it 21:26:07 cool, appreciate the stats work, that's been a hole 21:26:17 and for anyone who has some bandwidth, our current process is: 21:26:17 but if the stats are even vaguely right, we get from 30 to 80 bugs a week 21:26:17 not crushing > nothing 21:26:20 #link https://wiki.openstack.org/wiki/Nova/BugTriage 21:26:50 which is ~ 10 a day to do 'is this urgent or backburner' assessment on - which I think is a pretty light workload really 21:26:52 i wonder how many of those are devs filing right before they post a patch 21:26:59 this is across both nova and novaclient 21:27:04 russellb: i think that happens a lot 21:27:04 should I report them separately? 21:27:19 well, the ones that go to In Progress immediately don't really need to be triaged in the same sense 21:27:21 i think 21:27:37 russellb: Do we ask them to do that? Is it valuable? [agreed that in progress immediately isn't a triage burden] 21:27:42 agreed, but priority isn't set either sometimes, or backport potential 21:28:00 mriedem: good point, so there's still some triage work to do 21:28:05 I usually create it in Confirmed state if I know I've got a bug, but haven't yet got a patch 21:28:07 i retract my comment then 21:28:10 yea and sometimes they eventually ended up abandoned by the original developer so need to go back into triage 21:28:27 cyeoh: true, that's one of the tasks listed on https://wiki.openstack.org/wiki/BugTriage 21:28:33 yeah, would be nice if launchpad had an expiration feature or something 21:28:36 though i don't know how often we make it past looking at the New ones ... 21:28:46 or knew when the patch was abandoned so it could set the bug to confirmed or something 21:28:48 or new 21:28:57 we should automate that, right? 21:29:06 probably 21:29:10 mriedem: it does 21:29:19 lifeless: after how long? 21:29:23 mriedem: if a bug is marked incomplete with no follow on comments, and on one project, it will expire 21:29:28 i've seen abandoned patches with in progress bugs long after they were abandoned 21:29:29 while on bugs can we talk about some of the gate bugs? 21:29:36 lifeless: ah, that's different 21:29:44 bug is in progress but patch is abandoned 21:29:44 jog0: yep 21:29:46 the bug doesn't change 21:29:48 mriedem: but if we need a different policy it's an api script away. 21:29:55 mriedem: oh, you want to unassign idle bugs? 21:29:55 in short http://lists.openstack.org/pipermail/openstack-dev/2013-December/021127.html 21:30:00 lifeless: yeah 21:30:02 something like that 21:30:09 lifeless: be nice if it had an auto-warn first 21:30:11 we have a lot of bugs on that list and most don't have anyone working on them 21:30:11 mriedem: sounds like an infra feature request. 21:30:15 (to the person it is assigned to) 21:30:23 including some neutron blocking bugs 21:30:37 cyeoh: they can always toggle it back, it's non destructive 21:31:13 russellb: hopeing for some volunteers to fix some gate bugs 21:31:22 lifeless: yea was just thinking of avoiding races where someone else picks it up and ends up duplicating work already done 21:31:33 alright guys, let's not leave jog0 hanging. who can help with some gate issues over the next week? 21:31:48 (and really, it's all of openstack-dev, not just jog0) :-) 21:31:59 * jog0 can't he will be at two conferences 21:32:15 looks like most of it is nova+neutron 21:32:25 yeah anda few nova + tempest 21:32:26 crap, there was another one i opened last night that has been happening but wasn't reported, scheduler fail 21:32:43 mriedem: that one isn't on the list yet but yeah 21:32:45 phil day had a patch that should help at least make it more obvious in the logs when it fails 21:32:47 I commented on it 21:32:59 jog0: k, haven't read the bug emails yet 21:33:03 i wonder how to give more incentive to work on these ... 21:33:07 mriedem: https://bugs.launchpad.net/nova/+bug/1257644 21:33:09 Launchpad bug 1257644 in nova "gate-tempest-dsvm-postgres-full fails - unable to schedule instance" [Critical,Triaged] 21:33:13 no other patches land until these fixed? 21:33:18 heh that's one way 21:33:19 and we have libvirt still stacktracing all the time 21:33:20 stop-the-line in LEAN terms 21:33:28 and i'm willing to use that hammer in desperate times 21:33:42 jog0: is that the 'instance not found' libvirt stacktrace? 21:33:43 russellb: Just a thought, but if the expectation is that that hammer will come out 21:33:45 russellb lifeless: we aren't there yet ( I think) 21:33:52 russellb: perhaps folk will respond more quickly in non desperate times. 21:34:20 true 21:34:37 mriedem: mriedem thats a different thing 21:34:39 let me dig 21:35:02 might be worth a test run 21:35:08 https://bugs.launchpad.net/nova/+bug/1255624 21:35:10 Launchpad bug 1255624 in nova "libvirtError: Unable to read from monitor: Connection reset by peer" [Critical,Triaged] 21:35:11 re: stop the line thing 21:35:13 https://bugs.launchpad.net/nova/+bug/1254872 21:35:15 Launchpad bug 1254872 in nova "libvirtError: Timed out during operation: cannot acquire state change lock" [Critical,Triaged] 21:35:16 i think if failure rates pass some really bad threshold we should do that 21:35:31 it was 25% the other day 21:35:31 (the hammer) 21:35:39 in gate not check 21:35:48 Can we link MC Hammer in the email where you say it's happened? 21:35:51 thoughts on where the "stop everything" line should be? 21:35:53 sure 21:35:56 hammertime 21:35:59 * lifeless is satisfied 21:36:08 mrodden: go on? 21:36:28 http://goo.gl/25j6nx 21:36:49 i think the idea is just no +A until we get criticals in gate fixed, threshold down etc. 21:36:50 gate-tempest-dsvm-postgres-full = 20.00 failure rate as of now 21:37:07 is there an idea of how fast these start to build up, i.e. once you hit a certain % it's exponential fails after that? 21:37:09 yeah, that's high 21:37:09 so fixing gate issues is a several step process 21:37:25 1) fingerprint the bug, and add to e-r 21:37:26 probably not (no more +A) high yet 21:37:31 2) identify root cause 21:37:34 3) fix it 21:37:46 step 1 is what I have been focusing on 21:37:53 step 2 and 3 are alot more work 21:37:57 is 'reverify no bug' dead yet? 21:38:13 jog0: hugely helpful, then people know which individual things to dive deep on 21:38:24 jog0: the libvirt things, danpb had some feedback on collecting more logs, any progress on that? 21:38:34 russellb: been backed up so no 21:38:45 ok, not expecting you to do it, just asking in general 21:39:25 alright, well, work needed here, but let's move on for the meeting 21:39:27 #topic blueprints 21:39:36 #link https://launchpad.net/nova/+milestone/icehouse-2 21:39:37 so gate pass rate= 80% right now 21:39:46 #link https://blueprints.launchpad.net/nova/icehouse 21:39:55 113 total icehouse blueprints right now 21:40:07 jog0, which jobs are failing consistently? 21:40:11 which already seems over our realistic velocity 21:40:19 let's continue the gate chat in #openstack-nova 21:40:24 k 21:40:36 how that icehouse-1 is done, we need to get our icehouse-2 and icehouse-3 lists cleaned up 21:40:40 87 blueprints on icehouse-2 21:40:43 which is *far* from realistic 21:40:44 if we get a lot of stuffs in rc2 or rc3, does that mean we will get a lot of bugs in the end? 21:41:00 shane-wang: i'm not sure what you mean? 21:41:34 russellb: I mean do we need some time in rc to clean the bugs? 21:41:49 shane-wang: icehouse will be feature frozen after icehouse-3 21:41:56 and there will be a number of weeks where only bug fixes can be merged 21:42:01 ok 21:42:06 #link https://wiki.openstack.org/wiki/Icehouse_Release_Schedule 21:42:11 just a little worried. 21:42:18 from March 6 to April 17 21:42:20 bug fixes only 21:42:26 good, thanks. 21:42:29 sure np 21:42:39 so, 40 blueprints still need some review attention 21:42:44 you can see it pretty well on the full icehouse list 21:42:49 anything not prioritized is still in the review process 21:43:01 russellb: how do our plans change if we decide to unfreeze nova-network? 21:43:16 Pending Approval (waiting on review), Review (waiting on submitter, but we need to follow up to check if updates have been posted) 21:43:27 Drafting (not ready for review), Discussion (review pending some discussion in progress) 21:43:44 so any help on those is appreciated, and generally expected of nova-drivers :-) 21:44:00 jog0: so, after icehouse-2 i'd like to have the nova-network discussion 21:44:11 our plans don't change *too* much at this point 21:44:20 other than we can accept more patches 21:44:29 jog0: we will have a bunch of v3 API work if we decide to support nova-network again, but I think it'll be manageable 21:44:37 cyeoh: great point. 21:44:52 well lets hope it doesn't come to that 21:44:56 agreed 21:45:03 it seems they are making much better progress than pre-summit 21:45:11 so i remain optimistic that it won't come to that 21:45:18 n0ano: whats your email address? 21:45:31 lifeless, donald.d.dugger@intel.com 21:45:46 russellb: I am not optimistic 21:45:54 if it does come to that, unfreezing nova-network will spawn a much more significant discussion about neutron that is beyond just nova 21:46:00 n0ano: ack, thanks 21:46:03 so ... not for just us to work out the details 21:46:25 any specific blueprints anyone wants to cover? 21:47:26 well, if you have stuff targeted at i2, please consider moving it to i3 if you don't think you'll have code in the next few weeks 21:47:36 #topic open discussion 21:47:44 the agenda on the wiki had something for open discussion 21:47:48 mriedem: that was you right? 21:47:48 upgrades are broken! 21:47:52 oh 21:47:54 russellb: yeah 21:47:55 dansmith: but you're fixing them 21:48:05 dansmith: we can talk about that in a sec though 21:48:08 russellb: ndipanov probably cares but don't see him here 21:48:20 basically the ML thread on mox/mock, sounds like that's mainly sorted out 21:48:36 use mock for new tests, move mox to mox3 over time, and there are exceptions to not using mock 21:48:48 exceptionial cases sound like backports or big ugly hacks 21:49:09 the other thing on the agenda i was pointing out was our plethora of hacking guides 21:49:09 yeah 21:49:18 jog0 was cleaning up the keystone hacking guide to point at the global one 21:49:22 i think we should do the same for nova 21:49:29 i thought we already point to the global one? 21:49:29 i.e. nova/tests/README is super stale 21:49:37 and then have addition of our own nova specific rules 21:49:49 yeah, but i think the nova-specific stuff might need to be reviewed again 21:49:54 OK 21:49:58 nova/tests/README is it's own animal i htink 21:50:10 and I think you're talking about something more broad than what people generally think of when you say HACKING around here 21:50:13 you mean ... dev docs 21:50:14 and last point being the horizon guys have a great testing guide 21:50:18 yeah 21:50:20 there's also docs/source/devref/ 21:50:27 well, there are wikis, devref, readmes, etc 21:50:30 yeah. 21:50:31 it's everywhere 21:50:43 docs/source/devref/ should probably be where we focus right? 21:50:43 ++ to streamlined dev docs 21:50:46 i think we should try to get as much of that as possible into the global hacking docs (devref) 21:50:51 yes 21:50:55 ok cool, that works for me 21:50:59 I support this effort, heh 21:51:03 and i think we want to integrate horizon's testing guide into the global hacking devref 21:51:09 horizon has a great testing guide 21:51:19 and then once that's done, we add our stance on mox/mock 21:51:30 * mriedem needs to breath 21:51:31 you planning to work on this? 21:51:38 christ, so.... 21:51:48 i think i can get the ball rolling 21:51:52 mriedem: I can get behind that idea, a little doc re-org is needed but very doable. 21:51:55 ndipanov seems to be passionate about it 21:52:02 so do you :) 21:52:06 yeah, this isn't rocket science, just takes time to do it 21:52:25 i care about it simply because i get tired of having to explain it in reviews 21:52:32 shall i write a letter to your boss? we have work for you to do here 21:52:33 i want to just point to a single location 21:53:00 plus it will be easier to tell people in a review 'sorry, use mock, link here' 21:53:07 works for me 21:53:09 then it's not a question of my personal preference, it's dogma 21:53:14 don't think anyone would argue with this 21:53:17 just need to do it :) 21:53:39 yeah, so i'll start to get some work items in place and then i (and maybe others) can help knock them off 21:54:17 alright sounds good 21:54:26 dansmith: you're up 21:54:48 well, there's not much to say, really, but geekinutah broke live upgrades 21:54:53 he broke them all to hell and back 21:54:57 haha 21:55:00 all his fault? 21:55:03 now youre backtracking 21:55:05 of course, it wasn't his fault at all 21:55:08 lol 21:55:12 but he DID break them 21:55:28 anyway, basically just infrastructure that we don't have all worked out yet 21:55:43 I've got a set up that fixes it, but it'll require a teensy backport to havana in order to make it work 21:56:03 however, with that, I'm back to being able to run tempest smoke against a deployment on master with havana compute nodes 21:56:05 did you have to hack rpc too? 21:56:09 to make your test work? 21:56:19 https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:object_compat,n,z 21:56:23 if anyone is interested 21:56:30 related: https://review.openstack.org/60361 and https://review.openstack.org/60362 21:56:34 posted 3 seconds ago 21:56:36 russellb: no, I see some like cert or console fails, but they're not blockers 21:56:59 russellb: so that set, with the conductor fix backported to havana makes it all work 21:57:14 dansmith: oh, hm, i guess it's just a mixed havana-icehouse compute node env that broke? 21:57:20 i guess that's right 21:57:31 russellb: right, that's the big breakage, which doesn't affect me 21:57:36 you just can't have both at the same time right now ^^^ without those fixes 21:57:41 right 21:57:44 k 21:57:57 and then your object magic 21:58:04 and the gate job 21:58:17 hey guys, live upgrades are hard 21:58:21 yeah, we really need the gate job, 21:58:36 but at least I have a one-button "are we broke or not" thing I can check until then 21:58:52 that is all 21:59:05 so next time I break things I'll have to fix them :-)? 21:59:22 geekinutah: next time you break them, I quit :) 21:59:26 alright, we're about out of time 21:59:30 thank you all for your time! 21:59:33 #endmeeting