14:00:41 <johnthetubaguy> #startmeeting nova 14:00:42 <openstack> Meeting started Thu Apr 2 14:00:41 2015 UTC and is due to finish in 60 minutes. The chair is johnthetubaguy. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:43 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:45 <openstack> The meeting name has been set to 'nova' 14:00:49 <ndipanov> o/ 14:00:49 <neiljerram> o/ 14:00:50 <mriedem> hi 14:00:51 <dims> o/ 14:00:51 <alex_xu> \o 14:00:51 <bmwiedemann> o/ 14:00:52 <alaski> o/ 14:00:53 <markus_z> o/ 14:00:56 <dansmith> o/ 14:00:56 <edleafe> o/ 14:00:59 <bauzas> \o 14:01:04 <claudiub> o/ 14:01:06 <johnthetubaguy> #topic Kilo Release Status 14:01:12 <johnthetubaguy> hello hello 14:01:34 <johnthetubaguy> so its the final burn down to RC 14:01:51 <johnthetubaguy> basically lets merge lots of bug fixes and test kilo as much as we can 14:02:12 <johnthetubaguy> lets leave the bug talk to the bug section 14:02:17 <johnthetubaguy> #topic Gate status 14:02:30 <mriedem> stable was busted but that's fixed now 14:02:35 <johnthetubaguy> there was a gate fix around policy issues, but anything else cropping up? 14:02:51 <johnthetubaguy> cool, fixed now is better than not fixed 14:02:54 <mriedem> there was a bad trace with n-net and deallocate after some tempest changes but that's fixed and i'll backport to stable today 14:03:04 <mriedem> i've added links to some non-voting jobs 14:03:04 <sdague> o/ 14:03:06 <andreykurilin> o/ 14:03:09 <johnthetubaguy> mriedem: anything you need people to jump on 14:03:16 <ndipanov> the pci bug with compute not being able to start has a fix re-posted and it looks better 14:03:17 <mriedem> yes 14:03:17 <mriedem> aiocpu (multi-node devstack job that runs live migration testing) fails a lot, see bug https://bugs.launchpad.net/nova/+bug/1438803 14:03:18 <openstack> Launchpad bug 1438803 in OpenStack Compute (nova) "libvirt error libvirtError: Requested operation is not valid: nwfilter is in use hit in _post_live_migration" [Undecided,In progress] - Assigned to Matt Riedemann (mriedem) 14:03:31 <ndipanov> https://review.openstack.org/#/c/131321/ <<== this 14:03:33 <mriedem> i have a patch up for ^ but it's hacky and i haven't seen it demonstrate that it hits the flow 14:03:44 <johnthetubaguy> we are talking gate bugs here 14:03:47 <bauzas> we have the cells job that is going to be fixed soon 14:03:56 <ndipanov> ah ok 14:04:03 <mriedem> johnthetubaguy: i realize, but these are also jobs running on nova code 14:04:08 <mriedem> so if they don't work, it means nova doesn't work 14:04:11 <bauzas> but let's discuss about the cells job on the bugs section 14:04:14 <ndipanov> sorry bout that 14:04:21 <johnthetubaguy> mriedem: totally 14:04:31 <mriedem> oh, nvm :) 14:04:34 <mriedem> i got defensive for nothing 14:04:41 <mriedem> moving on - ceph job: ceph non-voting job fails hard on test_volume_boot_pattern: https://bugs.launchpad.net/cinder/+bug/1439371 14:04:42 <openstack> Launchpad bug 1439371 in Cinder "Volume creation from image fails for UEC+Ceph" [Undecided,New] - Assigned to Jon Bernard (jbernard) 14:04:55 <mriedem> dansmith: and jbernard looked at that yesterday, looks like it's in cinder's court now? 14:05:09 <dansmith> yeah, although I doubt there will be a fix for kilo 14:05:14 <mriedem> ouch 14:05:15 <dansmith> jbernard is going to try 14:05:18 <johnthetubaguy> mriedem: not got priorities on these, what do you want to set them as? 14:05:20 <dansmith> but I think it's probably not going to happen 14:05:25 <dansmith> however, it's not a huge deal, IMHO 14:05:33 <dansmith> it's only ceph with UEC images AND boot from volume 14:05:39 <dansmith> easy to release note that and fix it 14:05:48 <johnthetubaguy> ah, OK, so we can probably mark that with the tag: https://launchpad.net/nova/+bugs?field.tag=kilo-rc-potential 14:05:54 <dansmith> I am working on a change to the gate to use the disk image for ceph jobs 14:05:58 <mriedem> for the live migration one, i don't know if that's an rc blocker or not 14:05:59 <dansmith> so we can get actual runs again 14:06:19 <johnthetubaguy> mriedem: just thinking we should triage the bug, thats all really 14:06:28 <mriedem> if i can get a run with my workaround retry patch for live migration and show it's hitting the retry loop, we could merge that as a workaround for kilo 14:06:39 <sdague> mriedem: on livemigration it's something we only recently got testing, so I'm not surprised we're exposing issues 14:06:42 <mriedem> johnthetubaguy: well, we were told it's due to super old libvirt/qemu 14:06:47 <sdague> mostly we should consider them for backporting 14:07:05 <johnthetubaguy> sdague: yeah, thats a good point 14:07:11 <dansmith> if they're not regressions, they're less important I think 14:07:12 <mriedem> yeah, like i said, not an rc blocker imo 14:07:29 <johnthetubaguy> +1 to all those comments 14:07:31 <mriedem> the other thing i had listed was the cells job, but sounds like bauzas was going to talk about that later 14:07:42 <johnthetubaguy> I trust mriedem to make that clear on the bug 14:07:48 <johnthetubaguy> cools… 14:07:48 <bauzas> mriedem: well, all the related bugs are RC1 related 14:08:12 <bauzas> but we can give a status here 14:08:15 <johnthetubaguy> #topic Bugs 14:08:24 <johnthetubaguy> so before we start… a quick process reminder 14:08:41 <johnthetubaguy> we can release RC1 once all the bugs listed here are merged: 14:08:54 <johnthetubaguy> https://launchpad.net/nova/+milestone/kilo-rc1 14:09:00 <johnthetubaguy> …so 14:09:07 <mriedem> rc1 is sort of scheduled for 4/9 right? 14:09:10 <mriedem> a week from now 14:09:16 <johnthetubaguy> please only target bugs to rc1 if you think we should block the release on that 14:09:21 <mriedem> or is 4/9 just when the project / release manageres get really antsy? 14:09:34 <johnthetubaguy> mriedem: yes, https://wiki.openstack.org/wiki/Kilo_Release_Schedule 14:09:43 <sdague> 4/9 is the last possible day 14:09:43 <ndipanov> well block seems strong here 14:09:52 <bauzas> mriedem: question was answered by mikal saying "the sooner is the better" 14:10:02 <bauzas> ideally before 4/9 IIUC 14:10:04 <sdague> yeh, earlier is better 14:10:09 <mriedem> yeah, we can also do rc2 if needed 14:10:10 <johnthetubaguy> well, we don't want to cut RC1 too early, so we force RC2, but yeah 14:10:20 <johnthetubaguy> anyways, so thats the exit criteria 14:10:28 <ndipanov> we have the importance - so I would imageine we target what we think is reasonable to land 14:10:29 <johnthetubaguy> all the RC1 targeted bugs merged 14:10:43 <johnthetubaguy> so if you target a bug, you are saying "please block the release on this" 14:10:48 <johnthetubaguy> now we also have a tag 14:10:56 <johnthetubaguy> if you want to say, please consider blocking on this one 14:10:59 <johnthetubaguy> then use this: 14:11:07 <johnthetubaguy> https://launchpad.net/nova/+bugs?field.tag=kilo-rc-potential 14:11:18 <johnthetubaguy> OK, so does that all sound like people expected? 14:11:35 <johnthetubaguy> its the same juno, as far as I remember 14:11:37 <ndipanov> yes mostly 14:11:40 <sdague> johnthetubaguy: yep 14:11:48 <johnthetubaguy> cools 14:12:04 <johnthetubaguy> #info targeting to kilo-rc1 means please block the kilo release on my bug 14:12:12 <johnthetubaguy> lets talk bugs 14:12:25 <johnthetubaguy> there was a cells one? 14:13:06 <bauzas> yup 14:13:11 <bauzas> so, just a quick status 14:13:39 <bauzas> chain starting with https://review.openstack.org/#/c/168294/3 is marked as RC1 and targets to reduce the failures down to 3 14:13:57 <sdague> bauzas: and the last 3 are going to be whitelisted out right? 14:14:05 <bauzas> and https://review.openstack.org/#/c/166396/ is whitelisting those 3 14:14:22 <bauzas> with a Depends-On tag on the last patch from the series 14:14:59 <sdague> yeh, I'm not sure why we do depends on there 14:15:02 <bauzas> all the patches but one are good to review, I should privode a last update for https://review.openstack.org/#/c/169400/3 by the 2 next hours 14:15:23 <johnthetubaguy> sdague: if we did depends the other way around we should see the cells test pass I guess? 14:15:29 <bauzas> sdague: because it will remove from the whitelist the failures that are fixed by the series 14:15:42 <sdague> johnthetubaguy: no we wouldn't 14:16:02 <johnthetubaguy> OK, I guess we can't depend on that repo 14:16:02 <sdague> bauzas: we can do it offline, but it's not needed here, I'm going to delete it 14:16:10 <johnthetubaguy> OK, cool 14:16:14 <bauzas> sdague: okay, let's discuss offline, sure 14:16:30 <johnthetubaguy> any more on those? they look good but not quite blockers I guess 14:16:54 <bauzas> johnthetubaguy: we actually would like to see the cells job green by Kilo hence the RC1 tag 14:17:13 <bauzas> if not, it would require backports 14:17:28 <johnthetubaguy> bauzas: OK 14:17:37 <sdague> bauzas: if so, you need to respin patches faster when alaski -1s them :) 14:17:52 <bauzas> sdague: yey, I know... 14:17:53 <johnthetubaguy> any more for this before we go onto the open discussion bits? 14:18:11 <edleafe> bauzas: I can help if needed 14:18:23 <sdague> honestly, I don't think the cells patches should hold up rc, because they are iterating too slow 14:18:53 <alaski> if it comes to it I agree, but I would like to see them in 14:18:57 <johnthetubaguy> sdague: I am with you, we can remove those when we get closer 14:18:58 <bauzas> sdague: eh, it was -1'd yesterday evening my time 14:18:58 <alaski> I can help iterate too 14:19:05 <dansmith> not holding up rc makes sense 14:19:12 <dansmith> continuing to merge until the line does as well, IMHO 14:19:24 <sdague> dansmith: ++ 14:19:37 <dansmith> they *have* made amazing progress in a short time, 14:19:44 <johnthetubaguy> +1 14:19:45 <dansmith> progress that we've been looking for for years :) 14:20:05 <johnthetubaguy> …. to be clear 14:20:24 <johnthetubaguy> we can merge any non-violating bug fixes until we try to cut RC1 14:20:50 <johnthetubaguy> so no need to target a bug to be able to merge it right now 14:20:58 <johnthetubaguy> we are trusting all cores to do the right thing here 14:21:01 <bauzas> okay, lemme try to fix the -1 in that hour, that's doable 14:21:01 <johnthetubaguy> anyways, moving on 14:21:13 <johnthetubaguy> #topic open discussion 14:21:32 <johnthetubaguy> so skipping stuck reviews 14:21:36 <johnthetubaguy> as there are non in the agenda 14:21:41 <johnthetubaguy> lets go through the agenda here 14:22:04 <johnthetubaguy> #link http://lists.openstack.org/pipermail/openstack-dev/2015-April/060360.html 14:22:10 <johnthetubaguy> systemz fun 14:22:11 <mriedem> markus_z: ^? 14:22:22 <markus_z> yepp, that's me 14:22:24 <dansmith> aren't we using fakelibvirt in the tests? 14:22:27 <mriedem> we should be 14:22:36 <dansmith> are they not running them in tox or something? 14:22:45 <sdague> dansmith: yeh, I thought so, it might be falling back through under some cases? 14:22:53 <bauzas> johnthetubaguy: I just added an item for the Bugs section in the agenda just before the meeting 14:22:56 <bauzas> (my badf) 14:22:58 <mriedem> are we requireing zkvm CI to run unit tests? 14:22:59 <dansmith> I didn't think it ... would. 14:23:00 <mriedem> i thought it was just tempest? 14:23:12 <dansmith> generally that's the case, yeah 14:23:16 <mriedem> we don't require other virt drivers to run unit tests 14:23:21 <mriedem> *CI i mean 14:23:27 <dansmith> right 14:23:42 <dansmith> saying the unit tests pass on platform X (or Z heh) doesn't mean much, IMHO 14:23:46 <dansmith> just a waste of resources I think 14:23:49 <mriedem> yeah 14:23:56 <mriedem> markus_z: so don't run unit tests on z :) 14:23:59 <mriedem> focus on tempest 14:24:00 <johnthetubaguy> dansmith: +1 14:24:01 <dansmith> problem solved! 14:24:03 <markus_z> Thanks, that was easy :) 14:24:05 <dansmith> heh 14:24:06 <johnthetubaguy> cools 14:24:25 <johnthetubaguy> so, hypervisor support matrix 14:24:30 <mriedem> also z 14:24:38 <markus_z> That's also from my side. 14:24:48 <mriedem> markus_z: for the support matrix, i think you just need to get a patch up and then it'll be iterated in review 14:24:55 <johnthetubaguy> right 14:25:02 <markus_z> Is the CI a precondition? 14:25:05 <mriedem> if something is partial, i think there is a place to add notes 14:25:07 <mriedem> no 14:25:13 <mriedem> well, i assume not 14:25:17 <mriedem> zkvm blueprint is merged and in tree 14:25:19 <markus_z> Some items are unclear to me, what do I do with them? 14:25:22 <johnthetubaguy> markus_z: CI is more about A vs C catagory 14:25:32 <markus_z> johnthetubaguy: Ah, ok 14:25:41 <mriedem> markus_z: maybe put TBD in the review and then -W it 14:25:42 <dansmith> yeah 14:25:47 <mriedem> then iterate in review 14:25:55 <mriedem> TBD in the table cell i mean for the item in question 14:25:56 <johnthetubaguy> yeah, lets get the known ticks and known crosses 14:25:58 <jichen> johnthetubaguy: does that means any hypervisor can submit it without CI ?code in stackforge is also ok? 14:26:04 <mriedem> jichen: no 14:26:14 <mriedem> powervc and zvm in stackforge don't apply here, or nova-docker :) 14:26:19 <mriedem> zkvm is in tree in the libvirt driver 14:26:20 <dansmith> right 14:26:22 <johnthetubaguy> jichen: we are talking about in three drivers here 14:26:29 <jichen> mriedem: ok, got it 14:26:30 <johnthetubaguy> s/three/tree/ 14:26:40 <jichen> johnthetubaguy: ok, thanks 14:26:41 <mriedem> jichen: you should probably have your own support matrix for zvm in the stackforge repo though 14:26:51 <johnthetubaguy> markus_z: have we unblocked you now? 14:27:05 <johnthetubaguy> markus_z: I guess just get the review up, and we can talk more in gerrit 14:27:06 <markus_z> Not yet, sorry. 14:27:26 <johnthetubaguy> np, whats the next question? 14:27:28 <markus_z> Some of the items itself are unclear. Even in the description there is "something something, dark side". 14:27:34 <markus_z> That's what I meant 14:27:52 <dansmith> markus_z: let's talk about them off the meeting 14:28:00 <dansmith> assuming you're talking about the matrix... 14:28:01 <johnthetubaguy> markus_z: right, lets talk about that offline 14:28:06 <markus_z> dansmith: yes I do 14:28:12 <dansmith> cool 14:28:13 <johnthetubaguy> just catch us in #openstack-nova as normal 14:28:16 <johnthetubaguy> cools 14:28:19 <markus_z> OK, pushing it next week and talk in review 14:28:26 <johnthetubaguy> so a message from mikal… 14:28:26 <mriedem> no family guy jokes here! 14:28:37 <markus_z> okidoki 14:28:43 <johnthetubaguy> #link http://lists.openstack.org/pipermail/openstack-dev/2015-April/060360.html 14:28:52 <johnthetubaguy> arg 14:28:59 <johnthetubaguy> #link https://etherpad.openstack.org/p/liberty-nova-summit-ideas 14:29:03 <johnthetubaguy> thats what I meant 14:29:07 <johnthetubaguy> the flood gates are open 14:29:23 <mriedem> well, you can probably throw nova-net -> neutron migration on there 14:29:25 <mriedem> and evacuate 14:29:26 <johnthetubaguy> but I would expect to be requested to have a spec up for review, if you want to discussion your feature at the summit 14:30:05 <bauzas> johnthetubaguy: how many fishbowl sessions ? 14:30:09 <bauzas> johnthetubaguy: 2 days ? 14:30:11 <johnthetubaguy> I think the deadline for sessions will be announced post election 14:30:15 <johnthetubaguy> bauzas: I think so 14:30:26 <mriedem> fwiw, i hope everything is scheduled similar to paris 14:30:27 <cfriesen> mriedem: evacuate? 14:30:31 <johnthetubaguy> same as last time, is what I remember we agreed 14:30:37 <mriedem> with 2 back to back sessions for hairy issues 14:30:42 <johnthetubaguy> right 14:30:44 <dansmith> cfriesen: yeah, as in "effing fix it" 14:30:45 <bauzas> ok, then the contrib meetup, fair 14:31:04 <mriedem> and a thing thurs afternoon to decide release priorities 14:31:06 <sdague> yeh, that worked well last time 14:31:12 <bauzas> agrteed 14:31:15 <mriedem> then the orgy on friday 14:31:16 <mriedem> :) 14:31:32 * dansmith bites his tongue 14:31:43 <johnthetubaguy> yes... 14:31:52 <mriedem> i'll throw the evacuate wip spec in here 14:31:52 <bauzas> so the etherpad is the source for the sessions ? no longer the website ? 14:31:55 <mriedem> since it was controversial 14:31:57 <johnthetubaguy> bauzas: you had something to discuss? 14:32:04 <johnthetubaguy> bauzas: I believe so, just like paris 14:32:07 <bauzas> johnthetubaguy: ok 14:32:13 <dansmith> mriedem: I don't think the spec is controversial 14:32:20 <bauzas> johnthetubaguy: yeah, just re: http://lists.openstack.org/pipermail/openstack-dev/2015-April/060448.html 14:32:20 <mriedem> dansmith: not the spec 14:32:24 <dansmith> mriedem: the "do we do anything in the meantime" was the controversial bit 14:32:25 <dansmith> okay 14:32:36 <mriedem> the thing that led to the spec 14:32:49 <dansmith> well, I think that's sailed now 14:33:22 <mriedem> in a sea of f bombs 14:33:22 <mriedem> yes 14:33:29 <dansmith> heh 14:33:40 <johnthetubaguy> we could add a log message saying not to do it 14:33:43 * johnthetubaguy ducks 14:34:06 <dansmith> not to do evacuate? 14:34:07 <johnthetubaguy> bauzas: so you had something about that thread? 14:34:26 <sdague> seems like path forward on quotas might be useful as well 14:34:28 <bauzas> johnthetubaguy: I was just thinking we should maybe discuss with the bug monkeys for helping checking it 14:34:33 <dansmith> LOG.warning('Oh jeez, you just effed up your system') 14:34:34 <sdague> because that's a ball of mud 14:34:34 <mriedem> sdague: quotas is in the etherpad 14:34:37 <johnthetubaguy> dansmith: i was kinda joking, but yeah 14:34:38 <dansmith> sdague: yeah 14:34:49 <bauzas> I looked and Launchpad is buggy 14:34:50 <cfriesen> bauzas: I liked James Bottomley's suggestion of having it automatically change state when more info provided 14:34:58 <sdague> mriedem: oh I missed it 14:35:09 <bauzas> cfriesen: Launchpad doesn't do this 14:35:16 <johnthetubaguy> sdague: +1 on fixing quotas, using compare and swap in the DB was the biggest lead I heard, but yeah 14:35:30 <sdague> cfriesen: that assumes all kinds of things about launchpad without knowing any of the launchpad limitations 14:35:44 <bauzas> cfriesen: and it's buggy because if you check 'Incomplete with response", it will give you bugs that are replied, but not from the owner only 14:35:46 <sdague> it was exceptionally unproductive response honestly 14:35:48 <bauzas> s/owner/reporter 14:36:18 <bauzas> http://goo.gl/YPMUf3 is the list of incomplete bugs having replies 14:36:22 <bauzas> 49 IIRC 14:36:40 <bauzas> and most of them have replies, but not from the reporter, just as follow-up messages 14:37:02 <bauzas> so we would need to iterate over them 14:37:16 <bauzas> as said, it needs to be done manually until someone enough lazy automates it 14:37:38 <bauzas> I was thinking that the trivial bug monkeys were a good gang for helping that 14:37:52 <sdague> yeh, I did go through a couple hundred incomplete bugs yesterday, closed a lot of them, it's just a lot of manual work 14:38:16 <bauzas> sdague: I tried during lunch to play with launchpadlib, that's doable but risky 14:38:35 <sdague> bauzas: I don't trust launchpadlib 14:38:42 <bauzas> sdague: agreed 14:38:48 <sdague> given how often launchpad times out rest calls 14:38:49 <bauzas> sdague: as I said, the report is wrong 14:38:52 <sdague> yep 14:39:02 <kashyap> Oh, good, /me was about play with launchpad lib 14:39:18 <bauzas> kashyap: that's fine but that's really buggy 14:39:23 <kashyap> Also, auto-closing some bugs can close off valid bugs as ttx noted :-( 14:39:43 <bauzas> kashyap: ^ hence my point, we need to do this by hand :( 14:39:49 <kashyap> But, I think given the volume, I think this auto-expiration needs to be in place I guess 14:39:58 <garyk> manully is a valid option. 14:40:11 <sdague> garyk: only if I'm not the only one doing it :P 14:40:11 <johnthetubaguy> the rub is, we need people to do this work 14:40:19 <bauzas> johnthetubaguy: exactly 14:40:21 <johnthetubaguy> sdague: +1 14:40:23 <sdague> so auto expire is back on 14:40:24 <kashyap> johnthetubaguy: Yes, and it's utterly unthankful 14:40:24 <garyk> sdague: you should certainly not be the one doing it. 14:40:33 <ttx> If you end up expiring manually all bugs after a given abandon time, you should just autoexpire and be done with them :) 14:40:37 <johnthetubaguy> kashyap: agreed 14:40:37 <garyk> it should be the onus of people who are responsible for tagged bugs 14:40:46 <bauzas> sdague: that's why I think we should call the bug monkeys to look at them 14:40:48 <sdague> garyk: I clearly am, there were 200+ incomplete bugs when I looked yesterday 14:40:55 <ttx> My point on the ML was that the tool is not really helping having a conversation with the reporter anyway 14:41:17 <kashyap> Also, hard as it might be to digest, sometimes some bugs will take a year or so to get to a proper resolution. 14:41:22 <johnthetubaguy> sdague: so I think auto abandon on is the right choice 14:41:28 <kashyap> I've seen such bugs in other community projects I participate in. 14:41:30 <sdague> johnthetubaguy: agreed 14:41:34 <garyk> sdague: i know. but you should not be responsible for doing this by yourself. 14:41:54 <garyk> it is our responsibility so we should get our act tigether 14:41:56 <bauzas> ttx: that's a tool issue, we should have a bug system that should put back the ticket in another state if the reporter is replying 14:42:19 <bauzas> at least if he's ticking that he answers the question 14:42:46 <johnthetubaguy> #help we need more folks to help with bug triage 14:42:56 <johnthetubaguy> OK, I don't see lots of us stepping up to help here 14:43:01 <johnthetubaguy> thats what we need 14:43:05 <anteaya> bauzas: I don't think you are helping support getting someone to do the work by using the phrase bug monkeys 14:43:12 <johnthetubaguy> maybe need some kind of bounty 14:43:25 <bauzas> anteaya: just because it worked for trivial bugs 14:43:29 <johnthetubaguy> but I think this conversation has run its course a bit... 14:43:38 <sdague> johnthetubaguy: agreed, move on 14:43:43 <johnthetubaguy> anything more before we move back over to #openstack-nova ? 14:43:48 <bauzas> anteaya: https://etherpad.openstack.org/p/kilo-nova-priorities-tracking L148 and below 14:43:51 <mriedem> end it! 14:43:58 <johnthetubaguy> mriedem: agreed 14:43:59 <bauzas> fair enough 14:44:01 <johnthetubaguy> thanks all 14:44:19 <johnthetubaguy> happy bug triage and fixing 14:44:23 <johnthetubaguy> #endmeeting