14:00:41 <johnthetubaguy> #startmeeting nova
14:00:42 <openstack> Meeting started Thu Apr  2 14:00:41 2015 UTC and is due to finish in 60 minutes.  The chair is johnthetubaguy. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:43 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:45 <openstack> The meeting name has been set to 'nova'
14:00:49 <ndipanov> o/
14:00:49 <neiljerram> o/
14:00:50 <mriedem> hi
14:00:51 <dims> o/
14:00:51 <alex_xu> \o
14:00:51 <bmwiedemann> o/
14:00:52 <alaski> o/
14:00:53 <markus_z> o/
14:00:56 <dansmith> o/
14:00:56 <edleafe> o/
14:00:59 <bauzas> \o
14:01:04 <claudiub> o/
14:01:06 <johnthetubaguy> #topic Kilo Release Status
14:01:12 <johnthetubaguy> hello hello
14:01:34 <johnthetubaguy> so its the final burn down to RC
14:01:51 <johnthetubaguy> basically lets merge lots of bug fixes and test kilo as much as we can
14:02:12 <johnthetubaguy> lets leave the bug talk to the bug section
14:02:17 <johnthetubaguy> #topic Gate status
14:02:30 <mriedem> stable was busted but that's fixed now
14:02:35 <johnthetubaguy> there was a gate fix around policy issues, but anything else cropping up?
14:02:51 <johnthetubaguy> cool, fixed now is better than not fixed
14:02:54 <mriedem> there was a bad trace with n-net and deallocate after some tempest changes but that's fixed and i'll backport to stable today
14:03:04 <mriedem> i've added links to some non-voting jobs
14:03:04 <sdague> o/
14:03:06 <andreykurilin> o/
14:03:09 <johnthetubaguy> mriedem: anything you need people to jump on
14:03:16 <ndipanov> the pci bug with compute not being able to start has a fix re-posted and it looks better
14:03:17 <mriedem> yes
14:03:17 <mriedem> aiocpu (multi-node devstack job that runs live migration testing) fails a lot, see bug https://bugs.launchpad.net/nova/+bug/1438803
14:03:18 <openstack> Launchpad bug 1438803 in OpenStack Compute (nova) "libvirt error libvirtError: Requested operation is not valid: nwfilter is in use hit in _post_live_migration" [Undecided,In progress] - Assigned to Matt Riedemann (mriedem)
14:03:31 <ndipanov> https://review.openstack.org/#/c/131321/ <<== this
14:03:33 <mriedem> i have a patch up for ^ but it's hacky and i haven't seen it demonstrate that it hits the flow
14:03:44 <johnthetubaguy> we are talking gate bugs here
14:03:47 <bauzas> we have the cells job that is going to be fixed soon
14:03:56 <ndipanov> ah ok
14:04:03 <mriedem> johnthetubaguy: i realize, but these are also jobs running on nova code
14:04:08 <mriedem> so if they don't work, it means nova doesn't work
14:04:11 <bauzas> but let's discuss about the cells job on the bugs section
14:04:14 <ndipanov> sorry bout that
14:04:21 <johnthetubaguy> mriedem: totally
14:04:31 <mriedem> oh, nvm :)
14:04:34 <mriedem> i got defensive for nothing
14:04:41 <mriedem> moving on - ceph job: ceph non-voting job fails hard on test_volume_boot_pattern: https://bugs.launchpad.net/cinder/+bug/1439371
14:04:42 <openstack> Launchpad bug 1439371 in Cinder "Volume creation from image fails for UEC+Ceph" [Undecided,New] - Assigned to Jon Bernard (jbernard)
14:04:55 <mriedem> dansmith: and jbernard looked at that yesterday, looks like it's in cinder's court now?
14:05:09 <dansmith> yeah, although I doubt there will be a fix for kilo
14:05:14 <mriedem> ouch
14:05:15 <dansmith> jbernard is going to try
14:05:18 <johnthetubaguy> mriedem: not got priorities on these, what do you want to set them as?
14:05:20 <dansmith> but I think it's probably not going to happen
14:05:25 <dansmith> however, it's not a huge deal, IMHO
14:05:33 <dansmith> it's only ceph with UEC images AND boot from volume
14:05:39 <dansmith> easy to release note that and fix it
14:05:48 <johnthetubaguy> ah, OK, so we can probably mark that with the tag: https://launchpad.net/nova/+bugs?field.tag=kilo-rc-potential
14:05:54 <dansmith> I am working on a change to the gate to use the disk image for ceph jobs
14:05:58 <mriedem> for the live migration one, i don't know if that's an rc blocker or not
14:05:59 <dansmith> so we can get actual runs again
14:06:19 <johnthetubaguy> mriedem: just thinking we should triage the bug, thats all really
14:06:28 <mriedem> if i can get a run with my workaround retry patch for live migration and show it's hitting the retry loop, we could merge that as a workaround for kilo
14:06:39 <sdague> mriedem: on livemigration it's something we only recently got testing, so I'm not surprised we're exposing issues
14:06:42 <mriedem> johnthetubaguy: well, we were told it's due to super old libvirt/qemu
14:06:47 <sdague> mostly we should consider them for backporting
14:07:05 <johnthetubaguy> sdague: yeah, thats a good point
14:07:11 <dansmith> if they're not regressions, they're less important I think
14:07:12 <mriedem> yeah, like i said, not an rc blocker imo
14:07:29 <johnthetubaguy> +1 to all those comments
14:07:31 <mriedem> the other thing i had listed was the cells job, but sounds like bauzas was going to talk about that later
14:07:42 <johnthetubaguy> I trust mriedem to make that clear on the bug
14:07:48 <johnthetubaguy> cools…
14:07:48 <bauzas> mriedem: well, all the related bugs are RC1 related
14:08:12 <bauzas> but we can give a status here
14:08:15 <johnthetubaguy> #topic Bugs
14:08:24 <johnthetubaguy> so before we start… a quick process reminder
14:08:41 <johnthetubaguy> we can release RC1 once all the bugs listed here are merged:
14:08:54 <johnthetubaguy> https://launchpad.net/nova/+milestone/kilo-rc1
14:09:00 <johnthetubaguy> …so
14:09:07 <mriedem> rc1 is sort of scheduled for 4/9 right?
14:09:10 <mriedem> a week from now
14:09:16 <johnthetubaguy> please only target bugs to rc1 if you think we should block the release on that
14:09:21 <mriedem> or is 4/9 just when the project / release manageres get really antsy?
14:09:34 <johnthetubaguy> mriedem: yes, https://wiki.openstack.org/wiki/Kilo_Release_Schedule
14:09:43 <sdague> 4/9 is the last possible day
14:09:43 <ndipanov> well block seems strong here
14:09:52 <bauzas> mriedem: question was answered by mikal saying "the sooner is the better"
14:10:02 <bauzas> ideally before 4/9 IIUC
14:10:04 <sdague> yeh, earlier is better
14:10:09 <mriedem> yeah, we can also do rc2 if needed
14:10:10 <johnthetubaguy> well, we don't want to cut RC1 too early, so we force RC2, but yeah
14:10:20 <johnthetubaguy> anyways, so thats the exit criteria
14:10:28 <ndipanov> we have the importance - so I would imageine we target what we think is reasonable to land
14:10:29 <johnthetubaguy> all the RC1 targeted bugs merged
14:10:43 <johnthetubaguy> so if you target a bug, you are saying "please block the release on this"
14:10:48 <johnthetubaguy> now we also have a tag
14:10:56 <johnthetubaguy> if you want to say, please consider blocking on this one
14:10:59 <johnthetubaguy> then use this:
14:11:07 <johnthetubaguy> https://launchpad.net/nova/+bugs?field.tag=kilo-rc-potential
14:11:18 <johnthetubaguy> OK, so does that all sound like people expected?
14:11:35 <johnthetubaguy> its the same juno, as far as I remember
14:11:37 <ndipanov> yes mostly
14:11:40 <sdague> johnthetubaguy: yep
14:11:48 <johnthetubaguy> cools
14:12:04 <johnthetubaguy> #info targeting to kilo-rc1 means please block the kilo release on my bug
14:12:12 <johnthetubaguy> lets talk bugs
14:12:25 <johnthetubaguy> there was a cells one?
14:13:06 <bauzas> yup
14:13:11 <bauzas> so, just a quick status
14:13:39 <bauzas> chain starting with https://review.openstack.org/#/c/168294/3 is marked as RC1 and targets to reduce the failures down to 3
14:13:57 <sdague> bauzas: and the last 3 are going to be whitelisted out right?
14:14:05 <bauzas> and https://review.openstack.org/#/c/166396/ is whitelisting those 3
14:14:22 <bauzas> with a Depends-On tag on the last patch from the series
14:14:59 <sdague> yeh, I'm not sure why we do depends on there
14:15:02 <bauzas> all the patches but one are good to review, I should privode a last update for https://review.openstack.org/#/c/169400/3 by the 2 next hours
14:15:23 <johnthetubaguy> sdague: if we did depends the other way around we should see the cells test pass I guess?
14:15:29 <bauzas> sdague: because it will remove from the whitelist the failures that are fixed by the series
14:15:42 <sdague> johnthetubaguy: no we wouldn't
14:16:02 <johnthetubaguy> OK, I guess we can't depend on that repo
14:16:02 <sdague> bauzas: we can do it offline, but it's not needed here, I'm going to delete it
14:16:10 <johnthetubaguy> OK, cool
14:16:14 <bauzas> sdague: okay, let's discuss offline, sure
14:16:30 <johnthetubaguy> any more on those? they look good but not quite blockers I guess
14:16:54 <bauzas> johnthetubaguy: we actually would like to see the cells job green by Kilo hence the RC1 tag
14:17:13 <bauzas> if not, it would require backports
14:17:28 <johnthetubaguy> bauzas: OK
14:17:37 <sdague> bauzas: if so, you need to respin patches faster when alaski -1s them :)
14:17:52 <bauzas> sdague: yey, I know...
14:17:53 <johnthetubaguy> any more for this before we go onto the open discussion bits?
14:18:11 <edleafe> bauzas: I can help if needed
14:18:23 <sdague> honestly, I don't think the cells patches should hold up rc, because they are iterating too slow
14:18:53 <alaski> if it comes to it I agree, but I would like to see them in
14:18:57 <johnthetubaguy> sdague: I am with you, we can remove those when we get closer
14:18:58 <bauzas> sdague: eh, it was -1'd yesterday evening my time
14:18:58 <alaski> I can help iterate too
14:19:05 <dansmith> not holding up rc makes sense
14:19:12 <dansmith> continuing to merge until the line does as well, IMHO
14:19:24 <sdague> dansmith: ++
14:19:37 <dansmith> they *have* made amazing progress in a short time,
14:19:44 <johnthetubaguy> +1
14:19:45 <dansmith> progress that we've been looking for for years :)
14:20:05 <johnthetubaguy> …. to be clear
14:20:24 <johnthetubaguy> we can merge any non-violating bug fixes until we try to cut RC1
14:20:50 <johnthetubaguy> so no need to target a bug to be able to merge it right now
14:20:58 <johnthetubaguy> we are trusting all cores to do the right thing here
14:21:01 <bauzas> okay, lemme try to fix the -1 in that hour, that's doable
14:21:01 <johnthetubaguy> anyways, moving on
14:21:13 <johnthetubaguy> #topic open discussion
14:21:32 <johnthetubaguy> so skipping stuck reviews
14:21:36 <johnthetubaguy> as there are non in the agenda
14:21:41 <johnthetubaguy> lets go through the agenda here
14:22:04 <johnthetubaguy> #link http://lists.openstack.org/pipermail/openstack-dev/2015-April/060360.html
14:22:10 <johnthetubaguy> systemz fun
14:22:11 <mriedem> markus_z: ^?
14:22:22 <markus_z> yepp, that's me
14:22:24 <dansmith> aren't we using fakelibvirt in the tests?
14:22:27 <mriedem> we should be
14:22:36 <dansmith> are they not running them in tox or something?
14:22:45 <sdague> dansmith: yeh, I thought so, it might be falling back through under some cases?
14:22:53 <bauzas> johnthetubaguy: I just added an item for the Bugs section in the agenda just before the meeting
14:22:56 <bauzas> (my badf)
14:22:58 <mriedem> are we requireing zkvm CI to run unit tests?
14:22:59 <dansmith> I didn't think it ... would.
14:23:00 <mriedem> i thought it was just tempest?
14:23:12 <dansmith> generally that's the case, yeah
14:23:16 <mriedem> we don't require other virt drivers to run unit tests
14:23:21 <mriedem> *CI i mean
14:23:27 <dansmith> right
14:23:42 <dansmith> saying the unit tests pass on platform X (or Z heh) doesn't mean much, IMHO
14:23:46 <dansmith> just a waste of resources I think
14:23:49 <mriedem> yeah
14:23:56 <mriedem> markus_z: so don't run unit tests on z :)
14:23:59 <mriedem> focus on tempest
14:24:00 <johnthetubaguy> dansmith: +1
14:24:01 <dansmith> problem solved!
14:24:03 <markus_z> Thanks, that was easy :)
14:24:05 <dansmith> heh
14:24:06 <johnthetubaguy> cools
14:24:25 <johnthetubaguy> so, hypervisor support matrix
14:24:30 <mriedem> also z
14:24:38 <markus_z> That's also from my side.
14:24:48 <mriedem> markus_z: for the support matrix, i think you just need to get a patch up and then it'll be iterated in review
14:24:55 <johnthetubaguy> right
14:25:02 <markus_z> Is the CI a precondition?
14:25:05 <mriedem> if something is partial, i think there is a place to add notes
14:25:07 <mriedem> no
14:25:13 <mriedem> well, i assume not
14:25:17 <mriedem> zkvm blueprint is merged and in tree
14:25:19 <markus_z> Some items are unclear to me, what do I do with them?
14:25:22 <johnthetubaguy> markus_z: CI is more about A vs C catagory
14:25:32 <markus_z> johnthetubaguy: Ah, ok
14:25:41 <mriedem> markus_z: maybe put TBD in the review and then -W it
14:25:42 <dansmith> yeah
14:25:47 <mriedem> then iterate in review
14:25:55 <mriedem> TBD in the table cell i mean for the item in question
14:25:56 <johnthetubaguy> yeah, lets get the known ticks and known crosses
14:25:58 <jichen> johnthetubaguy: does that means any hypervisor can submit it without CI ?code in stackforge is also ok?
14:26:04 <mriedem> jichen: no
14:26:14 <mriedem> powervc and zvm in stackforge don't apply here, or nova-docker :)
14:26:19 <mriedem> zkvm is in tree in the libvirt driver
14:26:20 <dansmith> right
14:26:22 <johnthetubaguy> jichen: we are talking about in three drivers here
14:26:29 <jichen> mriedem: ok, got it
14:26:30 <johnthetubaguy> s/three/tree/
14:26:40 <jichen> johnthetubaguy: ok, thanks
14:26:41 <mriedem> jichen: you should probably have your own support matrix for zvm in the stackforge repo though
14:26:51 <johnthetubaguy> markus_z: have we unblocked you now?
14:27:05 <johnthetubaguy> markus_z: I guess just get the review up, and we can talk more in gerrit
14:27:06 <markus_z> Not yet, sorry.
14:27:26 <johnthetubaguy> np, whats the next question?
14:27:28 <markus_z> Some of the items itself are unclear. Even in the description there is "something something, dark side".
14:27:34 <markus_z> That's what I meant
14:27:52 <dansmith> markus_z: let's talk about them off the meeting
14:28:00 <dansmith> assuming you're talking about the matrix...
14:28:01 <johnthetubaguy> markus_z: right, lets talk about that offline
14:28:06 <markus_z> dansmith: yes I do
14:28:12 <dansmith> cool
14:28:13 <johnthetubaguy> just catch us in #openstack-nova as normal
14:28:16 <johnthetubaguy> cools
14:28:19 <markus_z> OK, pushing it next week and talk in review
14:28:26 <johnthetubaguy> so a message from mikal…
14:28:26 <mriedem> no family guy jokes here!
14:28:37 <markus_z> okidoki
14:28:43 <johnthetubaguy> #link http://lists.openstack.org/pipermail/openstack-dev/2015-April/060360.html
14:28:52 <johnthetubaguy> arg
14:28:59 <johnthetubaguy> #link https://etherpad.openstack.org/p/liberty-nova-summit-ideas
14:29:03 <johnthetubaguy> thats what I meant
14:29:07 <johnthetubaguy> the flood gates are open
14:29:23 <mriedem> well, you can probably throw nova-net -> neutron migration on there
14:29:25 <mriedem> and evacuate
14:29:26 <johnthetubaguy> but I would expect to be requested to have a spec up for review, if you want to discussion your feature at the summit
14:30:05 <bauzas> johnthetubaguy: how many fishbowl sessions ?
14:30:09 <bauzas> johnthetubaguy: 2 days ?
14:30:11 <johnthetubaguy> I think the deadline for sessions will be announced post election
14:30:15 <johnthetubaguy> bauzas: I think so
14:30:26 <mriedem> fwiw, i hope everything is scheduled similar to paris
14:30:27 <cfriesen> mriedem: evacuate?
14:30:31 <johnthetubaguy> same as last time, is what I remember we agreed
14:30:37 <mriedem> with 2 back to back sessions for hairy issues
14:30:42 <johnthetubaguy> right
14:30:44 <dansmith> cfriesen: yeah, as in "effing fix it"
14:30:45 <bauzas> ok, then the contrib meetup, fair
14:31:04 <mriedem> and a thing thurs afternoon to decide release priorities
14:31:06 <sdague> yeh, that worked well last time
14:31:12 <bauzas> agrteed
14:31:15 <mriedem> then the orgy on friday
14:31:16 <mriedem> :)
14:31:32 * dansmith bites his tongue
14:31:43 <johnthetubaguy> yes...
14:31:52 <mriedem> i'll throw the evacuate wip spec in here
14:31:52 <bauzas> so the etherpad is the source for the sessions ? no longer the website ?
14:31:55 <mriedem> since it was controversial
14:31:57 <johnthetubaguy> bauzas: you had something to discuss?
14:32:04 <johnthetubaguy> bauzas: I believe so, just like paris
14:32:07 <bauzas> johnthetubaguy: ok
14:32:13 <dansmith> mriedem: I don't think the spec is controversial
14:32:20 <bauzas> johnthetubaguy: yeah, just re: http://lists.openstack.org/pipermail/openstack-dev/2015-April/060448.html
14:32:20 <mriedem> dansmith: not the spec
14:32:24 <dansmith> mriedem: the "do we do anything in the meantime" was the controversial bit
14:32:25 <dansmith> okay
14:32:36 <mriedem> the thing that led to the spec
14:32:49 <dansmith> well, I think that's sailed now
14:33:22 <mriedem> in a sea of f bombs
14:33:22 <mriedem> yes
14:33:29 <dansmith> heh
14:33:40 <johnthetubaguy> we could add a log message saying not to do it
14:33:43 * johnthetubaguy ducks
14:34:06 <dansmith> not to do evacuate?
14:34:07 <johnthetubaguy> bauzas: so you had something about that thread?
14:34:26 <sdague> seems like path forward on quotas might be useful as well
14:34:28 <bauzas> johnthetubaguy: I was just thinking we should maybe discuss with the bug monkeys for helping checking it
14:34:33 <dansmith> LOG.warning('Oh jeez, you just effed up your system')
14:34:34 <sdague> because that's a ball of mud
14:34:34 <mriedem> sdague: quotas is in the etherpad
14:34:37 <johnthetubaguy> dansmith: i was kinda joking, but yeah
14:34:38 <dansmith> sdague: yeah
14:34:49 <bauzas> I looked and Launchpad is buggy
14:34:50 <cfriesen> bauzas: I liked James Bottomley's suggestion of having it automatically change state when more info provided
14:34:58 <sdague> mriedem: oh I missed it
14:35:09 <bauzas> cfriesen: Launchpad doesn't do this
14:35:16 <johnthetubaguy> sdague: +1 on fixing quotas, using compare and swap in the DB was the biggest lead I heard, but yeah
14:35:30 <sdague> cfriesen: that assumes all kinds of things about launchpad without knowing any of the launchpad limitations
14:35:44 <bauzas> cfriesen: and it's buggy because if you check 'Incomplete with response", it will give you bugs that are replied, but not from the owner only
14:35:46 <sdague> it was exceptionally unproductive response honestly
14:35:48 <bauzas> s/owner/reporter
14:36:18 <bauzas> http://goo.gl/YPMUf3 is the list of incomplete bugs having replies
14:36:22 <bauzas> 49 IIRC
14:36:40 <bauzas> and most of them have replies, but not from the reporter, just as follow-up messages
14:37:02 <bauzas> so we would need to iterate over them
14:37:16 <bauzas> as said, it needs to be done manually until someone enough lazy automates it
14:37:38 <bauzas> I was thinking that the trivial bug monkeys were a good gang for helping that
14:37:52 <sdague> yeh, I did go through a couple hundred incomplete bugs yesterday, closed a lot of them, it's just a lot of manual work
14:38:16 <bauzas> sdague: I tried during lunch to play with launchpadlib, that's doable but risky
14:38:35 <sdague> bauzas: I don't trust launchpadlib
14:38:42 <bauzas> sdague: agreed
14:38:48 <sdague> given how often launchpad times out rest calls
14:38:49 <bauzas> sdague: as I said, the report is wrong
14:38:52 <sdague> yep
14:39:02 <kashyap> Oh, good, /me was about play with launchpad lib
14:39:18 <bauzas> kashyap: that's fine but that's really buggy
14:39:23 <kashyap> Also, auto-closing some bugs can close off valid bugs as ttx noted :-(
14:39:43 <bauzas> kashyap: ^ hence my point, we need to do this by hand :(
14:39:49 <kashyap> But, I think given the volume, I think this auto-expiration needs to be in place I guess
14:39:58 <garyk> manully is a valid option.
14:40:11 <sdague> garyk: only if I'm not the only one doing it :P
14:40:11 <johnthetubaguy> the rub is, we need people to do this work
14:40:19 <bauzas> johnthetubaguy: exactly
14:40:21 <johnthetubaguy> sdague: +1
14:40:23 <sdague> so auto expire is back on
14:40:24 <kashyap> johnthetubaguy: Yes, and it's utterly unthankful
14:40:24 <garyk> sdague: you should certainly not be the one doing it.
14:40:33 <ttx> If you end up expiring manually all bugs after a given abandon time, you should just autoexpire and be done with them :)
14:40:37 <johnthetubaguy> kashyap: agreed
14:40:37 <garyk> it should be the onus of people who are responsible for tagged bugs
14:40:46 <bauzas> sdague: that's why I think we should call the bug monkeys to look at them
14:40:48 <sdague> garyk: I clearly am, there were 200+ incomplete bugs when I looked yesterday
14:40:55 <ttx> My point on the ML was that the tool is not really helping having a conversation with the reporter anyway
14:41:17 <kashyap> Also, hard as it might be to digest, sometimes some bugs will take a year or so to get to a proper resolution.
14:41:22 <johnthetubaguy> sdague: so I think auto abandon on is the right choice
14:41:28 <kashyap> I've seen such bugs in other community projects I participate in.
14:41:30 <sdague> johnthetubaguy: agreed
14:41:34 <garyk> sdague: i know. but you should not be responsible for doing this by yourself.
14:41:54 <garyk> it is our responsibility so we should get our act tigether
14:41:56 <bauzas> ttx: that's a tool issue, we should have a bug system that should put back the ticket in another state if the reporter is replying
14:42:19 <bauzas> at least if he's ticking that he answers the question
14:42:46 <johnthetubaguy> #help we need more folks to help with bug triage
14:42:56 <johnthetubaguy> OK, I don't see lots of us stepping up to help here
14:43:01 <johnthetubaguy> thats what we need
14:43:05 <anteaya> bauzas: I don't think you are helping support getting someone to do the work by using the phrase bug monkeys
14:43:12 <johnthetubaguy> maybe need some kind of bounty
14:43:25 <bauzas> anteaya: just because it worked for trivial bugs
14:43:29 <johnthetubaguy> but I think this conversation has run its course a bit...
14:43:38 <sdague> johnthetubaguy: agreed, move on
14:43:43 <johnthetubaguy> anything more before we move back over to #openstack-nova ?
14:43:48 <bauzas> anteaya: https://etherpad.openstack.org/p/kilo-nova-priorities-tracking L148 and below
14:43:51 <mriedem> end it!
14:43:58 <johnthetubaguy> mriedem: agreed
14:43:59 <bauzas> fair enough
14:44:01 <johnthetubaguy> thanks all
14:44:19 <johnthetubaguy> happy bug triage and fixing
14:44:23 <johnthetubaguy> #endmeeting