#openstack-meeting log

21:00:22 <mriedem> #startmeeting nova
21:00:23 <openstack> Meeting started Thu Sep  1 21:00:22 2016 UTC and is due to finish in 60 minutes.  The chair is mriedem. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:24 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:26 <openstack> The meeting name has been set to 'nova'
21:00:31 <alaski> o/
21:00:35 <takashin> o/
21:00:37 <bauzas> \o
21:00:37 <raj_singh> o/
21:00:39 <melwitt> o/
21:00:39 <dansmith> o-
21:00:41 <Gibi> o/
21:00:48 <mriedem> #link agenda https://wiki.openstack.org/wiki/Meetings/Nova
21:00:55 <auggy> o/
21:01:04 <mriedem> let's make this a fast one since we are watching some changes go through the gate for n-3
21:01:08 <mriedem> #topic release status
21:01:22 <mriedem> n-3 and feature freeze is today
21:01:44 <mriedem> we are watching for 3 things to get in before FF and n-3
21:01:46 <mriedem> 1. https://review.openstack.org/#/c/364498/
21:01:50 <mriedem> for placement allocations
21:02:05 <mriedem> 2. https://review.openstack.org/#/c/356138/ for listing instances with cells v2
21:02:17 <alaski> that merged
21:02:22 <mriedem> 3. https://review.openstack.org/#/c/326906 for cells mq switching
21:02:23 <mriedem> sweet
21:02:45 * edleafe wanders in
21:02:55 <mriedem> once those are merged we'll tag n-3
21:03:17 <mriedem> novaclient 6.0.0 was released this week and is in upper-constraints
21:03:23 <mriedem> #info Sep 12-16: PTL self-nominations open for Ocata
21:03:27 <mriedem> fyi on that ^
21:03:34 <mriedem> #info Sep 15: RC1
21:03:40 <mriedem> so we have 2 weeks to rc1
21:03:47 <mriedem> be on the lookout for critical bugs
21:04:02 <mriedem> questions?
21:04:23 <mriedem> i'm going to skip bugs and gate status, nothing critical to report there
21:04:37 <mriedem> #topic summit planning
21:04:40 <tonyb> mriedem: can I patch nova to work better with os-brick 1.6.X?
21:04:46 <tonyb> or is that better for ocata?
21:05:04 <mriedem> tonyb: that's a bug fix, so ok
21:05:10 <tonyb> \o/
21:05:11 <mriedem> although i already put up a patch for that today
21:05:25 <tonyb> mriedem: okay, the problem with having stale context :(
21:05:35 <mriedem> https://review.openstack.org/#/c/364454/
21:05:46 <mriedem> there might be other issues, but let's just bug fix those afterware
21:05:57 <mriedem> so back to summit planning
21:05:58 <mriedem> Etherpad for topics: https://etherpad.openstack.org/p/ocata-nova-summit-ideas
21:06:06 <mriedem> We'll have 13 fishbowl sessions (all of Thurs and Fri morning), and the  meetup style on Friday afternoon. Compared to Austin where we had 18 FB  sessions and a full Friday meetup.
21:06:28 <mriedem> we probably had a few too many FB sessions in austin so i think this is ok
21:06:38 <mriedem> #link Ocata is a shorter cycle with a tentative schedule here: https://review.openstack.org/#/c/357214/
21:06:57 <mriedem> Because of this we'll probably limit to one unconference session and  focus the majority of the release on priority items carried over from  Newton, like cells v2, placement API, and libvirt storage pools.
21:07:28 <mriedem> i'm thinking planning meetings for sessions after rc1
21:07:39 <mriedem> questions about the summit?
21:07:55 <mriedem> #topic open discussion
21:08:04 <mriedem> (xavvior) Resume Guests State automatically rebuild on error
21:08:15 <mriedem> BUG: https://bugs.launchpad.net/nova/+bug/1585494
21:08:15 <openstack> Launchpad bug 1585494 in OpenStack Compute (nova) "Rebuild instance, if the physical data is missing (at start)" [Undecided,In progress] - Assigned to Alex Szarka (xavvior)
21:08:48 <xavvior> Hi all, about this patch:
21:08:56 <xavvior> This patch automatically rebuilds the instance, if we get any error during resume. At rebuild we restore all attributes of   instance from database (e.g.: ip addresses, networks).
21:09:05 <xavvior> As base conception, we automatically rebuild the instance, if the physical files is missing from the disk. For example: compute  blade replacement (due to hardware failure) and then nova can't  recover, resume the instance.
21:09:06 <alaski> this is what evacuate is intended for right?
21:09:19 <xavvior> Also, we kept the original restarting conception and extended it, so we introduce a new   config flag, and we can  turn on/off the fully automated rebuild.
21:09:27 <xavvior> If this flag is False  (this default value), then behaviour stays the same as before, set   instance state to error and we can do nothing with this instance.   If the flag is true (the system operator can set this), then all   instance with error state is automatically is tried to rebuild. If   it also fail, then set the instance state to error.
21:09:34 <xavvior> On instance we don't store persistent data at normal, therefore I   think, this not dangerous, but not restricted, so we set this  decision of system administrator.
21:09:38 <xavvior> The restart already automatized in nova and we'd the rebuild   automatize as like restart. Also, automatized restarting can cause damage too (as we try to   reboot from a corrupt disk, that can cause more corruption -> data   loss)
21:10:15 <bauzas> is that a feature rather ?
21:10:20 <mriedem> xavvior: so i think this is at least a blueprint, probably needs a spec discussion
21:10:24 <mriedem> i definitely don't think this is a bug fix
21:10:27 <alaski> yeah, this is more a feature than bug
21:10:44 <mriedem> a spec would be a good place to explain the use case and why this is different from evacuate
21:10:48 <mriedem> or evacuate doesn't work for this
21:10:55 <alaski> and it seems like evacuate covers this, though it's not automated
21:10:58 <mriedem> xavvior: so i think you should work on a blueprint and start a spec for ocata
21:11:12 <bauzas> xavvior: http://docs.openstack.org/developer/nova/process.html#how-do-i-get-my-code-merged can help understanding the process
21:11:48 <gibi> is it still a feature if the resume fails only beacuse the instance is not defined in libvirt but the local disk exists (as it was on shared storage)?
21:12:58 <mriedem> "If physical data is missing from the disk (e.g.: compute blade replacement due to hardware failure), nova can't recover instances. But we should try to rebuild instances with the original data (which are stored in the database)."
21:13:05 <mriedem> that reads to me as the scenario for using evacuate
21:13:21 <mriedem> unplanned outage and you have to resurrect an instance from a failed compute
21:13:22 <bauzas> gibi: if the operator tramples the disks, then I don't think Nova should help him
21:13:24 <alaski> if somehow Nova is causing the instance to be incorrectly removed from libvirt that's a bug. anything outside of that needs discussion
21:13:40 <bauzas> yeah that
21:14:07 <raj_singh> mriedem: Will you remove your -2's from centralize-config-option patches next week?
21:14:18 <mriedem> raj_singh: maybe
21:14:33 <gibi> alaski: in this case the it is not nova that removes the instance from libvirt but the hw replacement removes it
21:14:35 <mriedem> with 2 weeks to rc1 i don't want a ton of distraction really
21:14:48 <raj_singh> mriedem: Ok
21:15:01 <alaski> gibi: I wouldn't consider that a bug. If Nova has to deal with external changes like that it's a feature
21:15:16 <bauzas> +1
21:15:22 <gibi> alaski: OK. Let's spec this up
21:15:25 <mriedem> but my pet!
21:15:29 <alaski> gibi: sounds good, thanks!
21:15:34 <mriedem> xavvior: did you follow that?
21:15:40 <gibi> mriedem: yeah, this is about pets
21:15:44 <mriedem> if you have questions just hit us up in #openstack-nova in irc
21:15:48 <xavvior> yes, i follow the talk
21:15:51 <mriedem> preferrably after FF
21:15:56 * dansmith is losing focus quick
21:16:09 <mriedem> oh shit we're losing him
21:16:13 * mriedem grabs paddles
21:16:18 <mriedem> anything else?
21:16:30 <dansmith> hah
21:16:32 <mriedem> ok then let's wrap up newton
21:16:33 <mriedem> CLEAR!
21:16:36 <mriedem> #endmeeting