04:00:25 <samP> #startmeeting masakari
04:00:26 <openstack> Meeting started Tue Jul 11 04:00:25 2017 UTC and is due to finish in 60 minutes.  The chair is samP. Information about MeetBot at http://wiki.debian.org/MeetBot.
04:00:27 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
04:00:29 <openstack> The meeting name has been set to 'masakari'
04:00:36 <samP> hi all o/
04:00:42 <Dinesh_Bhor> hi all
04:00:44 <abhishekk> o/
04:00:48 <rkmrHonjo> hi
04:00:57 <samP> let's start..
04:01:13 <samP> #topic Critical Bugs
04:01:20 <samP> any bugs to discuss?
04:01:51 <samP> BTW, review request is done..
04:02:23 <rkmrHonjo> Thanks!
04:02:50 <samP> If no bugs to discuss, then let's move to discussion points.
04:03:03 <abhishekk> samP: ok
04:03:16 <rkmrHonjo> ok.
04:03:18 <samP> #topic Discussion points
04:03:41 <samP> 1. Make nova on_shared_storage configurable
04:04:34 <rkmrHonjo> We discussed about this topic last week.
04:05:02 <samP> ah..
04:05:07 <Dinesh_Bhor> In last meeting it is decided to make "on_shared_storage" option configurable. Will submit patch soon.
04:05:22 <samP> Agenda is not uptodate then..:)
04:05:34 <samP> Dinesh_Bhor: rkmrHonjo thanks..
04:06:09 <samP> sorry, I should hv update it..
04:06:36 <rkmrHonjo> Dinesh_Bhor: Do you want to discuss about this now?
04:06:55 <Dinesh_Bhor> Just have one question related to this: will there be a situation like half of the instances are on shared-storage and half not in real deployment?
04:08:00 <tpatil> example: host aggregate, some host aggregate on shared storage and others on non shared storage
04:09:14 <samP> First, in order to use Masakari, instance must be on shared storage.
04:09:56 <samP> if instance in not in shared storage evacuate = rebuild and can not rescue it.
04:11:42 <tpatil> Then  there is no point of making this option configurable
04:11:46 <samP> Can you pass on_shared_storage option through nova API?
04:11:57 <tpatil> Yes
04:12:06 <rkmrHonjo> samP: How do you think about user who uses boot-from-volume(and non shared storage)?
04:12:30 <tpatil> You can evacuate instance if it's not on shared storage
04:13:09 <samP> rkmrHonjo: in that case, on_shared_storage does not make any diff.
04:13:46 <samP> rkmrHonjo: because, it will be volume detach and re-attach after evacuate. right?
04:14:36 <tpatil> Dinesh : please update our findings about on shared storage parameter true and false
04:14:53 <rkmrHonjo> samP: I got it.
04:15:02 <Dinesh_Bhor> tpatil: yes
04:15:35 <Dinesh_Bhor> if on_shared_storage option isTrue and the instance files are not on shared storage actually then evacuate calls fails
04:15:43 <samP> I remember I fixed some thing related to this in past..
04:15:48 <samP> #link https://review.openstack.org/#/c/320231/
04:16:32 <Dinesh_Bhor> whereas if the on_shared_storage option is False then there is no issue
04:18:59 <samP> Dinesh_Bhor: true
04:19:27 <samP> #link https://developer.openstack.org/api-ref/compute/?expanded=evacuate-server-evacuate-action-detail
04:19:34 <samP> Please see ^^
04:19:43 <samP> Starting since version 2.14, Nova automatically detects whether the server is on shared storage or not.
04:19:52 <samP> Therefore this parameter was removed.
04:20:31 <samP> Am I going in wrong direction?
04:20:55 <rkmrHonjo> samP: Should we bump nova API version  over 2.14 in masakari? Current version is 2.9.
04:21:11 <rkmrHonjo> #link https://github.com/openstack/masakari/blob/master/masakari/compute/nova.py#L42
04:21:12 <tpatil> in which micro version this parameter was removed
04:22:44 <Dinesh_Bhor> #link https://github.com/openstack/nova/blob/master/releasenotes/notes/remove-on-shared-storage-flag-from-evacuate-api-76a3d58616479fe9.yaml
04:23:03 <Dinesh_Bhor> it is removed in microversion 2.14
04:23:39 <tpatil> What if the instances are booted from imaages?
04:24:48 <samP> tpatil: glance images?
04:25:29 <Dinesh_Bhor> yes
04:26:06 <samP> sorry, Still cant see the problem..
04:27:26 <samP> for image booted instance, if ephemeral disk is in shared_storage then, it works in normal way, right?
04:28:01 <tpatil_> samP: you said the on_shared_storage parameter is removed, what if the instances are booted from images. THe data storage on the instances will be lost after evacuation, right?
04:28:56 <tpatil_> samP: if instances path is not using shared_storage
04:29:00 <samP> nova automatically detects whether it is in shared storage or not..
04:29:09 <samP> tpatil_: in that case, yes
04:29:31 <tpatil_> samP: If it's not on shared storage, does it fail to evacuate if instance is booted from image
04:30:20 <samP> tpatil_: I have to check, but I think it will not fail. instance will still evacuated (= rebuild)
04:30:44 <tpatil_> samP: ok, do we want to allow masakari to evacuate instance in such cases
04:31:14 <Dinesh_Bhor> samP: I have checked this. It evacuates = rebuilds
04:31:21 <abhishekk> samP: i.e. if instance_path is not on shared storage
04:33:26 <samP> tpatil_: if operator define non shared storage cluster in masakari, then masakari will evacuate those instances..
04:33:41 <samP> tpatil_: which operator should not do..
04:34:44 <samP> if we can prevent that by set on_shared_storage option, then that would be good.
04:35:15 <tpatil_> samP: in future, we might need to use nova version 2.14 above so making this option configurable is not a good solution
04:35:15 <samP> However, my point is we can not control it after API v2.14
04:35:57 <tpatil_> samP: so let's not make on_shared_storage option configurable
04:36:12 <samP> tpatil_: agree.
04:36:31 <rkmrHonjo> tpatil_: +1
04:36:57 <samP> do we have bug report for this issue?
04:37:07 <tpatil_> samP: but the questions remains if instances are booted from image, then there will be data loss. I don't think it's acceptable to the users
04:37:46 <tpatil_> samP: I think we can document saying use masakari only if the instance path are on shared_storage
04:38:18 <samP> tpatil_: understand.. but instances booted from image on non shared storage... how do we rescue it when compute node no longer there..
04:39:41 <tpatil_> samP: I understand that we don't have any control, but I just wanted to point from users perspective that there will be data loss
04:39:56 <samP> tpatil_: agree. we should put this in README.rst
04:40:07 <tpatil_> samP: sure
04:41:45 <samP> tpatil_: that is a good point.. all the presentations we did in the past, this was an unspoken agreement..
04:42:24 <rkmrHonjo> tpatil: Will you modify masakari codes? I think that we should bump nova API version to 2.14 and remove on_shared_stroage parameter from nova.py.
04:43:08 <tpatil_> rkmrHonjo: Sure, we will submit a patch soon
04:43:19 <abhishekk> 2. Instance gets auto-confirmed(uses new flavor) if masakari evacuates an instance which was partially resized(resize-confirm is not performed)
04:43:22 <rkmrHonjo> tpatil: thanks.
04:43:23 <samP> tpatil_: thanks..
04:43:35 <abhishekk> samP: regarding second point, I have sent mail to operators mailing list but haven't got any constructive feedback yet, #link http://lists.openstack.org/pipermail/openstack-operators/2017-July/013905.html
04:43:43 <samP> abhishekk: thanks..
04:44:22 <samP> I asked some ops to take a look at this..
04:44:28 <abhishekk> samP: I am also discussing same on operators IRC channel, but tow pepoples said they havent got this situation where they need to evacuate resized instance
04:44:34 <samP> currently only Saverio has replied ..
04:44:44 <abhishekk> s/tow/two
04:45:20 <samP> It is a very rare situation
04:45:57 <samP> But still critical to us..
04:46:52 <abhishekk> samP: #link http://eavesdrop.openstack.org/irclogs/%23openstack-operators/%23openstack-operators.2017-07-10.log.html#t2017-07-10T08:37:13
04:47:01 <samP> On the other hand, most people do not use evacuate ..
04:47:23 <Dinesh_Bhor> In the meantime until nova makes changes to evacuate api to address this issue in masakari
04:47:24 <Dinesh_Bhor> From masakari side I have updated the patch which evacuates and stops the resized instance after evacuation on the basis of power_state of instance: https://review.openstack.org/#/c/469029/
04:48:18 <tpatil_> samP: IMO, we should fix this issue in masakari first and keep following up with nova community to address this issue in nova
04:48:26 <samP> Dinesh_Bhor: Thanks
04:48:44 <samP> tpatil_: agree.. fix nova will take time
04:49:24 <tpatil_> samP: Ok, I will review Dinesh's patch taking this point into consideration
04:49:39 <samP> Let's review and merge this..
04:49:43 <samP> tpatil_: thanks..
04:49:48 <tpatil_> samP: Sure
04:50:16 <samP> 3. Remove ERROR instances from recovery targets when host failure happen
04:50:28 <rkmrHonjo> I want to add a configurable option. Error instances will be remove from recovery targets if the option is set.
04:50:33 <rkmrHonjo> Because some users don't want to launch error instances after recoverying.
04:50:40 <rkmrHonjo> Ofcourse there is a possibility that following patch resolve this issue, but that will take time.
04:50:47 <rkmrHonjo> #link https://review.openstack.org/#/c/469029/
04:52:03 <samP> rkmrHonjo: do you want to remove only "ERROR" instances?
04:52:55 <rkmrHonjo> samP: Yes. But I think that there is another solution. Writing the rescuable statuses in masakari.conf.
04:53:17 <Dinesh_Bhor> With the above patch error instance will be stopped after evacuation
04:54:05 <Dinesh_Bhor> In the master code error instances will be evacuated and the final state will be active
04:54:27 <samP> Dinesh_Bhor: rkmrHonjo's proposal is not to evacaute error instance, which is slightly different from stop after evacuate, right?
04:54:44 <Dinesh_Bhor> yes
04:55:03 <rkmrHonjo> samP: yes.
04:55:35 <samP> rkmrHonjo: "rescuable statues" in config is much like Recovery method customization
04:55:40 <abhishekk> so even isntance which is in error state and marked as HA_Enabled True will be ignored in this case right?
04:55:51 <samP> abhishekk: right
04:56:24 <abhishekk> samP: thanls
04:56:28 <rkmrHonjo> abhishekk: yes, that is my wish.
04:56:33 <abhishekk> s/thanls/thanks
04:57:14 <abhishekk> 3 minutes left
04:57:16 <samP> I do not think it is a good idea to list down all rescuable statues in config, where we have spec for "Recovery method customization"
04:57:23 <samP> abhishekk: yep
04:57:36 <samP> Let's continue this discussion in ML
04:57:38 <abhishekk> I will fix comment given by rkmrHonjo on API specification patch
04:57:56 <rkmrHonjo> samP: OK. I'll send a mail.
04:58:03 <samP> I will send a mail with my thoughts..
04:58:07 <rkmrHonjo> abhishekk: thanks a lot! I'll check it.
04:58:14 <samP> ah...ok
04:58:20 <samP> rkmrHonjo: please..
04:58:24 <abhishekk> I have submitted updated specs for recovery method customization #link https://review.openstack.org/458023
04:58:31 <samP> abhishekk: thanks..
04:58:43 <abhishekk> in this specs I have mentioned which actions we need to add in mistral
04:58:49 <abhishekk> please have a look at it
04:58:58 <samP> abhishekk: great.. I will review this..
04:59:04 <samP> 1m  left
04:59:06 <abhishekk> samP: thank you
04:59:34 <samP> please offload to #openstack-masakari or ML with [masakari] for further discussions..
04:59:46 <samP> thank you all
04:59:51 <rkmrHonjo> thank you.
04:59:55 <samP> #endmeeting