04:00:05 <samP> #startmeeting masakari
04:00:06 <openstack> Meeting started Tue Jun  4 04:00:05 2019 UTC and is due to finish in 60 minutes.  The chair is samP. Information about MeetBot at http://wiki.debian.org/MeetBot.
04:00:07 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
04:00:09 <openstack> The meeting name has been set to 'masakari'
04:00:15 <samP> Hi all for Masakari
04:01:43 <samP> #topic Critical bugs and patches
04:02:08 <samP> Please share any items need to discuss here.
04:02:42 <tashiromt> Hi
04:02:47 <samP> tashiromt: Hi
04:03:50 <samP> Seems like we do not have any item to discuss today.
04:04:29 <tpatil> Sorry to join late
04:04:40 <samP> tpatil: no problem
04:05:01 <samP> Now in the topic of Critical bugs and patches
04:05:16 <samP> Any issues to share?
04:05:53 <tpatil> samP: No critical bugs to be discussed from my side
04:05:59 <samP> tpatil: thanks
04:06:08 <samP> #topic Train work items
04:06:24 <samP> Please share if you have any update on Train work items
04:07:03 <tpatil> 1) Add devstack support to install host-monitors
04:07:18 <tpatil> We are working on understanding how to configure STONITH resource
04:07:49 <tpatil> I think in host-monitor, it's possible to use STONITH_TYPE ssh and external/ipmi
04:08:05 <tpatil> Any idea, how does ssh works?
04:09:39 <tpatil> samP: Also, in our dev env. we don't have support for IPMI, so we couldn't make any progress. I would like to discuss with you about the dev. env offline.
04:11:18 <samP> tpatil: Sure, I will make time to do that. And I think we have IPMI in for bare metal nodes. I will share those details later
04:11:24 <tpatil> #link : https://github.com/openstack/masakari-monitors/blob/master/masakarimonitors/hostmonitor/hostmonitor.sh#L707
04:11:48 <tpatil> above link for ssh STONITH_TYPE
04:13:03 <tpatil> I have one question about STONITH_TYPE: ssh, if one compute node is down, how is it possible to interact with that node using SSH from other compute node?
04:13:04 <samP> tpatil: Thanks for the link. Before use that STONITH method, we need to configure pacemaker cluster with ssh settings
04:13:41 <samP> tpatil: you can't. STONITH is use to kill a node.
04:15:17 <tpatil> samP: Kill a node proactively or after pacemaker detects an issue in that node?
04:16:06 <samP> tpatil: As an Ex: if operator (or Pacemaker) find something wrong in node, then he (it) try to kill that node and isolate it from the other nodes
04:16:25 <samP> tpatil:  correct.
04:18:32 <tpatil> samP: The job of stonith resource device is to kill that node completely, correct?
04:19:10 <samP> tpatil: correct
04:19:34 <tpatil> samP: in that sense, how does STONITH_TYPE ssh works is my main question? becoz it's possible that failed node may fail to response to ssh
04:21:04 <tpatil> samP: We can discuss about supported STONITH_TYPE in host-monitors and how it works separately as I'm completely new in this area
04:21:24 <samP> tpatil: yes, there is a possibility. Normally we configure several type of STONITH_TYPS such as IPMI, ssh .. etc.
04:22:01 <samP> if ssh failed, then it will try to use the IPMI or any other methods
04:22:24 <tpatil> samP: Ok
04:22:35 <samP> different stonith method has its own advantages and disadvantages
04:23:25 <samP> in ssh, you execute commands such as take the logs, wait for core dump.
04:23:37 <samP> on the other hand it will take time.
04:24:22 <samP> in IPMI, you can kill a node very quick. However, you may loose core dump in servers with large RAM
04:25:28 <samP> tpatil: I will make detail doc for this and make time to explain about this.
04:25:42 <tpatil> samP: That will help, Thank you.
04:25:54 <samP> I will contact you offline for this.
04:26:10 <tpatil> samP: Ok, Thank you
04:26:11 <tpatil> 2) Real time rendering of recovery workflow in masakari-dashboard
04:26:58 <tpatil> as explained in last weekly meeting, fastener commit will be reverted : https://github.com/harlowja/fasteners/issues/36
04:28:04 <tpatil> So it's a blocker to implement real time rendering as we need to add more progress details during the evacuation of instances during execution of workflow
04:28:57 <tpatil> We haven't yet tried the workaround solution to monkey patch method from fastener to make this thing work in masakari
04:31:01 <tpatil> 3) Improve documentation
04:31:02 <samP> tpatil: Thanks for update.
04:31:33 <tpatil> Unfortunately, not yet started. Will take up this task sometime in this week
04:31:49 <samP> tpatil: no problem
04:32:14 <tpatil> That's all update about Train items from my end
04:32:31 <samP> tpatil: thanks
04:33:13 <samP> I have no update from my side.
04:33:45 <samP> I will make time to discuss about pacemaker and stonith settings. I will let you know the details
04:33:52 <samP> #topic AOB
04:34:06 <samP> Any other issues need to discuss
04:34:35 <tpatil> sam: OK
04:34:44 <tpatil> samP: OK
04:35:10 <tpatil> samP: No other topics for discussion from my side
04:35:19 <samP> tpatil: thanks
04:35:52 <samP> OK then, let's finish today's meeting here
04:35:59 <samP> Thank you all!
04:36:17 <tpatil> samP: Ok, Thanks
04:36:23 <samP> please use ML or #openstack-masakari IRC@freenode for further discussions.
04:36:33 <samP> Thank you all again..
04:36:37 <samP> #endmeeting