06:06:14 <yoctozepto> #startmeeting masakari
06:06:15 <opendevmeet> Meeting started Tue Jun  1 06:06:14 2021 UTC and is due to finish in 60 minutes.  The chair is yoctozepto. Information about MeetBot at http://wiki.debian.org/MeetBot.
06:06:16 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
06:06:18 <opendevmeet> The meeting name has been set to 'masakari'
06:07:04 <yoctozepto> #topic Roll-call
06:07:29 <yoctozepto> bot broke
06:07:41 <yoctozepto> \o/
06:07:41 <jopdorp> \o/ \O/
06:07:46 <yoctozepto> hi jopdorp
06:07:50 <yoctozepto> hi suzhengwei
06:07:51 <suzhengwei> 0/
06:08:01 <yoctozepto> glad to see you in the new network
06:08:44 <jopdorp> oftc!
06:08:49 <yoctozepto> #topic Agenda
06:08:55 <yoctozepto> * Roll-call
06:08:55 <yoctozepto> * Agenda
06:08:55 <yoctozepto> * Announcements
06:08:55 <yoctozepto> * Review action items from the last meeting
06:08:55 <yoctozepto> * CI status
06:08:57 <yoctozepto> * Backports pending reviews
06:08:57 <yoctozepto> * Xena planning -> https://etherpad.opendev.org/p/masakari-xena-ptg
06:08:59 <yoctozepto> * Open discussion
06:09:24 <yoctozepto> #topic Announcements
06:09:57 <yoctozepto> it's kind of obvious but we moved IRC channels to the new network - OFTC - that we are currently on
06:10:19 <yoctozepto> this information probably makes more sense in the saved logs read externally because you clearly know it being here ;-)
06:10:35 <yoctozepto> #topic Review action items from the last meeting
06:10:38 <yoctozepto> there were none
06:10:44 <yoctozepto> #topic CI status
06:10:58 <yoctozepto> I saw it green recently
06:11:03 <yoctozepto> #topic Backports pending reviews
06:11:08 <yoctozepto> none
06:11:15 <yoctozepto> #topic Xena planning -> https://etherpad.opendev.org/p/masakari-xena-ptg
06:11:22 <yoctozepto> ok, this is the interesting part
06:11:29 <yoctozepto> have we got more progress?
06:13:16 <jopdorp> not from my side
06:13:47 <yoctozepto> ok
06:15:51 <suzhengwei> update https://review.opendev.org/c/openstack/masakari/+/788382
06:15:59 <yoctozepto> I saw suzhengwei commented on my comment
06:16:02 <yoctozepto> oh, right, that one
06:16:08 <yoctozepto> I have not read it yet
06:17:10 <suzhengwei> no hurry
06:17:50 <yoctozepto> read it
06:18:05 <yoctozepto> but I'm not sure if you agree with me or disagree
06:18:32 <yoctozepto> was there any issue with the flow I presented in my comment?
06:19:27 <suzhengwei> My opinion, no need wat for the result if it disable or force-down nova-compute service.
06:20:21 <suzhengwei> There is no asynchronous call in nova inside.
06:21:06 <yoctozepto> well, if we check if it's up and it is, then we should try checking for it going down; if nova still sees the host, then we could say either abort or force it down and continue (this could be up to the user's choice)
06:21:57 <suzhengwei> The return of calls to nova-api can show whether the status/state has changed sucessefully.
06:23:15 <suzhengwei> we check if it's up just because we are not sure it has been fenced.
06:25:50 <suzhengwei> the nova-compute diable or not will not stop evacuation. but down or not matters.
06:26:14 <yoctozepto> suzhengwei: yes, we don't know if it's down precisely ;-)
06:26:17 <yoctozepto> also
06:26:32 <yoctozepto> if nova thinks it's up and it's actually down, then it will fail disabling the service as well
06:27:00 <yoctozepto> I think disabling the service is an extra such that the host does not come into play if it gets back on and needs operator's intervention
06:27:23 <yoctozepto> the order of actions is unfortunate though, I pointed you at the relevant bug reports
06:31:06 <suzhengwei> yep, we can disable the compute firstly, but I don't think need to wait. the enable_disable_service call returns show something.
06:31:45 <suzhengwei> It the call failed, it will raise exception.
06:32:25 <yoctozepto> suzhengwei: but it waits
06:32:35 <yoctozepto> last time I checked, there was some serious delay
06:32:53 <yoctozepto> because the disable wanted to speak via mq to the nova-compute service
06:37:57 <suzhengwei> but it is synchronous calls in nova inside. think that if nova-compute already down, which one to disable or force-down the compute service.
06:38:16 <suzhengwei> not nova-compute, all in the nova-api.
06:38:50 <suzhengwei> if it dellay, the call would respone dellay too.
06:39:37 <suzhengwei> delay
06:40:14 <yoctozepto> suzhengwei: yeah, it's synchronously waiting for an answer on mq
06:40:34 <yoctozepto> if nova-api thinks nova-compute is up
06:40:38 <yoctozepto> but it's actually not
06:40:46 <yoctozepto> then disabling it is going to timeout
06:41:08 <yoctozepto> try it out locally
06:41:39 <yoctozepto> just without masakari, ensure nova-compute has just been confirmed to be up, then firewall that host away and try disabling the service in nova
06:41:50 <suzhengwei> I am very sure.
06:42:52 <suzhengwei> Even the compute node down. I can still disable or force-down it freely.
06:43:11 <yoctozepto> if nova knows it's down, then yes
06:43:26 <yoctozepto> if nova thinks it's up but it's not, timeout
06:43:36 <yoctozepto> unless they changed that in recent series
06:43:56 <yoctozepto> because that's what it was in ussuri for sure
06:46:49 <yoctozepto> I suggest you check and I will check as well
06:49:55 <suzhengwei> the code in nova project. service_update_by_host_and_binary function in nova/compute/api.py
06:50:49 <suzhengwei> It just changes db in the nova-api process.
06:53:35 <yoctozepto> suzhengwei: but it finally calls https://github.com/openstack/nova/blob/5cf06bf33d8f187d444f812177946e134e4c9932/nova/compute/api.py#L5863
06:53:49 <yoctozepto> which has the unfortunate self.rpcapi.set_host_enabled
06:54:09 <yoctozepto> and it usually just timeouts
06:57:36 <suzhengwei> oh, i see. It need synchronous with pacement. But it save status first, then rpcapi.set_host_enabled.
06:57:56 <suzhengwei> placement
06:59:04 <yoctozepto> suzhengwei: yeah, I would say it's an edge situation on nova's side
06:59:26 <yoctozepto> ok, the point is we are disabling the service as a service for the operator and not because we require it
06:59:41 <yoctozepto> and we have to end the meeting
07:00:00 <yoctozepto> thanks for the discussion, that was a fruitful meeting
07:00:07 <yoctozepto> see you next time
07:00:10 <yoctozepto> #endmeeting