06:06:14 <yoctozepto> #startmeeting masakari 06:06:15 <opendevmeet> Meeting started Tue Jun 1 06:06:14 2021 UTC and is due to finish in 60 minutes. The chair is yoctozepto. Information about MeetBot at http://wiki.debian.org/MeetBot. 06:06:16 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 06:06:18 <opendevmeet> The meeting name has been set to 'masakari' 06:07:04 <yoctozepto> #topic Roll-call 06:07:29 <yoctozepto> bot broke 06:07:41 <yoctozepto> \o/ 06:07:41 <jopdorp> \o/ \O/ 06:07:46 <yoctozepto> hi jopdorp 06:07:50 <yoctozepto> hi suzhengwei 06:07:51 <suzhengwei> 0/ 06:08:01 <yoctozepto> glad to see you in the new network 06:08:44 <jopdorp> oftc! 06:08:49 <yoctozepto> #topic Agenda 06:08:55 <yoctozepto> * Roll-call 06:08:55 <yoctozepto> * Agenda 06:08:55 <yoctozepto> * Announcements 06:08:55 <yoctozepto> * Review action items from the last meeting 06:08:55 <yoctozepto> * CI status 06:08:57 <yoctozepto> * Backports pending reviews 06:08:57 <yoctozepto> * Xena planning -> https://etherpad.opendev.org/p/masakari-xena-ptg 06:08:59 <yoctozepto> * Open discussion 06:09:24 <yoctozepto> #topic Announcements 06:09:57 <yoctozepto> it's kind of obvious but we moved IRC channels to the new network - OFTC - that we are currently on 06:10:19 <yoctozepto> this information probably makes more sense in the saved logs read externally because you clearly know it being here ;-) 06:10:35 <yoctozepto> #topic Review action items from the last meeting 06:10:38 <yoctozepto> there were none 06:10:44 <yoctozepto> #topic CI status 06:10:58 <yoctozepto> I saw it green recently 06:11:03 <yoctozepto> #topic Backports pending reviews 06:11:08 <yoctozepto> none 06:11:15 <yoctozepto> #topic Xena planning -> https://etherpad.opendev.org/p/masakari-xena-ptg 06:11:22 <yoctozepto> ok, this is the interesting part 06:11:29 <yoctozepto> have we got more progress? 06:13:16 <jopdorp> not from my side 06:13:47 <yoctozepto> ok 06:15:51 <suzhengwei> update https://review.opendev.org/c/openstack/masakari/+/788382 06:15:59 <yoctozepto> I saw suzhengwei commented on my comment 06:16:02 <yoctozepto> oh, right, that one 06:16:08 <yoctozepto> I have not read it yet 06:17:10 <suzhengwei> no hurry 06:17:50 <yoctozepto> read it 06:18:05 <yoctozepto> but I'm not sure if you agree with me or disagree 06:18:32 <yoctozepto> was there any issue with the flow I presented in my comment? 06:19:27 <suzhengwei> My opinion, no need wat for the result if it disable or force-down nova-compute service. 06:20:21 <suzhengwei> There is no asynchronous call in nova inside. 06:21:06 <yoctozepto> well, if we check if it's up and it is, then we should try checking for it going down; if nova still sees the host, then we could say either abort or force it down and continue (this could be up to the user's choice) 06:21:57 <suzhengwei> The return of calls to nova-api can show whether the status/state has changed sucessefully. 06:23:15 <suzhengwei> we check if it's up just because we are not sure it has been fenced. 06:25:50 <suzhengwei> the nova-compute diable or not will not stop evacuation. but down or not matters. 06:26:14 <yoctozepto> suzhengwei: yes, we don't know if it's down precisely ;-) 06:26:17 <yoctozepto> also 06:26:32 <yoctozepto> if nova thinks it's up and it's actually down, then it will fail disabling the service as well 06:27:00 <yoctozepto> I think disabling the service is an extra such that the host does not come into play if it gets back on and needs operator's intervention 06:27:23 <yoctozepto> the order of actions is unfortunate though, I pointed you at the relevant bug reports 06:31:06 <suzhengwei> yep, we can disable the compute firstly, but I don't think need to wait. the enable_disable_service call returns show something. 06:31:45 <suzhengwei> It the call failed, it will raise exception. 06:32:25 <yoctozepto> suzhengwei: but it waits 06:32:35 <yoctozepto> last time I checked, there was some serious delay 06:32:53 <yoctozepto> because the disable wanted to speak via mq to the nova-compute service 06:37:57 <suzhengwei> but it is synchronous calls in nova inside. think that if nova-compute already down, which one to disable or force-down the compute service. 06:38:16 <suzhengwei> not nova-compute, all in the nova-api. 06:38:50 <suzhengwei> if it dellay, the call would respone dellay too. 06:39:37 <suzhengwei> delay 06:40:14 <yoctozepto> suzhengwei: yeah, it's synchronously waiting for an answer on mq 06:40:34 <yoctozepto> if nova-api thinks nova-compute is up 06:40:38 <yoctozepto> but it's actually not 06:40:46 <yoctozepto> then disabling it is going to timeout 06:41:08 <yoctozepto> try it out locally 06:41:39 <yoctozepto> just without masakari, ensure nova-compute has just been confirmed to be up, then firewall that host away and try disabling the service in nova 06:41:50 <suzhengwei> I am very sure. 06:42:52 <suzhengwei> Even the compute node down. I can still disable or force-down it freely. 06:43:11 <yoctozepto> if nova knows it's down, then yes 06:43:26 <yoctozepto> if nova thinks it's up but it's not, timeout 06:43:36 <yoctozepto> unless they changed that in recent series 06:43:56 <yoctozepto> because that's what it was in ussuri for sure 06:46:49 <yoctozepto> I suggest you check and I will check as well 06:49:55 <suzhengwei> the code in nova project. service_update_by_host_and_binary function in nova/compute/api.py 06:50:49 <suzhengwei> It just changes db in the nova-api process. 06:53:35 <yoctozepto> suzhengwei: but it finally calls https://github.com/openstack/nova/blob/5cf06bf33d8f187d444f812177946e134e4c9932/nova/compute/api.py#L5863 06:53:49 <yoctozepto> which has the unfortunate self.rpcapi.set_host_enabled 06:54:09 <yoctozepto> and it usually just timeouts 06:57:36 <suzhengwei> oh, i see. It need synchronous with pacement. But it save status first, then rpcapi.set_host_enabled. 06:57:56 <suzhengwei> placement 06:59:04 <yoctozepto> suzhengwei: yeah, I would say it's an edge situation on nova's side 06:59:26 <yoctozepto> ok, the point is we are disabling the service as a service for the operator and not because we require it 06:59:41 <yoctozepto> and we have to end the meeting 07:00:00 <yoctozepto> thanks for the discussion, that was a fruitful meeting 07:00:07 <yoctozepto> see you next time 07:00:10 <yoctozepto> #endmeeting