15:03:53 <bnemec> #startmeeting oslo 15:03:53 <bnemec> Courtesy ping for bnemec, smcginnis, moguimar, johnsom, stephenfin, bcafarel, kgiusti, jungleboyj 15:03:53 <bnemec> #link https://wiki.openstack.org/wiki/Meetings/Oslo#Agenda_for_Next_Meeting 15:03:54 <openstack> Meeting started Mon Aug 10 15:03:53 2020 UTC and is due to finish in 60 minutes. The chair is bnemec. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:03:55 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:03:58 <openstack> The meeting name has been set to 'oslo' 15:04:11 <johnsom> o/ 15:07:43 <bnemec> #topic Red flags for/from liaisons 15:07:56 <johnsom> Nothing from Octavia 15:08:07 <bnemec> I was out last week so I have no idea what's going on. Hopefully someone else can fill us in. :-) 15:08:59 <moguimar> nothing from Barbican 15:09:08 <moguimar> we didn't have a meeting last week 15:09:39 <moguimar> now Hervé is also on PTO 15:10:14 <bnemec> Might be a quick meeting then. 15:10:15 <moguimar> and we need to come to a decision about kevko_ 's patch 15:10:24 <bnemec> Which is okay since I have a ton of emails to get through. :-) 15:10:35 <moguimar> this: https://review.opendev.org/#/c/742193/ 15:11:56 <bnemec> I've added it to the agenda. 15:12:22 <bnemec> #topic Releases 15:12:37 <bnemec> I'll try to take care of these this week since Herve is out. 15:13:41 <bnemec> I guess that's all I have on this topic. 15:13:45 <bnemec> #topic Action items from last meeting 15:14:07 <bnemec> "kgiusti to retire devstack-plugin-zmq" 15:14:41 <kgiusti> in progress 15:14:51 <bnemec> Cool, thanks. 15:14:59 <bnemec> "hberaud to sync oslo-cookiecutter contributing template with main cookiecutter one" 15:15:08 <bnemec> Pretty sure I voted on this patch. 15:15:40 <bnemec> Yep. 15:15:41 <bnemec> #link https://review.opendev.org/#/c/743939/ 15:15:50 <bnemec> It's blocked on ci. 15:16:22 <bnemec> Which is fixed by https://review.opendev.org/#/c/745304 15:16:49 <bnemec> So, all in progress, which is good. 15:17:09 <bnemec> #topic zuulv3 migration 15:17:22 <bnemec> The zmq retirement is related to this.. 15:17:40 <bnemec> I thought I saw something about migrating grenade jobs too. 15:17:48 <tosky> yep 15:18:01 <tosky> line 213: https://etherpad.opendev.org/p/goal-victoria-native-zuulv3-migration 15:18:04 <kgiusti> Yeah - that's one of the "retirement" tasks 15:18:21 <kgiusti> https://review.opendev.org/#/q/status:open+project:openstack/project-config+branch:master+topic:retire-devstack-plugin-zmq 15:18:25 <tosky> - the openstack/devstack-plugin-zmq jobs are covered by repository retirement 15:18:40 <openstackgerrit> Sean McGinnis proposed openstack/oslo-cookiecutter master: sync oslo-cookiecutter contributing template https://review.opendev.org/743939 15:18:42 <tosky> - oslo.versionedobjects is fixed by https://review.opendev.org/745183 15:19:02 <tosky> - and clarkb provided patches to port the pbr jobs (https://review.opendev.org/745171, https://review.opendev.org/745189, https://review.opendev.org/745192 ) 15:19:10 <bnemec> \o/ 15:19:47 <bnemec> So basically we have changes in flight to address all of the remaining oslo jobs. 15:20:03 <tosky> correct 15:20:34 <bnemec> stephenfin: See the above about the pbr jobs. I know you had looked at that too. 15:21:16 <bnemec> Okay, we're on track for this goal. 15:21:20 <bnemec> Thanks for the updates! 15:21:25 <bnemec> And all the patches! 15:21:55 <bnemec> #topic oslo.cache flush patch 15:22:02 <bnemec> #link https://review.opendev.org/#/c/742193/ 15:22:19 <bnemec> moguimar: kevko_: You're up! 15:22:39 <bnemec> cc lbragstad since he had thoughts on this too. 15:22:53 <moguimar> I think the patch is pretty much solid 15:23:11 <lbragstad> i need to look at it again 15:23:18 <moguimar> but I'm concerned about Keystone expecting the default behavior to be True and we flipping it to False 15:24:16 <bnemec> If we do go ahead with the patch we must have a way for keystone to default that back to true, IMHO. 15:24:40 <lbragstad> imo - it seems like they need to scale up their memcached deployment 15:24:50 <bnemec> And since Keystone is one of the main consumers of oslo.cache I'm unclear how much it will help to turn it off only other places. 15:25:04 <lbragstad> because it appears to the root of the issue is that a network event causes memcached to spiral into an unrecoverable error 15:26:01 <lbragstad> i need to stand up an environment with caching configured to debug the issue where you don't flush, because i'm suspicious that stale authorization data will be returned 15:26:39 <lbragstad> (e.g., when memcached is unreachable, the user revokes their token or changes their password, but their tokens are still in memcached) 15:27:04 <moguimar> what if the default value was True instead 15:27:25 <bnemec> I think it's just one server going down in the pool, then the token getting revoked on a different one, then the original server coming back up that is the problem. 15:27:39 <bnemec> IIUC it can result in a bad cached value for the server that disconnected. 15:27:50 <openstackgerrit> Merged openstack/oslo-cookiecutter master: Add ensure-tox support. https://review.opendev.org/745304 15:27:54 <lbragstad> right - you could have inconsistent data across servers 15:28:01 <lbragstad> and we don't really handle that in keystone code 15:28:14 * bnemec proposes that we just rm -rf memcache_pool 15:28:52 <lbragstad> well - that's essentially what we assume since we flush all memcached data (valid and invalid) when the client reconnects 15:29:22 <lbragstad> (we're not sure what happened when you were gone, but rebuild the source of truth) 15:29:56 <lbragstad> rebuild from keystone's database, which is the source of truth * 15:32:31 <lbragstad> i need to dig into this more, but i haven't had the time 15:32:50 <lbragstad> so i don't want to hold things up if it's in a reasonable place (where keystone can opt into the behavior we currently have today) 15:34:13 <moguimar> the patch does two things, turns it into a config option and flips the default behavior 15:39:36 <bnemec> I'm curious what happens in the affected cluster if they just restart all of their services. Doesn't it trigger the same overload? 15:39:46 <bnemec> Maybe on a rolling restart it's spread out enough to not cause a problem? 15:41:07 <openstackgerrit> Moisés Guimarães proposed openstack/oslo.cache master: Bump dogpile.cache's version for Memcached TLS support https://review.opendev.org/745509 15:42:20 <bnemec> Okay, I've left a review that reflects our discussion here. Let me know if I misrepresented anything. 15:43:17 <openstackgerrit> Merged openstack/oslo-cookiecutter master: sync oslo-cookiecutter contributing template https://review.opendev.org/743939 15:43:29 <bnemec> #topic enable oslo.messaging heartbeat fix by default? 15:43:44 <bnemec> This came up the week before I left. 15:43:59 <kgiusti> seems like a safe bet at this point 15:44:08 <bnemec> Related to the oslo.messaging ping endpoint change. 15:44:34 <kgiusti> yeah, that change I'm not so thrilled about. 15:45:11 <kgiusti> I was thinking of -2'ing that change, but wanted to discuss it here first. 15:45:24 <kgiusti> too bad herve is off having a life :) 15:45:38 <bnemec> Yeah, related only insofaras it came up in the discussion as an issue with checking liveness of services. 15:45:43 <kgiusti> I wanted his opinion 15:45:55 <kgiusti> bnemec: +1 15:46:11 <bnemec> We can probably wait until next week. This option has been around for quite a while now so it's not critical that we do it immediately. 15:46:28 <bnemec> I'm not aware of anyone reporting issues with it though. 15:46:46 <kgiusti> neither do I 15:47:10 <kgiusti> but I think we do need to make a final decision of that ping patch 15:47:14 <bnemec> Okay, I'll just leave it on the agenda for next week. 15:47:21 <kgiusti> https://review.opendev.org/#/c/735385/ 15:47:31 <bnemec> Was there more discussion on that after I logged off? 15:47:46 * bnemec has not been through openstack-discuss yet 15:48:01 <kgiusti> Lemme check... 15:48:44 <kgiusti> http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016229.html 15:49:17 <kgiusti> and the start of the discussion: http://lists.openstack.org/pipermail/openstack-discuss/2020-July/016097.html 15:49:45 <bnemec> I think it's in the same place as when I left. :-/ 15:50:05 <kgiusti> KK 15:50:23 <bnemec> It feels like a bug in Nova if a compute node can stop responding to messaging traffic and still be seen as "up". 15:50:59 <kgiusti> Agreed. Seems like that proposed feature is out of scope for the oslo.messaging project IMHO. 15:51:43 <kgiusti> Other than internal state monitoring, o.m. isn't intended to be a healthcheck solution 15:52:03 <bnemec> I feel weird arguing against this when I've been advocating for good-enough healthchecks in the api layer though. :-/ 15:52:36 <kgiusti> heck I _wanted_ this, but for my own selfish "don't blame me" reasons :) 15:53:14 <kgiusti> Having dan's opinion made me rethink that from a more user-driver perspective. 15:54:36 <kgiusti> Anyhow, that's where we stand at the moment. 15:55:12 <kgiusti> I was wondering if any folks in Oslo felt differently. 15:55:30 <bnemec> Unfortunately we're a bit short on Oslo folks today. 15:56:16 <bnemec> I'm going to reply to the thread and ask if fixing the service status on the Nova side would address the concern here. That seems like a better fix than adding a bunch of extra ping traffic on the rabbit bus (which is already a bottleneck in most deployments). 15:56:38 <kgiusti> +1 15:57:04 <bnemec> #action bnemec to reply to rpc ping thread with results of meeting discussion 15:57:12 <kgiusti> thanks bnemec 15:58:43 <bnemec> Okay, we're basically at time now so I'm going to skip the wayward review and open discussion. 15:59:07 <bnemec> I think we had some good discussions this week though, so it was a productive meeting. 15:59:29 <bnemec> If there's anything else we need to discuss, feel free to add it to the agenda for next week or bring it up in regular IRC. 15:59:42 <bnemec> Thanks for joining everyone! 15:59:45 <moguimar> not on my end 15:59:58 <moguimar> o/ 16:00:00 <bnemec> #endmeeting