18:00:12 <gouthamr> #startmeeting tc 18:00:12 <opendevmeet> Meeting started Tue Jul 16 18:00:12 2024 UTC and is due to finish in 60 minutes. The chair is gouthamr. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:12 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 18:00:12 <opendevmeet> The meeting name has been set to 'tc' 18:00:57 <gouthamr> Welcome to the weekly meeting of the OpenStack Technical Committee. A reminder that this meeting is held under the OpenInfra Code of Conduct available at https://openinfra.dev/legal/code-of-conduct. 18:01:03 <gouthamr> Today's meeting agenda can be found at https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee. 18:01:10 <gouthamr> #topic Roll Call 18:01:11 <gmann> o/ 18:01:15 <dansmith> o/ 18:01:17 <frickler> \o 18:01:28 <slaweq> o/ 18:01:55 <gtema> o/ 18:02:02 <spotz[m]> o/ 18:02:23 <JayF> o/ 18:02:46 <gouthamr> courtesy-ping: noonedeadpunk 18:02:53 <noonedeadpunk> o/ 18:03:03 <noonedeadpunk> sorry, got a bit late :) 18:03:37 <gouthamr> awesome, full house 18:03:42 <spotz[m]> Just starting:) 18:03:47 <gouthamr> lets get started 18:03:57 <gouthamr> #topic AIs from last week 18:04:46 <gouthamr> we took a few regarding inactive/emerging projects 18:04:50 <gouthamr> and we have some reviews pending 18:04:56 <gouthamr> i'll bring them up in a bit 18:05:08 <gouthamr> Masakari: noonedeadpunk to follow up on the status and progress related to SQLAlchemy/Oslo.db issues 18:05:12 <opendevreview> Slawek Kaplonski proposed openstack/governance master: Update criteria for the inactive projects to become active again https://review.opendev.org/c/openstack/governance/+/921500 18:05:26 <gouthamr> any update here, noonedeadpunk? 18:06:03 <noonedeadpunk> frankly - no :( 18:06:28 <gouthamr> ty, no problem.. we can pursue this next week 18:06:31 <noonedeadpunk> trailing behind the plan dramatically 18:06:54 <gouthamr> dalees responded to the ML post regarding adjutant's pending patches and broken CI 18:07:27 <gouthamr> #link https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/RJD75IM5OTDTI4SUUTZQ7F23J4ZSZCSG/ 18:08:07 <gouthamr> i saw some patches merging with his help; so hopefully this looks good/trending better for the release team's concerns as well 18:09:25 <gouthamr> any concerns regarding adjutant? 18:09:34 <noonedeadpunk> just checked masakari results from the last week recheck - and they're utterly broken :( 18:09:36 <noonedeadpunk> https://review.opendev.org/c/openstack/masakari/+/920034/1#message-30bd512eedaae5eff4268d1460b00cf78a9cf252 18:11:10 <gouthamr> ah; we probably need suzhengwei's attention 18:11:30 <noonedeadpunk> yeah, for sure 18:11:39 <frickler> adjutant looks ok-ish to me after a quick glance at gerrit 18:11:49 <noonedeadpunk> I'll try to propose a fix though 18:12:27 <gouthamr> noonedeadpunk: thank you.. 18:12:32 <gouthamr> frickler: ack; ty 18:13:32 <gouthamr> the other pending items/AIs we took last week were regarding retirement of kuryr-kubernetes and the eventlet removal goal proposal 18:14:27 <gouthamr> we can run through the gate health topic and get to these as part of our usual tracking 18:14:33 <gouthamr> .. discussion 18:14:45 <gouthamr> did i miss any other AIs? :) 18:15:31 <gouthamr> #topic A check on gate health 18:15:56 <dansmith> I've been very consumed lately, 18:16:17 <dansmith> (and was out last week) but the big CVE push from the week before definitely had a lot of rechecking, but things did eventually move through 18:16:40 <dansmith> lots of the usual, timeouts doing volume things, failures to ssh to instances that were otherwise healthy, etc 18:17:01 <slaweq> regarding ssh timeouts we have some new issue recently in neutron 18:17:45 <frickler> also zuul itself was having multiple issues in recent days 18:17:46 <slaweq> https://review.opendev.org/c/openstack/neutron/+/924213 this should hopefully make it better 18:17:54 <gouthamr> curious about that; do the tests in nova/integrated-gate/neutron use cirros always? or some other OS images too? 18:18:09 <dansmith> only cirros 18:18:20 <slaweq> but this is also for the ml2/ovs or lb jobs, so no default configuration from devstack 18:18:26 <dansmith> for the base jobs anyway.. I think maybe ironic and tripleo used other OSes 18:18:54 <dansmith> gouthamr: in general, real OSes have hard-coded memory requirements that mean they won't even boot in 128MB instance flavors 18:19:07 <JayF> We use cirros as the image we install on the nodes. We use other OSes for ramdisks (similar to octavia having amphora with non-cirros)\ 18:19:18 <frickler> iiuc neutron uses ubuntu for some advanced tests 18:19:23 <gouthamr> manila does ubuntu images; and SSH timeouts are a frequent problem.. 18:19:37 <JayF> I'll note I've been working on DIB support for gentoo, and Ironic will be looking into doing IPA ramdisks with gentoo. It's likely also possible to create a minimal alternative to cirros if we ever had a reason to do so. 18:19:41 <slaweq> frickler: true, but only in the neutron-tempest-plugin jobs and in some specific tests 18:19:50 <slaweq> in most of them we use cirros 18:20:12 <noonedeadpunk> yeah, we felt quite some instability in zuul recently 18:20:28 <noonedeadpunk> including timeouts reaching API during docs promotion 18:20:40 <noonedeadpunk> (basically fetching the artifacts) 18:20:48 <gmann> there was some proposal to test other img in tempest as integrated but we have not done much progress on that 18:21:24 <noonedeadpunk> fwiw dib has couple of outstanding patches for fixing fedora builds (non-container way) 18:21:26 <gouthamr> JayF++ nice; i'd be interested to try that! gentoo has a decent package manager and nfs/samba/ceph clients! 18:22:02 <JayF> gouthamr: happy to chat about it sometime in another venue; I am a gentoo contributor and pretty expert level at it :) 18:22:05 <fungi> noonedeadpunk: if it was over the weekend, that was due to some changes in the query planner leading to sub-optimal queries that bogged down the servers 18:22:16 <noonedeadpunk> I think it was today 18:22:18 <frickler> fungi: no, that was 2h ago 18:22:26 <fungi> oh, so something new presumably 18:22:27 <noonedeadpunk> https://zuul.opendev.org/t/openstack/build/6caf416c14c44ca6b20bc834556d1d3a 18:22:46 <noonedeadpunk> I also timouted with same query localy just in case 18:25:11 <dansmith> I'm sure that wouldn't matter at this stage, all it has done is clone requirements and try to install it.. I think all that is disabled but let me confirm 18:25:39 <dansmith> oops, sorrty 18:25:50 <gouthamr> ack; few issues here :) ty for highlighting the SSH issue dansmith.. since the image wasn't changed, the problem there could be related to the sporadic issue that the neutron dhcp change that was reverted may have improved? Or has this been seen before 18:26:40 <dansmith> we see it all the time 18:26:44 <gouthamr> ah 18:27:15 <gouthamr> so needs some more investigation :/ 18:27:24 <gmann> SSH issue is consistent since many years :) 18:27:28 <dansmith> it's been ten years, which is why I say the usual 18:27:31 <dansmith> right 18:28:00 <gouthamr> and these timeouts are pretty large too? 300 seconds iirc 18:28:29 <dansmith> it's not a timeout problem, 18:28:35 <dansmith> it's a no-connectivity-so-timeout problem 18:28:39 <dansmith> no amount of waiting will fix it 18:28:42 <gouthamr> makes sense 18:29:46 <gouthamr> or doesn't - unless we root cause this to some misconfigured ports or dhcp/network flows 18:29:51 <slaweq> in some of those cases I noticed issues with metadata which wasn't reachable thus there was long boot time and timeouts 18:30:03 <slaweq> but even if we would wait longer in such case it won 18:30:09 <slaweq> *won't help 18:30:25 <slaweq> because there will be no ssh key inserted into the guest vm 18:30:38 <dansmith> nah, that's not the issue we normally see, we see connection timeout 18:30:41 <dansmith> either way,m 18:30:49 <dansmith> no need to triage these old issues here I think 18:30:56 <slaweq> I was investigating that issue already but did not found anything 18:30:57 <dansmith> I really just mentioned it as "the usual general failures" 18:31:11 <slaweq> dansmith ok 18:31:22 <gouthamr> ack; i wonder if we can create a bug nevertheless and keep piling on logs/triage information 18:31:37 <gmann> yeah, we have investigated and debugged and workaround it a lot but nothing worked well or fix the root cause. 18:31:38 <dansmith> I bet there's plenty already 18:31:44 <dansmith> tbh I'm surprised to hear that anyone is surprised about this :) 18:31:47 <gmann> yes, there are many in tempest also logged 18:31:59 <gouthamr> sheesh newbies, dansmith 18:32:28 <gouthamr> ack; ty.. i'll track some down and poke you after the meeting 18:32:37 <gouthamr> any other gate discussions to be had today? 18:32:58 <gouthamr> regarding non-cirros images, glad to see JayF's experiments with dib and gentoo, please feel free to post any updates here in the coming weeks 18:33:13 <JayF> For the newbies, Ironic has a similar issue with random failures, usually associated with dnsmasq crashes. It's been reported upstream and is unplanned to be addressed afaict 18:33:22 <JayF> So we'll have to implement support for something like kea to get outta that one :( 18:33:47 <gouthamr> we hammer things pretty hard 18:34:20 <JayF> I think you get the general impression of why they are uninterested in fixing the hard memory safety bug, yeah :D Only really happens in OpenStack CI 18:36:43 <gouthamr> perhaps you'll be chatting about the zuul query issues in #opendev 18:36:53 <gouthamr> hehe, you are.. 18:36:56 <gouthamr> lets move on 18:37:10 <gouthamr> #topic 2024.2 TC Tracker 18:37:59 <gouthamr> i wanted to start with eating the frog 18:38:24 <gouthamr> #link https://review.opendev.org/c/openstack/governance/+/902585 (Remove eventlet from OpenStack) 18:38:32 <gouthamr> is this one ready to merge? 18:39:01 <noonedeadpunk> it's not having enough votes at very least... 18:39:34 <JayF> only 3 of 9 TC members have voted on it by my count 18:40:01 <gouthamr> not by the TC rules though, it has enough to merge.. but, i don't like to get it in without more buy-in 18:40:03 <gtema> I'll review this tomorrow 18:40:12 <noonedeadpunk> +1 to ^ 18:40:19 <frickler> I have a question about the procedure, it is going to be merged as "proposed" goal, is the expectation that it will move to "accepted" unchanged? 18:40:42 <gouthamr> not really; and that's something we should brainstorm 18:40:46 <gmann> remember this is goal proposal which does not require formal-vote but selection of goal need the motion 18:40:48 <JayF> frickler: right now, honestly, I think the point is more to get a roadmap down for people to follow, regardless of if we commit to it as a goal or not 18:41:19 <dansmith> well, (not) comitting to it as a formal goal is pretty important to earn my vote 18:41:21 <noonedeadpunk> ah, then probably it's ready for merge 18:41:30 <gmann> yeah, selection of goal is separate process and goal document can be improved many times based on feedback 18:41:33 <frickler> +1 18:42:34 <gouthamr> for one, i'd move all of hberaud's lovely explanation into a doc :D 18:42:51 <spotz[m]> I'm sure he would if we asked 18:42:56 <gouthamr> and suggest a roadmap 18:43:12 <gouthamr> and a structure around getting this done across openstack like the proposal seeks 18:44:55 <gouthamr> spotz[m]: yeah.. for now this is okay.. i think its a good read, sometimes the dire warnings are repetitive, you get the point after a while.. but, the length of this may be deterring people to review.. 18:45:13 <gouthamr> but, its been evolving this way for months 18:45:27 <gtema> indeed -the length is pretty scary 18:45:37 <noonedeadpunk> yeah, 1500 lines needs quite some determination to finish :D 18:45:38 <spotz[m]> Possibly, and there's differeent ways to configure looking so you might get all the comments or only some of them 18:46:02 <JayF> gtema: yeah, the original was more like "this is the solution" and was shorter, this version is more of a menu of solutions 18:46:22 <gtema> it's all clear, but ... 18:46:29 <gouthamr> haha :D 18:47:53 <gouthamr> so would we need a few more days to look/iterate on this? 18:48:15 <gtema> yeah, just days, not weeks 18:49:10 <gouthamr> ty please do; we'll try to get the proposal in, or out by next week.. 18:49:35 <gouthamr> when you comment please do note if something needs to be clarified when we accept the goal 18:50:17 <gouthamr> noonedeadpunk frickler dansmith gmann slaweq - is this okay? :) 18:50:40 <dansmith> I honestly can't promise anything.. most of my world is still on fire and I'm pretty grumpy about this topic 18:50:44 <gouthamr> sorry for the ping/mention.. i want to understand the process myself and not let this stagnate 18:51:23 <gouthamr> dansmith: ack, your past concerns regarding enforcing asyncio across the board may be addressed 18:51:36 <gouthamr> so i'll tease you to look at those parts and keep us honest 18:51:36 <slaweq> Good for me 18:51:37 <noonedeadpunk> yup, sounds good 18:51:46 <noonedeadpunk> can I ask to land like... Monday or smth? As there's a chance I will have time for a good read only during weekends... 18:51:53 <gouthamr> ++ 18:51:54 <gmann> ok for me. 18:52:26 <gouthamr> ++ 18:53:13 <gouthamr> okay then; i'll follow up async 18:53:20 <gouthamr> lets go down the tracker 18:53:52 <gouthamr> #link https://review.opendev.org/c/openstack/governance/+/923441 (Inactive state extensions: Freezer) 18:54:08 <gouthamr> #link https://review.opendev.org/c/openstack/governance/+/923919 (Inactive state extensions: Monasca) 18:54:42 <gouthamr> ^ these two will ofcourse conflict; don't know if its worth rebasing right away, if folks have concerns with one or the other 18:55:04 <gouthamr> #link https://review.opendev.org/c/openstack/governance/+/924109 (move skyline out from the "Emerging Projects" list) 18:56:17 <gouthamr> these should address the project inactivity/emerging projects concerns that we brought up in the past weeks 18:56:57 <gouthamr> pending noonedeadpunk's investigation into masakari, i think we're looking good.. please take a look and express any concerns on these patches 18:56:57 <frickler> regarding skyline I saw this patch and it made me wonder whether we should be concerned openstack/skyline-console master: Lock setuptools version as 69.2.0 https://review.opendev.org/c/openstack/skyline-console/+/924130 18:57:16 <noonedeadpunk> masakari fix was really easy: https://review.opendev.org/c/openstack/masakari/+/924278 18:57:37 <noonedeadpunk> but now someone should merge that :D 18:57:41 <gouthamr> noonedeadpunk: haha, you put that together during the meeting 18:58:22 <gouthamr> frickler: yikes; but, its in the makefile.. do we expect that to be adhered to by distros? 18:59:04 <frickler> well, distros will have to execute the build somehow, too 18:59:27 <JayF> At a minimum, at least they're documenting it for distros. 18:59:32 <gouthamr> https://github.com/pypa/setuptools/issues/4300 18:59:35 <JayF> Packagers should be looking at Makefiles even if not directly using them. 19:00:21 <gouthamr> sorry; we're at the hour 19:00:29 <JayF> o/ ty gouthamr 19:00:39 <gouthamr> frickler: can you post this to the skyline graduation patch as well please? 19:00:46 <gouthamr> thanks for noticing it 19:00:56 <gouthamr> thank you all for attending 19:01:07 <gouthamr> we'll hopefully reprise Open Discussion in future meetings 19:01:12 <gouthamr> #endmeeting