18:00:12 <gouthamr> #startmeeting tc
18:00:12 <opendevmeet> Meeting started Tue Jul 16 18:00:12 2024 UTC and is due to finish in 60 minutes.  The chair is gouthamr. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:00:12 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
18:00:12 <opendevmeet> The meeting name has been set to 'tc'
18:00:57 <gouthamr> Welcome to the weekly meeting of the OpenStack Technical Committee. A reminder that this meeting is held under the OpenInfra Code of Conduct available at https://openinfra.dev/legal/code-of-conduct.
18:01:03 <gouthamr> Today's meeting agenda can be found at https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee.
18:01:10 <gouthamr> #topic Roll Call
18:01:11 <gmann> o/
18:01:15 <dansmith> o/
18:01:17 <frickler> \o
18:01:28 <slaweq> o/
18:01:55 <gtema> o/
18:02:02 <spotz[m]> o/
18:02:23 <JayF> o/
18:02:46 <gouthamr> courtesy-ping: noonedeadpunk
18:02:53 <noonedeadpunk> o/
18:03:03 <noonedeadpunk> sorry, got a bit late :)
18:03:37 <gouthamr> awesome, full house
18:03:42 <spotz[m]> Just starting:)
18:03:47 <gouthamr> lets get started
18:03:57 <gouthamr> #topic AIs from last week
18:04:46 <gouthamr> we took a few regarding inactive/emerging projects
18:04:50 <gouthamr> and we have some reviews pending
18:04:56 <gouthamr> i'll bring them up in a bit
18:05:08 <gouthamr> Masakari: noonedeadpunk to follow up on the status and progress related to SQLAlchemy/Oslo.db issues
18:05:12 <opendevreview> Slawek Kaplonski proposed openstack/governance master: Update criteria for the inactive projects to become active again  https://review.opendev.org/c/openstack/governance/+/921500
18:05:26 <gouthamr> any update here, noonedeadpunk?
18:06:03 <noonedeadpunk> frankly - no :(
18:06:28 <gouthamr> ty, no problem.. we can pursue this next week
18:06:31 <noonedeadpunk> trailing behind the plan dramatically
18:06:54 <gouthamr> dalees responded to the ML post regarding adjutant's pending patches and broken CI
18:07:27 <gouthamr> #link https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/RJD75IM5OTDTI4SUUTZQ7F23J4ZSZCSG/
18:08:07 <gouthamr> i saw some patches merging with his help; so hopefully this looks good/trending better for the release team's concerns as well
18:09:25 <gouthamr> any concerns regarding adjutant?
18:09:34 <noonedeadpunk> just checked masakari results from the last week recheck - and they're utterly broken :(
18:09:36 <noonedeadpunk> https://review.opendev.org/c/openstack/masakari/+/920034/1#message-30bd512eedaae5eff4268d1460b00cf78a9cf252
18:11:10 <gouthamr> ah; we probably need suzhengwei's attention
18:11:30 <noonedeadpunk> yeah, for sure
18:11:39 <frickler> adjutant looks ok-ish to me after a quick glance at gerrit
18:11:49 <noonedeadpunk> I'll try to propose a fix though
18:12:27 <gouthamr> noonedeadpunk: thank you..
18:12:32 <gouthamr> frickler: ack; ty
18:13:32 <gouthamr> the other pending items/AIs we took last week were regarding retirement of kuryr-kubernetes and the eventlet removal goal proposal
18:14:27 <gouthamr> we can run through the gate health topic and get to these as part of our usual tracking
18:14:33 <gouthamr> .. discussion
18:14:45 <gouthamr> did i miss any other AIs? :)
18:15:31 <gouthamr> #topic A check on gate health
18:15:56 <dansmith> I've been very consumed lately,
18:16:17 <dansmith> (and was out last week) but the big CVE push from the week before definitely had a lot of rechecking, but things did eventually move through
18:16:40 <dansmith> lots of the usual, timeouts doing volume things, failures to ssh to instances that were otherwise healthy, etc
18:17:01 <slaweq> regarding ssh timeouts we have some new issue recently in neutron
18:17:45 <frickler> also zuul itself was having multiple issues in recent days
18:17:46 <slaweq> https://review.opendev.org/c/openstack/neutron/+/924213 this should hopefully make it better
18:17:54 <gouthamr> curious about that; do the tests in nova/integrated-gate/neutron use cirros always? or some other OS images too?
18:18:09 <dansmith> only cirros
18:18:20 <slaweq> but this is also for the ml2/ovs or lb jobs, so no default configuration from devstack
18:18:26 <dansmith> for the base jobs anyway.. I think maybe ironic and tripleo used other OSes
18:18:54 <dansmith> gouthamr: in general, real OSes have hard-coded memory requirements that mean they won't even boot in 128MB instance flavors
18:19:07 <JayF> We use cirros as the image we install on the nodes. We use other OSes for ramdisks (similar to octavia having amphora with non-cirros)\
18:19:18 <frickler> iiuc neutron uses ubuntu for some advanced tests
18:19:23 <gouthamr> manila does ubuntu images; and SSH timeouts are a frequent problem..
18:19:37 <JayF> I'll note I've been working on DIB support for gentoo, and Ironic will be looking into doing IPA ramdisks with gentoo. It's likely also possible to create a minimal alternative to cirros if we ever had a reason to do so.
18:19:41 <slaweq> frickler: true, but only in the neutron-tempest-plugin jobs and in some specific tests
18:19:50 <slaweq> in most of them we use cirros
18:20:12 <noonedeadpunk> yeah, we felt quite some instability in zuul recently
18:20:28 <noonedeadpunk> including timeouts reaching API during docs promotion
18:20:40 <noonedeadpunk> (basically fetching the artifacts)
18:20:48 <gmann> there was some proposal to test other img in tempest as integrated but we have not done much progress on that
18:21:24 <noonedeadpunk> fwiw dib has couple of outstanding patches for fixing fedora builds (non-container way)
18:21:26 <gouthamr> JayF++ nice; i'd be interested to try that! gentoo has a decent package manager and nfs/samba/ceph clients!
18:22:02 <JayF> gouthamr: happy to chat about it sometime in another venue; I am a gentoo contributor and pretty expert level at it :)
18:22:05 <fungi> noonedeadpunk: if it was over the weekend, that was due to some changes in the query planner leading to sub-optimal queries that bogged down the servers
18:22:16 <noonedeadpunk> I think it was today
18:22:18 <frickler> fungi: no, that was 2h ago
18:22:26 <fungi> oh, so something new presumably
18:22:27 <noonedeadpunk> https://zuul.opendev.org/t/openstack/build/6caf416c14c44ca6b20bc834556d1d3a
18:22:46 <noonedeadpunk> I also timouted with same query localy just in case
18:25:11 <dansmith> I'm sure that wouldn't matter at this stage, all it has done is clone requirements and try to install it.. I think all that is disabled but let me confirm
18:25:39 <dansmith> oops, sorrty
18:25:50 <gouthamr> ack; few issues here :) ty for highlighting the SSH issue dansmith.. since the image wasn't changed, the problem there could be related to the sporadic issue that the neutron dhcp change that was reverted may have improved? Or has this been seen before
18:26:40 <dansmith> we see it all the time
18:26:44 <gouthamr> ah
18:27:15 <gouthamr> so needs some more investigation :/
18:27:24 <gmann> SSH issue is consistent since many years :)
18:27:28 <dansmith> it's been ten years, which is why I say the usual
18:27:31 <dansmith> right
18:28:00 <gouthamr> and these timeouts are pretty large too? 300 seconds iirc
18:28:29 <dansmith> it's not a timeout problem,
18:28:35 <dansmith> it's a no-connectivity-so-timeout problem
18:28:39 <dansmith> no amount of waiting will fix it
18:28:42 <gouthamr> makes sense
18:29:46 <gouthamr> or doesn't - unless we root cause this to some misconfigured ports or dhcp/network flows
18:29:51 <slaweq> in some of those cases I noticed issues with metadata which wasn't reachable thus there was long boot time and timeouts
18:30:03 <slaweq> but even if we would wait longer in such case it won
18:30:09 <slaweq> *won't help
18:30:25 <slaweq> because there will be no ssh key inserted into the guest vm
18:30:38 <dansmith> nah, that's not the issue we normally see, we see connection timeout
18:30:41 <dansmith> either way,m
18:30:49 <dansmith> no need to triage these old issues here I think
18:30:56 <slaweq> I was investigating that issue already but did not found anything
18:30:57 <dansmith> I really just mentioned it as "the usual general failures"
18:31:11 <slaweq> dansmith ok
18:31:22 <gouthamr> ack; i wonder if we can create a bug nevertheless and keep piling on logs/triage information
18:31:37 <gmann> yeah, we have investigated and debugged and workaround it a lot but nothing worked well or fix the root cause.
18:31:38 <dansmith> I bet there's plenty already
18:31:44 <dansmith> tbh I'm surprised to hear that anyone is surprised about this :)
18:31:47 <gmann> yes, there are many in tempest also logged
18:31:59 <gouthamr> sheesh newbies, dansmith
18:32:28 <gouthamr> ack; ty.. i'll track some down and poke you after the meeting
18:32:37 <gouthamr> any other gate discussions to be had today?
18:32:58 <gouthamr> regarding non-cirros images, glad to see JayF's experiments with dib and gentoo, please feel free to post any updates here in the coming weeks
18:33:13 <JayF> For the newbies, Ironic has a similar issue with random failures, usually associated with dnsmasq crashes. It's been reported upstream and is unplanned to be addressed afaict
18:33:22 <JayF> So we'll have to implement support for something like kea to get outta that one :(
18:33:47 <gouthamr> we hammer things pretty hard
18:34:20 <JayF> I think you get the general impression of why they are uninterested in fixing the hard memory safety bug, yeah :D Only really happens in OpenStack CI
18:36:43 <gouthamr> perhaps you'll be chatting about the zuul query issues in #opendev
18:36:53 <gouthamr> hehe, you are..
18:36:56 <gouthamr> lets move on
18:37:10 <gouthamr> #topic 2024.2 TC Tracker
18:37:59 <gouthamr> i wanted to start with eating the frog
18:38:24 <gouthamr> #link  https://review.opendev.org/c/openstack/governance/+/902585 (Remove eventlet from OpenStack)
18:38:32 <gouthamr> is this one ready to merge?
18:39:01 <noonedeadpunk> it's not having enough votes at very least...
18:39:34 <JayF> only 3 of 9 TC members have voted on it by my count
18:40:01 <gouthamr> not by the TC rules though, it has enough to merge.. but, i don't like to get it in without more buy-in
18:40:03 <gtema> I'll review this tomorrow
18:40:12 <noonedeadpunk> +1 to ^
18:40:19 <frickler> I have a question about the procedure, it is going to be merged as "proposed" goal, is the expectation that it will move to "accepted" unchanged?
18:40:42 <gouthamr> not really; and that's something we should brainstorm
18:40:46 <gmann> remember this is goal proposal which does not require formal-vote but selection of goal need the motion
18:40:48 <JayF> frickler: right now, honestly, I think the point is more to get a roadmap down for people to follow, regardless of if we commit to it as a goal or not
18:41:19 <dansmith> well, (not) comitting to it as a formal goal is pretty important to earn my vote
18:41:21 <noonedeadpunk> ah, then probably it's ready for merge
18:41:30 <gmann> yeah, selection of goal is separate process and goal document can be improved many times based on feedback
18:41:33 <frickler> +1
18:42:34 <gouthamr> for one, i'd move all of hberaud's lovely explanation into a doc :D
18:42:51 <spotz[m]> I'm sure he would if we asked
18:42:56 <gouthamr> and suggest a roadmap
18:43:12 <gouthamr> and a structure around getting this done across openstack like the proposal seeks
18:44:55 <gouthamr> spotz[m]: yeah.. for now this is okay.. i think its a good read, sometimes the dire warnings are repetitive, you get the point after a while.. but, the length of this may be deterring people to review..
18:45:13 <gouthamr> but, its been evolving this way for months
18:45:27 <gtema> indeed  -the length is pretty scary
18:45:37 <noonedeadpunk> yeah, 1500 lines needs quite some determination to finish :D
18:45:38 <spotz[m]> Possibly, and there's differeent ways to configure looking so you might get all the comments or only some of them
18:46:02 <JayF> gtema: yeah, the original was more like "this is the solution" and was shorter, this version is more of a menu of solutions
18:46:22 <gtema> it's all clear, but ...
18:46:29 <gouthamr> haha :D
18:47:53 <gouthamr> so would we need a few more days to look/iterate on this?
18:48:15 <gtema> yeah, just days, not weeks
18:49:10 <gouthamr> ty please do; we'll try to get the proposal in, or out by next week..
18:49:35 <gouthamr> when you comment please do note if something needs to be clarified when we accept the goal
18:50:17 <gouthamr> noonedeadpunk frickler dansmith gmann slaweq - is this okay? :)
18:50:40 <dansmith> I honestly can't promise anything.. most of my world is still on fire and I'm pretty grumpy about this topic
18:50:44 <gouthamr> sorry for the ping/mention.. i want to understand the process myself and not let this stagnate
18:51:23 <gouthamr> dansmith: ack, your past concerns regarding enforcing asyncio across the board may be addressed
18:51:36 <gouthamr> so i'll tease you to look at those parts and keep us honest
18:51:36 <slaweq> Good for me
18:51:37 <noonedeadpunk> yup, sounds good
18:51:46 <noonedeadpunk> can I ask to land like... Monday or smth? As there's a chance I will have time for a good read only during weekends...
18:51:53 <gouthamr> ++
18:51:54 <gmann> ok for me.
18:52:26 <gouthamr> ++
18:53:13 <gouthamr> okay then; i'll follow up async
18:53:20 <gouthamr> lets go down the tracker
18:53:52 <gouthamr> #link https://review.opendev.org/c/openstack/governance/+/923441 (Inactive state extensions: Freezer)
18:54:08 <gouthamr> #link https://review.opendev.org/c/openstack/governance/+/923919 (Inactive state extensions: Monasca)
18:54:42 <gouthamr> ^ these two will ofcourse conflict; don't know if its worth rebasing right away, if folks have concerns with one or the other
18:55:04 <gouthamr> #link https://review.opendev.org/c/openstack/governance/+/924109 (move skyline out from the "Emerging Projects" list)
18:56:17 <gouthamr> these should address the project inactivity/emerging projects concerns that we brought up in the past weeks
18:56:57 <gouthamr> pending noonedeadpunk's investigation into masakari, i think we're looking good.. please take a look and express any concerns on these patches
18:56:57 <frickler> regarding skyline I saw this patch and it made me wonder whether we should be concerned  openstack/skyline-console master: Lock setuptools version as 69.2.0  https://review.opendev.org/c/openstack/skyline-console/+/924130
18:57:16 <noonedeadpunk> masakari fix was really easy: https://review.opendev.org/c/openstack/masakari/+/924278
18:57:37 <noonedeadpunk> but now someone should merge that :D
18:57:41 <gouthamr> noonedeadpunk: haha, you put that together during the meeting
18:58:22 <gouthamr> frickler: yikes; but, its in the makefile.. do we expect that to be adhered to by distros?
18:59:04 <frickler> well, distros will have to execute the build somehow, too
18:59:27 <JayF> At a minimum, at least they're documenting it for distros.
18:59:32 <gouthamr> https://github.com/pypa/setuptools/issues/4300
18:59:35 <JayF> Packagers should be looking at Makefiles even if not directly using them.
19:00:21 <gouthamr> sorry; we're at the hour
19:00:29 <JayF> o/ ty gouthamr
19:00:39 <gouthamr> frickler: can you post this to the skyline graduation patch as well please?
19:00:46 <gouthamr> thanks for noticing it
19:00:56 <gouthamr> thank you all for attending
19:01:07 <gouthamr> we'll hopefully reprise Open Discussion in future meetings
19:01:12 <gouthamr> #endmeeting