14:00:31 <haleyb> #startmeeting networking
14:00:31 <opendevmeet> Meeting started Tue Sep 10 14:00:31 2024 UTC and is due to finish in 60 minutes.  The chair is haleyb. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:31 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:31 <opendevmeet> The meeting name has been set to 'networking'
14:00:33 <haleyb> Ping list: bcafarel, elvira, frickler, mlavalle, mtomaska, obondarev, slaweq, tobias-urdin, ykarel, lajoskatona, jlibosva, averdagu, amotoki, haleyb, ralonsoh
14:00:34 <mlavalle1> \o
14:00:38 <elvira> o/
14:00:41 <rubasov> o/
14:00:41 <ralonsoh> hello
14:00:50 <mtomaska> o/
14:00:50 <obondarev> hi
14:00:53 <slaweq> o/
14:01:01 <ihrachys> o/
14:01:04 <lajoskatona> o/
14:01:23 <bcafarel> o/
14:01:38 <haleyb> hi everyone, let's get started
14:01:39 <haleyb> #topic announcements
14:01:47 <haleyb> We are now in Dalmatian release week (R - 3)
14:02:00 <haleyb> Focus should be on finding and fixing release-critical bugs, so that release candidates and final versions of the 2024.2 Dalmatian deliverables can be proposed, well ahead of the final 2024.2 Dalmatian release date
14:02:11 <haleyb> All deliverables released under a cycle-with-rc model should have a first release candidate by the end of the week, from which a stable/2024.2 branch will be cut
14:02:18 <haleyb> This branch will track the 2024.2 Dalmatian release
14:02:28 <haleyb> Once stable/2024.2 has been created, the master branch will be ready to switch to 2025.1 Epoxy development. While the master branch will no longer be feature-frozen, please prioritize any work necessary for completing 2024.2 Dalmatian plans
14:02:35 <frickler> o/
14:02:37 <haleyb> Release-critical bugfixes will need to be merged in the master branch first, then backported to the stable/2024.2 branch before a new release candidate can be proposed
14:03:00 <haleyb> Ok, so that was a lot of pasting from the wiki
14:03:42 <haleyb> There is an initial RC1 candidate proposed
14:03:46 <haleyb> #link https://review.opendev.org/c/openstack/releases/+/928550
14:03:57 <haleyb> I'm assuming master has moved since that was proposed
14:04:26 <haleyb> I will check and update today, but is there anything in the gate that i should be waiting for?
14:05:16 <lajoskatona> thanks for taking care
14:05:27 <haleyb> As once that merges, stable/2024.2 will be cut and anything critical will need to be cherry-picked
14:06:22 <haleyb> We do have until Friday at the latest, so we can look at getting any last minute fixes in
14:07:27 <haleyb> very quiet, so i'll move on
14:07:33 <ralonsoh> This patch (I found today): the cycle highlights https://review.opendev.org/c/openstack/releases/+/928289
14:07:45 <haleyb> yes, that was next on my list
14:07:49 <ralonsoh> sorry
14:08:14 <haleyb> np, it does need a formatting update, but if there's something i missed, or something that doesn't need to be there please comment
14:08:41 <haleyb> i just went through commit messages for ideas
14:09:58 <haleyb> and regarding the RC1 deadline - there are a number of other networking-* projects that also have releases proposed
14:10:13 <haleyb> #link https://review.opendev.org/q/project:openstack/releases
14:10:38 <haleyb> if you contribute to any of those (too many to list), please check and +1
14:10:57 <haleyb> i will go through them later today and verify hashes with master branches
14:11:34 <haleyb> So just to re-state the deadlines
14:11:40 <haleyb> RC1 deadline: September 13th, 2024 (R-3 week) (this week)
14:11:47 <haleyb> Final RC deadline: September 26th, 2024 (R-1 week)
14:11:53 <haleyb> Final 2024.2 Dalmatian release: October 2nd, 2024
14:12:40 <haleyb> Thanks for everyone's hard work during the cycle, we're almost done :)
14:13:11 <haleyb> that was all the announcements i had, any others or questions?
14:14:16 <haleyb> ok, let's move onto bugs
14:14:19 <haleyb> #topic bugs
14:14:28 <haleyb> there were unfortunately a lot of bugs this week
14:14:50 <haleyb> ihrachys was the bug deputy, his report is at
14:14:54 <haleyb> #link https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/6UHEMOMG3CM5JN4FHOETKSFGAJOBSYAK/
14:15:17 <ihrachys> some of these in the report are not NEW but changed their state though (e.g. more info added or requested)
14:16:17 <haleyb> ihrachys: ack
14:16:22 <haleyb> first one in the list is
14:16:26 <haleyb> #link https://bugs.launchpad.net/neutron/+bug/2078846
14:16:34 <haleyb> OVN fails to transition router port to ACTIVE, claims L2 provisioning block is still present
14:17:40 <ihrachys> i reported it; it took several hours of my time and i still could figure it fully out. i dumped all i found in the bug though. if someone is familiar with provisioning blocks / ovn router port transition to active, your help may be fitting.
14:19:17 <haleyb> ihrachys: thanks for looking. did you see it more than once?
14:19:38 <ihrachys> it was once in gate on my patch, hence I had to look and report :)
14:21:09 <haleyb> https://review.opendev.org/c/openstack/neutron/+/927631 patch i guess based on the link
14:22:51 <haleyb> well, if someone has the cycles to look that would be good, but lets get through some of the others
14:22:54 <haleyb> next is
14:23:09 <haleyb> #link https://bugs.launchpad.net/neutron/+bug/2079047
14:23:58 <ralonsoh> yeah, I saw this job failing all times in the periodic queue
14:24:20 <ihrachys> oh is it in periodic after all? thought it was experimental.
14:24:24 <ralonsoh> I have no time to review this specific configuration
14:24:35 <ralonsoh> all periodic jobs are also experimental
14:24:41 <ralonsoh> (not the other way around)
14:24:45 <ihrachys> if so then I think opensearch exposes the approximate time when it started to fail
14:25:04 <ihrachys> because it was clear on the chart (link in the bug report) when it started to spike
14:25:18 <slaweq> maybe we should ping LiuYulong about it as he is main author of this distributed dhcp feature and those jobs
14:26:03 <frickler> opensearch only goes back 10 days, so if it starts then, that's how much data there are, not necessarily when the issue started
14:26:11 <ralonsoh> and is always the same test
14:26:23 <ihrachys> frickler: ah! :(
14:27:04 <haleyb> slaweq: yes, pinging him would be good as it could be related to those changes
14:27:16 <ralonsoh> I though that could be related to the eventlet/wsgi change
14:27:17 <ralonsoh> https://review.opendev.org/c/openstack/neutron/+/927958
14:27:21 <ralonsoh> but this is not the case
14:27:38 <ralonsoh> thought*
14:28:45 <haleyb> I will add Liu and ask about his thoughts
14:30:28 <haleyb> ok, next one
14:30:35 <haleyb> #link https://bugs.launchpad.net/neutron/+bug/2079048
14:30:43 <haleyb> Metadata functional tests failing due to connection timeout
14:30:51 <ihrachys> think we merged a patch that may help to investigate the next time it hits
14:31:30 <haleyb> yes, there was a patch to enable iptables debug
14:31:49 <haleyb> AssertionError: metadata proxy unreachable on http://[fe80::a9fe:a9fe%port19caee]:80 before timeout
14:31:55 <slaweq> yes, it was just merged today
14:32:25 <slaweq> it will log iptables rules applied by the agent, maybe this will help understand why there is no connectivity there
14:32:48 <slaweq> so far I did not found anything wrong in the logs from the job runs which I was checking
14:33:20 <haleyb> so is it always the rate-limiting test that fails?
14:33:43 <slaweq> as far as I saw, yes
14:33:53 <slaweq> and it is always the same test probably
14:34:07 <slaweq> related to ipv6 metadata
14:35:36 <haleyb> right. i'm just thinking of ideas, but wonder if there is something different wrt ipv6 where we hit the limit too soon? or there's just another bug hiding in here
14:36:37 <slaweq> but this is failing on the first attempt to reach to the metadata server, so there should be no limit hit yet at all
14:37:37 <haleyb> oh, so either not listening or packets got "lost"
14:37:48 <slaweq> IMO yes
14:38:01 <slaweq> that's why I added this iptables debug to the tests
14:38:16 <slaweq> as metadata in router namespace should be reachable through the iptables rules
14:38:16 <haleyb> i will watch it and try to look
14:38:35 <slaweq> thx haleyb
14:38:51 <haleyb> and we have one more gate failure
14:38:55 <haleyb> #link https://bugs.launchpad.net/neutron/+bug/2079831
14:39:13 <ralonsoh> that is related to the OVN wsgi migration patch
14:39:15 <haleyb> this one i took last week, and added tempest as that's where the code lives
14:39:18 <ralonsoh> thanks for taking it
14:39:40 <haleyb> ralonsoh: oh, so did you only see the failure with that last wsgi patch?
14:39:43 <ralonsoh> yes
14:39:53 <haleyb> i've been trying to see the "not ACTIVE" string unsuccessfully
14:40:16 <haleyb> ok, i might need to add another depends-on to my test patch
14:40:19 <ralonsoh> actually the logs I provided are from this patch, some CI executions
14:40:28 <ralonsoh> https://review.opendev.org/c/openstack/neutron/+/924317
14:40:49 <ralonsoh> actually ^^ this patch should be updated with yours
14:40:56 <haleyb> ralonsoh: and i will update the tempest patch based on your comments, was thinking about it last night but ran out of time
14:41:08 <ralonsoh> thanks
14:41:24 <ihrachys> hm. do you have a theory of how does wsgi patch affects the speed of port active transition?
14:41:56 <ralonsoh> I think (I need to confirm that) it replies faster (the API call)
14:42:29 <ihrachys> ah. so it's vice versa - we list ports more quickly?
14:42:34 <ralonsoh> so we receive the event (chassis binding) and we process it at the same time
14:42:44 <ralonsoh> right
14:42:51 <ralonsoh> but as I said, I need to confirm that
14:43:00 <ralonsoh> (wsgi is supposed to be faster)
14:43:19 <ihrachys> ack. but the test change seems sane regardless. we should assume a particular slowliness / snappiness of port transitions.
14:43:32 <ihrachys> *should not
14:44:24 <ralonsoh> the problem is that after the nova VM creation call, the ports could not be ready yet
14:44:43 <ralonsoh> so I think this check is correct there
14:45:21 <ralonsoh> actually we usually call https://github.com/openstack/tempest/blob/0a0e1070e573674332cb5126064b95f17099307e/tempest/scenario/test_network_basic_ops.py#L124
14:45:49 <ralonsoh> but not in this case because of project_networks_reachable=False
14:49:22 <haleyb> alright, i'll move on since there's still more unassigned bugs :(
14:49:32 <haleyb> #link https://bugs.launchpad.net/neutron/+bug/2078856
14:49:39 <haleyb> OVN invalid syntax '' in networks
14:50:01 <ralonsoh> waiting for more info
14:50:08 <haleyb> right, see that now
14:51:33 <haleyb> there were also some vpnaas ones
14:51:40 <haleyb> #link https://bugs.launchpad.net/neutron/+bug/2080072
14:51:48 <haleyb> Failed to delete vpnaas ipsec-site-connections with 502 error, ORM session: SQL execution without transaction in progress
14:53:15 <haleyb> i will try and ping bodo, don't see him on channel
14:53:24 <ihrachys> noticed it in gate once; seems like a clear case of sqlalchemy api not used correctly
14:53:31 <lajoskatona> I keep an eye also on it
14:55:29 <haleyb> alright, any other bugs to discuss, running out of time
14:55:50 <haleyb> this weeks deputy is lucasgomes, next week is jlibosva
14:56:20 <haleyb> i remember someone pinging lucas and he said he's good for this week
14:56:42 <haleyb> and fwiw bug count is staying stable
14:56:46 <haleyb> Current bug count this week: 717, up 5 from last week
14:56:51 <mlavalle1> haleyb: I confirmed with lucas that he will be triaging bugs this week
14:57:17 <haleyb> mlavalle1: thanks
14:57:18 <mlavalle1> we even had a follow up chat about it at the end of last week
14:57:31 <haleyb> i'll skip over community
14:57:34 <haleyb> #topic on-demand
14:57:41 <haleyb> anything else to discuss?
14:59:13 <haleyb> so just as a reminder, please only merge bug fixes on master until stable/2024.2 branch created, or ask for an exception request
14:59:48 <haleyb> as i mentioned, i'll be looking at the release patches today/tomorrow
15:00:36 <haleyb> thanks for attending and have a good week fixing bugs :)
15:00:40 <haleyb> #endmeeting