14:00:31 <haleyb> #startmeeting networking 14:00:31 <opendevmeet> Meeting started Tue Sep 10 14:00:31 2024 UTC and is due to finish in 60 minutes. The chair is haleyb. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:31 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:31 <opendevmeet> The meeting name has been set to 'networking' 14:00:33 <haleyb> Ping list: bcafarel, elvira, frickler, mlavalle, mtomaska, obondarev, slaweq, tobias-urdin, ykarel, lajoskatona, jlibosva, averdagu, amotoki, haleyb, ralonsoh 14:00:34 <mlavalle1> \o 14:00:38 <elvira> o/ 14:00:41 <rubasov> o/ 14:00:41 <ralonsoh> hello 14:00:50 <mtomaska> o/ 14:00:50 <obondarev> hi 14:00:53 <slaweq> o/ 14:01:01 <ihrachys> o/ 14:01:04 <lajoskatona> o/ 14:01:23 <bcafarel> o/ 14:01:38 <haleyb> hi everyone, let's get started 14:01:39 <haleyb> #topic announcements 14:01:47 <haleyb> We are now in Dalmatian release week (R - 3) 14:02:00 <haleyb> Focus should be on finding and fixing release-critical bugs, so that release candidates and final versions of the 2024.2 Dalmatian deliverables can be proposed, well ahead of the final 2024.2 Dalmatian release date 14:02:11 <haleyb> All deliverables released under a cycle-with-rc model should have a first release candidate by the end of the week, from which a stable/2024.2 branch will be cut 14:02:18 <haleyb> This branch will track the 2024.2 Dalmatian release 14:02:28 <haleyb> Once stable/2024.2 has been created, the master branch will be ready to switch to 2025.1 Epoxy development. While the master branch will no longer be feature-frozen, please prioritize any work necessary for completing 2024.2 Dalmatian plans 14:02:35 <frickler> o/ 14:02:37 <haleyb> Release-critical bugfixes will need to be merged in the master branch first, then backported to the stable/2024.2 branch before a new release candidate can be proposed 14:03:00 <haleyb> Ok, so that was a lot of pasting from the wiki 14:03:42 <haleyb> There is an initial RC1 candidate proposed 14:03:46 <haleyb> #link https://review.opendev.org/c/openstack/releases/+/928550 14:03:57 <haleyb> I'm assuming master has moved since that was proposed 14:04:26 <haleyb> I will check and update today, but is there anything in the gate that i should be waiting for? 14:05:16 <lajoskatona> thanks for taking care 14:05:27 <haleyb> As once that merges, stable/2024.2 will be cut and anything critical will need to be cherry-picked 14:06:22 <haleyb> We do have until Friday at the latest, so we can look at getting any last minute fixes in 14:07:27 <haleyb> very quiet, so i'll move on 14:07:33 <ralonsoh> This patch (I found today): the cycle highlights https://review.opendev.org/c/openstack/releases/+/928289 14:07:45 <haleyb> yes, that was next on my list 14:07:49 <ralonsoh> sorry 14:08:14 <haleyb> np, it does need a formatting update, but if there's something i missed, or something that doesn't need to be there please comment 14:08:41 <haleyb> i just went through commit messages for ideas 14:09:58 <haleyb> and regarding the RC1 deadline - there are a number of other networking-* projects that also have releases proposed 14:10:13 <haleyb> #link https://review.opendev.org/q/project:openstack/releases 14:10:38 <haleyb> if you contribute to any of those (too many to list), please check and +1 14:10:57 <haleyb> i will go through them later today and verify hashes with master branches 14:11:34 <haleyb> So just to re-state the deadlines 14:11:40 <haleyb> RC1 deadline: September 13th, 2024 (R-3 week) (this week) 14:11:47 <haleyb> Final RC deadline: September 26th, 2024 (R-1 week) 14:11:53 <haleyb> Final 2024.2 Dalmatian release: October 2nd, 2024 14:12:40 <haleyb> Thanks for everyone's hard work during the cycle, we're almost done :) 14:13:11 <haleyb> that was all the announcements i had, any others or questions? 14:14:16 <haleyb> ok, let's move onto bugs 14:14:19 <haleyb> #topic bugs 14:14:28 <haleyb> there were unfortunately a lot of bugs this week 14:14:50 <haleyb> ihrachys was the bug deputy, his report is at 14:14:54 <haleyb> #link https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/6UHEMOMG3CM5JN4FHOETKSFGAJOBSYAK/ 14:15:17 <ihrachys> some of these in the report are not NEW but changed their state though (e.g. more info added or requested) 14:16:17 <haleyb> ihrachys: ack 14:16:22 <haleyb> first one in the list is 14:16:26 <haleyb> #link https://bugs.launchpad.net/neutron/+bug/2078846 14:16:34 <haleyb> OVN fails to transition router port to ACTIVE, claims L2 provisioning block is still present 14:17:40 <ihrachys> i reported it; it took several hours of my time and i still could figure it fully out. i dumped all i found in the bug though. if someone is familiar with provisioning blocks / ovn router port transition to active, your help may be fitting. 14:19:17 <haleyb> ihrachys: thanks for looking. did you see it more than once? 14:19:38 <ihrachys> it was once in gate on my patch, hence I had to look and report :) 14:21:09 <haleyb> https://review.opendev.org/c/openstack/neutron/+/927631 patch i guess based on the link 14:22:51 <haleyb> well, if someone has the cycles to look that would be good, but lets get through some of the others 14:22:54 <haleyb> next is 14:23:09 <haleyb> #link https://bugs.launchpad.net/neutron/+bug/2079047 14:23:58 <ralonsoh> yeah, I saw this job failing all times in the periodic queue 14:24:20 <ihrachys> oh is it in periodic after all? thought it was experimental. 14:24:24 <ralonsoh> I have no time to review this specific configuration 14:24:35 <ralonsoh> all periodic jobs are also experimental 14:24:41 <ralonsoh> (not the other way around) 14:24:45 <ihrachys> if so then I think opensearch exposes the approximate time when it started to fail 14:25:04 <ihrachys> because it was clear on the chart (link in the bug report) when it started to spike 14:25:18 <slaweq> maybe we should ping LiuYulong about it as he is main author of this distributed dhcp feature and those jobs 14:26:03 <frickler> opensearch only goes back 10 days, so if it starts then, that's how much data there are, not necessarily when the issue started 14:26:11 <ralonsoh> and is always the same test 14:26:23 <ihrachys> frickler: ah! :( 14:27:04 <haleyb> slaweq: yes, pinging him would be good as it could be related to those changes 14:27:16 <ralonsoh> I though that could be related to the eventlet/wsgi change 14:27:17 <ralonsoh> https://review.opendev.org/c/openstack/neutron/+/927958 14:27:21 <ralonsoh> but this is not the case 14:27:38 <ralonsoh> thought* 14:28:45 <haleyb> I will add Liu and ask about his thoughts 14:30:28 <haleyb> ok, next one 14:30:35 <haleyb> #link https://bugs.launchpad.net/neutron/+bug/2079048 14:30:43 <haleyb> Metadata functional tests failing due to connection timeout 14:30:51 <ihrachys> think we merged a patch that may help to investigate the next time it hits 14:31:30 <haleyb> yes, there was a patch to enable iptables debug 14:31:49 <haleyb> AssertionError: metadata proxy unreachable on http://[fe80::a9fe:a9fe%port19caee]:80 before timeout 14:31:55 <slaweq> yes, it was just merged today 14:32:25 <slaweq> it will log iptables rules applied by the agent, maybe this will help understand why there is no connectivity there 14:32:48 <slaweq> so far I did not found anything wrong in the logs from the job runs which I was checking 14:33:20 <haleyb> so is it always the rate-limiting test that fails? 14:33:43 <slaweq> as far as I saw, yes 14:33:53 <slaweq> and it is always the same test probably 14:34:07 <slaweq> related to ipv6 metadata 14:35:36 <haleyb> right. i'm just thinking of ideas, but wonder if there is something different wrt ipv6 where we hit the limit too soon? or there's just another bug hiding in here 14:36:37 <slaweq> but this is failing on the first attempt to reach to the metadata server, so there should be no limit hit yet at all 14:37:37 <haleyb> oh, so either not listening or packets got "lost" 14:37:48 <slaweq> IMO yes 14:38:01 <slaweq> that's why I added this iptables debug to the tests 14:38:16 <slaweq> as metadata in router namespace should be reachable through the iptables rules 14:38:16 <haleyb> i will watch it and try to look 14:38:35 <slaweq> thx haleyb 14:38:51 <haleyb> and we have one more gate failure 14:38:55 <haleyb> #link https://bugs.launchpad.net/neutron/+bug/2079831 14:39:13 <ralonsoh> that is related to the OVN wsgi migration patch 14:39:15 <haleyb> this one i took last week, and added tempest as that's where the code lives 14:39:18 <ralonsoh> thanks for taking it 14:39:40 <haleyb> ralonsoh: oh, so did you only see the failure with that last wsgi patch? 14:39:43 <ralonsoh> yes 14:39:53 <haleyb> i've been trying to see the "not ACTIVE" string unsuccessfully 14:40:16 <haleyb> ok, i might need to add another depends-on to my test patch 14:40:19 <ralonsoh> actually the logs I provided are from this patch, some CI executions 14:40:28 <ralonsoh> https://review.opendev.org/c/openstack/neutron/+/924317 14:40:49 <ralonsoh> actually ^^ this patch should be updated with yours 14:40:56 <haleyb> ralonsoh: and i will update the tempest patch based on your comments, was thinking about it last night but ran out of time 14:41:08 <ralonsoh> thanks 14:41:24 <ihrachys> hm. do you have a theory of how does wsgi patch affects the speed of port active transition? 14:41:56 <ralonsoh> I think (I need to confirm that) it replies faster (the API call) 14:42:29 <ihrachys> ah. so it's vice versa - we list ports more quickly? 14:42:34 <ralonsoh> so we receive the event (chassis binding) and we process it at the same time 14:42:44 <ralonsoh> right 14:42:51 <ralonsoh> but as I said, I need to confirm that 14:43:00 <ralonsoh> (wsgi is supposed to be faster) 14:43:19 <ihrachys> ack. but the test change seems sane regardless. we should assume a particular slowliness / snappiness of port transitions. 14:43:32 <ihrachys> *should not 14:44:24 <ralonsoh> the problem is that after the nova VM creation call, the ports could not be ready yet 14:44:43 <ralonsoh> so I think this check is correct there 14:45:21 <ralonsoh> actually we usually call https://github.com/openstack/tempest/blob/0a0e1070e573674332cb5126064b95f17099307e/tempest/scenario/test_network_basic_ops.py#L124 14:45:49 <ralonsoh> but not in this case because of project_networks_reachable=False 14:49:22 <haleyb> alright, i'll move on since there's still more unassigned bugs :( 14:49:32 <haleyb> #link https://bugs.launchpad.net/neutron/+bug/2078856 14:49:39 <haleyb> OVN invalid syntax '' in networks 14:50:01 <ralonsoh> waiting for more info 14:50:08 <haleyb> right, see that now 14:51:33 <haleyb> there were also some vpnaas ones 14:51:40 <haleyb> #link https://bugs.launchpad.net/neutron/+bug/2080072 14:51:48 <haleyb> Failed to delete vpnaas ipsec-site-connections with 502 error, ORM session: SQL execution without transaction in progress 14:53:15 <haleyb> i will try and ping bodo, don't see him on channel 14:53:24 <ihrachys> noticed it in gate once; seems like a clear case of sqlalchemy api not used correctly 14:53:31 <lajoskatona> I keep an eye also on it 14:55:29 <haleyb> alright, any other bugs to discuss, running out of time 14:55:50 <haleyb> this weeks deputy is lucasgomes, next week is jlibosva 14:56:20 <haleyb> i remember someone pinging lucas and he said he's good for this week 14:56:42 <haleyb> and fwiw bug count is staying stable 14:56:46 <haleyb> Current bug count this week: 717, up 5 from last week 14:56:51 <mlavalle1> haleyb: I confirmed with lucas that he will be triaging bugs this week 14:57:17 <haleyb> mlavalle1: thanks 14:57:18 <mlavalle1> we even had a follow up chat about it at the end of last week 14:57:31 <haleyb> i'll skip over community 14:57:34 <haleyb> #topic on-demand 14:57:41 <haleyb> anything else to discuss? 14:59:13 <haleyb> so just as a reminder, please only merge bug fixes on master until stable/2024.2 branch created, or ask for an exception request 14:59:48 <haleyb> as i mentioned, i'll be looking at the release patches today/tomorrow 15:00:36 <haleyb> thanks for attending and have a good week fixing bugs :) 15:00:40 <haleyb> #endmeeting