16:00:12 <johnsom> #startmeeting Octavia 16:00:13 <openstack> Meeting started Wed Jul 15 16:00:12 2020 UTC and is due to finish in 60 minutes. The chair is johnsom. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:14 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:16 <openstack> The meeting name has been set to 'octavia' 16:00:20 <cgoncalves> hi 16:00:22 <johnsom> Hi everyone 16:00:23 <ataraday_> hi 16:00:28 <gthiemonge> hi 16:00:28 <aannuusshhkkaa> hi 16:00:38 <ZhuXiaoYu> hi 16:01:07 <rm_work> o/ 16:01:13 <johnsom> #topic Announcements 16:01:27 <johnsom> OpenStack 10 years celebration July 16th 1500 UTC 16:01:35 <johnsom> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-July/015839.html 16:02:06 <johnsom> I wanted to make sure everyone knew the foundation is having a virtual celebration of ten years of OpenStack 16:02:24 <johnsom> That is the only announcement I have this week. Any others? 16:02:59 <johnsom> #topic Brief progress reports / bugs needing review 16:03:46 <johnsom> I have been deep into porting the failover flow updates to the amphora v2 driver. It's slow going, but I think I have the tasks all ported now. I have started on the amphora failover flow. 16:03:58 <ataraday_> I would like to point out that there are 8 changes for review about amphorav2 required to be ready/merged listed in https://etherpad.openstack.org/p/octavia-priority-reviews 16:04:25 <ataraday_> And I will have a vacation next week 16:04:38 <ataraday_> will we make till end of July? 16:05:09 <johnsom> Nice, I will take a look at those. 16:05:42 <johnsom> I assume you are talking about making Amphora v2 default decision for MS2 milestone? 16:05:43 <cgoncalves> I've completed the failover flows backports to Ussuri and Train, now passing Python 2 CI jobs too. work on the jobboard on/off continued (thank you for the reviews thus far, ataraday_!) 16:06:07 <johnsom> I have added a topic later on the agenda to talk about that. I think we have some problems in the Amphora v2 driver we need to work on. 16:06:25 <ataraday_> johnsom, yeah, ok lets discuss later 16:07:35 <cgoncalves> I also posted small but important patches, and would appreciate reviews on those. I added them to the review priority list as well as cleaned it up by removing already merged patches 16:07:53 <cgoncalves> #link https://etherpad.opendev.org/p/octavia-priority-reviews 16:08:13 <johnsom> cgoncalves Thanks for working on the priority list! 16:08:23 <aannuusshhkkaa> We are working on changing amphora stats to report deltas instead of absolutes. We have written code for the amphora, still have to write code for the other side. 16:09:05 <johnsom> Nice 16:09:41 <johnsom> #topic Queens and Rocky End Of Life 16:09:55 <johnsom> Someone added this topic to the agenda. Please run with it 16:10:07 <cgoncalves> o/ 16:11:00 <cgoncalves> Adam proposed some time ago a patch to EOL Queens. 16:11:42 <cgoncalves> Adam Workflow-1'd because, per my understanding, we could still want to make a final release. I think we are past that time since it has entered extended maintenance. 16:12:09 <cgoncalves> if that is accurate, would we be good to officially EOL queens? 16:12:46 <johnsom> We can still merge the patches on Queens if we want even though it is in extended maintenance. It just will not have anymore releases 16:13:17 <cgoncalves> there are a few open stable/queens patches. however, queens and rocky CI pipelines are broken due to devstack/wsgi/... issues 16:13:36 <johnsom> I am good with moving forward with the EOL. I don't think we have any reasons to not move forward. 16:13:48 <johnsom> Ugh, yeah, those are ... troublesome 16:14:34 <cgoncalves> there was a patch in devstack project to fix issues but I don't think that has made much progress since 16:15:44 <cgoncalves> so, Rocky is mostly also in the same boat: broken CI, in extended maintenance mode (no point releases, no CI commitments). should we EOL Rocky too? 16:16:57 <johnsom> I think we need to finish the EOL of Queens first, but I am also very open to EOL Rocky. 16:17:03 <rm_work> link it and i will un-minus-workflow 16:17:25 <johnsom> #link https://review.opendev.org/#/c/719099/ 16:17:30 <cgoncalves> 2 sec. finding your patch 16:17:37 <cgoncalves> #link https://review.opendev.org/#/c/719099/ 16:18:01 <rm_work> So, no more release, go ahead with queens EOL? 16:18:05 <rm_work> sounds good to me 16:18:41 <johnsom> Should we just abandon the open patches? 16:19:04 <johnsom> I guess if devstack and the gates are broken they aren't going to be able to merge any time soon 16:19:25 <cgoncalves> +1 to abandon 16:19:53 <johnsom> Ok, maybe we wait for the EOL tag to land, then abandon these. 16:21:45 <johnsom> Does someone want to propose EOL of Rocky? Are we ready for that? 16:22:46 <johnsom> I think most of the distributions are using Queens as LTS and not Rocky, so maybe that is ok 16:22:53 <cgoncalves> I can propose if you would like. I think coming from the PTL or stable liaison would be best 16:23:38 <johnsom> Yeah, I agree. I expect Rocky will have some commentary... grin 16:25:05 <cgoncalves> the fact devstack Rocky is broken for most if not everyone for weeks is already half-way to have it accepted 16:25:16 <johnsom> Not a lot of comments here. I guess go ahead and propose it and see if anyone comments on the proposal. 16:25:49 <cgoncalves> OK. I'll do it asap 16:25:53 <johnsom> Yeah, I had the same comment about Queens. Both of the recent CVEs in other projects didn't bother to backport to queens. 16:26:23 <johnsom> Ok, thank you cgoncalves 16:26:39 <johnsom> #topic Active/Active specification 16:26:48 <johnsom> #link https://review.opendev.org/#/c/723864 16:27:06 <johnsom> I want to highlight we have an open specification that has had a recent update. 16:27:35 <johnsom> Also ZhuXiaoYu has joined the meeting today if you have any questions/comments on the specification 16:28:34 <johnsom> I have not yet had a chance to review the update. How are the proposals for changes to neutron going? 16:28:43 <ZhuXiaoYu> yes, please speak to me if you have any advise with this 16:29:19 <ZhuXiaoYu> I remove the same subnet limit 16:30:03 <ZhuXiaoYu> and continue using existing update router api 16:30:57 <ZhuXiaoYu> and about the FIP, sorry but it is necessary 16:31:14 <johnsom> #link https://review.opendev.org/#/c/729532 16:31:21 <johnsom> That is the ECMP in neutron patch 16:31:55 <johnsom> Hmm, yeah, if FIPs are necessary, that might be an issue. We would at a minimum need to document somehow. 16:32:15 <ZhuXiaoYu> yes, our neutron team is planning to use HMARK in iptables instead of normal routes 16:33:15 <ZhuXiaoYu> this could support not to redistribute flows when remove a amphora 16:34:19 <johnsom> I have put down a task of re-reviewing the comments/specification. I hope others in the Octavia community can also comment/review. 16:35:14 <johnsom> ZhuXiaoYu Are there any questions you have for the Octavia team we can answer now? Or just the items in the comments of the specification? 16:37:45 <ZhuXiaoYu> In fact, we're more concerned about how to get it passed. 16:38:18 <johnsom> Ok, so what you need is reviews. Thanks! 16:38:35 <johnsom> #topic State of Amphora v2 driver 16:38:48 <ataraday_> I would like to mention that first 3 changes from the list - are actually important fixes for amphorav2 issues and should be backported. 16:39:27 <johnsom> So this is my topic. As I have been working on the failover port, I have found some issues I think we need to work on. 16:39:38 <johnsom> I have created an etherpad to track: 16:39:43 <johnsom> #link https://etherpad.opendev.org/p/octavia-amphora-v2 16:40:14 <johnsom> Most are around the fact that when a flow fails and is re-dispatched, it's not bounded in any way. 16:40:58 <johnsom> So, for example, I have a code issue (expected during the work I'm doing) that causes a port data model to not serialize correctly. 16:41:26 <johnsom> This caused the controller to go into an endless loop trying to re-dispatch the flow to another worker. 16:41:53 <johnsom> The problem is that the revert isn't triggering, so each attempt creates a new port in neutron 16:42:20 <johnsom> In a matter of seconds I have 255 bogus ports and no more IP addresses left of the subnet. 16:43:51 <johnsom> So, this is a big concern for me with the v2 driver. I think we need to look into how to make sure it doesn't run away and consume all of the cloud resources. 16:44:58 <ataraday_> Could you put steps of repro for this on etherpad? 16:45:37 <ataraday_> or a story 16:45:47 <johnsom> Yes, I can. Granted they will be "artificial" steps, but I think some cloud outage scenarios will also trigger this behavior. 16:46:10 <johnsom> I will do this after the meeting. 16:46:26 <ataraday_> johnsom, thanks! 16:47:18 <johnsom> ataraday_ Thank you for the comment about the progress update notification. I also agree I need to look at that. 16:48:19 <johnsom> Let's work on these issues. I hope we can have some initial comments/ideas in a week or so. That way we can make an good decision on the "v2 default" at MS2 16:48:53 <johnsom> If you find any other concerns, please feel free to update the etherpad. 16:49:42 <johnsom> Any other discussion on the V2 amphora driver? 16:51:26 <johnsom> Ok. More work to do, but I don't think it's a show stopper yet. 16:51:29 <johnsom> #topic Open Discussion 16:51:37 <johnsom> Any other topics this week? 16:52:22 <cgoncalves> just a FYI that all our scenario CI jobs run on nested-virt nodepool instances with KVM rather than TCG 16:52:36 <johnsom> Nice! 16:52:44 <johnsom> That should help a lot. 16:53:00 <johnsom> I just hope we don't find many broken hypervisors. 16:53:17 <cgoncalves> this change improves CI job build times significantly. please keep an eye open for random failures, it could be issues with nested virtualization and do let us know 16:54:04 <johnsom> Also a quick mention, there are a lot of open patches on the tempest plugin. It would be nice to get some reviews on those. 16:54:06 <johnsom> #link https://review.opendev.org/#/q/project:openstack/octavia-tempest-plugin+status:open 16:54:47 <cgoncalves> also, I've observed the CentOS 8 job taking as low as ~45 minutes to run so great improvement considering before it was taking 2h30 to TIMOUT 16:55:11 <johnsom> Really? That is great news! Maybe we can re-add a centos job 16:55:29 <cgoncalves> sure! here: 16:55:30 <cgoncalves> #link https://review.opendev.org/#/c/741147/ 16:55:56 <cgoncalves> job history: 16:55:58 <cgoncalves> #link https://zuul.opendev.org/t/openstack/builds?job_name=octavia-v2-dsvm-scenario-centos-8 16:56:01 <johnsom> Nice, it even passes 16:56:06 <cgoncalves> 41 mins 39 secs !! 16:56:13 <rm_work> noice 16:56:24 <cgoncalves> lol to "it even passes" 16:56:49 <johnsom> Well, it's been a while since we had a working job. I was worried there would be some bit-rot 16:58:24 <johnsom> Any other topics this week? 16:59:19 <johnsom> Ok, thank you everyone! Have a good week. 16:59:23 <cgoncalves> thank you 16:59:28 <johnsom> #endmeeting