16:00:12 <johnsom> #startmeeting Octavia
16:00:13 <openstack> Meeting started Wed Jul 15 16:00:12 2020 UTC and is due to finish in 60 minutes.  The chair is johnsom. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:14 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:16 <openstack> The meeting name has been set to 'octavia'
16:00:20 <cgoncalves> hi
16:00:22 <johnsom> Hi everyone
16:00:23 <ataraday_> hi
16:00:28 <gthiemonge> hi
16:00:28 <aannuusshhkkaa> hi
16:00:38 <ZhuXiaoYu> hi
16:01:07 <rm_work> o/
16:01:13 <johnsom> #topic Announcements
16:01:27 <johnsom> OpenStack 10 years celebration July 16th 1500 UTC
16:01:35 <johnsom> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-July/015839.html
16:02:06 <johnsom> I wanted to make sure everyone knew the foundation is having a virtual celebration of ten years of OpenStack
16:02:24 <johnsom> That is the only announcement I have this week. Any others?
16:02:59 <johnsom> #topic Brief progress reports / bugs needing review
16:03:46 <johnsom> I have been deep into porting the failover flow updates to the amphora v2 driver. It's slow going, but I think I have the tasks all ported now. I have started on the amphora failover flow.
16:03:58 <ataraday_> I would like to point out that there are 8 changes for review about amphorav2 required to be ready/merged listed in https://etherpad.openstack.org/p/octavia-priority-reviews
16:04:25 <ataraday_> And I will have a vacation next week
16:04:38 <ataraday_> will we make till end of July?
16:05:09 <johnsom> Nice, I will take a look at those.
16:05:42 <johnsom> I assume you are talking about making Amphora v2 default decision for MS2 milestone?
16:05:43 <cgoncalves> I've completed the failover flows backports to Ussuri and Train, now passing Python 2 CI jobs too. work on the jobboard on/off continued (thank you for the reviews thus far, ataraday_!)
16:06:07 <johnsom> I have added a topic later on the agenda to talk about that. I think we have some problems in the Amphora v2 driver we need to work on.
16:06:25 <ataraday_> johnsom, yeah, ok lets discuss later
16:07:35 <cgoncalves> I also posted small but important patches, and would appreciate reviews on those. I added them to the review priority list as well as cleaned it up by removing already merged patches
16:07:53 <cgoncalves> #link https://etherpad.opendev.org/p/octavia-priority-reviews
16:08:13 <johnsom> cgoncalves Thanks for working on the priority list!
16:08:23 <aannuusshhkkaa> We are working on changing amphora stats to report deltas instead of absolutes. We have written code for the amphora, still have to write code for the other side.
16:09:05 <johnsom> Nice
16:09:41 <johnsom> #topic Queens and Rocky End Of Life
16:09:55 <johnsom> Someone added this topic to the agenda. Please run with it
16:10:07 <cgoncalves> o/
16:11:00 <cgoncalves> Adam proposed some time ago a patch to EOL Queens.
16:11:42 <cgoncalves> Adam Workflow-1'd because, per my understanding, we could still want to make a final release. I think we are past that time since it has entered extended maintenance.
16:12:09 <cgoncalves> if that is accurate, would we be good to officially EOL queens?
16:12:46 <johnsom> We can still merge the patches on Queens if we want even though it is in extended maintenance. It just will not have anymore releases
16:13:17 <cgoncalves> there are a few open stable/queens patches. however, queens and rocky CI pipelines are broken due to devstack/wsgi/... issues
16:13:36 <johnsom> I am good with moving forward with the EOL. I don't think we have any reasons to not move forward.
16:13:48 <johnsom> Ugh, yeah, those are ... troublesome
16:14:34 <cgoncalves> there was a patch in devstack project to fix issues but I don't think that has made much progress since
16:15:44 <cgoncalves> so, Rocky is mostly also in the same boat: broken CI, in extended maintenance mode (no point releases, no CI commitments). should we EOL Rocky too?
16:16:57 <johnsom> I think we need to finish the EOL of Queens first, but I am also very open to EOL Rocky.
16:17:03 <rm_work> link it and i will un-minus-workflow
16:17:25 <johnsom> #link https://review.opendev.org/#/c/719099/
16:17:30 <cgoncalves> 2 sec. finding your patch
16:17:37 <cgoncalves> #link https://review.opendev.org/#/c/719099/
16:18:01 <rm_work> So, no more release, go ahead with queens EOL?
16:18:05 <rm_work> sounds good to me
16:18:41 <johnsom> Should we just abandon the open patches?
16:19:04 <johnsom> I guess if devstack and the gates are broken they aren't going to be able to merge any time soon
16:19:25 <cgoncalves> +1 to abandon
16:19:53 <johnsom> Ok, maybe we wait for the EOL tag to land, then abandon these.
16:21:45 <johnsom> Does someone want to propose EOL of Rocky? Are we ready for that?
16:22:46 <johnsom> I think most of the distributions are using Queens as LTS and not Rocky, so maybe that is ok
16:22:53 <cgoncalves> I can propose if you would like. I think coming from the PTL or stable liaison would be best
16:23:38 <johnsom> Yeah, I agree. I expect Rocky will have some commentary... grin
16:25:05 <cgoncalves> the fact devstack Rocky is broken for most if not everyone for weeks is already half-way to have it accepted
16:25:16 <johnsom> Not a lot of comments here. I guess go ahead and propose it and see if anyone comments on the proposal.
16:25:49 <cgoncalves> OK. I'll do it asap
16:25:53 <johnsom> Yeah, I had the same comment about Queens. Both of the recent CVEs in other projects didn't bother to backport to queens.
16:26:23 <johnsom> Ok, thank you cgoncalves
16:26:39 <johnsom> #topic Active/Active specification
16:26:48 <johnsom> #link https://review.opendev.org/#/c/723864
16:27:06 <johnsom> I want to highlight we have an open specification that has had a recent update.
16:27:35 <johnsom> Also ZhuXiaoYu has joined the meeting today if you have any questions/comments on the specification
16:28:34 <johnsom> I have not yet had a chance to review the update. How are the proposals for changes to neutron going?
16:28:43 <ZhuXiaoYu> yes, please speak to me if you have any advise with this
16:29:19 <ZhuXiaoYu> I remove the same subnet limit
16:30:03 <ZhuXiaoYu> and continue using existing update router api
16:30:57 <ZhuXiaoYu> and about the FIP, sorry but it is necessary
16:31:14 <johnsom> #link https://review.opendev.org/#/c/729532
16:31:21 <johnsom> That is the ECMP in neutron patch
16:31:55 <johnsom> Hmm, yeah, if FIPs are necessary, that might be an issue. We would at a minimum need to document somehow.
16:32:15 <ZhuXiaoYu> yes, our neutron team is planning to use HMARK in iptables instead of normal routes
16:33:15 <ZhuXiaoYu> this could support not to redistribute flows when remove a amphora
16:34:19 <johnsom> I have put down a task of re-reviewing the comments/specification. I hope others in the Octavia community can also comment/review.
16:35:14 <johnsom> ZhuXiaoYu Are there any questions you have for the Octavia team we can answer now? Or just the items in the comments of the specification?
16:37:45 <ZhuXiaoYu> In fact, we're more concerned about how to get it passed.
16:38:18 <johnsom> Ok, so what you need is reviews. Thanks!
16:38:35 <johnsom> #topic State of Amphora v2 driver
16:38:48 <ataraday_> I would like to mention that first 3 changes from the list - are actually important fixes for amphorav2 issues and should be backported.
16:39:27 <johnsom> So this is my topic. As I have been working on the failover port, I have found some issues I think we need to work on.
16:39:38 <johnsom> I have created an etherpad to track:
16:39:43 <johnsom> #link https://etherpad.opendev.org/p/octavia-amphora-v2
16:40:14 <johnsom> Most are around the fact that when a flow fails and is re-dispatched, it's not bounded in any way.
16:40:58 <johnsom> So, for example, I have a code issue (expected during the work I'm doing) that causes a port data model to not serialize correctly.
16:41:26 <johnsom> This caused the controller to go into an endless loop trying to re-dispatch the flow to another worker.
16:41:53 <johnsom> The problem is that the revert isn't triggering, so each attempt creates a new port in neutron
16:42:20 <johnsom> In a matter of seconds I have 255 bogus ports and no more IP addresses left of the subnet.
16:43:51 <johnsom> So, this is a big concern for me with the v2 driver. I think we need to look into how to make sure it doesn't run away and consume all of the cloud resources.
16:44:58 <ataraday_> Could you put steps of repro for this on etherpad?
16:45:37 <ataraday_> or a story
16:45:47 <johnsom> Yes, I can. Granted they will be "artificial" steps, but I think some cloud outage scenarios will also trigger this behavior.
16:46:10 <johnsom> I will do this after the meeting.
16:46:26 <ataraday_> johnsom, thanks!
16:47:18 <johnsom> ataraday_ Thank you for the comment about the progress update notification. I also agree I need to look at that.
16:48:19 <johnsom> Let's work on these issues. I hope we can have some initial comments/ideas in a week or so. That way we can make an good decision on the "v2 default" at MS2
16:48:53 <johnsom> If you find any other concerns, please feel free to update the etherpad.
16:49:42 <johnsom> Any other discussion on the V2 amphora driver?
16:51:26 <johnsom> Ok. More work to do, but I don't think it's a show stopper yet.
16:51:29 <johnsom> #topic Open Discussion
16:51:37 <johnsom> Any other topics this week?
16:52:22 <cgoncalves> just a FYI that all our scenario CI jobs run on nested-virt nodepool instances with KVM rather than TCG
16:52:36 <johnsom> Nice!
16:52:44 <johnsom> That should help a lot.
16:53:00 <johnsom> I just hope we don't find many broken hypervisors.
16:53:17 <cgoncalves> this change improves CI job build times significantly. please keep an eye open for random failures, it could be issues with nested virtualization and do let us know
16:54:04 <johnsom> Also a quick mention, there are a lot of open patches on the tempest plugin. It would be nice to get some reviews on those.
16:54:06 <johnsom> #link https://review.opendev.org/#/q/project:openstack/octavia-tempest-plugin+status:open
16:54:47 <cgoncalves> also, I've observed the CentOS 8 job taking as low as ~45 minutes to run so great improvement considering before it was taking 2h30 to TIMOUT
16:55:11 <johnsom> Really? That is great news! Maybe we can re-add a centos job
16:55:29 <cgoncalves> sure! here:
16:55:30 <cgoncalves> #link https://review.opendev.org/#/c/741147/
16:55:56 <cgoncalves> job history:
16:55:58 <cgoncalves> #link https://zuul.opendev.org/t/openstack/builds?job_name=octavia-v2-dsvm-scenario-centos-8
16:56:01 <johnsom> Nice, it even passes
16:56:06 <cgoncalves> 41 mins 39 secs !!
16:56:13 <rm_work> noice
16:56:24 <cgoncalves> lol to "it even passes"
16:56:49 <johnsom> Well, it's been a while since we had a working job. I was worried there would be some bit-rot
16:58:24 <johnsom> Any other topics this week?
16:59:19 <johnsom> Ok, thank you everyone! Have a good week.
16:59:23 <cgoncalves> thank you
16:59:28 <johnsom> #endmeeting