16:00:32 <gthiemonge> #startmeeting Octavia
16:00:32 <opendevmeet> Meeting started Wed Jul 20 16:00:32 2022 UTC and is due to finish in 60 minutes.  The chair is gthiemonge. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:32 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:32 <opendevmeet> The meeting name has been set to 'octavia'
16:00:35 <gthiemonge> Hi Folks
16:00:41 <johnsom> o/
16:00:50 <tweining> o/
16:01:55 <gthiemonge> #topic Announcements
16:02:02 <gthiemonge> ** PTL on vacation
16:02:16 <gthiemonge> I'm on PTO for the next 2 weeks
16:02:24 <tweining> I hope you'll find a cool spot ;)
16:02:38 <gthiemonge> I propose that we cancel the weekly meetings until I'm back
16:02:43 <gthiemonge> tweining: ;)
16:02:49 <johnsom> Enjoy the time away!
16:02:54 <oschwart> Enjoy gthiemonge!
16:03:03 <gthiemonge> if you're ok with that, the next meeeting will be on Aug 10th
16:03:09 <tweining> ack
16:03:11 <johnsom> Yeah, things have been quiet, so a break from the weekly meetings is probably fine.
16:03:11 <gthiemonge> thanks
16:03:32 <gthiemonge> it gives you time for reviewing my patches
16:03:32 <oschwart> I am fine with Aug 10th
16:05:46 <gthiemonge> #topic CI Status
16:06:09 <gthiemonge> we had an issue this week with a new release of pyroute2
16:06:22 <gthiemonge> this release (0.7.1) has a bug in the ip "rule" module
16:06:48 <gthiemonge> it adds invaluid rules (no error when adding a rule, but it doesn't work)
16:06:51 <johnsom> Yeah, that release broke neutron too. I hope the new 0.7.2 is in better shape
16:07:05 <gthiemonge> so it breaks the network connectivity in the amphora (unresponsive VIP)
16:07:25 <gthiemonge> yeah the bug is fixed, and they released 0.7.2
16:07:38 <gthiemonge> there's a patch to bump pyroute
16:07:43 <gthiemonge> https://review.opendev.org/c/openstack/requirements/+/850301
16:07:57 <gthiemonge> so please keep an eye on the CI results in case there's another issue with it
16:08:36 <tweining> ok
16:08:59 <gthiemonge> in neutron, it was easy to spot where the error was (they got exception) but in the amphora, I only saw a weird output when running "ip rule show" in the haproxy ns
16:11:23 <johnsom> Yeah, silent failures like that are horrible. Good catch on tracking it down!
16:13:03 <gthiemonge> #topic Brief progress reports / bugs needing review
16:13:18 <gthiemonge> well I was busy with this pyroute2 issue...
16:13:47 <tweining> I could give a short update from the cpu pinning front...
16:13:55 <tweining> https://review.opendev.org/q/topic:amp-cpu-pinning+-status:abandoned
16:14:50 <tweining> I didn't update the proposal anymore. it was pretty clear to me what to do, so I started the implementation already
16:15:37 <gthiemonge> I hope I will be able to test it soon!
16:15:40 <tweining> I extended the amphora API a bit and now octavia can set the cpumap setting in the haproxy config automatically
16:16:10 <tweining> in other words: the HAProxy part works
16:16:14 <johnsom> How is that going? I haven't had time to look at it recently
16:16:47 <tweining> I also worked on implementing the element recently
16:17:34 <tweining> it uninstalls irqbalance and install tuned and tuna. I created a new "amphora" tuned profile based on the cpu-partitioning profile from Red Hat
16:18:14 <tweining> I thought I would need to get the total number of vCPUs in the amp in order to configure it, but today it turned out that I don't need it.
16:18:51 <tweining> so, all the IRQs and processes that are movable are pinned to vCPU0 now
16:19:38 <johnsom> NIC interrupts as well?
16:19:49 <tweining> but since HAProxy does it's own pinning it uses the other vCPUs
16:21:30 <tweining> johnsom: https://paste.opendev.org/show/bWg97HanXFatbg4nkd6H/
16:22:06 <tweining> but that is an older version. I think it's only 3 IRQs now that couldn't get moved.
16:22:53 <johnsom> virtio3 is the lb-mgmt-net NIC?
16:23:11 <johnsom> I hate that about the virtio drivers, they aren't very discriptive.
16:23:35 <tweining> that should be the up to date list https://paste.opendev.org/show/b1k7qw4rNHW8KAYKEDiz/
16:24:00 <gthiemonge> we need to check what happens when we add a new NIC
16:24:10 <tweining> I didn't figure out how these interrupts are mapped
16:24:13 <tweining> yet
16:24:23 <johnsom> Ack
16:25:03 <johnsom> Yeah, it's the VIP/member NICs that are the important part for mapping the interrupts. lb-mgmt-net is low volume
16:27:29 <gthiemonge> hey, as time flies, it would be great if you could review/test this patch chain:
16:27:36 <gthiemonge> #link https://review.opendev.org/c/openstack/octavia/+/660239/
16:27:49 <tweining> wrt processes I was able to pin all by pinning systemd, which is quite convenient
16:27:49 <gthiemonge> (it contains the multi-VIP support and the fix for plugging many subnets from the same network on the member ports)
16:27:54 <gthiemonge> (huge patches)
16:28:37 <johnsom> It looks like the first patch in the chain is failing unit tests: https://review.opendev.org/c/openstack/octavia/+/812368/12
16:28:49 <johnsom> Well, functional actually
16:29:08 <gthiemonge> grmpf
16:29:28 <gthiemonge> yeah this is the mock for pyroute2 that I need to apply in my patch
16:29:51 <gthiemonge> so the tests fail because of the tests :D
16:31:56 <gthiemonge> #topic Open Discussion
16:33:19 <gthiemonge> anything else?
16:34:12 <johnsom> Nothing from me this week
16:34:14 <tweining> one note
16:34:24 <tweining> about irq pinning
16:34:29 <oschwart> Nothing from me
16:35:16 <tweining> I do that using a kernel cmdline parameter (irqaffinity AFAIR) so I would expect that it will apply to all new IRQs as well
16:35:32 <tweining> but that needs to be verified
16:35:50 <tweining> nothing else from me
16:36:18 <gthiemonge> ok
16:36:22 <gthiemonge> thank you folks!
16:36:25 <gthiemonge> #endmeeting