14:00:33 <dulek> #startmeeting Kuryr
14:00:34 <openstack> Meeting started Mon Jul 29 14:00:33 2019 UTC and is due to finish in 60 minutes.  The chair is dulek. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:35 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:37 <openstack> The meeting name has been set to 'kuryr'
14:01:06 <dulek> Well, it's been a few weeks since I run this. Sorry about that, crazy time with some deadlines for other projects.
14:01:56 <dulek> There are a few things that's going to change for Kuryr, so here they are.
14:03:27 <dulek> First of all dmellado cannot be involved too much in the project, so me and ltomasbo are probably the go-to people here. I should probably figure out how to do a PTL change during the cycle to make sure all the formal things are ironed out.
14:04:09 <ltomasbo> o/
14:04:13 <aperevalov> o/
14:04:44 <dulek> Second of all there's not a lot activity in those meetings, so I'd probably vote to switch to the office hours model, when this time during the week we'll simply welcome any questions on the #openstack-kuryr.
14:05:21 <dulek> Any thoughts on that?
14:05:50 <dulek> If there's none and we're here, I guess aperevalov can talk a bit about improve-pod-launch-time blueprint?
14:05:52 <aperevalov> no problem, it's ok for us.
14:06:35 <aperevalov> yes, it's a problem for us, I guess it's not problem for your, due to you are using ovn.
14:07:44 <dulek> aperevalov: Well, not really, we're mostly running nested.
14:07:54 <dulek> In that case ports are immediately ACTIVE, so the time is cut.
14:08:23 <dulek> aperevalov: But I totally agree there's room for improvement.
14:08:39 <aperevalov> nested - means you are plugging port to nova-instance?
14:09:08 <dulek> aperevalov: Yup - trunk ports and subports.
14:09:12 <dulek> Pods get the subports.
14:09:33 <aperevalov> dulek, thanks
14:09:36 <aperevalov> there are several parts in that blueprint. The first part about direct RPC (kuryr-controller - kuryr-daemon). Which probably should be faster than doing it through k8s.
14:10:06 <aperevalov> and keep storing states into k8s, for fallback.
14:10:15 <dulek> aperevalov: So we had some ideas about that in the past, I even crafted some code.
14:10:26 <dulek> aperevalov: First of all… Are you using port pools?
14:10:35 <aperevalov> yes,
14:11:12 <aperevalov> we measured it, in bm case - 2x times faster
14:11:23 <dulek> aperevalov: Direct?
14:11:51 <aperevalov> yes, it was direct ports, but we still waiting for neutron's status Active.
14:12:40 <aperevalov> and than more ports we requested than more waiting time interval
14:13:07 <dulek> aperevalov: Hm, okay, I don't think I'm immediately against direct communication.
14:13:26 <dulek> We would just need to make sure we're not abusing some K8s paradigms.
14:13:45 <aperevalov> it also depends on openstack controller performance
14:14:00 <dulek> aperevalov: But here's an idea we had a long while ago. I think it addresses all the issues you list in the blueprint.
14:14:13 <dulek> aperevalov: So basically we wanted to extend the port pools concept.
14:14:37 <aperevalov> do you have a blueprint for it?
14:14:41 <dulek> aperevalov: As we have host in the pool key, pools are already "attached" to a kuryr-daemon.
14:14:44 <dulek> aperevalov: Yes, just a sec…
14:15:02 <dulek> aperevalov: https://blueprints.launchpad.net/kuryr-kubernetes/+spec/daemon-pool-port-choice
14:15:16 <aperevalov> so, do you want to do "prebind" ports
14:15:28 <aperevalov> wait a minute, I'm reading
14:15:30 <dulek> Yup, that was part of the idea.
14:15:55 <dulek> aperevalov: So the idea is that kuryr-controller would create ports for the pools and for each port create KuryrPort CRD.
14:16:52 <dulek> aperevalov: kuryr-daemon would watch for those and choose the ports on their own. So if there are ports in the pool - daemon won't wait for kuryr-controller at all.
14:17:37 <dulek> aperevalov: Then another improvement would be to "pre-bind" ports, so they become ACTIVE in Neutron even before pods get created. Then on the "real" binding only the interface gets moved to another netns.
14:19:35 <aperevalov> if I truly understand you, all ports from such pool will be with ACTIVE status
14:20:38 <dulek> aperevalov: Yup, so we won't wait for that.
14:21:58 <dulek> aperevalov: Does this make any sense for you? I had some POC, it's listed on the blueprint.
14:22:24 <dulek> While it was working okay, I haven't noticed much performance improvement back then.
14:22:48 <dulek> But probably I made some mistakes somewhere in there.
14:23:19 <aperevalov> but technically, e.g. neutron-openvswitch-agent - it's a controller for ports in ovs, it responsible for openflow rules and it set that rules when/after we attached tap into ovs. So you propose to create ovs port (osvif->plug) in batch, before real pod launch happend.
14:24:04 <dulek> aperevalov: Yep, that was the idea. It was attaching those ports to a fake network namespace.
14:24:19 <dulek> aperevalov: And on pod creation that port was moved to pod namespace.
14:25:35 <aperevalov> looks like it will work with SR-IOV too, but in none pool mode we still have to wait for neutron updates.
14:27:14 <dulek> aperevalov: Yes, that idea doesn't really have a lot of sense without pools.
14:27:48 <aperevalov> we tried to improve wait mechanism, so please review https://review.opendev.org/#/c/669642/ )
14:29:25 <dulek> aperevalov: This seems to be broken for containerized case?
14:30:03 <dulek> Hm, maybe just a coincidence.
14:30:18 <aperevalov> do you mean containerized kuryr? no, no, we checked it in containerized case.
14:31:04 <aperevalov> this test kuryr-kubernetes-tempest-containerized passed
14:32:12 <dulek> aperevalov: I see. Okay, I'll take a look, but I'm not totally convinced as it's pretty unusual case when you have access to RabbitMQ from Kuryr point.
14:32:55 <dulek> aperevalov: That would only happen in clouds you manage. If you wanted to run Kubernetes + Kuryr on any OpenStack public cloud, that would not work.
14:33:51 <aperevalov> ok, in this case kuryr will work as before, by request.
14:34:12 <dulek> aperevalov: I see it's failing back to that, sure.
14:34:36 <aperevalov> Fallback method should be invisible
14:35:46 <dulek> aperevalov: Yes, yes, I see.
14:36:32 <dulek> aperevalov: Okay, I'll take a look on that patch. Will you think about that daemon-pool-port-choice? It seems to be a bit more Kubernetes-style than to just allow direct communication.
14:38:14 <aperevalov> it's a nice idea, who will finish implementation? We have resources for this...
14:39:04 <dulek> aperevalov: I'm pretty sure me, ltomasbo and Maysa are unable to work on that now.
14:39:35 <dulek> So if you think it would help your use case - I'd be super happy if you can grab it.
14:41:15 <aperevalov> ok, this review https://review.opendev.org/#/c/527243 was about it, wasn't it?
14:42:55 <dulek> aperevalov: Yes, but it's probably super outdated.
14:43:24 <dulek> aperevalov: Back then we had some issue with CRD's support, but that's definitely fixed by now.
14:43:41 <dulek> aperevalov: K8s API wasn't working as it should when using CRD's.
14:49:36 <dulek> Okay, I guess I'll just close the meeting. Thanks all!
14:49:48 <aperevalov> Thanks!!!
14:50:20 <dulek> #endmeeting