14:00:33 <dulek> #startmeeting Kuryr 14:00:34 <openstack> Meeting started Mon Jul 29 14:00:33 2019 UTC and is due to finish in 60 minutes. The chair is dulek. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:35 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:37 <openstack> The meeting name has been set to 'kuryr' 14:01:06 <dulek> Well, it's been a few weeks since I run this. Sorry about that, crazy time with some deadlines for other projects. 14:01:56 <dulek> There are a few things that's going to change for Kuryr, so here they are. 14:03:27 <dulek> First of all dmellado cannot be involved too much in the project, so me and ltomasbo are probably the go-to people here. I should probably figure out how to do a PTL change during the cycle to make sure all the formal things are ironed out. 14:04:09 <ltomasbo> o/ 14:04:13 <aperevalov> o/ 14:04:44 <dulek> Second of all there's not a lot activity in those meetings, so I'd probably vote to switch to the office hours model, when this time during the week we'll simply welcome any questions on the #openstack-kuryr. 14:05:21 <dulek> Any thoughts on that? 14:05:50 <dulek> If there's none and we're here, I guess aperevalov can talk a bit about improve-pod-launch-time blueprint? 14:05:52 <aperevalov> no problem, it's ok for us. 14:06:35 <aperevalov> yes, it's a problem for us, I guess it's not problem for your, due to you are using ovn. 14:07:44 <dulek> aperevalov: Well, not really, we're mostly running nested. 14:07:54 <dulek> In that case ports are immediately ACTIVE, so the time is cut. 14:08:23 <dulek> aperevalov: But I totally agree there's room for improvement. 14:08:39 <aperevalov> nested - means you are plugging port to nova-instance? 14:09:08 <dulek> aperevalov: Yup - trunk ports and subports. 14:09:12 <dulek> Pods get the subports. 14:09:33 <aperevalov> dulek, thanks 14:09:36 <aperevalov> there are several parts in that blueprint. The first part about direct RPC (kuryr-controller - kuryr-daemon). Which probably should be faster than doing it through k8s. 14:10:06 <aperevalov> and keep storing states into k8s, for fallback. 14:10:15 <dulek> aperevalov: So we had some ideas about that in the past, I even crafted some code. 14:10:26 <dulek> aperevalov: First of all… Are you using port pools? 14:10:35 <aperevalov> yes, 14:11:12 <aperevalov> we measured it, in bm case - 2x times faster 14:11:23 <dulek> aperevalov: Direct? 14:11:51 <aperevalov> yes, it was direct ports, but we still waiting for neutron's status Active. 14:12:40 <aperevalov> and than more ports we requested than more waiting time interval 14:13:07 <dulek> aperevalov: Hm, okay, I don't think I'm immediately against direct communication. 14:13:26 <dulek> We would just need to make sure we're not abusing some K8s paradigms. 14:13:45 <aperevalov> it also depends on openstack controller performance 14:14:00 <dulek> aperevalov: But here's an idea we had a long while ago. I think it addresses all the issues you list in the blueprint. 14:14:13 <dulek> aperevalov: So basically we wanted to extend the port pools concept. 14:14:37 <aperevalov> do you have a blueprint for it? 14:14:41 <dulek> aperevalov: As we have host in the pool key, pools are already "attached" to a kuryr-daemon. 14:14:44 <dulek> aperevalov: Yes, just a sec… 14:15:02 <dulek> aperevalov: https://blueprints.launchpad.net/kuryr-kubernetes/+spec/daemon-pool-port-choice 14:15:16 <aperevalov> so, do you want to do "prebind" ports 14:15:28 <aperevalov> wait a minute, I'm reading 14:15:30 <dulek> Yup, that was part of the idea. 14:15:55 <dulek> aperevalov: So the idea is that kuryr-controller would create ports for the pools and for each port create KuryrPort CRD. 14:16:52 <dulek> aperevalov: kuryr-daemon would watch for those and choose the ports on their own. So if there are ports in the pool - daemon won't wait for kuryr-controller at all. 14:17:37 <dulek> aperevalov: Then another improvement would be to "pre-bind" ports, so they become ACTIVE in Neutron even before pods get created. Then on the "real" binding only the interface gets moved to another netns. 14:19:35 <aperevalov> if I truly understand you, all ports from such pool will be with ACTIVE status 14:20:38 <dulek> aperevalov: Yup, so we won't wait for that. 14:21:58 <dulek> aperevalov: Does this make any sense for you? I had some POC, it's listed on the blueprint. 14:22:24 <dulek> While it was working okay, I haven't noticed much performance improvement back then. 14:22:48 <dulek> But probably I made some mistakes somewhere in there. 14:23:19 <aperevalov> but technically, e.g. neutron-openvswitch-agent - it's a controller for ports in ovs, it responsible for openflow rules and it set that rules when/after we attached tap into ovs. So you propose to create ovs port (osvif->plug) in batch, before real pod launch happend. 14:24:04 <dulek> aperevalov: Yep, that was the idea. It was attaching those ports to a fake network namespace. 14:24:19 <dulek> aperevalov: And on pod creation that port was moved to pod namespace. 14:25:35 <aperevalov> looks like it will work with SR-IOV too, but in none pool mode we still have to wait for neutron updates. 14:27:14 <dulek> aperevalov: Yes, that idea doesn't really have a lot of sense without pools. 14:27:48 <aperevalov> we tried to improve wait mechanism, so please review https://review.opendev.org/#/c/669642/ ) 14:29:25 <dulek> aperevalov: This seems to be broken for containerized case? 14:30:03 <dulek> Hm, maybe just a coincidence. 14:30:18 <aperevalov> do you mean containerized kuryr? no, no, we checked it in containerized case. 14:31:04 <aperevalov> this test kuryr-kubernetes-tempest-containerized passed 14:32:12 <dulek> aperevalov: I see. Okay, I'll take a look, but I'm not totally convinced as it's pretty unusual case when you have access to RabbitMQ from Kuryr point. 14:32:55 <dulek> aperevalov: That would only happen in clouds you manage. If you wanted to run Kubernetes + Kuryr on any OpenStack public cloud, that would not work. 14:33:51 <aperevalov> ok, in this case kuryr will work as before, by request. 14:34:12 <dulek> aperevalov: I see it's failing back to that, sure. 14:34:36 <aperevalov> Fallback method should be invisible 14:35:46 <dulek> aperevalov: Yes, yes, I see. 14:36:32 <dulek> aperevalov: Okay, I'll take a look on that patch. Will you think about that daemon-pool-port-choice? It seems to be a bit more Kubernetes-style than to just allow direct communication. 14:38:14 <aperevalov> it's a nice idea, who will finish implementation? We have resources for this... 14:39:04 <dulek> aperevalov: I'm pretty sure me, ltomasbo and Maysa are unable to work on that now. 14:39:35 <dulek> So if you think it would help your use case - I'd be super happy if you can grab it. 14:41:15 <aperevalov> ok, this review https://review.opendev.org/#/c/527243 was about it, wasn't it? 14:42:55 <dulek> aperevalov: Yes, but it's probably super outdated. 14:43:24 <dulek> aperevalov: Back then we had some issue with CRD's support, but that's definitely fixed by now. 14:43:41 <dulek> aperevalov: K8s API wasn't working as it should when using CRD's. 14:49:36 <dulek> Okay, I guess I'll just close the meeting. Thanks all! 14:49:48 <aperevalov> Thanks!!! 14:50:20 <dulek> #endmeeting