17:23:12 <flaviof> #startmeeting ovn-community-development-discussion 17:23:13 <openstack> Meeting started Thu May 21 17:23:12 2020 UTC and is due to finish in 60 minutes. The chair is flaviof. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:23:14 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:23:16 <openstack> The meeting name has been set to 'ovn_community_development_discussion' 17:23:27 <flaviof> well then, I know that much. ;) 17:23:33 <zhouhan> thx flaviof 17:23:39 <flaviof> anyone want to go first? 17:23:47 <zhouhan> I can go first 17:24:51 <zhouhan> I sent a fix for the dp_hash issue. imaximets: could you take a look: #link https://patchwork.ozlabs.org/project/openvswitch/patch/1589527067-91901-1-git-send-email-hzhou@ovn.org/ 17:25:14 <zhouhan> I did some reviews, most for numans's I-P 17:25:40 <zhouhan> I had a question on _lore_'s patch for the SRC_IP_POLICY 17:25:57 <_lore_> zhouhan: sure 17:26:38 <zhouhan> by GW router, do you mean the non-distributed GW router, or a distributed router with a distributed gateway port? 17:26:57 <_lore_> 'distributed router with a distributed gateway port' 17:27:23 <_lore_> gw router port on a given chassis 17:27:39 <zhouhan> _lore_: ok, then the ARP is sent from which component to that router? 17:27:44 <_lore_> yes 17:27:52 <_lore_> for non-FIP case 17:28:04 <_lore_> this is what we want to avoid 17:28:36 <_lore_> for this reason we need to chage reg1/eth.src after table=9,10,11 17:28:49 <zhouhan> Sorry, my question was: from which component is the ARP sent to the distributed router with DGP? 17:29:04 <_lore_> zhouhan: I did not get you 17:29:18 <zhouhan> _lore_: I am still not clear about the scenario, before going to the solution. 17:29:39 <zhouhan> _lore_: Let's discuss in the email offline :) 17:29:44 <zhouhan> That's my update 17:30:15 <_lore_> zhouhan: sure, but the scenario is this one: 17:30:45 <_lore_> the chassis where we have the FIP has a direct connection to the underlay network using a localnet port 17:31:03 <_lore_> so we want to send the ARP out to that port 17:31:16 <_lore_> agree? 17:32:06 <zhouhan> well, depends on the logical topology. I am not sure what's the source and destination, and the logical components connecting them. 17:32:27 <_lore_> local chassis has a direct connection to the ToR 17:32:32 <_lore_> switch 17:33:29 <_lore_> OpenStack want the possibility to send traffic directly avoid going through the tunnel since the chassis has a direct connection to the external world 17:33:50 <zhouhan> _lore_: yes, if there is just a single localnet logical switch connecting a VIF and the TOR, then it should work, without worrying about logical routers and distributed gateway ports. But I guess your scenario is more complex than that. 17:34:40 <_lore_> on the local chassis you mean? 17:34:49 <zhouhan> _lore_: I wasn't sure if it is for the typical k8s scenario or openstack. If it is for openstack, maybe I have some clue now. 17:35:18 <_lore_> I think k8s does not use FIP so far, just OpenStack 17:35:29 <_lore_> I guess 17:35:30 <_lore_> not sure 17:36:47 <zhouhan> _lore_: let's see if some else wants to report. After that we can continue. Or discuss offline. 17:36:47 <_lore_> the goal is: if you have a FIP associated to a given logical switch port you want to send traffic directly and not going through the tunnel to the gw router 17:36:53 <_lore_> ack 17:37:07 * _lore_ is on mute 17:37:34 <zhouhan> so, anyone else? 17:38:00 <_lore_> it seems not :) 17:38:10 <zhouhan> ok, let's continue 17:38:11 <panda> I can go last 17:38:21 <zhouhan> ok, panda please go ahead 17:38:34 <panda> zhouhan: thanks. 17:38:45 <panda> mine is not an update, but a presentation, I'd like to start contributing to the project. I already studied the architecture and I'm now studying the code 17:39:02 <panda> I plan to propose a patch on the documentation with my the list of task that helped me to start. But I'll have some questions for the mailing list. 17:39:23 <zhouhan> panda: welcome! 17:39:24 <panda> In the meantime I'm looking for low hanging fruit bugs or tasks to give a direction to my studies. If you have anything to propose I'd like to hear 17:39:43 <panda> _lore_ already helped me to bootstrap, and I have a long term tasks from him. 17:39:56 <panda> zhouhan: thanks :) 17:40:10 <flaviof> welcome panda! 17:40:18 <panda> flaviof: thanks! 17:40:25 <flaviof> I can go in next, really quick. 17:40:44 <flaviof> I have not been doing a lot on core OVN, but have been implementing a cool functionality in Openstack that is based on OVN. 17:41:03 <flaviof> It is called port forwarding. For folks who don't know, it uses OVN load balancers to carve out a single FIP into multiple internal VM based on proto+port. 17:41:18 <flaviof> Have POC running great and now moving onto integration tests. 17:41:18 <flaviof> #link https://review.opendev.org/#/c/723863/23/doc/source/ovn/port_forwarding.rst Port forwarding functionality from ML2/OVN 17:41:32 <flaviof> If any of you have interest on that or any other OVN related integration matters with Openstack, please do not be shy to say hi! 17:41:39 <flaviof> Including you, panda ! ;) 17:41:49 <flaviof> That is all from me. 17:42:05 <zhouhan> flaviof: very cool! 17:42:32 <flaviof> zhouhan: thanks. It mostly works because of people like you, so thanks to _you_! 17:43:21 <panda> flaviof: interesting :) 17:44:20 <flaviof> anyone got something he/she want to say here? 17:45:48 <_lore_> zhouhan: do you think we can proceed? 17:45:57 <zhouhan> _lore_: sure 17:46:01 <_lore_> :) 17:46:03 * panda will reserve the questions for the mailing list. 17:46:35 <zhouhan> _lore_: firstly, how does the normal IP traffic work? 17:46:54 <_lore_> non-FIP? 17:47:18 <zhouhan> _lore_: in the email you said IP traffic works as expect but just ARP doesn;t work 17:47:20 <_lore_> e.g. for external world? going through the gw router 17:47:27 <_lore_> ah ok 17:47:48 <_lore_> normal FIP traffic is going through the localnet port 17:48:06 <_lore_> like IP using FIP as src IP 17:48:36 <zhouhan> logically, it is going through the LR, and SNAT is done by the LR, right? 17:48:52 <_lore_> yes, locally 17:49:07 <_lore_> s/locally/logically 17:49:45 <zhouhan> when the VM send the packets to external, the nexthop is the LR, and LR's next hop is the external GW (on the TOR) 17:50:00 <_lore_> yes 17:50:49 <zhouhan> now the ARP is for the LR's IP, why should it be sent out through localnet? 17:51:06 <zhouhan> or do you mean the ARP from LR to the TOR's IP? 17:51:56 <_lore_> nope, the ARP has src IP the FIP 17:52:04 <_lore_> not the LR external IP 17:52:29 <zhouhan> So you mean the ARP from LR to the TOR, right? 17:53:40 <_lore_> yes 17:54:03 <_lore_> let' say your VM is pinging 1.1.1.1 17:55:08 <_lore_> the external network from logical router to the external network is 172.16.0.0/24 and you have associated the FIP 172.16.0.100 to the VM 17:56:06 <_lore_> you want system sends an ARP req to the gw of the network using 172.16.0.100 as src IP and dnat_snat external mac as src mac 17:56:41 <_lore_> sending the ARP using the localnet port on the chassis 17:57:18 <_lore_> the scenario is a little bit tricky, I agree :) 17:57:32 <zhouhan> In the logical pipeline, the IP packet from VM should first hit the LR, which then triggers the ARP to the external GW IP. Now which packet is observed on the tunnel? 17:57:35 <flaviof> _lore_: if you don't mind also add the next hop (TOR's) mac address in your example. 17:57:47 <_lore_> flaviof: sure 17:58:04 <_lore_> zhouhan: this is the point 17:58:09 <_lore_> no packet on the tunnel 17:58:30 <_lore_> the local logical router pipeline magaes the arp 17:58:51 <_lore_> whitout sending the packet to gw router 17:59:15 <_lore_> flaviof: let's the next hop is 172.16.0.254 17:59:24 <zhouhan> _lore_: but you said the problem is some packets were seen on the tunnel, and the patch is to avoid that, right? My question is, which packet was on the tunnel? The IP packet? Or just ARP packets? 17:59:42 <_lore_> zhouhan: before the commit that introduce the issue you reported 18:00:07 <_lore_> with the patch I send this week no packets are sent to the tunnel 18:00:16 <zhouhan> _lore_: yes I am talking about the original patch, not the later one. 18:00:34 <_lore_> zhouhan: even with the origianl one no packets are sent to the tunnel 18:00:42 <zhouhan> _lore_: still trying to understand the original problem :) 18:00:57 <_lore_> you are right, I have not been so clear :) 18:01:05 <_lore_> the orignal case is: 18:01:48 <_lore_> in the scenario I described before the packet for 172.16.0.254 is sent to the gw router and the gw router is sending the ARP 18:01:51 <_lore_> right? 18:03:03 <_lore_> then when the ARP reply arrives the packets start flowing 18:03:47 <_lore_> this is the behaviour before the offending commit 18:04:43 <zhouhan> ok, do you mean when ARP reply arrives to the GW node, the IP packets start being sent through local chassis directly? 18:05:21 <_lore_> correct 18:05:47 <_lore_> this is the original FIP behaviour 18:06:19 <_lore_> with the offending commit or the last patch the ARP is sent by the local node and not by the GW 18:06:19 <zhouhan> So before the ARP is sent, which packet is sent through the tunnel to the GW node? The IP packet or the ARP packet? 18:06:47 <_lore_> the first IP packet that triggers the ARP 18:07:01 <zhouhan> ok, that's clear now. Thanks 18:07:01 <_lore_> just this packet 18:07:25 <zhouhan> And all these nodes are on same L2 (e.g. under the TOR), right? 18:07:30 <_lore_> yes 18:07:47 <_lore_> the issue that when the ARP arrives this first IP packet is re-inhected but on the GW 18:08:20 <_lore_> while the second is sent by the local device so the ToR is confused 18:08:30 <_lore_> are we on the same page now? 18:08:43 <zhouhan> Yes, I think so. 18:08:46 <_lore_> ok, cool 18:08:48 <flaviof> +1 ;) 18:08:57 <_lore_> sorry to be not so clear 18:08:58 <zhouhan> So the actual problem is the reinjection that confuses TOR 18:09:00 <_lore_> anyway 18:09:09 <_lore_> yes 18:09:46 <zhouhan> If there is no reinjection (sacrifice the first packet), then there is no real problem, but can be optimized to avoid the tunnel for the first packet. 18:10:04 <_lore_> yes, I think so 18:10:10 <_lore_> but I am not 100% sure 18:10:33 <zhouhan> I see. Let me revisit your patch. Thanks for the explain! 18:10:57 <_lore_> sure, thank to you for be so patient :) 18:11:26 <_lore_> another possible solution could be add nat info to port_binding table 18:11:42 <_lore_> but the issue is we have no access to the db in pinctrl thread 18:12:53 <_lore_> so we came up adding a new stable to logical router pipeline in order to overwrite reg1/eth.src just for FIP 18:13:19 <_lore_> doing so we can manage even ARP and first IP packet locally 18:14:25 <_lore_> last this week I added the possibility to attach strace or perf to ovn-scale-test 18:14:25 <zhouhan> "doing so" you mean the current patch, right? 18:14:32 <_lore_> zhouhan: yes 18:15:06 <zhouhan> _lore_: cool. I will take look. Thanks! 18:15:16 <_lore_> zhouhan: basically in the last patch I reverted offending commit and added this new stage 18:16:08 <_lore_> I think now we are all the same page :) 18:16:52 <zhouhan> _lore_: yes, I think so. 18:17:07 <zhouhan> flaviof: are we still in the meeting? 18:17:21 <flaviof> yes. but we can end if you think we should 18:17:31 <flaviof> _lore_: that is the table called S_ROUTER_IN_IP_SRC_POLICY, right? 18:17:41 <_lore_> flaviof: right 18:17:55 <_lore_> maybe the name is not the best one :) 18:18:12 <flaviof> ack. just wanted to mentioning it here to have a quick way to search for it. This discussion is an integral part of it. ;) 18:18:23 <flaviof> good discussion. Thank you both for doing it here. Anything else to talk about or shall we call it a meeting? 18:19:43 <zhouhan> flaviof: I think maybe we are done 18:19:49 <_lore_> I guess si 18:19:51 <_lore_> *so 18:19:59 <flaviof> yeah. si ! ;) 18:20:04 <zhouhan> bye everyone :) 18:20:08 <flaviof> bye all 18:20:10 <_lore_> si == yes in Italian :) 18:20:16 <flaviof> <3 18:20:19 <flaviof> #endmeeting