#openvswitch log

17:21:22 <mmichelson> #startmeeting ovn_community_development_meeting
17:21:33 <openstack> Meeting started Thu Apr 15 17:21:22 2021 UTC and is due to finish in 60 minutes.  The chair is mmichelson. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:21:33 <imaximets> hi
17:21:33 <mmichelson> uh, openstack?
17:21:34 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
17:21:36 <openstack> The meeting name has been set to 'ovn_community_development_meeting'
17:21:37 <mmichelson> ah there we go
17:26:17 <mmichelson> OK, so, I finally have a new patch series up for the floating IP issue that I started working on a long time ago
17:26:49 <mmichelson> It's gone from a 2 patch series to a 5 patch series in an attempt to both fix the issue and to make things more efficient by no longer requiring ARPs to be sent in cases where they shouldn't be necessary
17:27:09 <mmichelson> We also released new versions of OVN 20.03 and 20.06 for Ubuntu's purposes.
17:27:28 <numans> Hello
17:27:43 <mmichelson> I was hoping blp would be here today so I could ask about the bug report I sent for ovn-northd-ddlog, but I guess I'll just need to bump the email I sent.
17:28:04 <mmichelson> Um...I think that's it for me.
17:28:53 <numans> I can go real fast
17:29:10 <numans> I've submitted a couple of patches for review related to conntrack improvement and on usage of ct.inv.
17:29:20 <numans> zhouhan_hzhou8, ^ appreciate if you can take a look.
17:29:38 <numans> I also addressed zhouhan_hzhou8's comments and submitted another version of physical flow split patch.
17:29:47 <numans> I did some code reviews.
17:29:52 <numans> That's it from me.
17:30:58 <imaximets> I have a small update too.
17:31:25 <imaximets> I finished and posted v2 of stream record/replay functionality with integration to ovsdb-server.
17:31:31 <imaximets> #link https://patchwork.ozlabs.org/project/openvswitch/list/?series=238830
17:32:27 <imaximets> Once this accepted to OVS, we will be able to integrate into ovn daemons, i.e. northd or ovn-controller.  for debugging and performance testing purposes.
17:33:43 <imaximets> Dumitru and zhouhan_hzhou8 reviewed bundles support for ofctrl.  So, I guess, the patch is good to go now. :)
17:33:53 <imaximets> that's it from my side.
17:33:55 <_lore_> can I go next? quite fast
17:34:32 <_lore_> this week I worked on skip_force_snat patch, thx mark for the review
17:34:53 <_lore_> then I started working on 2 items and I would like to have your opinion:
17:35:55 <_lore_> 1- I noticed whenever we wake ovn-controller main thread from pinctrl we run all the handlers even if just one will do some goodput
17:36:19 <_lore_> does it worth to just run the related handler in pinctrl_run()?
17:37:12 <_lore_> any opinions?
17:38:52 <mmichelson> _lore_, so you want to add incremental processing for pinctrl, essentially?
17:39:01 <_lore_> mmichelson: nope
17:39:08 <imaximets> _lore_, I'm not sure about the idea (I just do not know that code), but I'm also not sure how thread-safe ovn-controller is.  Is it?
17:39:50 <_lore_> there is a mutex between ovn-controller main thread and pinctrl_thread
17:39:53 <_lore_> imaximets: ..^
17:40:19 <imaximets> _lore_, ack.
17:40:32 <_lore_> mmichelson: what I mean is having a mask to run just the handlers in pinctl_run() that has been set in pinctrl_thread()
17:41:34 <_lore_> it is just an idea, I need to review the code to see if it is feasible
17:42:04 <mmichelson> _lore_, If it's causing a noticeable problem, then sure I'd say to go ahead and pursue that as a possible option. I'm not sure how much of a problem this actually poses right now though.
17:42:45 <_lore_> mmichelson: I do not have any report, I just noticed lookig at the code
17:43:05 <numans> I agree with mmichelson.
17:43:26 * zhouhan finally got passwd back on a new computer
17:43:41 <_lore_> numans: mmichelson: ok, I will see if it easy to implement
17:43:55 <numans> zhouhan, cool
17:44:09 <_lore_> 2- we have an issue related to garp that can make ovn-northd clocks at 100% cpu
17:44:17 <zhouhan> _lore_ I don't remember exactly but shouldn't pinctrl run just one handler based on the message type?
17:44:38 <_lore_> zhouhan: what I mean is not pinctrl_thread
17:45:02 <_lore_> I mean if we can optmize main thread
17:45:07 <_lore_> but it is just an idea
17:45:10 <_lore_> I am not sure
17:45:55 <_lore_> related to 2, I was wondering if we can implement something specific to arp reply to limit the rate of wakes or we can just use a meter on the action
17:46:09 <_lore_> what do you think?
17:46:09 <zhouhan> _lore_ oh, sorry, just re-read your message and I got your point now. Yes, you are right, but the most costly part is in I-P and since there is no input change, the I-P engine will not do any compute.
17:46:47 <_lore_> zhouhan: maybe there is no difference, I spotted this just looking at the code
17:46:49 <zhouhan> _lore_ If there are handlers too costly to be run on each main thread wakeup, then we should put it in I-P eng
17:46:57 <_lore_> ack
17:47:48 <_lore_> any opinion about point 2?
17:48:25 <mmichelson> _lore_, can you be more specific about why the garps are causing ovn-northd to run at 100% cpu?
17:48:35 <mmichelson> is it because mac_bindings are being updated too often?
17:49:00 <_lore_> according to the bz keepalived keeps moving the vip from one place to another  and send garps to it
17:49:06 <_lore_> yes
17:49:31 <_lore_> mac_bindings is updated very opten by the garp
17:49:47 <_lore_> this will end-up in a sb update
17:49:49 <zhouhan> but mac_bindings are not handled by northd, right?
17:50:19 <zhouhan> ok, so northd just got wake up, and there is no I-P there in the C version :)
17:50:28 <_lore_> yes
17:50:33 <_lore_> correct
17:50:40 <zhouhan> Oh, wait, northd shouldn't even monitor mac_binding
17:51:06 <mmichelson> zhouhan, northd monitors mac_bindings and deletes them if the logical port has been deleted.
17:51:28 <mmichelson> zhouhan, see cleanup_mac_bindings()
17:51:49 <zhouhan> oh, ok, I recall this been added sometime ago
17:51:50 <_lore_> this is the related bz: https://bugzilla.redhat.com/show_bug.cgi?id=1947913
17:51:52 <openstack> bugzilla.redhat.com bug 1947913 in ovn2.13 "[OVN][RFE] Add protection mechanism against gARPs / flapping ports" [High,New] - Assigned to lorenzo.bianconi
17:51:57 <mmichelson> What's happening here is that the MAC_Bindings are being updated, but since no ports are being changed, it means that northd is doing useless work.
17:52:22 <_lore_> yes
17:52:39 <zhouhan> Does the ddlog northd help?
17:52:45 <mmichelson> So a possible workaround that dceara discussed at one point was making ovn-northd stop prematurely if the only change was a MAC_Binding. But that's kind of a poor man's I-P
17:53:05 <mmichelson> zhouhan, presumably, it would.  But that's not being used in production
17:53:24 <_lore_> the bz even say ovs-db is quite loaded so maybe better to filter them in ovn-controller?
17:53:31 <mmichelson> But the flip side to this is that the garps themselves should probably be metered. And I think that's the part you're bringing up here _lore_
17:53:44 <_lore_> mmichelson: correct
17:53:53 <_lore_> my question is:
17:54:15 <_lore_> is it better a meter for the action (need to check the code) or to implement something in c for mac_binding?
17:57:21 <zhouhan> I remember dumitru started some work 1 - 2 years ago for control plane ratelimiting in general
17:57:22 <mmichelson> _lore_, I think probably both are good ideas, personally.
17:58:12 <_lore_> ok, I will come up with something
17:59:38 <zhouhan> but can't remember how did it go. In general, there was a dilemma for ratelimiting. Either it could block real request or it could require huge amount of meters causing performance problem.
17:59:44 <imaximets> mmichelson, _lore_:  about rate limiting, don't we need to just ban certain garps that causes problems instead of limiting all of them?  We will loose some valid binding events due to limiting and that may be bad.
18:00:16 <_lore_> imaximets: yes, this is what I would like to do in c
18:00:21 <_lore_> in pinctrl code
18:00:51 <_lore_> zhouhan: maybe it is better to just ratelimit a given IP
18:00:57 <_lore_> not all the IPs
18:01:26 <zhouhan> _lore_ hmm, but how do you know which IP to meter? Through config?
18:01:43 <_lore_> in the GARP we have the src mac and IP
18:01:47 <_lore_> right?
18:01:57 <_lore_> or in the arp reply in general
18:02:35 <_lore_> I will try to come up with a PoC
18:02:39 <mmichelson> But how do you know how to program which IPs to meter on? Don't you have to know that before the garps arrive?
18:02:46 <_lore_> and then we can continue the discussion on the ml
18:02:48 <zhouhan> I mean, we can't predict what IPs could appear in GARPs, i.e. which GARPs to apply rateliimiting
18:04:03 <zhouhan> (same as what mmichelson said :)
18:05:23 <_lore_> I mean when we receive the first request we can have a map for them
18:05:35 <_lore_> and so keep track of the incoming requests
18:05:40 <_lore_> it is just an idea
18:06:14 <mmichelson> _lore_, if you can put together a PoC, then I agree that we can continue the discussion on the mailing list
18:06:36 <_lore_> ack thx
18:06:37 <_lore_> :)
18:07:06 <zhouhan> _lore_: so when there are lots of new GARP comes with different IPs, that map can increase which would require huge amount of meters, right?
18:07:27 <zhouhan> yep, maybe a POC is great. Some tradeoffs could be made
18:07:31 <_lore_> I will not use meters
18:08:03 <_lore_> I mean, I will implement the logic of a meter, but in pinctrl thread
18:08:14 <_lore_> and the map will have a max size
18:08:36 <_lore_> we the map is full we will discard the new packets since we are under attack
18:08:39 <_lore_> right?
18:09:26 <zhouhan> _lore_: ok, but discarding new packets basically is blocking "healthy" ones (as a result of DDoS)
18:10:10 <_lore_> zhouhan: yes, but if the maps is full it means somthing not right is happening, so it is better just to discard packets and not run at 100% cpu right
18:10:13 <_lore_> ?
18:10:15 <_lore_> it is a tradeoff
18:11:18 <_lore_> let me review the code better and we can continue discussing about it
18:11:18 <zhouhan> yes, I believe some tradeoff has to be made, to solve the dilemma
18:11:30 <zhouhan> sorry I have to run for another meeting. ttyl
18:11:34 <mmichelson> _lore_, but if the map gets full, and we stop processing new garps, then wouldn't that still result in a DoS since we're not processing garps any longer?
18:11:35 <_lore_> zhouhan: this is a perfect world :D
18:12:01 <mmichelson> I'll wait for a PoC since it's hard to know exactly how this will work.
18:12:16 <_lore_> mmichelson: it is better to not processing just arp but the system continue working, right?
18:12:43 <mmichelson> _lore_, Ok I see what you're saying
18:13:32 <_lore_> I will keep you updated :)
18:13:37 <_lore_> that's all from me
18:13:40 <_lore_> thx
18:14:28 <mmichelson> Anybody else?
18:15:18 <mmichelson> OK I guess that's it for today
18:15:23 <mmichelson> #endmeeting