*** zhouhan_ has joined #openvswitch | 01:14 | |
*** zhouhan has quit IRC | 01:17 | |
*** dholler has quit IRC | 02:20 | |
*** dholler has joined #openvswitch | 02:34 | |
*** troulouliou_div2 has quit IRC | 02:50 | |
*** anilvenkata has joined #openvswitch | 04:45 | |
*** atpa8a has joined #openvswitch | 04:49 | |
*** links has joined #openvswitch | 05:08 | |
*** armax has quit IRC | 05:18 | |
*** ralonsoh has joined #openvswitch | 05:43 | |
*** atpa8a has quit IRC | 06:07 | |
*** atpa8a has joined #openvswitch | 06:09 | |
*** jaicaa has quit IRC | 06:24 | |
*** jaicaa has joined #openvswitch | 06:27 | |
*** maciejjozefczyk has joined #openvswitch | 06:52 | |
*** slaweq has joined #openvswitch | 07:10 | |
*** rcernin has quit IRC | 07:32 | |
*** eelco has joined #openvswitch | 07:34 | |
*** ak77_ has quit IRC | 08:04 | |
*** ak77_ has joined #openvswitch | 08:05 | |
*** Madkiss_ has left #openvswitch | 08:13 | |
*** Madkiss has joined #openvswitch | 08:13 | |
*** ktraynor has joined #openvswitch | 08:22 | |
*** rcernin has joined #openvswitch | 08:30 | |
*** rcernin has quit IRC | 08:34 | |
*** links has quit IRC | 08:55 | |
*** links has joined #openvswitch | 09:06 | |
*** imaximets__ is now known as imaximets | 09:39 | |
*** zhouhan_ has quit IRC | 09:43 | |
*** matteo has joined #openvswitch | 09:57 | |
*** psahoo has joined #openvswitch | 10:28 | |
*** rcernin has joined #openvswitch | 10:38 | |
*** rcernin has quit IRC | 10:54 | |
*** thaller has quit IRC | 11:12 | |
*** thaller has joined #openvswitch | 11:13 | |
*** psahoo has quit IRC | 12:18 | |
*** psahoo has joined #openvswitch | 12:32 | |
*** thaller has quit IRC | 12:38 | |
*** thaller has joined #openvswitch | 12:42 | |
*** bostondriver has joined #openvswitch | 12:53 | |
*** dcbw has joined #openvswitch | 13:52 | |
*** psahoo has quit IRC | 13:55 | |
*** panda has quit IRC | 15:21 | |
*** eelco has quit IRC | 15:22 | |
*** armax has joined #openvswitch | 15:23 | |
*** zhouhan has joined #openvswitch | 15:41 | |
*** zhouhan has quit IRC | 15:42 | |
*** zhouhan has joined #openvswitch | 15:44 | |
*** livelace has joined #openvswitch | 15:52 | |
*** anilvenkata has quit IRC | 16:31 | |
*** anilvenkata has joined #openvswitch | 16:31 | |
*** slaweq has quit IRC | 16:37 | |
*** links has quit IRC | 16:44 | |
*** dceara has joined #openvswitch | 16:44 | |
*** zhouhan_ has joined #openvswitch | 16:45 | |
*** zhouhan has quit IRC | 16:48 | |
*** zhouhan_ has quit IRC | 17:11 | |
*** zhouhan has joined #openvswitch | 17:12 | |
*** mmichelson_ is now known as mmichelson | 17:13 | |
mmichelson | Hi everyone. I'm going to go ahead and start the meeting | 17:14 |
---|---|---|
mmichelson | #startmeeting ovn_community_development_discussion | 17:14 |
openstack | Meeting started Thu Jul 30 17:14:16 2020 UTC and is due to finish in 60 minutes. The chair is mmichelson. Information about MeetBot at http://wiki.debian.org/MeetBot. | 17:14 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 17:14 |
openstack | The meeting name has been set to 'ovn_community_development_discussion' | 17:14 |
mmichelson | Normally I'd start the meeting by giving my update first, but I need to step away for a couple of minutes | 17:14 |
mmichelson | So if anyone else wants to go ahead, I'll be back in a bit. | 17:14 |
dceara | Hi | 17:15 |
*** anilvenkata has quit IRC | 17:16 | |
*** Franky_T has joined #openvswitch | 17:16 | |
_lore_ | hi all | 17:17 |
dceara | I can start, I have a quick update: we've been hitting some (probably) raft related issues lately. In ovn-k8s deployments, in specific conditions, the SB database ends up in an inconsistent state, i.e., on a follower the raft logs try to modify/delete records that are not in the snapshot. We're still investigating to figure out what the trigger is. It sounds a bit similar to what zhouhan reported a month or so ago. I was wondering if we got a | 17:21 |
dceara | root cause of that until now. | 17:21 |
dceara | Once the DB ends up in this situation it will refuse any write transactions from clients. | 17:23 |
dceara | That's it on my side for today. Thanks. | 17:25 |
mmichelson | OK, and I'm back now. | 17:25 |
mmichelson | I can go next | 17:25 |
mmichelson | Easy things first: I got the ECMP symmetric reply patch merged. Thanks numans for reviews. And thanks zhouhan for fixing the compile error introduced. | 17:26 |
mmichelson | Next, if you're an OVN committer you've probably seen my messages with Jeremy Kerr of Patchwork. It looks like we're going to have OVN as a separate project in patchwork from OVS. This will make it significantly easier to spot relevant patch series and get them reviewed. | 17:27 |
mmichelson | And having a separate patchwork project is also going to simplify the existing CI (i.e. 0-day robot) processing. | 17:27 |
mmichelson | If you are a committer and have an objection to moving OVN to its own patchwork project, please speak up in the email thread. | 17:28 |
numans | mmichelson++ | 17:28 |
zhouhan | mmichelson: sounds great | 17:28 |
mmichelson | And finally, we've had a number of fixes go into 20.06 and I think we're verging on the need for another release. Right now, all regressions and other bugs found by ovn-kubernetes have been fixed. However, one thing that's worth talking about is whether we think it is appropriate to put any "flow explosion" fixes into the 20.06 branch. | 17:29 |
mmichelson | ovn-kubernetes is looking at changing to a shared gateway mode, and they have flow explosion concerns. So the question is, are these changes (those that have gone in, as well as those that are still up for review) candidates for branch-20.06? | 17:30 |
zhouhan | if ovn-k8s can't wait for 20.09, then I think it is ok to add them to 20.06 | 17:31 |
zhouhan | otherwise, it would be better to avoid backporting, because those are not new features | 17:32 |
dceara | mmichelson: For the arp responder flow explosion patches, even though they're quite large, I think we can argue that they are bug fixes. | 17:32 |
zhouhan | s/not new features/not bug fixes | 17:33 |
zhouhan | but as dceara said, some of them were bug fixes. However, are those critical bugs? | 17:34 |
numans | I'm fine if they need to be backported | 17:34 |
mmichelson | zhouhan, I think criticality is in the eye of the beholder :) | 17:36 |
zhouhan | I fine for backporting, but I just want to make sure we can always keep released branches stable enough. We'd be cautious for any change that could impact the existing feature to be backported. | 17:37 |
mmichelson | zhouhan, +1. Yeah, that's why I wanted to float the idea in here. | 17:38 |
mmichelson | Anyways, the backporting idea doesn't have any hard vetoes, so that's good to see. | 17:39 |
mmichelson | And that's all I had wanted to bring up. Whoever wants to go next, feel free. | 17:39 |
numans | I can go real quick. | 17:39 |
numans | I worked on stabilizing the 20.06 branch as ovn-k8s CI reported issues. | 17:39 |
numans | All the issues are addressed now and I hope this will be the last regression because of I-P patches. | 17:40 |
dceara | numans++ | 17:40 |
numans | Last week I submitted a 2 patch series to improve conntrack usage in OVN. Would appreciate some reviews on it - https://patchwork.ozlabs.org/project/openvswitch/list/?series=191630 | 17:41 |
zhouhan | mmichelson: for a release, we freeze for weeks to make sure what's released is stable. We may have same criteria if we want to backport features - give some time for it to stay in master branch so that we have more confidence of its stability | 17:41 |
numans | zhouhan, I couldn't get the chance to review the other 2 patches of yours. I'll get back to them soon. Hopefully by tomorrow. | 17:41 |
mmichelson | zhouhan, that makes sense. I'd argue that maybe we need more hardened CI so that we can get more immediate feedback as patches are merged to master. | 17:42 |
numans | zhouhan, I've one point here. Right now no CMS be it openstck or ovn-k8s is testing their CI tests on top of OVN master. | 17:42 |
zhouhan | numans: thanks numans | 17:42 |
numans | and hence we are not able to catch any regressions on master | 17:42 |
numans | And our test coverage is definitely not covering many things. | 17:42 |
numans | All the I-P patch series regressions were caught once 20.06 was consumed by our internal QE testing and ovn-k8s testing. | 17:43 |
numans | We need to improve more test coverage on master. | 17:43 |
numans | in order for us to be sure that new features don't cause regression. | 17:43 |
numans | May be we should run ovn-k8s kind tests when we commit a patch to ovn master branch. | 17:44 |
numans | any thoughts here ? | 17:44 |
numans | I think that should be possible with github actions. | 17:44 |
mmichelson | +`1 | 17:45 |
mmichelson | +1, I mean | 17:45 |
numans | mmichelson, you had some plans on the upstream CI right ? | 17:45 |
zhouhan | numans: mmichelson: yes, that's a problem. We should improve test in master. But still, there is more chance to find bugs in master when people keep developing on it. Otherwise, if we completely trust CI and then release, there is not much point to keep a released branch :) | 17:46 |
numans | mmichelson, may be github actions can be considered. | 17:46 |
numans | zhouhan, Agree. But as a developer we definitely miss out on edge cases and some scenarios :) | 17:47 |
mmichelson | numans, github actions could be a good idea. The only problem I have is that since we don't use PRs, the CI would run after the change is already pushed | 17:47 |
numans | mmichelson, github actions would also run once we push a patch. | 17:48 |
numans | So may be patchwork based CI (if you're planning on those lines) can test a patch before applying. | 17:48 |
numans | and once a patch is committed we can run ovn-k8s tests for example. | 17:48 |
numans | But I guess we can discuss about it in the ML too :) | 17:49 |
zhouhan | numans: yes, I mean, we should do both: 1) improve testing on master, e.g. borrow CI from ovn-k8s/networking-ovn to test against OVN master. 2) give more time for a new feature on master before backporting to released branch | 17:49 |
mmichelson | numans, sure. | 17:49 |
stintel | hi all. I'm getting this error every minute: Jul 30 20:48:12 ministore ovs-vswitchd[13839]: ovs|02297|odp_util(handler13)|ERR|internal error parsing flow key | 17:49 |
stintel | recirc_id(0x1),dp_hash(0xa90679df),skb_priority(0x7),in_port(7),skb_mark(0),ct_state(0x21),ct_zone(0),ct_mark(0),ct_label(0),ct_tuple4(src=10.50.18.6,dst=239.0.0.250,proto=2,tp_src=0,tp_dst=0),eth(src=b2:1d:c3:86:4d:33,dst=01:00:5e:00:00:fa),eth_type(0x8100),vlan(vid=18,pcp=0),encap(eth_type(0x0800),ipv4(src=10.50.18.6,dst=239.0.0.250,proto=2,tos=0xc0,ttl=1,frag=no)) | 17:49 |
numans | zhouhan, agree on both. | 17:49 |
numans | I'm done with the update. If some one wants to go next. | 17:49 |
numans | stintel, Hi. this is on OVN deployment ? | 17:50 |
stintel | openvswitch-2.13.0 on kernel 5.7.8 (using kernel openvswitch modules) | 17:50 |
numans | we are in the middle of OVN meeting. Probably we can discuss after it. | 17:50 |
stintel | numans: I am seeing this permanently | 17:50 |
stintel | ah sorry about that | 17:50 |
numans | or you can bring up next if you want :) | 17:50 |
zhouhan | may I go next? | 17:51 |
numans | sure. | 17:51 |
zhouhan | I was working on scale testing last week. | 17:51 |
zhouhan | I found that there were regression between 2.12 and later branches. The northd CPU utitilization almost doubled in 20.03/20.06 compared to 2.12. | 17:52 |
zhouhan | I was testing the creating and bind 12K ports in 1200 HVs scenario | 17:53 |
mmichelson | zhouhan, ouch | 17:53 |
zhouhan | I am also reworking on the separate nb_cfg in Chassis/Chassis_private. Will send the patch soon. | 17:54 |
zhouhan | I'll do more testing and analysis, and this is my update. | 17:55 |
numans | I want to discuss a bit on the ovn-northd. Any idea on the ovn-northd-ddlog ? | 17:55 |
* zhouhan have the same question | 17:56 | |
numans | I feel may be we should add I-P support to ovn-northd (may be a rudimentary one to start with) | 17:56 |
numans | With my last work on the I-P patches, I feel more confident in it. | 17:56 |
zhouhan | numans: do you mean I-P without DDlog? | 17:57 |
numans | And this could relieve a bit of CPU for ovn-northd | 17:57 |
numans | zhouhan, yes. | 17:57 |
numans | until we have ddlog ready | 17:57 |
numans | not a full I-P support, but start with some basic scenarios | 17:57 |
numans | Just a thought and wanted to check what everyone here thinks ? | 17:58 |
numans | Is it worth it ? | 17:58 |
zhouhan | numans: but it seems blp and leonid have brought ddlog very close for northd | 17:58 |
zhouhan | numans: I wonder if this would be a big waste of effort | 17:58 |
numans | zhouhan, That's the concern I have too. | 17:58 |
zhouhan | I think the DDlog problem is (I guess) that northd code keeps changing and then it would be hard for Ben to catch up with | 17:59 |
zhouhan | If we do I-P, would it be the same problem? | 17:59 |
numans | zhouhan, yes. that's the problem. I think sooner we have ddlog better it is. | 17:59 |
*** ralonsoh has quit IRC | 18:00 | |
numans | zhouhan, probably not. Because we are not adding new feature to northd right ? So it ddlog version doesn't need to catch up on it. | 18:00 |
numans | Anyway I wanted to check on this :) | 18:01 |
zhouhan | numans: sorry, what do you mean "we are not adding new feature to northd"? I think we kept adding :) | 18:01 |
mmichelson | Adding I-P doesn't add features that ddlog cares about | 18:01 |
numans | zhouhan, I think I misunderstood your comment- If we do I-P, would it be the same problem? | 18:01 |
numans | yes. | 18:02 |
numans | If some one wants to jump in and update please do so. Looks like I'm taking more time :) | 18:03 |
zhouhan | oh, I meant, if we do I-P manually (without DDlog), would we face the same problem that northd keeps changes and our I-P implementation can't catch up? | 18:03 |
mmichelson | I think it depends to what degress we add I-P | 18:03 |
mmichelson | *degree | 18:03 |
zhouhan | numans: BTW, do you have any idea why northd CPU doubled after 2.12? | 18:03 |
numans | zhouhan, no idea on that. | 18:04 |
zhouhan | ok | 18:04 |
dceara | zhouhan: what scale scenario are you testing with? | 18:04 |
zhouhan | dceara: I was testing the creating and bind 12K ports in 1200 HVs scenario | 18:04 |
dceara | zhouhan: without ACLs/LBs I assume, right? | 18:05 |
dceara | zhouhan: one thing that comes to mind is the hairpin flows for LBs on logical switches. | 18:05 |
zhouhan | Oh, it seems northd is costing CPU when system is idle (not running any tests). This didn't happen before. | 18:06 |
zhouhan | dceara: no, not ACLs/LBs | 18:06 |
dceara | zhouhan: ack | 18:06 |
zhouhan | I'll dig more on this. Please continue if anyone wants to update | 18:07 |
mmichelson | I'm guessing by the silence that there's noboday else wanting to update | 18:10 |
mmichelson | So I'll end the meeting here. Thanks everyone | 18:10 |
numans | Bye | 18:10 |
mmichelson | #endmeeting | 18:10 |
openstack | Meeting ended Thu Jul 30 18:10:14 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 18:10 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/ovn_community_development_discussion/2020/ovn_community_development_discussion.2020-07-30-17.14.html | 18:10 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/ovn_community_development_discussion/2020/ovn_community_development_discussion.2020-07-30-17.14.txt | 18:10 |
zhouhan | bye | 18:10 |
openstack | Log: http://eavesdrop.openstack.org/meetings/ovn_community_development_discussion/2020/ovn_community_development_discussion.2020-07-30-17.14.log.html | 18:10 |
numans | stintel, I'd suggest to send an email to the ovs-dev ML | 18:10 |
numans | stintel, from what little I know, this normally happens when there is a mismatch in the way flow key is seen by vswitchd and the kernel datapath. | 18:11 |
*** livelace has quit IRC | 18:11 | |
*** matteo has quit IRC | 18:18 | |
*** matteo has joined #openvswitch | 18:18 | |
stintel | numans: ok, I'll try that, probably after my 2w holiday. it's always the same multicast destination address. not seeing this for any other traffic | 18:35 |
*** livelace has joined #openvswitch | 18:36 | |
*** Franky_T has quit IRC | 18:47 | |
*** maciejjozefczyk has quit IRC | 19:13 | |
*** maciejjozefczyk has joined #openvswitch | 19:13 | |
*** zhouhan has quit IRC | 19:46 | |
*** zhouhan has joined #openvswitch | 19:46 | |
*** zhouhan_ has joined #openvswitch | 19:47 | |
*** dceara has quit IRC | 19:50 | |
*** zhouhan has quit IRC | 19:51 | |
*** maciejjozefczyk has quit IRC | 19:58 | |
*** zhouhan_ has quit IRC | 20:51 | |
*** zhouhan has joined #openvswitch | 20:52 | |
*** zhouhan_ has joined #openvswitch | 21:14 | |
*** zhouhan has quit IRC | 21:14 | |
*** livelace has quit IRC | 21:14 | |
*** zhouhan_ has quit IRC | 21:39 | |
*** zhouhan has joined #openvswitch | 21:46 | |
*** zhouhan_ has joined #openvswitch | 21:54 | |
*** zhouhan has quit IRC | 21:55 | |
*** zhouhan has joined #openvswitch | 22:07 | |
*** zhouhan_ has quit IRC | 22:10 | |
*** matteo has quit IRC | 22:14 | |
*** zhouhan has quit IRC | 22:22 | |
*** zhouhan has joined #openvswitch | 22:23 | |
*** zhouhan has quit IRC | 23:04 | |
*** rcernin has joined #openvswitch | 23:14 | |
*** zhouhan has joined #openvswitch | 23:37 | |
*** bostondriver has quit IRC | 23:38 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!