#openvswitch log

17:15:30 <mmichelson> #startmeeting ovn_community_development_discussion
17:15:30 <openstack> Meeting started Thu May 13 17:15:30 2021 UTC and is due to finish in 60 minutes.  The chair is mmichelson. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:15:31 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
17:15:33 <openstack> The meeting name has been set to 'ovn_community_development_discussion'
17:15:50 <mmichelson> I changed the meeting name back to the original so the link in the topic should work now :)
17:16:04 <_lore_> hi all
17:16:34 <ihrachys> o/
17:17:00 <blp> hi!
17:17:05 <mmichelson> Last Friday was soft freeze for 21.06. That means that any new feature patches that have been posted since then are not candidates for 21.06. I've been trying to do some code review this week. There are a lot of patches out there
17:17:05 <zhouhan> hi
17:17:21 <blp> I see that Ihar asked a question on the list about ddlog debugging. I'll try to look at that today.
17:17:31 <mmichelson> Including a 27 patch series from a certain someone...
17:17:32 <ihrachys> thanks :)
17:18:00 <mmichelson> I'm also working on v8 of the ARP and floating IP fixes series.
17:18:14 <mmichelson> The next version is going to be quite different based on feedback I received from dceara and numans
17:18:33 <mmichelson> And that's really about all I can report.
17:19:01 <blp> I don't think anyone should feel obligated to review that whole 27-part series. I try to order these things so that a review of any number of patches starting from the beginning still provides value.
17:19:17 <blp> So, if you can review the first patch, or the first 5 patches, or whatever, that's great.
17:19:20 <mmichelson> blp, that's a good point. When I did my first pass, I focused more on the patches at the beginning
17:20:16 <blp> I post the whole series because the whole thing is ready. It probably looks intimidating that way, though.
17:20:39 <mmichelson> blp, also I'm kind of still in the mindset with ddlog patches from you that I'll look at them, and do my best to understand them, but I'm not in a good position to critique them, especially when it comes to best practices.
17:22:12 <blp> It takes some experience.
17:22:42 <blp> I think that the actual ddlog changes in the series take less review and are less risky than the other patches.
17:23:28 <mmichelson> Yeah with a lot of them, it's easy to see what's different, but I have to just take it on trust that, for instance, interning fields is actually improving performance.
17:23:47 <blp> I did provide measurements.
17:24:00 <mmichelson> I could run a test of my own to verify, but I don't have the instinct just from looking at code to go "ah yes, that'll do it!"
17:25:28 <mmichelson> Anyway, this digression has gone on a bit. Anyone else, feel free to share.
17:25:44 <_lore_> can I go next? quite fast
17:26:38 <_lore_> this week I worked on a I-P issue for localport, actually when we delete a localport for the ovs bridge the flows in table 65 are not updated
17:26:50 <_lore_> I posted a fix today
17:27:36 <mmichelson> Excellent
17:27:46 <_lore_> I would like to ask to zhouhan about this commit: db41da34323c
17:27:59 <_lore_> inc-proc-eng: Call clear_tracked_data before recompute.
17:28:13 <_lore_> because in my case it triggers some issues
17:29:09 <zhouhan> _lore_: it was a refactor from what was already used in one of the engine handlers
17:29:20 <zhouhan> _lore_: to make it generic
17:29:22 <_lore_> in particular when the update for the ovs_interface table is "lost" by this commit since the ct_zone node trigger a full recompute and clear the tracked data
17:29:43 <_lore_> zhouhan: got my point?
17:30:20 <zhouhan> In theory, recompute should not need any tracked data, right?
17:30:46 <zhouhan> I will take a look at the issue in your case
17:31:04 <_lore_> zhouhan: thx
17:31:08 <_lore_> the flow is:
17:32:25 <_lore_> ovs_interface change (we remove the localport) --> physical_dlow_change_ovs_iface_handler --> en_ct_zone full recompute --> flow_outpu_physical_flow_changes_handler
17:33:10 <zhouhan> I did notice some dependency problems in ovn-controller, which are potential issues. Not sure if they are related to your case. I was planning to fix them as soon as I get some time.
17:33:11 <_lore_> in flow_outpu_physical_flow_changes_handler we do not run physical_handle_ovs_iface_changes() since ovs_ifaces_changed is set to false by the full recompute of the ct_zone
17:33:37 <zhouhan> thanks, I will take a note on this
17:33:40 <_lore_> so I had to changes:
17:33:44 <_lore_> 1- revert your commit
17:34:07 <_lore_> 2- ovs_ifaces_changed = true; in en_physical_flow_changes_run
17:34:20 <_lore_> I went for the second one
17:34:57 <_lore_> that's all from me
17:35:11 <_lore_> ah no, I posted a trivial patch
17:35:35 <_lore_> http://patchwork.ozlabs.org/project/ovn/patch/fdb9df310c76168f7813a1c75dd3ad26bb531da6.1620849904.git.lorenzo.bianconi@redhat.com/
17:35:39 <zhouhan> en_physical_flow_changes_run is very tricky. numans has a patch that removes it, and I have been reviewing it and some comments are still open
17:35:40 <_lore_> thx
17:36:08 <_lore_> that one is an issue hit by openstack
17:39:11 <numans> zhouhan, Sorry. I'm not able to address your comments yet. Hopefully I'll address them and post the patch next week
17:39:35 <zhouhan> numans: no worries.
17:40:47 <zhouhan> I have a quick update
17:41:06 <zhouhan> A small patch pending for review: https://patchwork.ozlabs.org/project/ovn/patch/20210510215938.369355-1-hzhou@ovn.org/
17:42:00 <zhouhan> In fact dumitru reviewed it but he also helped on adding the ddlog part fix, so I added him as co-author. Better that someone else take a quick look
17:42:25 <imaximets> zhouhan, AFAICT, numans acked it already.
17:42:27 <mmichelson> zhouhan, looks like numans acked it
17:42:50 <zhouhan> Oh, sorry, let me check. Maybe I just need dumitru's signoff
17:42:57 <mmichelson> he signed off too :)
17:43:02 <zhouhan> :D
17:43:27 <zhouhan> Great then. Nothing else to update
17:44:02 <mmichelson> Cool, anyone else?
17:44:09 <ihrachys> ok can I?.. real quick :)
17:44:13 <imaximets> zhouhan, I'm wondering if we can add an strstr(actions, "_MC_") to the lflow_add function and assert?
17:45:31 <ihrachys> more of a heads up re: https://mail.openvswitch.org/pipermail/ovs-dev/2021-May/382967.html two things there: 1) there are test suite failures in master so don't be scared if you see them and 2) looking for tips on how to debug ddlog (mis?)behavior, but blp already said he'll take a look so thanks for that.
17:46:14 <zhouhan> imaximets: it may be helpful. But I really want to fix the issue in I-P, so that we won't need that workaround in northd.
17:46:45 <imaximets> zhouhan, sure.
17:47:49 <zhouhan> ihrachys: thanks for digging into it. You also mentioned the ACL priority issue for the allow-stateless patch. Just want to know are you working on a fix?
17:47:56 <ihrachys> yes
17:48:34 <zhouhan> ok, thanks. It seems to me we will have to add more ACL flows in the pre-acl stage to fix the priority problem, right?
17:48:36 <ihrachys> some flow number increase may induce though, need to try to keep it under control
17:48:49 <ihrachys> you are right, at least my initial thinking
17:49:14 <blp> Does anyone have comments on renaming the "master" branch to something else? I didn't receive any feedback via email.
17:49:44 <zhouhan> blp: sounds good to me
17:50:10 <blp> Maybe, when I get a chance, I will take a stab at writing up a commit to do it for OVS.
17:50:11 <mmichelson> blp, I'm fine with it. I noticed numans has been calling it the "main" branch on the ML lately
17:50:36 <imaximets> blp, I missed this email somehow, found it today in the archive.  Sounds OK to me.
17:51:18 <ihrachys> will master be an alias or smth? thinking about 3parties pulling master in CI.
17:51:45 <imaximets> "main" seems to be a default choice.
17:52:33 <zhouhan> ihrachys: probably we can make sure the extra flows are needed only if there are mixed (allow-related and allow-stateless) ACLs, so it won't do any harm for existed usecases.
17:52:59 <blp> There is some support for forwarding the old name to the new one, but it doesn't work for everything.
17:53:15 <ihrachys> zhouhan: right, what I meant saying to keep it under control. though may be harder to do in e.g. ddlog, maybe not though, not much experience with that
17:55:47 <ihrachys> zhouhan: it's sad matches are not structured but text, otherwise we could compare sets and detect where a conflict may happen.
17:56:42 <ihrachys> blp: so maybe trying that if it's easy, and giving an official heads up of several weeks
17:57:31 <mmichelson> One nice thing about the master branch is that there are likely a lot of places that don't even specify it since it's the default. But you can't rely on everyone to do that.
17:57:32 <blp> ihrachys: Is that about master->main?
17:57:43 <ihrachys> blp: yes
17:58:11 <mmichelson> I'll have to retrain some muscle memory regarding rebasing branches, but eh, no biggie.
17:58:14 <blp> OK. I will make a mental note to work out the details.
18:00:21 <zhouhan> ihrachys: for "it's sad matches are not structured but text, otherwise we could compare sets and detect where a conflict may happen.", could you explain more?
18:01:10 <mmichelson> zhouhan, I can relate to what ihrachys is talking about. ACL and router policy matches are just strings, rather than parsed expressions
18:01:22 <ihrachys> in the test case in question, there are two rules with different priority, one for allow-stateless and another for allow-related for the same match. if we could somehow safely detect that matches are overlapping / the same, we could then skip adding some flows in same cases.
18:01:38 <blp> It's difficult to make structured relational data as powerful as a language.
18:02:05 <ihrachys> yes I don't even know it's possible
18:02:36 <blp> It's better if you want to have rigidly structured features, but that wasn't the original goal for OVN.
18:03:36 <zhouhan> ihrachys: ok, so you meant analysing the ACLs in ovn-northd instead of adding them directly to flows
18:04:03 <ihrachys> yes, doing some pre-processing for example to detect redundant ACLs
18:04:55 <ihrachys> but that's too much to ask, I didn't mean it like an actual proposal :)
18:06:17 <mmichelson> I've completely lost track of whose turn it was and who all has gone at this point. Anyone left?
18:06:32 <imaximets> I have a small update
18:07:32 <imaximets> I mostly worked on trying to fix CI.  GHA strted to put random text to /etc/hosts, so the final fix was to remove this stuff.
18:07:40 <imaximets> Now CI seems to work fine.
18:08:03 <zhouhan> thanks imaximets!
18:08:07 <mmichelson> imaximets, that is so strange. I love how your fix was to remove the line that says "don't remove this line"
18:08:28 <zhouhan> lol
18:08:57 <imaximets> numans, accepted my patch to combine ARP responder flows.  dceara reviewed my stream record/replay and 2-Tier ovsdb series.  I'll follow up on them later.
18:09:05 <imaximets> mmichelson, I was in a bad mood. :)
18:09:29 <mmichelson> imaximets, :)
18:09:33 <imaximets> I also have a question:
18:10:15 <imaximets> container management software likes to use SIGTERM in order to stop containers.
18:10:49 <imaximets> but OVS-based applications doesn't perform a fully gracefull shutdown on SIGTERM.
18:11:11 <imaximets> i.e. northd will not release the lock and ovsdb-server will not transfer leadership.
18:11:36 <blp> ovsdb locks get released automatically when the connection drops
18:12:00 <imaximets> So, I'm thinking, maybe we need to catch SIGTERM from the daemon, set exiting=true and re-raise at the very end?
18:12:21 <blp> leadership should transfer quickly when the other servers see the connection dropped (it would be better to transfer it beforehand though)
18:13:09 <blp> It's a good idea to handle SIGTERM more gracefully.
18:13:21 <blp> I think we have some code for that...
18:13:40 <imaximets> blp, yes, the point is that it's probably better to terminate gracefully instead of relying on a failover.
18:14:23 <blp> Oh yes, lib/fatal-signal.c. It doesn't address this use case perfectly and will probably need an update.
18:14:44 <imaximets> blp, we can register a custom handler and in the very end call a default handler form the fatal-signal.c and execute fatal_signal_run().
18:15:57 <blp> A custom signal handler you mean? Yes, fatal-signal does that already.
18:16:50 <imaximets> blp, yes.  fatal-signal doesn't register its own handler if it was already registered by the daemon in the main code.
18:18:03 <blp> imaximets: Yes. The intention behind that was different (it was meant to preserve signals that had been set to SIG_IGN by the process that exec()'d the daemon) but it has that effect.
18:19:05 <blp> fatal_signal_run() is called multiple places automatically so that you don't get a daemon blocking somewhere unexpected and failing to die.
18:19:58 <imaximets> blp, ok.  I'll think a bit more about this and will, probably, prepare some patches.
18:21:06 <blp> Signals are one of the most awful parts of POSIX.
18:21:26 <imaximets> There is one more problem around SIGTERM:  sometimes raise() that called to re-raise the original signal and kill the process fails.
18:22:00 <imaximets> in this case fatal_signal_run() falls into OVS_NOT_REACHED() that calls abort.
18:22:29 <blp> imaximets: How/when would that happen?
18:22:45 <blp> This is a new case for me.
18:23:02 <imaximets> but raise(SIGABRT) fails inside the glibc and glibc ends up with ABORT_INSTRUCTION that effectively triggers a segfault to kill the process.
18:23:27 <blp> WTF?
18:24:02 <imaximets> blp, this happens sometimes during some complex upgrade of containers alng with the host OS, so I don't really know what the hell is happened and why raise() fails.
18:24:29 <imaximets> I suspect some glibc incompatibility or kernel issue.
18:25:53 <imaximets> blp, you can find some backtraces here: https://bugzilla.redhat.com/show_bug.cgi?id=1957030
18:25:53 <openstack> bugzilla.redhat.com bug 1957030 in ovsdb2.13 "core dumps for ovsdb-sever and northd seen in OCP 4.7->4.8 upgrade" [High,New] - Assigned to ovs-team
18:26:08 <blp> If signal(SIGTERM, SIG_DFL); raise(SIGTERM); fails, then all bets are all, everything's fucked.
18:27:01 <imaximets> I don't think that we can fix that, but we at least could call VLOG_FATAL and avoid segfault with coredump.
18:27:38 <imaximets> And we could find some more information by printing errno.
18:28:02 <imaximets> VLOG_FATAL will result in exit(EXIT_FAILURE)
18:28:07 <blp> I think I want a segfault with coredump, it's the only way this will ever be debuggable.
18:29:36 <imaximets> But we pretty much sure that segfault is because of the failure of raise().  IF raise() failed we will have a segfault in the end, and debugging of glibc is not really useful.
18:31:46 <blp> If raise() fails, why are we confident that exit() will work?
18:33:05 <imaximets> we're not.  But, if it will not work, we will reach OVS_NOT_REACHED and will have a segfault anyway.  So, it's just an extra attempt to save a "normal" termination.
18:35:05 <blp> It shouldn't be a normal termination, it should be a signal termination.
18:37:22 <blp> There's a bug here and it's not our bug. Can we get that bug fixed?
18:38:11 <mmichelson> blp, we can report it, but in the meantime it would be best if we could make our stuff try to avoid triggering it if possible.
18:38:29 <imaximets> That would be ideal to fix, of course.
18:39:09 <imaximets> For now, I guess, we can at least print an error if raise() failed and allow it to reach the OVS_NOT_REACHED.  This should not make any harm.
18:40:04 <imaximets> The issue itself should be reported to appropriate project for investigation.
18:40:17 <imaximets> That's it form my side.
18:41:23 <blp> Logging an error seems reasonable. Or we could VLOG_ABORT.
18:42:41 <imaximets> blp, yep.  More explicit version of OVS_NOT_REACHED. :)
18:46:11 <imaximets> Sorry, my small update took 35 minutes.
18:46:27 <mmichelson> lol
18:46:30 <blp> I think everyone is gone now, but... anyone else?
18:47:04 <mmichelson> I think we're done. 90 minutes this week, wow.
18:47:11 <mmichelson> Bye everyone!
18:47:14 <imaximets> bye
18:47:15 <mmichelson> #endmeeting