*** anilvenkata has quit IRC | 03:55 | |
*** hexa- has quit IRC | 04:20 | |
*** hexa- has joined #openvswitch | 04:22 | |
*** hexa- has quit IRC | 04:29 | |
*** hexa- has joined #openvswitch | 04:30 | |
*** ralonsoh has joined #openvswitch | 04:31 | |
*** rcernin has quit IRC | 05:13 | |
*** moldorcoder7 has quit IRC | 05:14 | |
*** rcernin has joined #openvswitch | 05:19 | |
*** avishnoi has quit IRC | 05:20 | |
*** avishnoi has joined #openvswitch | 05:22 | |
*** moldorcoder7 has joined #openvswitch | 05:41 | |
*** anilvenkata has joined #openvswitch | 06:24 | |
*** slaweq has joined #openvswitch | 06:26 | |
*** anilvenkata has quit IRC | 06:49 | |
*** mdgray has joined #openvswitch | 07:13 | |
*** avishnoi has quit IRC | 07:17 | |
*** elvira has joined #openvswitch | 07:24 | |
*** avishnoi has joined #openvswitch | 07:27 | |
*** __lore__ has joined #openvswitch | 07:35 | |
*** _lore_ has quit IRC | 07:35 | |
*** __lore__ has quit IRC | 07:37 | |
*** _lore_ has joined #openvswitch | 07:40 | |
*** dholler has joined #openvswitch | 07:58 | |
*** __lore__ has joined #openvswitch | 08:08 | |
*** _lore_ has quit IRC | 08:08 | |
*** jaicaa has quit IRC | 08:36 | |
*** avishnoi has quit IRC | 08:58 | |
*** avishnoi has joined #openvswitch | 09:00 | |
*** istokes has joined #openvswitch | 09:00 | |
*** blahdodo has quit IRC | 09:26 | |
*** dholler has quit IRC | 09:27 | |
*** blahdodo has joined #openvswitch | 09:30 | |
*** avishnoi has quit IRC | 09:46 | |
*** anilvenkata has joined #openvswitch | 09:47 | |
*** avishnoi has joined #openvswitch | 10:08 | |
*** anilvenkata has quit IRC | 10:11 | |
*** __lore__ has quit IRC | 10:27 | |
*** _lore_ has joined #openvswitch | 10:54 | |
*** anilvenkata has joined #openvswitch | 11:51 | |
*** avishnoi has quit IRC | 12:06 | |
*** avishnoi has joined #openvswitch | 12:07 | |
*** anilvenkata has quit IRC | 12:27 | |
*** anilvenkata has joined #openvswitch | 12:27 | |
*** anilvenkata has quit IRC | 12:33 | |
*** subuk has joined #openvswitch | 12:44 | |
*** bostondriver has joined #openvswitch | 12:46 | |
*** links has joined #openvswitch | 12:53 | |
*** avishnoi has quit IRC | 13:15 | |
*** anilvenkata has joined #openvswitch | 13:18 | |
*** rcernin has quit IRC | 13:19 | |
*** dcbw has joined #openvswitch | 13:34 | |
*** avishnoi has joined #openvswitch | 13:37 | |
*** psahoo has joined #openvswitch | 13:38 | |
*** _lore_ has quit IRC | 13:42 | |
*** _lore_ has joined #openvswitch | 13:45 | |
*** anilvenkata has quit IRC | 13:47 | |
*** anilvenkata has joined #openvswitch | 13:52 | |
*** KpuCko has quit IRC | 14:14 | |
*** anilvenkata has quit IRC | 14:44 | |
*** fbl has joined #openvswitch | 14:48 | |
*** _lore_ has quit IRC | 15:44 | |
*** elvira has quit IRC | 16:02 | |
*** links has quit IRC | 16:32 | |
*** psahoo has quit IRC | 16:44 | |
*** _lore_ has joined #openvswitch | 16:49 | |
*** blp has joined #openvswitch | 17:14 | |
mmichelson | Hi everyone | 17:15 |
---|---|---|
mmichelson | #startmeeting ovn_community_development_discussion | 17:15 |
openstack | Meeting started Thu May 13 17:15:30 2021 UTC and is due to finish in 60 minutes. The chair is mmichelson. Information about MeetBot at http://wiki.debian.org/MeetBot. | 17:15 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 17:15 |
openstack | The meeting name has been set to 'ovn_community_development_discussion' | 17:15 |
mmichelson | I changed the meeting name back to the original so the link in the topic should work now :) | 17:15 |
_lore_ | hi all | 17:16 |
ihrachys | o/ | 17:16 |
*** zhouhan has joined #openvswitch | 17:16 | |
blp | hi! | 17:17 |
mmichelson | Last Friday was soft freeze for 21.06. That means that any new feature patches that have been posted since then are not candidates for 21.06. I've been trying to do some code review this week. There are a lot of patches out there | 17:17 |
zhouhan | hi | 17:17 |
blp | I see that Ihar asked a question on the list about ddlog debugging. I'll try to look at that today. | 17:17 |
mmichelson | Including a 27 patch series from a certain someone... | 17:17 |
ihrachys | thanks :) | 17:17 |
mmichelson | I'm also working on v8 of the ARP and floating IP fixes series. | 17:18 |
mmichelson | The next version is going to be quite different based on feedback I received from dceara and numans | 17:18 |
mmichelson | And that's really about all I can report. | 17:18 |
blp | I don't think anyone should feel obligated to review that whole 27-part series. I try to order these things so that a review of any number of patches starting from the beginning still provides value. | 17:19 |
blp | So, if you can review the first patch, or the first 5 patches, or whatever, that's great. | 17:19 |
mmichelson | blp, that's a good point. When I did my first pass, I focused more on the patches at the beginning | 17:19 |
blp | I post the whole series because the whole thing is ready. It probably looks intimidating that way, though. | 17:20 |
mmichelson | blp, also I'm kind of still in the mindset with ddlog patches from you that I'll look at them, and do my best to understand them, but I'm not in a good position to critique them, especially when it comes to best practices. | 17:20 |
blp | It takes some experience. | 17:22 |
blp | I think that the actual ddlog changes in the series take less review and are less risky than the other patches. | 17:22 |
mmichelson | Yeah with a lot of them, it's easy to see what's different, but I have to just take it on trust that, for instance, interning fields is actually improving performance. | 17:23 |
blp | I did provide measurements. | 17:23 |
mmichelson | I could run a test of my own to verify, but I don't have the instinct just from looking at code to go "ah yes, that'll do it!" | 17:24 |
mmichelson | Anyway, this digression has gone on a bit. Anyone else, feel free to share. | 17:25 |
_lore_ | can I go next? quite fast | 17:25 |
_lore_ | this week I worked on a I-P issue for localport, actually when we delete a localport for the ovs bridge the flows in table 65 are not updated | 17:26 |
_lore_ | I posted a fix today | 17:26 |
mmichelson | Excellent | 17:27 |
_lore_ | I would like to ask to zhouhan about this commit: db41da34323c | 17:27 |
_lore_ | inc-proc-eng: Call clear_tracked_data before recompute. | 17:27 |
_lore_ | because in my case it triggers some issues | 17:28 |
zhouhan | _lore_: it was a refactor from what was already used in one of the engine handlers | 17:29 |
zhouhan | _lore_: to make it generic | 17:29 |
_lore_ | in particular when the update for the ovs_interface table is "lost" by this commit since the ct_zone node trigger a full recompute and clear the tracked data | 17:29 |
_lore_ | zhouhan: got my point? | 17:29 |
*** KpuCko has joined #openvswitch | 17:30 | |
zhouhan | In theory, recompute should not need any tracked data, right? | 17:30 |
zhouhan | I will take a look at the issue in your case | 17:30 |
_lore_ | zhouhan: thx | 17:31 |
_lore_ | the flow is: | 17:31 |
_lore_ | ovs_interface change (we remove the localport) --> physical_dlow_change_ovs_iface_handler --> en_ct_zone full recompute --> flow_outpu_physical_flow_changes_handler | 17:32 |
zhouhan | I did notice some dependency problems in ovn-controller, which are potential issues. Not sure if they are related to your case. I was planning to fix them as soon as I get some time. | 17:33 |
_lore_ | in flow_outpu_physical_flow_changes_handler we do not run physical_handle_ovs_iface_changes() since ovs_ifaces_changed is set to false by the full recompute of the ct_zone | 17:33 |
zhouhan | thanks, I will take a note on this | 17:33 |
_lore_ | so I had to changes: | 17:33 |
_lore_ | 1- revert your commit | 17:33 |
_lore_ | 2- ovs_ifaces_changed = true; in en_physical_flow_changes_run | 17:34 |
_lore_ | I went for the second one | 17:34 |
_lore_ | that's all from me | 17:34 |
_lore_ | ah no, I posted a trivial patch | 17:35 |
_lore_ | http://patchwork.ozlabs.org/project/ovn/patch/fdb9df310c76168f7813a1c75dd3ad26bb531da6.1620849904.git.lorenzo.bianconi@redhat.com/ | 17:35 |
zhouhan | en_physical_flow_changes_run is very tricky. numans has a patch that removes it, and I have been reviewing it and some comments are still open | 17:35 |
_lore_ | thx | 17:35 |
_lore_ | that one is an issue hit by openstack | 17:36 |
numans | zhouhan, Sorry. I'm not able to address your comments yet. Hopefully I'll address them and post the patch next week | 17:39 |
zhouhan | numans: no worries. | 17:39 |
zhouhan | I have a quick update | 17:40 |
zhouhan | A small patch pending for review: https://patchwork.ozlabs.org/project/ovn/patch/20210510215938.369355-1-hzhou@ovn.org/ | 17:41 |
zhouhan | In fact dumitru reviewed it but he also helped on adding the ddlog part fix, so I added him as co-author. Better that someone else take a quick look | 17:42 |
imaximets | zhouhan, AFAICT, numans acked it already. | 17:42 |
mmichelson | zhouhan, looks like numans acked it | 17:42 |
zhouhan | Oh, sorry, let me check. Maybe I just need dumitru's signoff | 17:42 |
mmichelson | he signed off too :) | 17:42 |
zhouhan | :D | 17:43 |
zhouhan | Great then. Nothing else to update | 17:43 |
*** ralonsoh has quit IRC | 17:43 | |
mmichelson | Cool, anyone else? | 17:44 |
ihrachys | ok can I?.. real quick :) | 17:44 |
imaximets | zhouhan, I'm wondering if we can add an strstr(actions, "_MC_") to the lflow_add function and assert? | 17:44 |
ihrachys | more of a heads up re: https://mail.openvswitch.org/pipermail/ovs-dev/2021-May/382967.html two things there: 1) there are test suite failures in master so don't be scared if you see them and 2) looking for tips on how to debug ddlog (mis?)behavior, but blp already said he'll take a look so thanks for that. | 17:45 |
zhouhan | imaximets: it may be helpful. But I really want to fix the issue in I-P, so that we won't need that workaround in northd. | 17:46 |
imaximets | zhouhan, sure. | 17:46 |
zhouhan | ihrachys: thanks for digging into it. You also mentioned the ACL priority issue for the allow-stateless patch. Just want to know are you working on a fix? | 17:47 |
ihrachys | yes | 17:47 |
zhouhan | ok, thanks. It seems to me we will have to add more ACL flows in the pre-acl stage to fix the priority problem, right? | 17:48 |
ihrachys | some flow number increase may induce though, need to try to keep it under control | 17:48 |
ihrachys | you are right, at least my initial thinking | 17:48 |
*** mdgray has quit IRC | 17:48 | |
blp | Does anyone have comments on renaming the "master" branch to something else? I didn't receive any feedback via email. | 17:49 |
zhouhan | blp: sounds good to me | 17:49 |
blp | Maybe, when I get a chance, I will take a stab at writing up a commit to do it for OVS. | 17:50 |
mmichelson | blp, I'm fine with it. I noticed numans has been calling it the "main" branch on the ML lately | 17:50 |
imaximets | blp, I missed this email somehow, found it today in the archive. Sounds OK to me. | 17:50 |
ihrachys | will master be an alias or smth? thinking about 3parties pulling master in CI. | 17:51 |
imaximets | "main" seems to be a default choice. | 17:51 |
zhouhan | ihrachys: probably we can make sure the extra flows are needed only if there are mixed (allow-related and allow-stateless) ACLs, so it won't do any harm for existed usecases. | 17:52 |
blp | There is some support for forwarding the old name to the new one, but it doesn't work for everything. | 17:52 |
ihrachys | zhouhan: right, what I meant saying to keep it under control. though may be harder to do in e.g. ddlog, maybe not though, not much experience with that | 17:53 |
ihrachys | zhouhan: it's sad matches are not structured but text, otherwise we could compare sets and detect where a conflict may happen. | 17:55 |
ihrachys | blp: so maybe trying that if it's easy, and giving an official heads up of several weeks | 17:56 |
mmichelson | One nice thing about the master branch is that there are likely a lot of places that don't even specify it since it's the default. But you can't rely on everyone to do that. | 17:57 |
blp | ihrachys: Is that about master->main? | 17:57 |
ihrachys | blp: yes | 17:57 |
mmichelson | I'll have to retrain some muscle memory regarding rebasing branches, but eh, no biggie. | 17:58 |
blp | OK. I will make a mental note to work out the details. | 17:58 |
zhouhan | ihrachys: for "it's sad matches are not structured but text, otherwise we could compare sets and detect where a conflict may happen.", could you explain more? | 18:00 |
mmichelson | zhouhan, I can relate to what ihrachys is talking about. ACL and router policy matches are just strings, rather than parsed expressions | 18:01 |
ihrachys | in the test case in question, there are two rules with different priority, one for allow-stateless and another for allow-related for the same match. if we could somehow safely detect that matches are overlapping / the same, we could then skip adding some flows in same cases. | 18:01 |
blp | It's difficult to make structured relational data as powerful as a language. | 18:01 |
ihrachys | yes I don't even know it's possible | 18:02 |
blp | It's better if you want to have rigidly structured features, but that wasn't the original goal for OVN. | 18:02 |
zhouhan | ihrachys: ok, so you meant analysing the ACLs in ovn-northd instead of adding them directly to flows | 18:03 |
ihrachys | yes, doing some pre-processing for example to detect redundant ACLs | 18:04 |
ihrachys | but that's too much to ask, I didn't mean it like an actual proposal :) | 18:04 |
mmichelson | I've completely lost track of whose turn it was and who all has gone at this point. Anyone left? | 18:06 |
imaximets | I have a small update | 18:06 |
imaximets | I mostly worked on trying to fix CI. GHA strted to put random text to /etc/hosts, so the final fix was to remove this stuff. | 18:07 |
imaximets | Now CI seems to work fine. | 18:07 |
zhouhan | thanks imaximets! | 18:08 |
mmichelson | imaximets, that is so strange. I love how your fix was to remove the line that says "don't remove this line" | 18:08 |
zhouhan | lol | 18:08 |
imaximets | numans, accepted my patch to combine ARP responder flows. dceara reviewed my stream record/replay and 2-Tier ovsdb series. I'll follow up on them later. | 18:08 |
imaximets | mmichelson, I was in a bad mood. :) | 18:09 |
mmichelson | imaximets, :) | 18:09 |
imaximets | I also have a question: | 18:09 |
imaximets | container management software likes to use SIGTERM in order to stop containers. | 18:10 |
imaximets | but OVS-based applications doesn't perform a fully gracefull shutdown on SIGTERM. | 18:10 |
imaximets | i.e. northd will not release the lock and ovsdb-server will not transfer leadership. | 18:11 |
blp | ovsdb locks get released automatically when the connection drops | 18:11 |
imaximets | So, I'm thinking, maybe we need to catch SIGTERM from the daemon, set exiting=true and re-raise at the very end? | 18:12 |
blp | leadership should transfer quickly when the other servers see the connection dropped (it would be better to transfer it beforehand though) | 18:12 |
blp | It's a good idea to handle SIGTERM more gracefully. | 18:13 |
blp | I think we have some code for that... | 18:13 |
imaximets | blp, yes, the point is that it's probably better to terminate gracefully instead of relying on a failover. | 18:13 |
blp | Oh yes, lib/fatal-signal.c. It doesn't address this use case perfectly and will probably need an update. | 18:14 |
imaximets | blp, we can register a custom handler and in the very end call a default handler form the fatal-signal.c and execute fatal_signal_run(). | 18:14 |
blp | A custom signal handler you mean? Yes, fatal-signal does that already. | 18:15 |
imaximets | blp, yes. fatal-signal doesn't register its own handler if it was already registered by the daemon in the main code. | 18:16 |
blp | imaximets: Yes. The intention behind that was different (it was meant to preserve signals that had been set to SIG_IGN by the process that exec()'d the daemon) but it has that effect. | 18:18 |
blp | fatal_signal_run() is called multiple places automatically so that you don't get a daemon blocking somewhere unexpected and failing to die. | 18:19 |
imaximets | blp, ok. I'll think a bit more about this and will, probably, prepare some patches. | 18:19 |
blp | Signals are one of the most awful parts of POSIX. | 18:21 |
imaximets | There is one more problem around SIGTERM: sometimes raise() that called to re-raise the original signal and kill the process fails. | 18:21 |
imaximets | in this case fatal_signal_run() falls into OVS_NOT_REACHED() that calls abort. | 18:22 |
blp | imaximets: How/when would that happen? | 18:22 |
blp | This is a new case for me. | 18:22 |
imaximets | but raise(SIGABRT) fails inside the glibc and glibc ends up with ABORT_INSTRUCTION that effectively triggers a segfault to kill the process. | 18:23 |
blp | WTF? | 18:23 |
imaximets | blp, this happens sometimes during some complex upgrade of containers alng with the host OS, so I don't really know what the hell is happened and why raise() fails. | 18:24 |
imaximets | I suspect some glibc incompatibility or kernel issue. | 18:24 |
imaximets | blp, you can find some backtraces here: https://bugzilla.redhat.com/show_bug.cgi?id=1957030 | 18:25 |
openstack | bugzilla.redhat.com bug 1957030 in ovsdb2.13 "core dumps for ovsdb-sever and northd seen in OCP 4.7->4.8 upgrade" [High,New] - Assigned to ovs-team | 18:25 |
blp | If signal(SIGTERM, SIG_DFL); raise(SIGTERM); fails, then all bets are all, everything's fucked. | 18:26 |
imaximets | I don't think that we can fix that, but we at least could call VLOG_FATAL and avoid segfault with coredump. | 18:27 |
imaximets | And we could find some more information by printing errno. | 18:27 |
imaximets | VLOG_FATAL will result in exit(EXIT_FAILURE) | 18:28 |
blp | I think I want a segfault with coredump, it's the only way this will ever be debuggable. | 18:28 |
imaximets | But we pretty much sure that segfault is because of the failure of raise(). IF raise() failed we will have a segfault in the end, and debugging of glibc is not really useful. | 18:29 |
blp | If raise() fails, why are we confident that exit() will work? | 18:31 |
imaximets | we're not. But, if it will not work, we will reach OVS_NOT_REACHED and will have a segfault anyway. So, it's just an extra attempt to save a "normal" termination. | 18:33 |
blp | It shouldn't be a normal termination, it should be a signal termination. | 18:35 |
blp | There's a bug here and it's not our bug. Can we get that bug fixed? | 18:37 |
mmichelson | blp, we can report it, but in the meantime it would be best if we could make our stuff try to avoid triggering it if possible. | 18:38 |
imaximets | That would be ideal to fix, of course. | 18:38 |
*** acidfu has quit IRC | 18:38 | |
imaximets | For now, I guess, we can at least print an error if raise() failed and allow it to reach the OVS_NOT_REACHED. This should not make any harm. | 18:39 |
*** acidfu has joined #openvswitch | 18:39 | |
imaximets | The issue itself should be reported to appropriate project for investigation. | 18:40 |
imaximets | That's it form my side. | 18:40 |
blp | Logging an error seems reasonable. Or we could VLOG_ABORT. | 18:41 |
imaximets | blp, yep. More explicit version of OVS_NOT_REACHED. :) | 18:42 |
imaximets | Sorry, my small update took 35 minutes. | 18:46 |
mmichelson | lol | 18:46 |
blp | I think everyone is gone now, but... anyone else? | 18:46 |
mmichelson | I think we're done. 90 minutes this week, wow. | 18:47 |
mmichelson | Bye everyone! | 18:47 |
imaximets | bye | 18:47 |
mmichelson | #endmeeting | 18:47 |
openstack | Meeting ended Thu May 13 18:47:15 2021 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 18:47 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/ovn_community_development_discussion/2021/ovn_community_development_discussion.2021-05-13-17.15.html | 18:47 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/ovn_community_development_discussion/2021/ovn_community_development_discussion.2021-05-13-17.15.txt | 18:47 |
openstack | Log: http://eavesdrop.openstack.org/meetings/ovn_community_development_discussion/2021/ovn_community_development_discussion.2021-05-13-17.15.log.html | 18:47 |
*** blp has left #openvswitch | 18:47 | |
*** thaller has quit IRC | 18:50 | |
*** thaller has joined #openvswitch | 18:50 | |
*** istokes has quit IRC | 18:53 | |
*** donhw has quit IRC | 19:00 | |
*** donhw has joined #openvswitch | 19:37 | |
*** donhw has quit IRC | 19:44 | |
*** warewolf has quit IRC | 20:09 | |
*** warewolf has joined #openvswitch | 20:09 | |
*** donhw has joined #openvswitch | 20:48 | |
*** slaweq has quit IRC | 20:49 | |
*** fbl has quit IRC | 20:49 | |
*** zhouhan has quit IRC | 21:02 | |
*** dcbw has quit IRC | 21:54 | |
*** bostondriver has quit IRC | 22:02 | |
*** aconole has quit IRC | 22:23 | |
*** rcernin has joined #openvswitch | 22:57 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!