15:03:11 #startmeeting oslo 15:03:12 Meeting started Mon Feb 11 15:03:11 2019 UTC and is due to finish in 60 minutes. The chair is bnemec. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:03:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:03:15 The meeting name has been set to 'oslo' 15:03:18 courtesy ping for amotoki, amrith, ansmith, bnemec, dansmith, dhellmann, dims 15:03:18 courtesy ping for dougwig, e0ne, electrocucaracha, flaper87, garyk, gcb, haypo 15:03:18 courtesy ping for hberaud, jd__, johnsom, jungleboyj, kgiusti, kragniz, lhx_ 15:03:18 courtesy ping for moguimar, njohnston, raildo, redrobot, sileht, sreshetnyak, stephenfin 15:03:19 courtesy ping for stevemar, therve, thinrichs, toabctl, zhiyan, zxy, zzzeek 15:03:23 o/ 15:03:29 ol 15:03:32 o/ 15:03:34 \o 15:03:36 xD 15:03:36 o/ 15:03:44 I was waiting in the wrong channel 15:03:56 My previous meeting ran over a bit. 15:04:05 #link https://wiki.openstack.org/wiki/Meetings/Oslo#Agenda_for_Next_Meeting 15:04:07 o/ 15:05:20 #topic Red flags for/from liaisons 15:05:53 I've been kind of out of touch for the past two weeks, so if there's anything going on let me know. 15:05:59 Nothing to report here. 15:06:41 * jungleboyj has been out of touch as well. :-) 15:07:04 jungleboyj: Yours involved a lot more bbq than mine. :-) 15:07:22 He he he. Yeah, and the scale shows it this morning. :-( 15:08:34 Okay, as always you don't have to wait for the meeting to bring up issues, so if there is anything just let the Oslo team know. 15:08:37 #topic Releases 15:09:04 I released oslo.utils before I left on PTO so we could get the EventletEvent fix out there. 15:09:20 Skimming my emails this morning I didn't see that it made anything explode. 15:09:34 I'll do the usual set of releases today. 15:10:02 #topic Action items from last meeting 15:10:10 "kgiusti to try to reproduce https://bugs.launchpad.net/oslo.messaging/+bug/1800957 with higher thread count" 15:10:11 Launchpad bug 1800957 in oslo.messaging "Upgrading to pike version causes rabbit timeouts with ssl" [High,Confirmed] - Assigned to Ken Giusti (kgiusti) 15:10:19 I believe this is done. 15:10:35 And IIRC there was even a fix posted to the bug. 15:11:35 And that was it for action items. 15:11:52 #topic Privsep and code that forks 15:11:59 #link http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001831.html 15:12:41 This came up because the Neutron team was having another issue with the threaded privsep change. 15:13:00 It turns out that the library they called had a fork in it. 15:13:16 I think we ran into this as well. Someone was attempting to add privsep to our agent but was having problems. 15:13:27 This very much does not play nicely with threads, as you can see from the links in my email. 15:14:44 I added this to the agenda because I don't have a great answer to it. 15:15:28 We may just need to come up with a way to make a fork-safe privsep call (whether that's running in the main thread or a completely separate daemon) and deal with these issues as we find them. 15:15:49 But if anyone has a better idea please share. :-) 15:16:04 is the fork happening inside secure side of the privsep call? 15:16:19 dhellmann: Yes 15:16:26 It was in a library called by the privileged function. 15:16:27 * stephenfin tends to bury his head in the sand when it comes to all things privsep and is of absolutely no help :( 15:16:50 what is the fork doing? 15:16:55 stephenfin: This is why "privsep expert" is high on my list of wants in every Oslo project update. ;-) 15:17:58 is it the ssh that triggers the failure? 15:18:03 It was something to do with netns management: https://github.com/svinota/pyroute2/blob/master/pyroute2/netns/nslink.py#L146-L147 15:18:49 Ah: https://github.com/svinota/pyroute2/blob/master/pyroute2/netns/nslink.py#L108 15:18:51 ok, well, the whole point of privsep is to avoid having to do forks 15:18:56 about : https://bugs.launchpad.net/oslo.messaging/+bug/1800957 15:18:58 we fixed the problem here: https://github.com/celery/py-amqp/pull/247 15:18:58 Launchpad bug 1800957 in oslo.messaging "Upgrading to pike version causes rabbit timeouts with ssl" [High,Confirmed] - Assigned to Ken Giusti (kgiusti) 15:18:58 That's why it's forking. 15:19:23 gsantomaggio: Yep, thanks for looking into that! 15:19:33 it sounds like maybe the privsep API needs an "init hook" to set stuff like this up 15:19:58 The problem is it isn't our code. 15:20:04 well, it's the structure 15:20:19 we need to give the secure side of the privsep api a chance to do some setup before we start making threads 15:20:48 I'm not sure this is something that could be done ahead of time though. 15:20:57 oh, no? 15:20:59 Neutron doesn't have a complete list of netns's at startup. 15:21:03 ah 15:21:18 At least I don't think so, since they can be created by new networks and such. 15:21:22 oh, sure 15:22:02 I guess we need to give the secure functions a way to do something in the "main" thread then 15:22:20 or maybe we just say this is not a good candidate for privsep, I don't know 15:23:08 Yeah, it seems like we could add an in-process command to the privsep protocol. 15:23:33 I guess the trick is that the secure code in the thread has to be able to trigger it 15:23:39 Alternatively, neutron got around this by just synchronizing all of the privileged calls in this module. 15:23:50 that seems like a reasonable approach, too 15:23:58 at least as a stop-gap 15:24:03 I don't know whether that's entirely safe either, but it at least keeps multiple calls from stepping on each other. 15:24:52 * dhellmann shrugs 15:25:46 Okay, sounds like adding in-process execution is probably our best bet here. 15:26:14 #action Investigate adding main thread execution to privsep 15:26:35 in-process really isn't the right term since all the calls run in the same process, just different threads. 15:26:38 oh, maybe we can just flag certain privsep functions to run in process instead of in threads 15:26:49 that might be simpler than having something in the main thread that the other threads can communicate with 15:27:03 sorry, flag them to run in the main thread instead of a worker 15:28:05 Yeah, I'm wondering if we could add a privileged_synchronous decorator or something to indicate that we can't run the call asynchronously. 15:28:17 right, I think that's likely to be the simplest approach 15:28:39 I was thinking originally we'd have something the worker thread could do to cause work to happen in the main thread, but that's silly 15:28:52 at some point we'd be building an operating system scheduler 15:29:17 Yeah, I _think_ we can do it this way, but I don't know the privsep code well enough to say for sure. 15:29:44 it should just be an if statement at the point where we dispatch the work to a thread on the secure side 15:30:05 either do the work immediately, or throw it into a thread 15:30:05 Yeah 15:31:06 Okay, sounds like we have a plan then. 15:31:13 I'll reply to the mailing list thread too for visibility. 15:31:31 #action bnemec to update openstack-discuss about privsep/fork issue 15:32:08 #topic Weekly Wayward Review 15:32:44 #link https://review.openstack.org/579186 15:33:01 Slightly different this week. 15:33:22 it looks like we need a recheck there to build new versions of the docs, since the old logs have expired 15:33:23 We need someone to take this patch over and eliminate the duplication in the docs. 15:33:58 also that, yes 15:34:54 Basically it's a good change that needs a bit of massaging. I've been meaning to do that but I keep not having time, so I'm putting out a call for help. :-) 15:35:44 o/ 15:36:16 I can take it dhellmann 15:36:26 moguimar: Thanks! 15:36:31 thanks, moguimar ! 15:36:49 #action moguimar to take over https://review.openstack.org/579186 15:37:09 #topic Open discussion 15:37:17 Anything else? 15:37:26 yep 15:37:38 I spoke about oslo.config last week at FOSDEM 15:37:40 https://fosdem.org/2019/schedule/event/python_application_configuration/ 15:37:58 * dhellmann still has that video queued up to watch 15:38:13 Nice 15:38:18 it is basically the same talk I gave at Python Brasil 15:38:39 I thought that I had enought reharsal but it was in portuguese 15:39:00 so I kinda translated on the fly and next time will do some english reharsal too 15:39:06 :-) 15:39:23 :-) 15:41:06 Any other topics? 15:41:59 not on my end 15:43:11 Okay, we'll give everyone 15 minutes back then. 15:43:14 Thanks for joining! 15:43:19 #endmeeting