21:00:25 <timburke> #startmeeting swift 21:00:26 <openstack> Meeting started Wed Sep 23 21:00:25 2020 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:27 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:29 <openstack> The meeting name has been set to 'swift' 21:00:36 <timburke> who's here for the swift meeting? 21:00:51 <kota_> hello 21:02:16 <timburke> maybe it's just you and me 21:02:26 <kota_> oh yeah 21:02:28 <timburke> agenda's at https://wiki.openstack.org/wiki/Meetings/Swift 21:03:10 <timburke> main thing i wanted to mention was some follow up on the state of our gate 21:03:24 <kota_> okay 21:03:26 <clayg> o/ 21:03:40 <timburke> #topic busted gates 21:04:10 <kota_> oh, it looks bunch of branches were broken... 21:04:18 <timburke> i *think* swift's ussuri gate is fixed now -- at any rate, i stopped seeing emails about docs failure 21:05:06 <timburke> assuming it's moving, i'll try to get some stable releases out for ussuri and train this week 21:06:10 <timburke> swift client's gate is better now! the fix landed after the deadline to branch for victoria, though, so i might need to reach out to the stable team to sort out how best to fix that one 21:06:39 <timburke> (the fix involved some requirements changes, so i worry a little that a simple backport may not be great) 21:06:45 <kota_> i see 21:07:48 <timburke> i discovered pyeclib's gate was broken after seeing p 744623 21:07:48 <patchbot> https://review.opendev.org/#/c/744623/ - pyeclib - [goal] Migrate testing to ubuntu focal (ABANDONED) - 4 patch sets 21:08:25 <timburke> p 753472 fixed it, but disabled the two jobs we had to test against tip-of-master libec 21:08:26 <patchbot> https://review.opendev.org/#/c/753472/ - pyeclib - Fix gate (MERGED) - 1 patch set 21:08:41 <clayg> focal is gunna be so great - i'm sure I'll try upgrading to it at some point 21:09:22 <kota_> clayg! 21:09:44 <clayg> kota_: i snuck in 😁 21:10:10 <timburke> at some point we should dig into how those fail, but they're both such low-volume repos that i'm fairly certain they still work well together 21:10:55 <timburke> while i was looking at pyeclib, i also pushed in p 753421 to test against py38 on focal and py36 on centos8 21:10:56 <patchbot> https://review.opendev.org/#/c/753421/ - pyeclib - Update gate jobs (MERGED) - 4 patch sets 21:11:12 <kota_> libec-pyeclib-unit said `/bin/bash: line 17: tox: command not found` :( 21:11:28 <kota_> at p 753472 21:11:28 <patchbot> https://review.opendev.org/#/c/753472/ - pyeclib - Fix gate (MERGED) - 1 patch set 21:11:49 <kota_> no p 744623 21:11:49 <patchbot> https://review.opendev.org/#/c/744623/ - pyeclib - [goal] Migrate testing to ubuntu focal (ABANDONED) - 4 patch sets 21:12:40 <timburke> i love how snappy pyeclib's jobs are -- at 2-4 mins per job, i feel like we can add more target platforms all day long! 21:12:56 <kota_> sounds good 21:14:07 <timburke> but all of this reminded me that i should check on the state of libec's gate; will report back next week 21:14:49 <timburke> that's all i've got for the gate stuff; any questions or comments? 21:15:45 <kota_> nothing so far. thanks for your effort to keep the gate to work. 21:16:31 <clayg> timburke: 👏 21:16:37 <timburke> all right, i've just got one other topic on my mind lately 21:16:46 <timburke> #topic hung proxy servers 21:17:32 <timburke> there have been two distinct issues that came up recently are somewhat related 21:18:23 <timburke> one is https://bugs.launchpad.net/swift/+bug/1895739 21:18:24 <openstack> Launchpad bug 1895739 in OpenStack Object Storage (swift) "Proxy server sometimes deadlocks while logging client disconnect" [Undecided,In progress] 21:20:44 <timburke> the nitty-gritty is in the bug, but the summary is that while we're down in logging, garbage collection may cause us to try to grab the same (non-reentrant) lock twice in the same (green)thread 21:21:09 <timburke> the other is https://github.com/eventlet/eventlet/pull/498 21:21:59 <timburke> where eventlet sees that there's a fd read to read, but then doesn't wake anyone up to read it 21:23:24 <timburke> good news is that the second one is already merged (and tagged!) following https://github.com/eventlet/eventlet/pull/645 -- thanks for cleaning it up clayg! 21:23:40 <clayg> tight poll loop keeps asking for the same fd, and it says it's ready - but it just keeps polling 21:24:16 <timburke> the first one has a patch at p 752593 21:24:16 <patchbot> https://review.opendev.org/#/c/752593/ - swift - Replace threading._active_limbo_lock with a re-ent... - 3 patch sets 21:24:52 <timburke> i think both of these issues can affect other services, it's just acutely bad on proxies 21:25:56 <timburke> as much as anything, i just wanted to raise awareness in case anyone else sees similar issues, and maybe see if i could get someone to look at the swift patch ;-) 21:28:04 <clayg> does lp bug #1895739 only effect py3? 21:28:05 <openstack> Launchpad bug 1895739 in OpenStack Object Storage (swift) "Proxy server sometimes deadlocks while logging client disconnect" [Undecided,In progress] https://launchpad.net/bugs/1895739 21:28:27 <timburke> i've only *observed it* on py3 -- and i'm not sure why :-( 21:28:55 <timburke> looking at py2's code, it seems like it *could* happen there, too... but again, i've not actually seen it 21:29:10 <timburke> maybe there was some change in GC algo? 21:29:33 <clayg> what kind of lock *is* _active_limbo_lock in cpython? does eventlet patch it by default? 21:29:41 <timburke> i still haven't found a good way to reliably reproduce the problem, either :-( 21:29:51 <clayg> 😢 21:31:05 <timburke> clayg, so in cpython it's a pretty low-level lock -- uses https://docs.python.org/2/library/thread.html#thread.allocate_lock as i recall 21:32:26 <timburke> eventlet *does* patch it; it gets replaces with a Semaphore 21:32:42 <clayg> neato! 21:33:46 <timburke> which seems like a reasonable replacement given the semantics 21:35:08 <timburke> i tried to go over some of the weirdness that leads to this in the bug -- it's not really clear to me whether we're to blame, eventlet's to blame, or cpython's to blame :-/ 21:35:55 <timburke> swapping out for our own reentrant lock seems like the most-reasonable approach, though, especially since it's already getting patched 21:36:51 <timburke> clayg, since you've already put some effort into thining about eventlet and our PipeMutex, mind takinga look this week? 21:37:33 <clayg> i'm sure it's fine - but without a repro it's hard to say exactly 21:38:22 <timburke> all right, that's all i've got planned 21:38:26 <timburke> #topic open discussion 21:38:34 <timburke> what else should we talk about this week? 21:38:41 <clayg> are we still stalled out on pyeclib? 21:39:43 <timburke> pyeclib's good now, afaik -- maybe you're thinking of p 738959 though? 21:39:44 <patchbot> https://review.opendev.org/#/c/738959/ - liberasurecode - Be willing to write fragments with legacy crc - 2 patch sets 21:41:09 <timburke> i still haven't circled back on it -- i'm coming around to wanting to at least treat set-to-the-empty-string the same as unset, but beyond that i'm not sure 21:44:18 <timburke> i think my main question is: which falsey values should we look for? 21:46:25 <timburke> kota_, clayg any thoughts there? keeping in mind that the check'll have to be written in C 21:47:11 <clayg> i like 0 and 1 for true and false in C 21:47:55 <kota_> clayg: agree. plus empty value seems False. 21:48:32 <clayg> anyone have any idea why making a request that uses acl's results in the env getting copied? p 752770 21:48:32 <patchbot> https://review.opendev.org/#/c/752770/ - swift - Log error processing manifest as ServerError - 1 patch set 21:48:38 <timburke> ok, i'll code that up this week 21:48:58 <clayg> we end up loosing the storage policy index from the req.environ as well 21:57:03 <timburke> i have no idea. sorry. went looking 21:57:23 <timburke> i'll see about digging into it more on the patch, though 21:58:06 <timburke> all right, i think that'll do it 21:58:18 <timburke> thank you all for coming, and thank you for working on swift! 21:58:27 <timburke> #endmeeting