21:00:11 <timburke> #startmeeting swift 21:00:11 <opendevmeet> Meeting started Wed Apr 26 21:00:11 2023 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:11 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:11 <opendevmeet> The meeting name has been set to 'swift' 21:00:26 <timburke> who's here for the swift team meeting? 21:01:13 <acoles> o/ 21:01:19 <kota> o/ 21:02:46 <mattoliver> o/ 21:03:16 <timburke> sorry that it's been so long since i actually held a meeting 21:03:37 <timburke> there were a couple items from whenever that last time was that i wanted to follow up on 21:03:47 <timburke> #topic keepalive timeouts 21:03:59 <timburke> #link https://review.opendev.org/c/openstack/swift/+/873744 21:04:19 <timburke> i finally got around to reworking that to be a plumbing-only patch! 21:04:33 <timburke> thanks zaitcev for reviewing it 21:04:54 <mattoliver> Oh cool! I'll take a look at it too then 21:06:12 <timburke> i forget whether we (nvidia) have just recently started running with that and latest eventlet, or if that's going out next week, but it's looking good so far 21:06:30 <timburke> #topic ssync and data with offsets 21:06:41 <timburke> both patches have merged! 21:06:51 <acoles> \o/ 21:07:33 <timburke> but... they introduced a flakey probe test. keep an eye out for it in gate results 21:07:50 <timburke> speaking of... 21:07:57 <timburke> #topic gate issues 21:08:30 <timburke> we recently had a busted lower-constraints job, after virtualenv dropped support for creating py2 envs 21:09:13 <timburke> that's been fixed (and i've got a follow-up to fix it better) -- but it still impacts stable branches 21:09:50 <timburke> once the follow-up lands, i'll propose some backports with the two patches squashed together 21:10:38 <timburke> i've also seen some flakey tests lately 21:10:48 <timburke> #link https://bugs.launchpad.net/swift/+bug/2017024 21:10:57 <mattoliver> Kk, thanks for all that work timburke 21:11:31 <timburke> was easy to reproduce locally, and not too bad to fix once i looked at some of the other tests in the file 21:11:57 <timburke> see https://review.opendev.org/c/openstack/swift/+/881142 21:12:16 <timburke> but the probe test i mentioned... 21:12:29 <timburke> #link https://bugs.launchpad.net/swift/+bug/2017021 21:13:03 <timburke> i still haven't reproduced locally, and i haven't found any smoking guns in the gate job logs 21:14:12 <timburke> if anyone has some cycles to spare on it, i'd appreciate any insights you can figure out -- this seems to be the leading cause of rechecks the past week or two 21:14:42 <acoles> :/ 21:15:06 <timburke> alternatively, we could consider removing the flakey test, but that doesn't seem great 21:15:12 <acoles> subjectively, it does seem to be causing a lot of rechecks 21:16:37 <timburke> lastly, i wanted to draw attention to a recent proxy error we've been dogpiling on 21:16:43 <timburke> #topic ec frag iter errors 21:16:47 <mattoliver> I'll keep an eye out for it, and if it happens to me I'll dig in, in the meantime will also add it to the bottom of my todo list and hope to get to it at some point. 21:17:49 <timburke> indianwhocodes was investigating some differences in py2/py3 proxy behaviors, and started pulling at this "generator already executing" error 21:18:02 <timburke> #link https://review.opendev.org/c/openstack/swift/+/880356 21:19:37 <timburke> (note that the error would happen under both py2 and py3, but the tracebacks got much noisier in py3, as it started adding more context about what other errors were in the process of being handled) 21:21:08 <timburke> the more we thought about it, the weirder it seemed; eventually, we pieced together that one greenthread was trying to close out a generator that was currently executing (and blocked on IO) in another greenthread 21:22:12 <timburke> this has led to a few different refactorings from clayg and acoles -- i'm really optimistic about where the EC GET code will wind up 21:22:25 <timburke> two questions i've got about it, though: 21:22:37 <timburke> 1. do we have an upstream bug about the error already? 21:24:10 <timburke> and 2. do we have a fix yet? i think i heard that we've got a good idea of what needs to happen, but idk whether we've got a patch that could include a "closes-bug" 21:24:48 <mattoliver> 1. not that I've seen. though I haven't looked. maybe indianwhocodes could write one? it'll be good educational experience. 21:26:10 <timburke> maybe these will be better questions when indianwhocodes and clayg are around ;-) 21:26:28 <kota> +1 21:26:30 <timburke> i can also bring it up out-of-band 21:27:03 <mattoliver> yeah, sorry I haven't been following the work. 21:27:57 <timburke> just wanted to (1) bring it to people's attention in case they wanted to help out or better understand it, and (2) call out the good work in digging deep on a complicated part of the proxy 21:28:11 <mattoliver> +100 21:28:38 <timburke> that's all i've got 21:28:41 <timburke> #topic open discussion 21:28:50 <timburke> anything else we should bring up this week? 21:29:14 <mattoliver> I've spend a bunch of time, and there will be a bunch more, adding unit tests to tracing. You might have seen some activitity 21:30:15 <mattoliver> I'm basically adding some tracing asserts, ie what spans should be created to tests of middlewares that've i've actually gone in and instrumented a little (that have extra spans added) 21:30:50 <timburke> whoo! i still need to give tracing a spin 21:31:20 <mattoliver> It could be a never ending scope of how many and what type of unit tests, but just want to do something to get the code into a more upstream mergible state. 21:31:34 <mattoliver> you should give it a whirl! (when you have time) 21:32:52 <mattoliver> just yesterday I was adding tests to tempurl and saw clearing we run _get_hmac 4 times on HEADs. Then looking in the code, yup it does (and it's suppose to) but was obvious in the spans that were created, it was pretty cool to see it 21:33:06 <mattoliver> *clearly 21:33:17 <timburke> i remember that it requires py3 -- but can we merge it with the caveat that you should only configure it under py3 (but everything will still run fine under py2 as long as it's *not* configured)? 21:33:44 <mattoliver> yeah, when we do, there will be an impact. 21:34:05 <mattoliver> I'll double check, the concrete tracer implementations were defintely py3. 21:35:04 <opendevreview> Shreeya Deshpande proposed openstack/swift master: Error logs changed for ChunkWriteTimeout https://review.opendev.org/c/openstack/swift/+/881648 21:36:08 <mattoliver> anyway, that's a good test, I can create a python2 venv and give it a whirl 21:37:03 <mattoliver> thats all I have 21:37:12 <timburke> oh, duh! i should just look at the zuul results! 21:37:37 <mattoliver> oh yeah! lol 21:37:44 <timburke> all right, i think i'll call it early then 21:37:54 <timburke> thank you all for coming, and thank you for working on swift! 21:37:59 <timburke> #endmeeting