zaitcev | timburke: Are you saying that I don't need to be concerned with keystonemiddleware because we use our fork anyway? | 00:23 |
---|---|---|
opendevreview | Matthew Oliver proposed openstack/swift master: WIP: internal_client: Add iter_shard_ranges interface https://review.opendev.org/c/openstack/swift/+/877584 | 02:45 |
timburke | zaitcev, well... i know *my* clusters have used the swift-tree middleware for years now -- idk what *your* users are using -- but at least there's a migration path | 03:56 |
timburke | presumably, keystone will be happy to be rid of the thing at some point | 03:57 |
zaitcev | Of what thing? s3_token? | 04:00 |
zaitcev | Obviously Nova uses ec2_token, so it's not going anywhere. | 04:00 |
opendevreview | Matthew Oliver proposed openstack/swift master: WIP: Internalclient gatekeeper restore header shim https://review.opendev.org/c/openstack/swift/+/878188 | 04:37 |
mcape | hey all! one of three servers' controller failed, and all drives are unaccessible now. the problem is that cluster was at 95% of capacity, and now trying to rebalance as I understand. How can I stop the partition movement to handoff nodes (which will quickly overfill all cluster)? | 07:46 |
mcape | it has 2 regions, with 1 zone in first, and 2 zones in seconds | 08:05 |
mcape | the failed server is in 1 region... and two other servers in 1 region are heading to 98-99% of disk utilization... while the norm is 94-95% (as in second zone) | 08:06 |
opendevreview | Merged openstack/python-swiftclient master: Use SLO by default for segmented uploads if the cluster supports it https://review.opendev.org/c/openstack/python-swiftclient/+/864444 | 16:00 |
edausq | hello, I have opened a bug report https://bugs.launchpad.net/swift/+bug/2012531 | 16:57 |
edausq | it is both impacting and tricky, I hope my report is clair enough, so a coredev can give a look | 16:58 |
timburke | mcape, i think you've got two options: stop all replicators until you can get hardware replaced (which exacerbates your current durability troubles), or reduce the ring to 2 replicas and remove all devices from the failed node in the ring (which may cause some further shuffling of partitions and/or complicate bringing the disks back into the cluster) | 16:59 |
timburke | edausq, looking at it now -- will try to keep you updated. just to double-check: it's the same version of swift under both py2 and py3, yeah? | 17:06 |
opendevreview | ASHWIN A NAIR proposed openstack/swift master: allow x-open-expired on POST requests https://review.opendev.org/c/openstack/swift/+/877434 | 17:16 |
edausq | timburke: yes, same version of swift. thank you! | 17:23 |
timburke | edausq, i haven't been able to repro yet -- i've definitely spent some time thinking about exactly this sort of a problem, though, and thought we had all our bases covered :-/ just to make sure i've got my environment right: which 2.29.x release is this? which version of python? eventlet? | 18:19 |
timburke | is there anything special i should know about how the object-server's deployed? (for example, i know some people have tried getting it running using mod_wsgi or uwsgi instead of eventlet's wsgi server) | 18:19 |
timburke | oh! and do you have encryption enabled? i just realized: i *do*, and that's probably throwing off my testing so far... | 18:26 |
timburke | yup, that'll do it. *sigh* | 18:28 |
edausq | we don't have encryption enabled. I am so glad to read you were able to reproduce! And you have a traceback too. I don't understand how come we don't, but that's another topic | 19:52 |
edausq | timburke: since you can reproduce, I am guessing you don't need our details about python/eventlet and swift version. | 19:55 |
timburke | edausq, yeah, i'm good -- thanks | 20:03 |
kota | good morning | 20:57 |
mattoliver | morning | 21:02 |
acoles | kota: mattoliver good morning! | 21:02 |
kota | acoles: mattoliver o/ | 21:02 |
indianwhocodes | o/ | 21:05 |
mattoliver | timburke: you around? | 21:07 |
timburke | oh, right! | 21:07 |
timburke | #startmeeting swift | 21:07 |
opendevmeet | Meeting started Wed Mar 22 21:07:49 2023 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. | 21:07 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 21:07 |
opendevmeet | The meeting name has been set to 'swift' | 21:07 |
timburke | main things this week | 21:08 |
timburke | #topic vPTG | 21:08 |
timburke | it's next week! | 21:08 |
mattoliver | already! | 21:08 |
timburke | also, i accidentally scheduled a vacation at the same time 😳 | 21:09 |
kota | wow | 21:09 |
mattoliver | sure sure :P | 21:09 |
mattoliver | yeah, no stress | 21:09 |
timburke | but it sounds like mattoliver is happy to lead discussions | 21:09 |
mattoliver | yeah, I aint no timburke but I can talk, so happy to lead. But need people to help me discuss stuff :) | 21:10 |
mattoliver | So put your topics down! | 21:10 |
mattoliver | timburke: do we have rooms scheduled etc? | 21:11 |
timburke | no, not yet -- i'd suggest going for this timeslot, M-Th | 21:11 |
timburke | sorry acoles. there isn't really a good time :-( | 21:12 |
timburke | take good notes! i'll read through the etherpad when i get back :-) | 21:12 |
mattoliver | kk | 21:13 |
mattoliver | is there a place I'm suppose to suggest/register the rooms or just register them via the bot like I did for the ops feed back last time? | 21:13 |
timburke | via the bot, like last time. anyone should be able to book rooms over in #openinfra-events by messaging "#swift book <slot ref>" | 21:14 |
mattoliver | cool, I'l come up with something | 21:15 |
timburke | #topic py3 metadata bug | 21:15 |
timburke | #link https://bugs.launchpad.net/swift/+bug/2012531 | 21:15 |
mattoliver | So long as acoles is ok with it. Or maybe we have an earler one for ops feedback.. I'll come up with something | 21:15 |
mattoliver | oh this seems like an interesting bug | 21:15 |
timburke | so... it looks like i may have done too much testing with encryption enabled | 21:15 |
timburke | (encryption horribly mangles metadata anyway, then base64s it so it's safer -- which also prevented me from bumping into this earlier) | 21:17 |
timburke | but the TLDR is that py3-only clusters would write down object metadata as WSGI strings (that crazy str.encode('utf8').decode('latin1') dance). they'd be able to round-trip them back out just fine, but if you had data on-disk already that was written under py2, *that* data would cause the object-server to bomb out | 21:19 |
acoles | sorry guys I need to drop off, I'll do my best to make the PTG - mattoliver let me know what you work out with times | 21:20 |
mattoliver | acoles: kk | 21:21 |
timburke | my thinking is that the solution should be to ensure that diskfile only reads & writes proper strings, not WSGI ones -- but it will be interesting trying to deal with data that was written in a py3-only cluster | 21:21 |
mattoliver | timburke: oh bummer | 21:21 |
mattoliver | so diskfile will need to know how to return potential utf8 strings as wsgi ones, so antoher wsgi str dance. | 21:22 |
mattoliver | but I guess it's only for the metadata? | 21:22 |
timburke | yeah, should only be metadata. and (i think) only metadata from headers -- at the very least, metadata['name'] comes out right already | 21:23 |
timburke | hopefully it's a reasonable assumption that no one would actually *want* to write metadata that's mis-encoded like that, so my plan is to try the wsgi_to_str transformation as we read meta -- if it doesn't succeed, assume it was written correctly (either under py2 or py3-with-new-swift) | 21:24 |
mattoliver | yeah, kk | 21:24 |
mattoliver | let me know how you go or if you need me to poke at anything, esp while your away | 21:25 |
timburke | thanks mattoliver, i'll try to get a patch up for that later today | 21:25 |
mattoliver | and thanks for digging into it. thats a bugger of a bug. | 21:25 |
timburke | makes me wish i'd had the time/patience to get func tests running against a cluster with mixed python versions years ago... | 21:27 |
timburke | anyway | 21:27 |
timburke | #topic swiftclient release | 21:27 |
timburke | we've had some interesting bug fixes in swiftclient since our last release! | 21:27 |
timburke | #link https://review.opendev.org/c/openstack/python-swiftclient/+/874032 Retry with fresh socket on 499 | 21:29 |
timburke | #link https://review.opendev.org/c/openstack/python-swiftclient/+/877110 service: Check content-length before etag | 21:29 |
timburke | #link https://review.opendev.org/c/openstack/python-swiftclient/+/877424 Include transaction ID on content-check failures | 21:29 |
timburke | #link https://review.opendev.org/c/openstack/python-swiftclient/+/864444 Use SLO by default for segmented uploads if the cluster supports it | 21:30 |
timburke | so i'm planning to get a release out soon (ideally this week) | 21:30 |
mattoliver | ok cool | 21:30 |
timburke | thanks clayg in particular for the reviews! | 21:30 |
timburke | that's most everything i wanted to cover for this week | 21:32 |
mattoliver | nice. If there is anything else anyone wants to cover, put it in the PTG etherpad ;) | 21:33 |
timburke | other initiatives seem to be making steady progress (recovering expired objects, per-policy quotas, ssync timestamp-with-offset fix) | 21:34 |
timburke | #topic open discussion | 21:34 |
timburke | anything else we should talk about this week? | 21:34 |
mattoliver | We did have some proxies with very large memory useage > 10G | 21:34 |
mattoliver | so not sure if there is a bug there. maybe some memory leak with connections.. but it's too early to tell. I'm attempting to dig in. but just a heads up. | 21:35 |
timburke | right! this was part of our testing with py3, right? | 21:35 |
mattoliver | may or may not turn into anything | 21:35 |
mattoliver | yup | 21:35 |
timburke | i'm anxious to see a repro; haven't had a chance to dig into it more yet, myself | 21:35 |
mattoliver | there seems to be alot of CLOSE_WAIT connections, so wonder if its a socket leak or not closing properly or something. | 21:36 |
mattoliver | I'll try and dig in some more today | 21:36 |
kota | nice | 21:37 |
mattoliver | I am also working on an internalclient interface for getting shard ranges, as more and more things may need to become shard aware. | 21:38 |
mattoliver | #link https://review.opendev.org/c/openstack/swift/+/877584 | 21:38 |
mattoliver | but it's still a WIP, like other things, let's see how we go. | 21:38 |
mattoliver | if there is a gatekeeper added to the internal client it'll break the function though. Al has suggested one possible fix, I came up with a middleware shim in internal client, clayg seems to think we should just error hard. | 21:39 |
mattoliver | break the interface I mean. | 21:40 |
mattoliver | So dicsussions are happening about that.. might start with the simplest and error loud I guess, but let's see where it goes. | 21:40 |
mattoliver | That's all I have | 21:41 |
timburke | i'm surprised there'd be any internal clients that would want a gatekeeper... huh | 21:42 |
mattoliver | well there aren't | 21:43 |
mattoliver | but if someone creates one with alow_modify_pipeline=True (or whatever it's called), one will be added | 21:43 |
mattoliver | and this would break sharding.. in fact it might already as the the sharder uses interenal client to get shards already, the interface just wants unified | 21:44 |
mattoliver | or a mis configuration from an op. | 21:44 |
timburke | i'll blame it on clayg ;-) https://review.opendev.org/c/openstack/swift/+/77042/1/swift/common/internal_client.py | 21:49 |
mattoliver | So yeah, I could just be doing down an edgecase that doesn't really matter. But it is still a shoot foot edgecase, and do we attempt to avoid it, or assume people will do the right thing. | 21:49 |
mattoliver | lol | 21:49 |
timburke | well, i think i'll call it | 21:49 |
mattoliver | kk | 21:49 |
mattoliver | thats all I have anyway :) | 21:49 |
timburke | thank you all for coming, and thank you for working on swift! | 21:49 |
timburke | #endmeeting | 21:49 |
opendevmeet | Meeting ended Wed Mar 22 21:49:31 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 21:49 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/swift/2023/swift.2023-03-22-21.07.html | 21:49 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/swift/2023/swift.2023-03-22-21.07.txt | 21:49 |
opendevmeet | Log: https://meetings.opendev.org/meetings/swift/2023/swift.2023-03-22-21.07.log.html | 21:49 |
opendevreview | ASHWIN A NAIR proposed openstack/swift master: allow x-open-expired on POST requests https://review.opendev.org/c/openstack/swift/+/877434 | 21:49 |
timburke | huh. longer than normal meeting-end delay | 21:49 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!