21:00:03 <timburke_> #startmeeting swift 21:00:04 <openstack> Meeting started Wed Feb 3 21:00:03 2021 UTC and is due to finish in 60 minutes. The chair is timburke_. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:05 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:07 <openstack> The meeting name has been set to 'swift' 21:00:14 <timburke_> who's here for the swift meeting? 21:00:19 <mattoliverau> o/ 21:00:23 <seongsoocho> o/ 21:00:37 <kota_> o/ 21:00:41 <rledisez> o/ 21:01:04 <clayg> o/ 21:01:26 <acoles> o/ 21:01:37 <timburke_> as usual, the agenda's at 21:01:39 <timburke_> #link https://wiki.openstack.org/wiki/Meetings/Swift 21:01:51 <timburke_> first up 21:01:59 <timburke_> #topic vagrant swift all in one 21:02:10 <timburke_> #link https://github.com/swiftstack/vagrant-swift-all-in-one 21:02:54 <timburke_> i know clayg and i have been pretty heavy users of this tooling to set up our dev environments 21:03:34 <clayg> ๐ด IT WHILE IT'S ๐ฅ 21:03:45 <timburke_> but in the near future, it's likely to go away (as part of some long-standing cleanup necessary to get rid of the old swiftstack github account) 21:03:47 <rledisez> me and alecuyer (and some other colleagues) used it a lot too. very useful 21:04:43 <clayg> timburke_: update as of today - there's a non-zero chance it will just end up living at github.com/nvidia/vagrant-swift-all-in-one ๐คทโโ๏ธ 21:04:54 <clayg> everyone here that uses it should still fork it now just in case 21:05:39 <timburke_> i do wonder if it'd be good for us to have something like it moved in-tree... 21:06:03 <clayg> virtualbox yes, vagrant maybe, chef NO (like HELL NO) 21:06:03 <timburke_> could probably re-use a lot of the ansible playbooks we've already got for setting up probe tests 21:06:11 <clayg> timburke_: SOLD 21:06:44 <rledisez> it's so handy that i'm wondering if there is swift devs that does not use it :D 21:08:10 <seongsoocho> I really love this tool :-) 21:08:20 <clayg> rledisez: apparently mattoliverau has "better things" that he keeps to himself 21:08:21 <timburke_> it's the sort of thing i'm always a bit torn about -- it's way easier to stand up an environment, but there's definite value in having a range of setups that we each use. vsaio's definitely got some opinions baked in 21:08:41 <mattoliverau> lol, I don't know about better :P 21:09:31 <mattoliverau> I just have a dodgy bash script I wrote back when I needed to deploy a bunch on rackspace cloud that sets SAIO the way I want it (and am now used too). But it doesn't do all the things vsaio does. (like s3api). 21:09:58 <timburke_> fwiw, i remember notmyname linking https://gist.github.com/notmyname/40b8131963346676dd18817aeb5ef799 a while back if anyone wanted to go virtualbox with no vagrant ;-) 21:10:13 <zaitcev> I always follow our in-tree manual to set up SAIO. 21:10:42 <clayg> zaitcev: you are a hero ๐ 21:11:17 <zaitcev> https://knowyourmeme.com/memes/no-take-only-throw 21:11:18 <timburke_> ๐ค i wonder if we could generate some bits of the manual based on in-tree playbooks.... 21:11:47 <zaitcev> "IT dog, how do you automate?" "No automate" "Only type" 21:12:01 <clayg> timburke_: the super weird thing about vsaio is the configuration - like it's opinionated, but supports *some* options (which in a few cases are like... flip this ONE config option to false ๐คจ) 21:12:29 <timburke_> py3? yes/no 21:13:06 <clayg> right, then there's other options that are super useful - maybe when we port to in-tree ansible we can square that up to something more sensible 21:13:19 <timburke_> (i *still* wish i had good tooling for working with mixed py2/py3 development) 21:13:29 <clayg> it'd be nice to have a change that requires some crazy config (servers-per-port) and you can include the saio stuff for reviewers to check it out! 21:15:00 <mattoliverau> my dodgy one is: https://github.com/matthewoliver/simple_saio but ignore the ubuntu in the readme. I'm not sure what it really supports. I mainly using centos SAIOs atm and Opensuse probably works, well it did a few months ago. 21:15:00 <acoles> IMHO a common dev and CI ansible setup would be great 21:15:15 <rledisez> one thing that always bothered me is that it seems there is in the code stuff that are specific so that ir works in SAIO. I'm wondering if the one server/vm is still relevant in a world where docker is everywhere. I can imagine creating a real cluster just with some docker-compose file 21:15:19 <mattoliverau> but yeah, vsaio + ansible would be nicer. 21:15:31 <mattoliverau> I have a some chef experience, and I didn't enjoy it :P 21:15:49 <timburke_> anyway, i guess we covered what needed to be said. namely, vsaio is a repo that (might) go away, so if you want to keep using it, it's probably a good idea to fork it sooner rather than later 21:16:16 <clayg> rledisez: i think that's quite reasonable - the trick is porting probetests ๐ค 21:16:42 <timburke_> i think it was also (vaguely) what notmyname was trying to do with runway 21:17:09 <clayg> s/might/probably/ go away - s/sooner rather than later/like right now... during this meeting/ 21:17:32 <clayg> oh right runway!!! that one might already be gone ๐ค 21:17:48 <timburke_> yeah, that one's already gone 21:18:50 <clayg> https://gitlab.com/nvidia/proxyfs-ci/runway 21:19:05 <timburke_> we can keep thinking about how best to do dev envs and how similar they need to be to CI vs prod vs some other crazy thing, but i think we should probably keep moving 21:19:38 <timburke_> #topic sharding in train 21:19:59 <timburke_> zaitcev, i haven't seen patches yet, how's it going? 21:20:04 <zaitcev> I put together a stack of 18 patches, they pass unit tests. 21:20:16 <zaitcev> The 19th was much too hard, so I gave up. 21:20:16 <timburke_> \o/ 21:20:37 <clayg> didn't sinatra do a song about "sharding in the rain"? 21:20:38 <zaitcev> Unfortunately, I must focus on RBAC this week. 21:21:01 <timburke_> no worries, and thank you for taking on the RBAC work! 21:21:14 <timburke_> i just wanted to check in, make sure you weren't blocked 21:21:17 <zaitcev> So, I'm tempted to throw them into Gerrit in one big stack, just so they're not locked in my laptop. 21:21:52 <zaitcev> After we talked about it last week, I wanted to feed them in batches of 4 or 5, to let review them easier. 21:21:58 <timburke_> that's fine by me. i'll try to get through them quickly once they're up 21:22:07 <zaitcev> I replaced Change-ID at least 21:23:01 <timburke_> if possible, try to include the cherry-picked sha in the commit message; makes it a little easier for me to compare master vs stable 21:23:24 <zaitcev> Yes, I changed old Change-Id with Cherry-Picked-From. 21:23:52 <zaitcev> That's all 21:24:07 <timburke_> #topic eventlet and ssl 21:24:22 <timburke_> #link http://lists.openstack.org/pipermail/openstack-discuss/2021-January/020100.html 21:25:12 <timburke_> i was catching up on mailing list recently and saw zigo has been having trouble with eventlet and ssl in swift-proxy 21:25:39 <zigo> timburke_: I found out that the issue is dnspython 2.0. 21:25:47 <timburke_> i wanted to check if anyone has a ssl-enabled-keystone handy to try to repro 21:26:12 <timburke_> oh, curious 21:26:13 <zigo> timburke_: The issue is swift-proxy connecting to Keystone to check credentials... 21:26:14 <zaitcev> Ironically I don't 21:26:33 <zaitcev> I created a different region for Keystone and set Swift to talk using that region. 21:26:39 <zigo> So the problem is not a swift-proxy binding over SSL. 21:26:47 <rledisez> timburke_: we do have that, swift talking to a Keystone over SSL 21:26:51 <timburke_> ...i guess maybe dnspython imports ssl before eventlet's monkey-patched it? 21:27:12 <zigo> timburke_: I can't tell, but it's definitively a monkey paching issue between dnspython and eventlet. 21:27:34 <timburke_> i know i've seen similar recursion errors before, and it's been a matter of not monkey-patching early enough 21:27:42 <zigo> The same issue happens when Neutron tries to tell Nova (over the Nova API) that a VM port is up. 21:28:29 <zigo> Well, I'd prefer if there was a strong movement to get out of this madness. 21:28:44 <zigo> Monkey patching is a terrible idea. 21:29:21 <zigo> It has numerous times, and still bites hard... 21:30:08 <zigo> https://github.com/eventlet/eventlet/issues/619 <--- The issue has been opened since 25th of June ... 21:30:30 <timburke_> https://github.com/rthalley/dnspython/blob/v2.0.0/dns/query.py#L48 i guess? i don't see an ssl import on 1.16.0 (at a quick glance, anyway) 21:31:29 <timburke_> yeah, monkey-patching's... not great. one more reason we ought to look at that PUT+POST(+POST) patch again... it bugs me that we're so tied to eventlet 21:32:58 <zigo> My guess, is that we do eventlet monkey patching early, but then dnspython does monkey patching *after*, and then accessing stuff on the SSLContext object breaks hard. 21:33:05 <zigo> (I'm not sure, just double-guessing) 21:33:56 <timburke_> oh -- it does its own monkey patching or something, is that it? ick 21:34:34 <zigo> Isn't what you've just linked does? 21:34:55 <zigo> (ie: creating an SSLSocket object...) 21:35:17 <timburke_> does anyone have bandwidth to try to repro/fix the issue? having a pin on a two-year-old version of dnspython doesn't seem sustainable 21:35:47 <zigo> It's also currently completely broken in both Fedora and Debian (both have dnspython 2.0.x). 21:36:28 <zigo> I'm trying to push to revert to 1.16.0, but I'm not sure I'll be successful. 21:36:31 <clayg> timburke_: does anyone remember why we depend on dnspython? 21:36:46 <zigo> clayg: eventlet does depend on it ... 21:37:18 <clayg> cname_lookup 21:37:35 <zigo> python-eventlet (master)$ cat setup.py | grep dns 21:37:35 <zigo> 'dnspython >= 1.15.0, < 2.0.0', 21:38:17 <timburke_> so we *do* use it for cname_lookup, but the bigger issue seems to be that if you have it installed for the sake of something else, it'll break ssl in eventlet-ified processes 21:38:54 <zigo> Indirectly, yes. 21:39:07 <zigo> We use keystoneauth, which calls requests, which calls urllib3. 21:39:35 <zigo> Urllib3 access the SSL socket SSLContext.options object, and when it does ... big crash ! 21:40:59 <zigo> I believe the issue is because this: 21:40:59 <zigo> https://github.com/eventlet/eventlet/blob/master/eventlet/green/ssl.py#L449 21:40:59 <zigo> isn't in use because of dnspython overriding the eventlet monkey patching. 21:41:10 <zigo> I may be wrong, but so far, that's where I am... 21:41:56 <timburke_> all right, i'll look into it. the dnspython tip was useful, looks like i might have a repro now! 21:42:20 <zigo> What's the intention behind this: 21:42:20 <zigo> https://github.com/rthalley/dnspython/blob/v2.0.0/dns/query.py#L58 21:42:20 <zigo> ? 21:42:20 <timburke_> #topic orphaned shard ranges 21:42:39 <zigo> Ok, thanks. 21:42:50 <zigo> timburke_: Feel free to ping me anytime and we can discuss this later. 21:43:21 <timburke_> zigo, i think it's just trying to stub out enough to prevent NameErrors and the like in case ssl isn't available 21:43:52 * zigo will try to patch this out, just to see if it continues to work ... 21:45:03 <timburke_> i know we picked up https://review.opendev.org/c/openstack/swift/+/771086 recently to prevent us from running into this orphaned-shard situation... 21:46:13 <timburke_> is https://review.opendev.org/c/openstack/swift/+/770529 still viable for cleaning up any orphans that may already be on disk? or should we abandon that? 21:46:23 <mattoliverau> yup, and that stops them being created. 21:46:42 <mattoliverau> There is work on getting the new shrink code working. 21:47:15 <mattoliverau> acoles: has a chain starting: https://review.opendev.org/c/openstack/swift/+/771885 21:47:19 <timburke_> yeah, maybe i should change the agenda item to cover shrinking generally ;-) 21:47:23 <mattoliverau> yeah 21:48:16 <mattoliverau> the start of the chain will allow the root to provide the final shard acceptor as it's self when collapsing. (to keep it root driven) 21:49:09 <mattoliverau> acoles recently did an awesome job of simlifying that with a auditing state 21:49:16 <acoles> I prefer the root driven shrink & delete approach rather than the shard self-determination in https://review.opendev.org/c/openstack/swift/+/770529 21:49:33 <mattoliverau> +1 21:49:59 <mattoliverau> later on that chain is the new compact shard-manage-shard-ranges command 21:50:10 <mattoliverau> which reworks how shrinking works in the sharder. 21:50:30 <acoles> in general the shrinking and overlap repair is coming along nicely, but we uncovered a couple of bugs along the way 21:50:49 <mattoliverau> Once all these peices are done, orphan shards will be dealt with. initially manually, but its moving in the right direction for automatically too. 21:50:58 <acoles> mattoliverau: has fixed one https://review.opendev.org/c/openstack/swift/+/773832 21:51:57 <acoles> I'm about to push another fix, these are both bug fixes in addition to the new feature in swift-manage-shard-ranges commands 21:52:11 <acoles> so its been an interesting journey :) 21:52:31 <mattoliverau> :) 21:52:33 <timburke_> cool! sounds like things are moving right along. blocked on anything? 21:54:03 <acoles> timburke_: not enough hours in a day ! :) 21:54:04 <mattoliverau> not right now, just testing, reviews, coding.. and confidence we've fixed all the bugs :P 21:54:38 <acoles> there's the other bug fix https://review.opendev.org/c/openstack/swift/+/774002 21:54:52 <timburke_> all right then 21:54:59 <timburke_> #topic relinker 21:55:06 <acoles> note: these bugs would not impact *sharding*, just shrinking of shards 21:55:07 <timburke_> i'm still working on some relinker enhancements, and wanted to call out a couple things 21:55:17 <timburke_> first, my patches are (mostly) in a single chain now, so it's easier to try out all the improvements at once by checking out the end of the chain 21:56:12 <timburke_> second, the first couple patches in the chain make it so you can point the relinker at a config file to read most (but not quite all) of the cli flags 21:57:01 <timburke_> that was mostly because i realized we've got a bunch of options that should already be in you [DEFAULT] section of your object-server.conf (--swift-dir, --devices, --skip-mount-check) 21:57:36 <timburke_> with https://review.opendev.org/c/openstack/swift/+/772419, --user 21:58:40 <timburke_> i was mainly wondering if the config file seemed like as reasonable idea to everyone; i can't think of any other one-off tools we have in the swift repo that would go read a conf file... 21:59:15 <mattoliverau> the config file is still optional tho right? you can still use cli ops if need be. 21:59:57 <mattoliverau> So I think it makes sense to allow a config, esp when most of the options may already be set. But also it optional so wont stop how others may already be using the tool. 21:59:59 <timburke_> yup -- existing cli tooling should all still work 22:00:21 <mattoliverau> great job 22:00:42 <timburke_> with that in mind, should i make sure that all config options get CLI args? 22:00:44 <acoles> timburke_:+1 I think it is very reasonable 22:01:41 <acoles> +1 was for having a conf file. not sure if you *must* expose all the options if they are beyond current cli 22:02:04 <acoles> if the defaults are sensible 22:02:04 <mattoliverau> I think we're at time :( 22:02:20 <timburke_> yeah, i was noticing that, too ;-) 22:02:27 <timburke_> thank you all for coming, and thank you for working on swift! 22:02:30 <timburke_> #endmeeting