*** diablo_rojo is now known as Guest2181 | 03:19 | |
*** diablo_rojo__ is now known as diablo_rojo | 13:13 | |
clarkb | Anyone else here for the meeting? We will get started in a few minutes | 18:58 |
---|---|---|
ianw | o/ | 19:00 |
fungi | ohai | 19:00 |
clarkb | #startmeeting infra | 19:01 |
opendevmeet | Meeting started Tue Jun 15 19:01:07 2021 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
opendevmeet | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link http://lists.opendev.org/pipermail/service-discuss/2021-June/000254.html Our Agenda | 19:01 |
clarkb | #topic Announcements | 19:01 |
clarkb | I will not be around next week. We will either need a volunteer meeting chair or we can skip | 19:01 |
clarkb | I'll leave that up to those who will be around to decide :) | 19:01 |
clarkb | #topic Actions from last meeting | 19:02 |
clarkb | #link http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-06-08-19.01.txt minutes from last meeting | 19:02 |
clarkb | #action clarkb Followup with OpenStack on ELK retirement | 19:02 |
clarkb | I have not done this yet | 19:03 |
clarkb | #action someone write spec to replace Cacti with Prometheus | 19:03 |
clarkb | I have not seen a spec for this either. I assume it hasn't been done | 19:03 |
clarkb | ianw: did centos ppc packages get cleaned up? | 19:03 |
ianw | not yet sorry | 19:03 |
clarkb | #action ianw Push change to cleanup ppc packages in our CentOS mirrors | 19:04 |
clarkb | no worries i think we had a number of distractions last week | 19:04 |
clarkb | Lets jump in and talk about them :) | 19:04 |
clarkb | #topic Topics | 19:04 |
clarkb | #topic Eavesdrop and Limnoria | 19:04 |
clarkb | I wanted to call out that we had to fix a bug in limnoria to handle joins to many channels properly | 19:05 |
clarkb | This morning fungi discovered that limnoria doesn't seem to aggressively flush files to disk, but there is a config option we can toggle to have it do that | 19:05 |
ianw | hrm i'm pretty sure i turned that on | 19:05 |
fungi | we don't know for certain this will fix the observed behavior | 19:05 |
clarkb | And gmann was asking about the ptg.openstack.org etherpad lists were were/are hosted on eavesdrop01.openstack.org | 19:05 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/796513/ Limnoria flush channel logs | 19:06 |
fungi | yeah, those are in ptgbot's sqlite database | 19:06 |
clarkb | On the whole we seem to be tackling these issues as they pop up so I'm not super worried, but wanted to call them out in case people want to look into any of them | 19:06 |
fungi | the channel log flushing i'm not so sure is what we think it is. i watched some channels updating rather readily, while others it decided to not flush the log files for a day or so | 19:06 |
clarkb | on a related note at ~00:30UTC today freenode killed itself and split into a new freenode network with no user or channel migration | 19:07 |
clarkb | if there was any qusetion about us making the right choice to move I think that is settled now. | 19:07 |
ianw | https://review.opendev.org/c/opendev/system-config/+/795978 is what i was thinking of | 19:08 |
ianw | re flushing | 19:08 |
clarkb | ah a different flush setting | 19:09 |
ianw | ohhh, that's flushing the config file | 19:09 |
clarkb | hopefully fungi's change sorts this problem out | 19:09 |
clarkb | yup | 19:09 |
ianw | yeah, ++ | 19:09 |
fungi | i'm not convinced, but we'll see | 19:09 |
mordred | today's freenode-splosion is one of the most fascinating things to have happened in a while | 19:09 |
clarkb | fungi: I doubt it will hurt anything at least so seems safe to try | 19:10 |
fungi | agreed | 19:10 |
clarkb | also upstream has been super responsive which means if we can find and fix bugs pushing back upstream is worthwhile | 19:10 |
clarkb | alright, anything else on the topic of IRC and IRC bots? | 19:10 |
fungi | the pattern of what it was writing to disk and what it had seemingly decided to just no longer flush at all was not consistent | 19:10 |
ianw | templates/supybot.conf.erb:supybot.plugins.ChannelLogger.flushImmediately: False is in the old config | 19:11 |
fungi | but maybe there was more going on behind the scenes with fs write caching | 19:11 |
clarkb | fungi: it is running on a relatively new kernel inside a container with bind mounts too | 19:11 |
fungi | yeah | 19:12 |
ianw | there's also supybot.debug.flushVeryOften: False | 19:12 |
fungi | so lots of things can have changed under the covers | 19:12 |
fungi | supybot.debug.flushVeryOften seems to be about flushing its debug logs | 19:12 |
ianw | "automatically flush all flushers" | 19:12 |
fungi | which i figured was independent from channel logging | 19:13 |
fungi | but who knows how many toilets it flushes | 19:13 |
clarkb | we don't need to debug it in the meeting :) just want to make sure it is called out as a problem with a potential fix pushed up | 19:13 |
ianw | sorry the ptg thing, is there something to migrate off the old server? i'm not super familiar with that bit | 19:13 |
clarkb | I am not super familiar with it either. FOr some reason I thought the foundation was hosting that site | 19:14 |
clarkb | but dns said that was wrong when I asked dns about it | 19:14 |
fungi | ptgbot's ptg.openstack.org website was served from the old eavesdrop server, and the bot maintained some state in an sqlite file (mainly for things people set/updated via irc messages) | 19:14 |
ianw | hrm so it's some sort of cgi? | 19:14 |
clarkb | fungi: was the site cgi/wsgi then as part of the bot install? | 19:14 |
fungi | gmann was looking for the list of team etherpads from the last ptg, which the ptg.o.o site would still have been serving from data in ptgbot's sqlite db | 19:15 |
fungi | yeah, the puppet-ptgbot module handles all of that | 19:15 |
fungi | in theory it should get rewritten as aansible(+docker) including the website configuration | 19:16 |
clarkb | in that case I guess we can grab the sqlite db file then query against it until the site is up again if people need info from it? | 19:16 |
fungi | yeah, that was my thinking | 19:16 |
clarkb | fungi: yup and diablo_rojo volunteered to look at it starting next week | 19:16 |
diablo_rojo_phone | Yep! | 19:16 |
clarkb | *to look at the ansible (+docker) bits | 19:16 |
fungi | but also archive.org might be indexing that site, in which case there could be a list we can point people to there in the meantime | 19:16 |
diablo_rojo_phone | Almost down to that section of my to-do list. Probably by tomorrow. | 19:17 |
fungi | we can't confirm whether archive.org has an old copy until their power maintenance is over though | 19:17 |
ianw | it sounds like i should probably leave it alone then, but happy to help migrate things etc. | 19:18 |
ianw | it looks like possibly it's a javascript thing. i'm not seeing cgi/wsgi | 19:18 |
fungi | yeah, mainly let's just not delete the old server yet | 19:18 |
clarkb | ya I think worst case we'll look at the instance disk that is shutdown and/or backups and pull the data off | 19:18 |
clarkb | but waiting a few days is probably fine too | 19:18 |
ianw | yeah it's only shut down, essentially to avoid accidentally restarting the daemons twice | 19:18 |
fungi | cool, thanks for confirming | 19:20 |
clarkb | sounds like that may be it for this topic. Lets move on. | 19:20 |
clarkb | #topic Gerrit Account Cleanup | 19:20 |
clarkb | Has anyone had a chance to look at this info yet? I think I need to go through it again myself just to page context back in. But it would be nice to disable more accounts when we have had people take a look at hte lists so that we can let them site for a few weeks before permanently cleaning them up | 19:20 |
fungi | i've lost track of whether there was something which needed reviewing on this, sorry | 19:20 |
clarkb | ya there is a file on review in my homedir. Let me dig it up | 19:21 |
fungi | i'll try to look through it after dinner, apologies | 19:21 |
clarkb | ~clarkb/gerrit_user_cleanups/notes/proposed-cleanups.20210416 I think | 19:22 |
clarkb | but I need to repage things in myself too | 19:22 |
clarkb | anyway if you can take a look that would be helpful | 19:22 |
clarkb | #topic Server Upgrades | 19:22 |
clarkb | I have not made progress on the listserv upgrade testing as I have been distracted by things like server reboots and irc and all the things | 19:23 |
clarkb | it is still on my list but at this point I likely won't make progress on this until after next week | 19:23 |
clarkb | ianw: I think you have been making progress with review02. Anything new to report? | 19:23 |
ianw | i have just approved the container db bits that you've been working on and will monitor closely | 19:24 |
fungi | mnaser mentioned that the server got rebooted due to a host outage, so double-check things are still sane there i guess | 19:24 |
clarkb | fungi: ya that also took care of the reboot I was going to do on it :) | 19:24 |
clarkb | ianw: sounds good , thanks for pushing that along | 19:24 |
ianw | after that doesn't do anything to production, i will apply it to review02 and get the server mounting it's storage and ready | 19:24 |
ianw | i think we'll be very close to deciding when to sync data and move dns at that point | 19:25 |
clarkb | exciting | 19:25 |
ianw | i also had something to up the heap | 19:25 |
ianw | #link https://review.opendev.org/c/opendev/system-config/+/784003 | 19:25 |
clarkb | And then after that we can resurrect the gerrit 3.3 and 3.4 changes (there are som threads about 3.4 and ssh host key problems, but 3.3 looks like it should be good for us at this point) | 19:25 |
clarkb | any other server upgrade notes to amke? | 19:27 |
clarkb | #topic Draft Matrix Spec | 19:28 |
clarkb | corvus: did you want to introduce this topic? | 19:28 |
corvus | oh hi | 19:28 |
corvus | incoming prepared text dump: | 19:28 |
corvus | first i want to share a little update: | 19:28 |
corvus | i spent some time today talking with folks from the ansible, fedora, and gnome communities, all of which have serious plans to adopt matrix (they either have a homeserver or have plans to). | 19:28 |
mordred | I was there too | 19:28 |
corvus | #link gnome matrix spaces https://discourse.gnome.org/t/experimenting-with-matrix-spaces/6571 | 19:29 |
corvus | #link gnome matrix sovreignty https://blog.ergaster.org/post/20210610-sovereignty-federated-system-gnome/ | 19:29 |
corvus | #link fedora irc/matrix https://communityblog.fedoraproject.org/irc-announcement/ | 19:29 |
corvus | #link fedora matrix plan https://discussion.fedoraproject.org/t/matrix-server-channel-setup/29844/7 | 19:29 |
corvus | #link ansible matrix plan https://hackmd.io/FnpIUIrrRuec-gT3lrv-rQ?view#Current-plan-as-of-2021-06-14 | 19:29 |
corvus | so we've got some really good company here, and people to collaborate with as we figure stuff out. | 19:29 |
corvus | just today i've learned way too much to share here in full, but in short: there are even more options for managing transitions from irc to matrix (including ways to take full admin control of the existing matrix.org portal rooms, rename those to :opendev.org rooms, and either maintain or retire the bridge at any point). all of that to say that we'll make some choices that are appropriate for zuul, but there are other choices that may be | 19:29 |
corvus | more appropriate for other opendev projects which are equally valid. | 19:29 |
corvus | no matter what we do, the next step is for opendev to host a homeserver, so, on to the topic at hand, i uploaded a spec: https://review.opendev.org/796156 | 19:29 |
corvus | and there are 2 big questions from my pov: | 19:29 |
corvus | 1) does opendev want a homeserver? | 19:29 |
corvus | 2) if so, does opendev want to run one itself or pay EMS to run it? | 19:29 |
corvus | and so, what order do we want to answer these questions, and how do we want to decide the second one? | 19:29 |
corvus | (fwiw, i advocate for OIF paying EMS to host the homeserver) | 19:29 |
corvus | [eof] | 19:30 |
mordred | in the realm of "learned too much"... | 19:30 |
mordred | I recommend very strongly reading the matrix sovreignity post above | 19:30 |
clarkb | for 2) I'm pretty strongly in the have someone else run it if at all possible | 19:30 |
fungi | any feel for what the recurring opex is on paying ems to host a homeserver? | 19:30 |
mordred | it made me pretty well convinced that there is way more sharp edges around having a homeserver that also has user accounts | 19:31 |
mordred | than value | 19:31 |
fungi | just wondering what additional research we need to do there before appealing to have funds applied | 19:31 |
corvus | fungi: could be as little as $10/mo. i think it's wise for someone from the oif to talk to them and determine if that's appropriate. | 19:31 |
mordred | and so I think a homeserver that just hosts rooms and not users is what we'd be wanting | 19:31 |
clarkb | mordred: does that article cover the advantages of running a homeserver in that case without user accounts? eg why not just use matrix.org in that case? | 19:32 |
mordred | the $10/mo actually technically could do that - but we might be weasel-word reading the intent of that price tier - so we should likely talk to them | 19:32 |
mordred | clarkb: it does ... but I can summarize real quick | 19:32 |
mordred | if we have an opendev.org homeserver then we have control over things that brand themselves as being opendev channels | 19:32 |
clarkb | got it, its about channel management then. Makes sense | 19:33 |
mordred | so someone can be sure that #zuul:opendev.org is the zuul channel hosted by opendev - whereas #zuul:matrix.org might or might not have any relationship to us | 19:33 |
mordred | yeah | 19:33 |
mordred | we would also have a few more integration possibilities | 19:33 |
mordred | allowing us to think about things like logging bots slightly differently - or not, we could still use bots | 19:33 |
clarkb | ya and integration with other chat systems | 19:33 |
fungi | what is the process for moving herds of users from one matrix channel to a replacement channel? like could we use a matrix.org channel and later "forward" that to an opendev.org channel? | 19:33 |
corvus | (or slack bridges....) | 19:34 |
corvus | fungi: one of the options is to actually just rename the channel :) | 19:34 |
fungi | so "renames" are a thing then | 19:34 |
clarkb | and I guess the background on this topic is that Zuul has decided they would like to use matrix for primary synchronous comms rather than irc | 19:35 |
corvus | i just learned (moments ago) that's actually a possibility for the oftc portal rooms! | 19:35 |
fungi | and #zuul:matrix.org could be "renamed" to #zuul:opendev.org? | 19:35 |
corvus | well, there's no #zuul:matrix.org to my knowledge; i have no intention of creating any rooms :matrix.org | 19:35 |
clarkb | Looking at element pricing is done per user. The $10 option is for 5 users. I suspect we'd end up with ~500 users at any one time? | 19:35 |
mordred | clarkb: nope. | 19:36 |
mordred | clarkb: we'd just have rooms | 19:36 |
mordred | users would not log in to our homeserver | 19:36 |
clarkb | oh I see | 19:36 |
corvus | fungi: but if there were, we could rename that room. more to the point, we can rename the `#_oftc_#zuul:matrix.org` portal room, with some help from the matrix.org admins. | 19:36 |
clarkb | there would be ~500 users interacting with the channels on that homeserver but none of them would be opendev.org users | 19:36 |
mordred | (email winds up being an excellent analogy fwiw) | 19:36 |
mordred | yah | 19:36 |
fungi | yeah, if we need to have foundation staff talking to matrix about pricing, we probably should be clear on what wording is relevant | 19:36 |
corvus | we may want a handful of admin/bot accounts, that's it. 5 accounts is the right number to be thinking of. | 19:37 |
clarkb | corvus: got it | 19:37 |
mordred | exactly - that's where I'd want the EMS folks to be cool with our intended use | 19:37 |
mordred | but also - it's other homeservers that would be federating with it | 19:37 |
fungi | so my user might be fungi:yuggoth.org if i decide to run my own homeserver | 19:38 |
mordred | yah | 19:38 |
fungi | which wouldn't count against the opendev.org user count | 19:38 |
clarkb | fungi: yup. I'm currently Clark:matrix.org or something | 19:38 |
corvus | otoh, it's like "hey we're going to increase your load on matrix.org by X hundred/thousand users". they may be ":(" or they may hear "we're going to expose X hundred/thousand more people to matrix technology" and be ":)". i dunno. | 19:38 |
fungi | right, i have fungicide:matrix.org because fungi was taken by someone else cool enough to like the name | 19:38 |
clarkb | if I understand correctly what we would want to ask about is whether or not the $10/month (or maybe even the $75/month) options fit our use case of running a homeserver where the vast majority of users are authenticating with their own homeservers or matrix.org | 19:39 |
clarkb | The hosted homeserver would primarily be used to set channel ownership and manage those channels | 19:40 |
corvus | ++ | 19:40 |
mordred | yup | 19:40 |
clarkb | fungi and I should probably go read the spec and bring that up with OIF then | 19:40 |
clarkb | and then based on what we learn we can bring that feedback to the spec | 19:41 |
corvus | that sounds like a great next step -- you can do that, and then we can revise the spec to only include one hosting option | 19:41 |
fungi | clarkb: that would be great, happy to help in that discussion, and we can certainly involve anyone else who wants to be in on those conversations too | 19:41 |
mordred | I think corvus and I would be happy to chat with our OIF friends if that would be helpful | 19:42 |
mordred | you could tell sparky that he's welcome to come to Pal's and talk with me about it there | 19:42 |
corvus | yes, i am fully prepared to be a resource as needed :) | 19:42 |
clarkb | sounds good. I'll try to get started on that when I get back on Thursday | 19:42 |
clarkb | Anything else to bring up on the subject of Matrix? Or should we see where we end up after talking to OIF? | 19:43 |
fungi | is Pal's a bar? | 19:43 |
corvus | i think that's good for me | 19:43 |
clarkb | #topic arm64 cloud status | 19:44 |
clarkb | This wasn't on the agenda but it should've been so I'm adding it :{ | 19:44 |
clarkb | er :P | 19:44 |
fungi | chef's choice | 19:44 |
clarkb | When I rebooted servers the osuosl mirror node did not come back with working openafs. Googling found that ianw had run into this in the past but I couldn't find out how we got past it previously. For this reason we ended up disabling osuosl in nodepool | 19:45 |
mordred | fungi: yes | 19:45 |
fungi | more specifically, it's throwing a kernel oops in cache setup | 19:45 |
clarkb | since then we've discovered that linaro has a bunch of leaked nodes limiting our total capacity there. That cloud is functioning just not at full capacity. I have emailed kevinz with those details | 19:45 |
ianw | sorry i must have missed this | 19:45 |
clarkb | I expect kevinz will be able to clean up the nodes i listed as leaked and we'll be back to happy again in linaro. But I'm not sure what the next stpes for us in osuosl are | 19:46 |
fungi | ianw: it's partly my fault for being so scattered i forgot to mention it | 19:46 |
ianw | the usual case i've found is that the /var/cache/openafs is corrupt, and removing it helps | 19:46 |
clarkb | ianw: no worries. I think I remember for your initial query to the openafs list that this is focal specific. I suppose one option is to downgrade to bionic on the mirror | 19:46 |
fungi | we've tried a few things there, clearing the old cache files, reformatting and even recreating the cache volume in case it was a block level problem, manually loading the lkm before starting afsd... | 19:46 |
clarkb | ianw: we've cleared out the cache multiple times without it helping unfortunately. fungi even compeltely replaced the cinder volume that bakced it | 19:46 |
fungi | yeah, still the same oops every time we try to start up afsd | 19:47 |
ianw | do you have a link to the oops, i can't even remeber sorry | 19:47 |
clarkb | openafs upstream mentioned that 1.8.7 should include the expected fix | 19:48 |
clarkb | let me see if I can find it in scrollback | 19:48 |
fungi | i can scrape it from dmesg on the server, sure | 19:48 |
ianw | anyway, we can debug this today | 19:48 |
clarkb | ianw: https://www.mail-archive.com/openafs-info@openafs.org/msg41186.html should match the dmesg if I got the right thing | 19:48 |
clarkb | ianw: that would be great. Thanks! | 19:48 |
fungi | #link http://paste.openstack.org/show/806651 openafs kernel oops | 19:49 |
clarkb | #topic Gerrit Project Renames | 19:49 |
clarkb | fungi: do we have a change to update the playbook for this yet? | 19:49 |
fungi | i have not, no | 19:49 |
clarkb | ok, lets skip it for now then | 19:49 |
fungi | meant to do it late last week | 19:49 |
fungi | sorry! | 19:49 |
clarkb | no worries. It has been a fun few weeks | 19:49 |
clarkb | #topic Open Discussion | 19:49 |
clarkb | Is there anything else to talk about? | 19:50 |
clarkb | Sounds like that may be it. Thank you everyone! | 19:52 |
clarkb | #endmeeting | 19:52 |
opendevmeet | Meeting ended Tue Jun 15 19:52:57 2021 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:52 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2021/infra.2021-06-15-19.01.html | 19:52 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2021/infra.2021-06-15-19.01.txt | 19:52 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2021/infra.2021-06-15-19.01.log.html | 19:52 |
fungi | thanks clarkb! | 19:54 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!