*** diablo_rojo is now known as Guest924 | 09:16 | |
clarkb | just about meeting time | 18:59 |
---|---|---|
fungi | so it is | 18:59 |
clarkb | #startmeeting infra | 19:01 |
opendevmeet | Meeting started Tue May 9 19:01:02 2023 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
opendevmeet | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/BQ5T6VULIAYPCN6LPWSEMA4XITIXTSZB/ Our Agenda | 19:01 |
clarkb | I didn't hvae any announcements. We can dive right into our topics | 19:01 |
clarkb | #topic Migrating to Quay | 19:01 |
clarkb | A bunch of progress has been made on this since ~Friday | 19:01 |
clarkb | #link ttps://etherpad.opendev.org/p/opendev-quay-migration-2023 Plan/TODO list | 19:01 |
clarkb | I wrote this plan / todo list document to keep everything straight | 19:01 |
clarkb | since then I've pushed like 25 changes and a good number of them have landed. At this point I think about 12/34 imgaes are being pushed and pulled from quay.io | 19:02 |
clarkb | That said late yesterday I realized we were pushing change tagsto quay.io which we didn't want to do | 19:02 |
clarkb | There are two potential fixes for this | 19:02 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/882628 Don't run the upload role at all | 19:02 |
clarkb | In this one we modify our jobs to not run the upload role in the gate at all (the upload role in the gate is what is pushing those change tags) | 19:03 |
clarkb | #link https://review.opendev.org/c/zuul/zuul-jobs/+/882724 Upload role skip the push | 19:03 |
clarkb | In this option we modify the upload role to check the flag for promotion behavior and only push if the one that needs the push is selected | 19:03 |
clarkb | corvus: likes 882724 more. I do as well as it simplifies reorganizing all of the jobs as we go through this process. The changes to swap image publication over to quay.io are pretty minimal with this appraoch which is nice | 19:04 |
clarkb | If you have time to review both of these changes and weigh in that would be great. If we get one of them landed I can continue with my progress in migrating things (depending on which choice we choose I will need to do different things so don't want to move ahead right now) | 19:04 |
ianw | the only thing about that is all the documentation now says "use build in check and gate" and then promote for the IR path | 19:04 |
ianw | except then the main example of it doesn't do that :) | 19:05 |
clarkb | ianw: ya and I think we could possibly switch it over to that after the migration too. I'm mostly concerned that the amount of stuff to change to swap the images increases quite a bit if we go that route now. But both appraoches should work and I'm only preferring one for the amount of work it creates :) | 19:05 |
clarkb | but ya if we prefer the more explicit approach thats fine. I will just need to modify some changes and land the fixp change that ianw started | 19:06 |
corvus | i think things changed once we accepted the idea of being able to switch the promote mechanism with a variable | 19:06 |
corvus | i'm not sure it makes as much sense to switch that with a variable and then have to change the job structure too... | 19:07 |
clarkb | ya though it may be confusing to have two different jobs that behave identically if the flag is set | 19:08 |
clarkb | I can really go either way on this. | 19:08 |
clarkb | but making a decisions between one approach and the other is the next step in migrating so that we don't create more of a cleanup backlog. I can also do the cleanup once this is sorted out. Its only a few extra tags that can be removed by hand | 19:09 |
corvus | arguably, we should probably not have the role switch but instead have the job switch which role it includes... but this is sprawling and complex enough that putting the logic in the role seems like a reasonable concession. | 19:09 |
clarkb | maybe ianw and fungi can weigh in on the two changes and we can take it from there? | 19:10 |
fungi | sounds good | 19:10 |
clarkb | I'll use grafyaml's image as the confirmation whatever choice we make is functional | 19:10 |
clarkb | it is next in my list | 19:10 |
clarkb | anything else related to the container images moving to quay? | 19:11 |
clarkb | #topic Bridge Updates | 19:13 |
clarkb | #link https://review.opendev.org/q/topic:bridge-backups | 19:13 |
clarkb | This topic could still use an additional infra-root to sanity check it | 19:13 |
clarkb | Other than that I'm not aware of any changes to bridge. It seems to be happy | 19:13 |
clarkb | #topic Mailman 3 | 19:14 |
clarkb | fungi: looks like you have some updates on this one | 19:14 |
fungi | nothing exciting yet. new held node at 23.253.160.97 and the list filtering per-domain is working correctly by default, but there is a singular list domain name which gets displayed in the corner of the hyperkitty archive pages | 19:15 |
fungi | think i've figured out how that's piped in through the settings, though it also impacts the domain name used for admin communications the server sends out | 19:16 |
clarkb | yes I don't think that is going to be configurable unfortunately | 19:16 |
fungi | the current stack of changes tries to turn that default domain name into one which isn't one of the list domains, but i missed that setting i just found so it needs another revision | 19:16 |
clarkb | or at least not with current mailman 3. It has all of the info it needs to properly configure that though it would require updates to mailman to do so | 19:16 |
fungi | well, or it needs more advanced django site partitioning, below the mailman services layer | 19:17 |
clarkb | I think that template is in mailman itself | 19:17 |
fungi | postorius and hyperkitty are just delegating to django for that stuff, and this is bubbling up from django site info in settings and database | 19:18 |
fungi | so creating the sites in django and having per-site settings files would probably work around it | 19:18 |
fungi | but yes, using a singular settings.py is part of the problem, i think | 19:19 |
clarkb | mailman/hyperkitty/hyperkitty/templates/hyperkitty/*.html fwiw | 19:19 |
fungi | we could possibly just hide that string displayed in the webui | 19:19 |
fungi | to reduce confusion. though i still need to check whether it does the right thing with headers in messages | 19:19 |
fungi | which, with the held node, means fishing deferred copies out of the exim queue | 19:20 |
fungi | (and injecting them locally to start with) | 19:20 |
fungi | hopefully i'll get to that shortly. i'm coming out of a dark tunnel of other work finally | 19:21 |
fungi | anyway, i didn't have any real updates on it this week | 19:21 |
clarkb | let us know if/when we can help or review things | 19:21 |
clarkb | #topic Gerrit Updates | 19:21 |
fungi | will do | 19:21 |
clarkb | #link https://review.opendev.org/c/openstack/project-config/+/882075 last little bit of ACL cleanup work | 19:21 |
clarkb | This change (and its child) are unmerged due to documentation/ease of use concerns. I like the suggestion of having a tox target for it | 19:22 |
clarkb | since we expect people to be able to do that generally to run local tooling | 19:22 |
clarkb | Not urgent but also something we can probably clear out of the backlog quickly if that works for others | 19:23 |
fungi | would be easy to add a tox testenv to run that command in a followup change | 19:24 |
fungi | i'll try to knock that out really quick | 19:24 |
clarkb | thanks! | 19:24 |
clarkb | for replication tasks leaking I haven't seen any movement on the issues I filed. I'm || that close to creating a proper discord account to join the server to try and have a conversation about it (I think the matrix bridge may be losing messages and they are discussing using discord for the community meeting next month after May's was effectively cancelled due to no googler | 19:25 |
clarkb | being present to let people into the google meet...) | 19:25 |
clarkb | Assuming I can get through some higher priority itmes I can also spin up a gerrit dev env again and try to fix it myself. | 19:25 |
clarkb | The good news is the growth isn't insane. Currently just less than 16k files | 19:26 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/880672 Dealing with leaked replication tasks on disk | 19:26 |
clarkb | happy for feedback on my poor attempt at working around this locally too | 19:26 |
clarkb | And finally for Gerrit we should plan to restart the server on the new image once we move things to quay. That will also pick up the theming changes that were done for 3.8 upgrade prep | 19:26 |
clarkb | #topic Upgrading older servers | 19:27 |
clarkb | We've upgraded all of the servers we had been working on and need to pick up some new ones | 19:28 |
clarkb | #link https://etherpad.opendev.org/p/opendev-bionic-server-upgrades Notes | 19:28 |
clarkb | mirror nodes, registry, meetpad, etc could all be done. | 19:28 |
clarkb | I'll be looking at this once quay is off my plate. Help very much welcome if you have time to do some as well | 19:28 |
clarkb | This also ties into the docker-compose stuff we hit last week | 19:29 |
clarkb | basicall if we can get all the container based services running on focal or newer it ooks like we can pretty safely switch from pip docker-compose to distro docker compose | 19:30 |
* tonyb has some time to look at migrating some of the older servers | 19:30 | |
clarkb | thereis also a newer docker compose tool written in go that we can switch to but it changes container names and other things so we need to be more careful with this one | 19:30 |
clarkb | tonyb: thanks! we can sync up on that and go through it. But one of the first thigns that can be done is updating the CI for a service to run against focal or jammy (jammy would be preferred) and ensure everything is working there. Then an infra root can deploy a new server and add it to inventory | 19:31 |
clarkb | tonyb: you should be able to do everything up to the point of deploying the replacement server and adding it to inventory | 19:31 |
tonyb | clarkb: sounds good | 19:32 |
clarkb | tonyb: system-config/zuul.d/system-config-run.yaml is the interesting file to start on that as it defines the nodesets for each service under testing | 19:32 |
clarkb | #topic OpenAFS disk utilization | 19:32 |
clarkb | in unexpected news utilization is down slightly from last week | 19:33 |
clarkb | this is a good thing. It isn't drastic but it is noticeable on the grafana dashboard | 19:33 |
clarkb | I also started the discussion about winding down fedora. Either just the mirrors for the disro or the test images entirely | 19:33 |
tonyb | Is there a timeline we'd like to hit to begin winding things down? | 19:34 |
clarkb | So far I haven't seen any objections to removing the mirrors. Some libvirt and fedora folks are interested in keeping the images to help ensure openstack works with fedora and new virtualiation stuff. But new virtualization stuff is being built for centos stream as well so less important I think | 19:34 |
clarkb | tonyb: I think I'll give the thread a bump later today and maybe give it anothe rweek for feedback just to be usre anyone with an important use case and/or willingness to help hasn't missed it. But then I suspect we can start next week if nothing on the feedback changes | 19:35 |
clarkb | My main concern is moving too quickly and someone missing the discussion. 2 weeks seems like plenty to avoid that problem | 19:36 |
tonyb | Okay. I have it on my TODO for today to raise awareness inside RH, as well | 19:37 |
clarkb | I think even if we just drop the mirrors that would be a big win on the opendev side | 19:37 |
fungi | also if libvirt/fedora folks find it useful, then they clearly haven't been taking advantage of it for a while given it's not even the latest fedora any longer | 19:37 |
clarkb | rocky seems to do well enough without "local" mirrors since we don't run a ton of jobs on it and I think fedora is in a similar situation | 19:37 |
clarkb | but ya given the current situation I think we can probably go as far as removal. But we'll see where the feedback takes us | 19:38 |
tonyb | It can be a staged thing right | 19:38 |
clarkb | tonyb: yes we could start with mirror removal first as well. That won't solve fedora being 2 releases behind though | 19:38 |
tonyb | we can cleanup the mirrors and then work on winding up Fedora and/or focusing on stream | 19:38 |
clarkb | step one would be configuring jobs to not use the mirrors, then delete the mirrors, then $somethingelse | 19:39 |
clarkb | yup | 19:39 |
tonyb | okay, got it | 19:39 |
fungi | if the objections to removing fedora images are really because they provide newer $whatever then i don't mind keeping it around, but that implies actually having newest fedora which we don't, and nobody but us seems to have even noticed that | 19:39 |
clarkb | we will free up 400 GB of disk doing that or about 10% | 19:40 |
clarkb | will have a big impact | 19:40 |
tonyb | fungi: Yup I agree, seems like a theoretical objection | 19:41 |
clarkb | #topic Quo vadis Storyboard | 19:41 |
clarkb | There continues to be a steady trickle of projects moving off of storyboard | 19:41 |
clarkb | I haven't seen evidence of collaboration around tooling for that. I think the bulk of moves are just creating a line in the sand and switching | 19:42 |
clarkb | which is fine I guess | 19:42 |
fungi | nothing new to report on my end, but i have been making a point of switching projects to inactive and updating their descriptions to link to their new bug trackers. if anyone spots a "move off sb" change in project-config please make sure it's come to my attention | 19:42 |
clarkb | can do! | 19:42 |
fungi | i comment in them once i've done any relevant post-deploy cleanup | 19:43 |
fungi | just for tracking purposes | 19:43 |
clarkb | #topic Open Discussion | 19:44 |
clarkb | The wiki cert will need to be renewed. Historically I've done that with about 7 days remaining on it. | 19:44 |
clarkb | Apologies for the email it generates until then | 19:44 |
fungi | thanks for handling that | 19:45 |
clarkb | it is our last remaining non le cert :/ | 19:46 |
clarkb | but one cert a year isn't so bad | 19:46 |
clarkb | Last call for anything else | 19:47 |
clarkb | Thank you everyone. genekuo tonyb feel free to ping me directly if I'm not reviewing things or if you have questions about where ou can help. I'm definitely feeling like I've got too many things to think about at once right now and I appreciate the help you have offered so don't want you to feel ignored | 19:49 |
clarkb | and with that I think we can end the meeting a little early | 19:49 |
clarkb | thanks again! | 19:49 |
clarkb | #endmeeting | 19:49 |
opendevmeet | Meeting ended Tue May 9 19:49:53 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:49 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2023/infra.2023-05-09-19.01.html | 19:49 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2023/infra.2023-05-09-19.01.txt | 19:49 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2023/infra.2023-05-09-19.01.log.html | 19:49 |
fungi | thanks clarkb! | 19:50 |
tonyb | Thanks everyone | 19:50 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!