19:00:14 <clarkb> #startmeeting infra
19:00:14 <opendevmeet> Meeting started Tue Oct 29 19:00:14 2024 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:14 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:00:14 <opendevmeet> The meeting name has been set to 'infra'
19:00:27 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/WYXMA7GIFIAQYHIISBVJ2JEHEWARQ4JN/ Our Agenda
19:00:31 <clarkb> #topic Announcements
19:00:38 <clarkb> #link https://www.socallinuxexpo.org/scale/22x/events/open-infra-days CFP for Open Infra Days event at SCaLE is open until November 1
19:00:48 <clarkb> That is this week so get your proposals in before it is too late if you are interested
19:01:30 <clarkb> Also Europe has switched off of DST last weekend and North America is doing so this weekend. Australia did it at the beginning of october (though they went onto DST not off of it)
19:01:50 <clarkb> all that is a reminder that you should check your scheduels and meetings and make sure that you don't have any conflicts and that times are correct
19:01:57 <frickler> yes, finally not quite so late meeting time for me :)
19:03:02 <clarkb> I also wanted to call out that November is a busy month for me outside of work (and probably for the rest of us in the USA as well). Next week we have our big election and then I'm out friday/monday ish (monday is a holiday) and then at the end of the month is Thanskgiving
19:03:26 <clarkb> I've looked at my calendar and I don't believe any of our regularly scheduled meetings will be cancelled as a result, but I just wanted to make sure people are aware of everything goign on
19:04:21 <clarkb> anything else to announce?
19:05:20 <clarkb> #topic Zuul-launcher image builds
19:05:36 <clarkb> I believe the main thing here is adding more image build jobs
19:05:55 <clarkb> tonyb has previously volunteered to write those jobs. Not sure if tonyb is awake yet, but any questions or concerns?
19:07:38 <clarkb> I guess not. We can proceed and if there are question later get back to them at the tail end of the meeting
19:07:43 <clarkb> #topic Backup Server Pruning
19:08:03 <clarkb> Last week I went ahead and cleaned up ethercalc02 backups on the vexxhost server. As far as I can tell automation didn't recreate them
19:08:10 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/933354 Documentation of cleanup process applied to ethercalc02 backups.
19:08:23 <fungi> oh, right, i was going to look at this today
19:08:25 <clarkb> I wrote up what I did in this change so that we can have a concrete set of steps and also review what I did
19:09:11 <clarkb> ianw has some comments in there I want to respond to as well
19:10:03 <clarkb> but if that looks good then we can proceed with additiopnal cleanup
19:10:26 <clarkb> we're at 92% disk utilization on that backup server so we should consider regular pruning if this cleanup doesn't proceed super quickly
19:10:36 <clarkb> any other backup feedback/concerns/etc?
19:11:19 <clarkb> #topic Updating Gerrit Cache Sizes
19:11:49 <clarkb> Moving on we discussed this last week as well and tl;dr is gerrit was complaining about caches being super small compared to their actual size and needing to do a lto of pruning on startup
19:11:58 <clarkb> we didn't restart anything last week due to the distactions of the ptg
19:12:05 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/932763 increase sizes for four gerrit caches
19:12:32 <clarkb> this week is much better for that. I can probably do that first thing tomorrow (or some other time tomorrow) if the change looks good
19:12:39 <fungi> seems like we could do it any time this week, yep
19:12:48 <fungi> tomorrow sounds fine to me
19:13:14 <frickler> +1
19:13:42 <clarkb> #topic Upgrading old servers
19:14:04 <clarkb> anything new on this topic? I haven't seen anything myself and I'm hopign I'll be able to start looking at gerrit upgrade stuff in November
19:15:03 <fungi> i don't think so
19:15:23 <clarkb> #topic Docker compose plugin with podman service for servers
19:15:58 <clarkb> I don't think there is anything new for this either and it is related to the previous topic in that picking a small service like paste and upgrading its server and also updating how we run docker/podman/compose in the process is the proposed path forard
19:16:17 <clarkb> I think we can get that all working in the CI jobs first too to get a first early set of feedback on top of what corvus has already done
19:18:25 <clarkb> ok we're going to make record time today
19:18:31 <clarkb> #topic Etherpad 2.2.6 Upgrade
19:18:39 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/933618 Upgrade Etherpad to version 2.2.6
19:18:49 <clarkb> Held node at 158.69.73.1. Need to test pad creator deletion and if it works with our no auth setup
19:19:02 <fungi> awesome
19:19:02 <clarkb> there is a new etherpad release which I have a chagne up for a held not to do testing.
19:19:16 <clarkb> I haven't tested it yet and the only thing called out in the cahngelog is the new ability for pad creators to delete a pad
19:19:31 <clarkb> I want to see if/how that works with our lack of authentication before we upgrade using the held node
19:20:07 <clarkb> after this meeting I'll eat lunch and then start looking at that. I'll also pop out for a bike ride so not sure if that happens immediately after the meeting or not yet
19:20:23 <clarkb> and its a good time to upgrade etherpad with the ptg being over
19:21:16 <clarkb> #topic Open Discussion
19:21:18 <clarkb> Anything else?
19:21:29 <frickler> one thing to note are the recurring (twice at least so far) issues with zuul-registry stopping to work
19:21:46 <clarkb> ya though the most recent one seemed to be at a host level maybe?
19:21:53 <frickler> corvus did a restart on friday and it worked for a day or so and today it was broken again
19:22:00 <fungi> not strictly opendev, but i have some ptgbot fixes up for review too which should get jobs for it passing again
19:22:01 <clarkb> If that happens again I wonder if we need to tcpdump and check if packages are reaching the server?
19:22:12 <clarkb> s/packages/packets/
19:22:42 <frickler> I still have a hold active for the sdk job
19:23:12 <frickler> so once it fails again, we can recheck the push command and see the output. currently it is hidden by no_log I think
19:23:28 <clarkb> ++ between tcpdump and service logs hopefully we can figure it out
19:24:09 <frickler> also maybe the swift container could use some cleanup
19:24:31 <frickler> about 4M objects and 16 TB of space
19:24:39 <clarkb> as noted the easiest thing may be to just switch to a new container, force changes to recheck if necessary then figure out deletion of the old container
19:25:03 <clarkb> or test the prune command and fallback to ^
19:25:42 <clarkb> frickler: maybe you can add this to the meeting agenda for next week and we can think it over this week/debug further if necessary then decide on a plan going fowrard?
19:25:59 <frickler> yes, I can do that
19:26:08 <clarkb> thanks!
19:26:40 <frickler> also a quick ref to the discussion on the tc meeting earlier about the issues with the zanata job
19:26:58 <frickler> just in case someone reading here might want to help with that
19:27:55 <clarkb> tl;dr on that is openstack is dropping python3.8 support but zanata requires java8 which runs on the python3.8 platform
19:28:21 <clarkb> so there is a mismatch in tooling. The ideal solution is to finish the weblate migration and leave zanata behind but I think the tc said there would be followup on the i18n list?
19:28:33 <fungi> (to be clear, ubuntu bionic is the last lts to provide java8, and its default python is 3.8)
19:28:42 <frickler> well requirements has dropped py3.8 specific constraints from u-c on the master branch, to be more specific
19:29:03 <fungi> yeah, stable branch translation jobs are still working as a resuly
19:29:04 <fungi> result
19:29:55 <frickler> and yes, switching to weblate would be the best path going forward IMO
19:31:38 <clarkb> also meetpad did upgrade
19:31:41 <clarkb> and it still works
19:32:51 <clarkb> I'll leave things open until 19:35 but if there is nothing else we can stop at that point
19:33:52 <fungi> oh, as mentioned earlier in #opendev, the recent mailman upgrade makes it so list owners can now delete posts and threads, so we no longer need a super user intervention for spam cleanup
19:34:19 <frickler> I also tested the admin interface fwiw, seems to work fine
19:34:50 <frickler> I noticed there's some bounce handling options, maybe we can discuss activating that for some lists?
19:35:14 <frickler> I can add it to the agenda for next time, so ppl have time to think about it
19:35:18 <clarkb> frickler: you mean things like auto unsubscribe?
19:35:22 <clarkb> ++ to adding that to the agenda
19:35:56 <frickler> clarkb: yes, auto unsubscribe, but with tunable parameters
19:36:11 <frickler> https://lists.openstack.org/mailman3/lists/openstack-discuss.lists.openstack.org/settings/bounce_processing
19:36:21 <clarkb> ack would be a good topic for next week
19:36:27 <clarkb> I'll go ahead and end here now
19:36:31 <clarkb> thank you everyone for your time
19:36:44 <clarkb> and we'll be back next tuesday same time and location (though relative to your local timezone it may shift)
19:36:46 <clarkb> #endmeeting