19:00:14 <clarkb> #startmeeting infra 19:00:14 <opendevmeet> Meeting started Tue Oct 29 19:00:14 2024 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:14 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:14 <opendevmeet> The meeting name has been set to 'infra' 19:00:27 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/WYXMA7GIFIAQYHIISBVJ2JEHEWARQ4JN/ Our Agenda 19:00:31 <clarkb> #topic Announcements 19:00:38 <clarkb> #link https://www.socallinuxexpo.org/scale/22x/events/open-infra-days CFP for Open Infra Days event at SCaLE is open until November 1 19:00:48 <clarkb> That is this week so get your proposals in before it is too late if you are interested 19:01:30 <clarkb> Also Europe has switched off of DST last weekend and North America is doing so this weekend. Australia did it at the beginning of october (though they went onto DST not off of it) 19:01:50 <clarkb> all that is a reminder that you should check your scheduels and meetings and make sure that you don't have any conflicts and that times are correct 19:01:57 <frickler> yes, finally not quite so late meeting time for me :) 19:03:02 <clarkb> I also wanted to call out that November is a busy month for me outside of work (and probably for the rest of us in the USA as well). Next week we have our big election and then I'm out friday/monday ish (monday is a holiday) and then at the end of the month is Thanskgiving 19:03:26 <clarkb> I've looked at my calendar and I don't believe any of our regularly scheduled meetings will be cancelled as a result, but I just wanted to make sure people are aware of everything goign on 19:04:21 <clarkb> anything else to announce? 19:05:20 <clarkb> #topic Zuul-launcher image builds 19:05:36 <clarkb> I believe the main thing here is adding more image build jobs 19:05:55 <clarkb> tonyb has previously volunteered to write those jobs. Not sure if tonyb is awake yet, but any questions or concerns? 19:07:38 <clarkb> I guess not. We can proceed and if there are question later get back to them at the tail end of the meeting 19:07:43 <clarkb> #topic Backup Server Pruning 19:08:03 <clarkb> Last week I went ahead and cleaned up ethercalc02 backups on the vexxhost server. As far as I can tell automation didn't recreate them 19:08:10 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/933354 Documentation of cleanup process applied to ethercalc02 backups. 19:08:23 <fungi> oh, right, i was going to look at this today 19:08:25 <clarkb> I wrote up what I did in this change so that we can have a concrete set of steps and also review what I did 19:09:11 <clarkb> ianw has some comments in there I want to respond to as well 19:10:03 <clarkb> but if that looks good then we can proceed with additiopnal cleanup 19:10:26 <clarkb> we're at 92% disk utilization on that backup server so we should consider regular pruning if this cleanup doesn't proceed super quickly 19:10:36 <clarkb> any other backup feedback/concerns/etc? 19:11:19 <clarkb> #topic Updating Gerrit Cache Sizes 19:11:49 <clarkb> Moving on we discussed this last week as well and tl;dr is gerrit was complaining about caches being super small compared to their actual size and needing to do a lto of pruning on startup 19:11:58 <clarkb> we didn't restart anything last week due to the distactions of the ptg 19:12:05 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/932763 increase sizes for four gerrit caches 19:12:32 <clarkb> this week is much better for that. I can probably do that first thing tomorrow (or some other time tomorrow) if the change looks good 19:12:39 <fungi> seems like we could do it any time this week, yep 19:12:48 <fungi> tomorrow sounds fine to me 19:13:14 <frickler> +1 19:13:42 <clarkb> #topic Upgrading old servers 19:14:04 <clarkb> anything new on this topic? I haven't seen anything myself and I'm hopign I'll be able to start looking at gerrit upgrade stuff in November 19:15:03 <fungi> i don't think so 19:15:23 <clarkb> #topic Docker compose plugin with podman service for servers 19:15:58 <clarkb> I don't think there is anything new for this either and it is related to the previous topic in that picking a small service like paste and upgrading its server and also updating how we run docker/podman/compose in the process is the proposed path forard 19:16:17 <clarkb> I think we can get that all working in the CI jobs first too to get a first early set of feedback on top of what corvus has already done 19:18:25 <clarkb> ok we're going to make record time today 19:18:31 <clarkb> #topic Etherpad 2.2.6 Upgrade 19:18:39 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/933618 Upgrade Etherpad to version 2.2.6 19:18:49 <clarkb> Held node at 158.69.73.1. Need to test pad creator deletion and if it works with our no auth setup 19:19:02 <fungi> awesome 19:19:02 <clarkb> there is a new etherpad release which I have a chagne up for a held not to do testing. 19:19:16 <clarkb> I haven't tested it yet and the only thing called out in the cahngelog is the new ability for pad creators to delete a pad 19:19:31 <clarkb> I want to see if/how that works with our lack of authentication before we upgrade using the held node 19:20:07 <clarkb> after this meeting I'll eat lunch and then start looking at that. I'll also pop out for a bike ride so not sure if that happens immediately after the meeting or not yet 19:20:23 <clarkb> and its a good time to upgrade etherpad with the ptg being over 19:21:16 <clarkb> #topic Open Discussion 19:21:18 <clarkb> Anything else? 19:21:29 <frickler> one thing to note are the recurring (twice at least so far) issues with zuul-registry stopping to work 19:21:46 <clarkb> ya though the most recent one seemed to be at a host level maybe? 19:21:53 <frickler> corvus did a restart on friday and it worked for a day or so and today it was broken again 19:22:00 <fungi> not strictly opendev, but i have some ptgbot fixes up for review too which should get jobs for it passing again 19:22:01 <clarkb> If that happens again I wonder if we need to tcpdump and check if packages are reaching the server? 19:22:12 <clarkb> s/packages/packets/ 19:22:42 <frickler> I still have a hold active for the sdk job 19:23:12 <frickler> so once it fails again, we can recheck the push command and see the output. currently it is hidden by no_log I think 19:23:28 <clarkb> ++ between tcpdump and service logs hopefully we can figure it out 19:24:09 <frickler> also maybe the swift container could use some cleanup 19:24:31 <frickler> about 4M objects and 16 TB of space 19:24:39 <clarkb> as noted the easiest thing may be to just switch to a new container, force changes to recheck if necessary then figure out deletion of the old container 19:25:03 <clarkb> or test the prune command and fallback to ^ 19:25:42 <clarkb> frickler: maybe you can add this to the meeting agenda for next week and we can think it over this week/debug further if necessary then decide on a plan going fowrard? 19:25:59 <frickler> yes, I can do that 19:26:08 <clarkb> thanks! 19:26:40 <frickler> also a quick ref to the discussion on the tc meeting earlier about the issues with the zanata job 19:26:58 <frickler> just in case someone reading here might want to help with that 19:27:55 <clarkb> tl;dr on that is openstack is dropping python3.8 support but zanata requires java8 which runs on the python3.8 platform 19:28:21 <clarkb> so there is a mismatch in tooling. The ideal solution is to finish the weblate migration and leave zanata behind but I think the tc said there would be followup on the i18n list? 19:28:33 <fungi> (to be clear, ubuntu bionic is the last lts to provide java8, and its default python is 3.8) 19:28:42 <frickler> well requirements has dropped py3.8 specific constraints from u-c on the master branch, to be more specific 19:29:03 <fungi> yeah, stable branch translation jobs are still working as a resuly 19:29:04 <fungi> result 19:29:55 <frickler> and yes, switching to weblate would be the best path going forward IMO 19:31:38 <clarkb> also meetpad did upgrade 19:31:41 <clarkb> and it still works 19:32:51 <clarkb> I'll leave things open until 19:35 but if there is nothing else we can stop at that point 19:33:52 <fungi> oh, as mentioned earlier in #opendev, the recent mailman upgrade makes it so list owners can now delete posts and threads, so we no longer need a super user intervention for spam cleanup 19:34:19 <frickler> I also tested the admin interface fwiw, seems to work fine 19:34:50 <frickler> I noticed there's some bounce handling options, maybe we can discuss activating that for some lists? 19:35:14 <frickler> I can add it to the agenda for next time, so ppl have time to think about it 19:35:18 <clarkb> frickler: you mean things like auto unsubscribe? 19:35:22 <clarkb> ++ to adding that to the agenda 19:35:56 <frickler> clarkb: yes, auto unsubscribe, but with tunable parameters 19:36:11 <frickler> https://lists.openstack.org/mailman3/lists/openstack-discuss.lists.openstack.org/settings/bounce_processing 19:36:21 <clarkb> ack would be a good topic for next week 19:36:27 <clarkb> I'll go ahead and end here now 19:36:31 <clarkb> thank you everyone for your time 19:36:44 <clarkb> and we'll be back next tuesday same time and location (though relative to your local timezone it may shift) 19:36:46 <clarkb> #endmeeting