19:01:46 <clarkb> #startmeeting infra 19:01:46 <opendevmeet> Meeting started Tue Jan 24 19:01:46 2023 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:46 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:46 <opendevmeet> The meeting name has been set to 'infra' 19:01:52 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/SAMQHW6WCCF4LKQ2IADJ4VJGZZENI72D/ Our Agenda 19:01:58 <clarkb> #topic Announcements 19:02:16 <clarkb> I sent email last week and made the Service Coordinator nomination period that begins on January 31, 2023 official 19:02:22 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/32BIEDDOWDUITX26NSNUSUB6GJYFHWWP/ 19:03:13 <clarkb> I'll send a reminder email on the 31st that things have opened up too 19:03:55 <clarkb> #topic Bastion Host Updates 19:04:09 <clarkb> I suspect there hasn't been much change here after the week we had last week... 19:04:37 <clarkb> But on the todo list we need to shutdown the old bridge and clean it up when we are satisfied doing so is fine 19:04:42 <clarkb> #link https://review.opendev.org/q/topic:bridge-backups 19:05:07 <clarkb> We need to review ^ that stack of changes (I need to personally do it, but anything around backups and encryption demands time and attention and I haven't had a ton of that recently) 19:05:09 <ianw> oh right, yes everyone has signed off on that 19:05:34 <clarkb> and then finally once we've dealt with those items we can start looking at parallel infra-prod jobs again: 19:05:36 <clarkb> #link https://review.opendev.org/q/topic:prod-bastion-group Remaining changes are part of parallel ansible runs on bridge 19:06:25 <ianw> (i've shutdown the old bridge. it's already in emergency. i'll work on inventory changes, etc.) 19:06:42 <clarkb> thanks! 19:06:45 <clarkb> anything else to add to this topic? 19:06:56 <ianw> not today! 19:07:16 <clarkb> #topic Mailman 3 19:07:40 <clarkb> fungi: I know we've all been underwater with various security things, you more than others. Did you manage to make any progress on the outstanding mailman3 items? 19:07:51 <clarkb> For those following along these are the major todo items that I recall: 19:07:59 <clarkb> We need a service restart to set the site_owner config 19:08:03 <fungi> i've started catching back up on this, i added some initial notes to the bottom of https://etherpad.opendev.org/p/mm3migration in the todo sectiom 19:08:13 <fungi> oh, right, i should add the restart 19:08:25 <clarkb> We need to figure out domain vhosting and likely change domain configuration in the mm3 django install to do this 19:08:34 <clarkb> and we need to fix the root email alias on the server 19:09:44 <fungi> i think all that's captured in the pad now 19:09:44 <clarkb> #link https://etherpad.opendev.org/p/mm3migration live todo list for mailman3 work. 19:09:51 <clarkb> excellent. Anything else to mention on this topic? 19:10:25 <fungi> nope, my focus will be the restart (should be able to do that after the meeting) and troubleshooting the job failures on 867987 19:10:45 <clarkb> sounds good, thanks. Again let me know if I can help 19:10:51 <fungi> sure thing 19:10:51 <clarkb> #topic Gerrit Updates 19:11:09 <clarkb> This has sort of morphed out of the Gerrit 3.6 post upgrade task tracking into a bigger set of items 19:11:17 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/870114 Add Gerrit 3.6 -> 3.7 Upgrade test job 19:11:52 <clarkb> this change is a post upgrade item. ianw I responded to your comment there and basically indicated I feel like punting on that for now is ok/desireable since we aren't trying to fully automate the gerrit upgrade in production (yet) 19:12:11 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/870874 Convert Gerrit to run on our base python image 19:12:40 <clarkb> This one looks super straightforward, but is actually fairly involved. It turns out that the old opendjdk images on dockerhub aren't really something we should use going forward. 19:13:19 <clarkb> I've elected to address that by switching gerrit over to our base python image and installing java from debian repos. The reason for this is it will allow us to update python on those images to 3.10 or 3.11 for jeepyb in a straightforward manner 19:13:42 <clarkb> Debain bullseye includes java 11 (what we currently run on) and java 17 (what we'll eventually move to) which is nice too 19:13:48 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/870877 Run Gerrit under Java 17 19:14:28 <clarkb> this change is a followup to the previous change that switches us from java 11 to 17 as gerrit 3.6 release notes say 17 is fully supported as of 3.6. That said, I have had to add a workaround for a bug in running gerrit under java 17 (the bug is linked to in that change) 19:14:52 <clarkb> I should write the gerrit mailing list today asking them about that because "fully supported" and "use this workaround for the jvm" seem to be in conflict with one another 19:15:25 <clarkb> And finally ianw has a change to convert us away from deprecated copy conditions in 3.6 (this needs to be done before we upgrade to 3.7 along with other things like conversion to submit-requirements) 19:15:27 <clarkb> #link https://review.opendev.org/c/openstack/project-config/+/867931 Cleaning up deprecated copy conditions in project ACLs 19:16:29 <clarkb> I need to review ^ that one and will try to do that today. Then we can land that when we're happy with the state of acls generally (which I think we may already be) 19:16:38 <clarkb> Any other Gerrit updates? 19:16:50 <ianw> yeah i think we're fine on that -- to just log what happened i re-loaded the acl's per 19:16:59 <ianw> #link https://etherpad.opendev.org/p/760YNeM5OEFS1hlr7bE5 19:17:27 <ianw> someone else should probably double check the logs but the only "errors" were for projects that were retired in and in R/O mode 19:18:01 <ianw> i saw some discussion on that in #opendev and wondering if we should still make the change to jeepyb to stop on acl failures 19:18:16 <ianw> i think we can, because we won't try to load ACL's for retired projects (normally?) 19:18:33 <clarkb> ianw: we will if we haven't already cached that we've updated them 19:18:38 <ianw> although i guess it does mean we can't run the mass reload? 19:18:42 <clarkb> which is my concern since the cache may not persist forever 19:18:45 <clarkb> ya exactly 19:18:49 <ianw> yeah 19:19:13 <clarkb> I think we need to handle errors for updating RO projects or remove RO projects from projects.yaml or something if we want to make errors more forceful 19:19:29 <clarkb> I'm open to ideas if people want to leave them in that jeepyb review 19:19:38 <clarkb> but its nothing something we can land as is so I WIP'd it 19:20:40 <clarkb> I can use your captured error logging for hints too 19:20:47 <clarkb> I should go look at that for multiple reasons :) 19:21:20 <ianw> ok, yeah something to think about. they're all using the retired acl file 19:21:26 <fungi> i suppose our projects.yaml cleanup job could also propose removals of read-only projects $somehow 19:21:38 <ianw> i guess we need to probe though if it's been applied ... 19:22:13 <clarkb> well we know the retired acl does apply cleanly which would imply anything using that acl that fails is very likely to have failed due to being ro 19:22:30 <clarkb> I want ot say gerrit says something like "you can't modify this RO project" which we could use as an indication to ignore too 19:23:10 <ianw> yeah it says 19:23:11 <ianw> openstack-attic_compute-api.txt- ! [remote rejected] HEAD -> refs/meta/config (prohibited by Gerrit: project state does not permit write) 19:23:35 <clarkb> cool, I can work on a new patchset that checks for that error specifically and only ignore it if that is returned else error 19:23:44 <ianw> ++ 19:24:24 <clarkb> anything else related to Gerrit before we go to the next thing? 19:25:36 <clarkb> #topic Gitea 1.18 upgrade 19:25:50 <clarkb> yesterday we did a minor upgrade from 1.17.3 to 1.17.4 19:26:08 <clarkb> That was in preparation for an upgrade to 1.18.x which has now made it to 1.18.3 19:26:10 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/870851 Upgrade to 1.18.3 19:26:27 <clarkb> There is a held node against the child change of ^ that can be used to preview things. It seems like it is working happily. 19:27:04 <clarkb> reviews and double checking the changelog and the held node are much appreciated. I'm happy to watch that go in 19:27:21 <clarkb> (this is something that has been on my todo list since like December so will be glad to have it done :) ) 19:27:45 <ianw> ++ will do 19:28:13 <clarkb> #topic Pruning backups on the rax server 19:28:36 <fungi> yeah, i keep meaning to get to that 19:28:38 <clarkb> The rax backup server is warning us that we're at 92% of capacity and we should run the pruning tool 19:28:42 <fungi> doesn't seem dire yet 19:28:52 <clarkb> fungi: oh thanks. 19:29:03 <clarkb> I wanted to bring it up hee just to make sure it didn't get completely forgotten under the fun of last week :) 19:29:24 <ianw> yeah i can clear that out 19:29:53 <clarkb> sounds like we've got a couple volunteers so should get done soon enough. Thanks again 19:30:00 <clarkb> #topic Linaro Cloud Updates 19:30:06 <clarkb> #link https://review.opendev.org/c/openstack/project-config/+/871196 Remove old linaro cloud from Nodepool 19:30:32 <clarkb> I've reviewed that stack and I think it can go in whenever we are ready. Also note that the ssl cert for that cloud expires in 2 days so sooner than later is a good idea 19:31:03 <clarkb> I know ianw is actively debugging use of the new cloud, but any chance you can give us a quick update on the modifications you made to that cloud? 19:31:56 <ianw> yep, basically we have 2tb to play with on the cloud, but it was all assigned to a cinder pool 19:32:10 <frickler> do we need a delay between the above cleanup patches? 19:32:38 <clarkb> frickler: yes, ideally we land the first one then wait until nodepool is done cleaning everything up before landing the second 19:33:07 <frickler> o.k., I just approved the first and will only review the next one, then 19:33:45 <clarkb> thanks! 19:34:13 <ianw> anyway, i deleted that cinder pool, and made it only 150gb which is enough room for the cache volume we attach to the mirror node 19:34:50 <ianw> the rest of the storage i just attached as a regular file system, and moved the glance image storage/libvirt storage into volumes mounted from there 19:35:29 <frickler> the new mirror seems to need another deployment run to recreate some things on the new volume 19:35:32 <ianw> so, basically we have enough room now to store our uploaded images and run i think as many vm's as we have floating ip's for 19:36:18 <ianw> frickler: ahh, that may be, yes i did delete it's cache volume and recreate it. i probably should have done a manual run of the mirror deployment against it. will double check soon 19:36:35 <clarkb> sounds like good progress. Anything else to add on the arm cloud migration? 19:37:45 <clarkb> #topic Upgrading servers 19:37:58 <clarkb> this is reasonably high on my todo list but things like git security patching took precedence... 19:38:13 <clarkb> I'm hoping once I've got my backlog of gitea and gerrit things out of the way I'll be able to focus on this more 19:38:19 <clarkb> No real updates on this one. 19:38:28 <clarkb> #topic Quo vadis Storyboard 19:38:33 <clarkb> and same story on this topic :( 19:39:06 <clarkb> Which takes us to the end of our scheduled agenda 19:39:10 <clarkb> #topic Open Discussion 19:39:19 <clarkb> There are a couple things I wanted to mention here. 19:39:50 <clarkb> First is that we discovered gitea does cross repo searching (only on the primary branch) similar to hound. This caused us to wonder if we could drop hound as a result, but ianw pointed out that hound does regex serach and gitea does not 19:40:12 <clarkb> That said I've been using it for simple searches and it seems to work well. 19:40:48 <clarkb> And second I've pushed tox -> nox conversion changes for bindep, jeepyb, git-review, and system-config now. For at least some of these (git-review) I don't think tox is working at all. 19:40:50 <fungi> and it sounds like the underlying search library gitea uses supports regex, so it may just be simple glue/ui patching to gitea to add that? 19:40:50 <ianw> yeah, i use both all the time :) 19:41:22 <clarkb> fungi: yes there is opportunity to improve gitea to expose regex searching as both bleve (the default we use) and elasticsearch appear to support regex searches 19:41:43 <ianw> i think hound is pretty low maintenance, i personally would like to keep it. it probably doesn't need it's own host as it does now, but not sure where else it would live 19:42:16 <clarkb> ianw: ya I think as long as it has more functionality keeping it makes sense 19:42:41 <fungi> if we get approximate feature parity in gitea, then dropping one more redundant ancillary service will still be good though 19:42:45 <clarkb> ++ 19:43:23 <fungi> we can always host a static redirect from the codesearch name to gitea's explore search 19:43:30 <frickler> I like that I can just type "co" in the browser address bar and it will autocomplete and then focus in the search field 19:44:14 <frickler> two more things from me: 19:44:21 <clarkb> I've been trying to do the nox stuff when I've got a hole in my schedule that is too short to really dive into more involved tasks. I haven't seen any really strong reactions either way on nox. Please say something if you think I'm wasting my time and I'll try to fill those odd blocks of time with something else. 19:44:26 <fungi> https://opendev.org/explore/code also seems to focus on the search field, so a redirect would preserve that experience 19:44:58 <frickler> ah, o.k. 19:45:09 * frickler will be holidaying the upcoming two weeks and mostly be offline 19:45:22 <frickler> also I didn't get to add AFS to the agenda as promised. seems there was another cleanup in fedora, so we are good for now except maybe for some centos+ubuntu quota adjustments 19:46:00 <fungi> thanks for keeping an eye on that 19:46:00 <clarkb> ya I almost added it, but the dashboard looks pretty good right now so figured it could wait 19:46:44 <clarkb> frickler: I hope you are able to do something fun for your holidays 19:46:54 <fungi> i think i'm moments away from having updated screenshots to see if the latest revision of the donor logos addition works 19:47:56 <fungi> though looks like the gitea image build takes a while, so it will probably still spill over into my next meeting 19:48:29 <frickler> well as much fun as it gets with the weather and everything 19:50:30 <clarkb> Oh one last thought, should we offer to help debian with the git stuff since we already worked through much of it? fungi already gave them pointers, but I'm worried there hasn't been any movement there after a week. I just don't know what all might be involved particularly since that package isn't in salsa? 19:51:23 <fungi> i've been keeping an eye on https://repo.or.cz/w/git/debian.git/ and haven't seen any movement on the debian branches there either 19:51:43 <frickler> maybe ask some debian people like zigo or kevko for their judgement? 19:52:00 <fungi> that sounds reasonable 19:52:36 <fungi> i'm not a dd so couldn't nmu anything without a sponsor, but also git is central enough to so many things i'd be a little uneasy being the one to nmu that anyway 19:52:59 <clarkb> frickler: excellent idea 19:53:00 <ianw> i am a dd ... but would probably not do that for git! :) 19:53:10 <ianw> well not without consent, anyway 19:53:28 <ianw> but ... if we find the right people happy to help 19:54:11 <fungi> ianw: oh! you're a dd? i have something else i need official dds to weigh in on at some point, but it's not directly related to opendev 19:54:21 <fungi> i'll follow up with you later on it 19:55:05 <ianw> haha well yeah, i maintain a few things; i was much more active back in itanium days on ia64 things 19:55:33 <clarkb> ianw: when I worked at intel we actually had some of those racked up doing I forget what 19:55:35 <ianw> but ... well that's a slice of history now :) 19:55:51 <ianw> clarkb: probably making a lot of noise and heat 19:55:54 <clarkb> ha 19:56:11 <clarkb> sounds like that is everything for today. I've got another meeting in a few minutes so stopping here and getting time between would be great 19:56:14 <clarkb> thanks everyone! 19:56:25 <ianw> thanks clarkb! 19:56:27 <clarkb> both for your time today and all the hard work everyone does to make this machine roll forward 19:56:32 <clarkb> #endmeeting