19:01:06 <clarkb> #startmeeting infra
19:01:06 <opendevmeet> Meeting started Tue Feb 15 19:01:06 2022 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:06 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:06 <opendevmeet> The meeting name has been set to 'infra'
19:01:12 <clarkb> #link http://lists.opendev.org/pipermail/service-discuss/2022-February/000319.html Our Agenda
19:01:18 <clarkb> #topic Announcements
19:01:21 <ianw> o/
19:01:40 <clarkb> As discussed last week I sent email to the list explicitly for the service coordinator nomination period. I haven't need any nominations yet. Does that mean I'm "it" again
19:03:01 <clarkb> I'll make it official after lunch today if no one else indicates interest
19:03:28 <clarkb> #topic Actions from last meeting
19:03:35 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2022/infra.2022-02-08-19.01.txt minutes from last meeting
19:03:42 <clarkb> #action frickler propose mergeability check reenablement change
19:03:55 <clarkb> I don't see that change yet, but also know frickler isn't around today. Hopefully soon though
19:04:36 <clarkb> #topic Topics
19:04:43 <clarkb> #topic Improving OpenDev's CD throughput
19:05:02 <clarkb> This item and the next one continue to be deprioritized due to other things for me :/
19:05:25 <clarkb> ianw: not sure there are any updates directly related to this though your gpg encrypted logs setup may be related
19:05:44 <clarkb> did you want to talk about that really quickly? Or is it still in refinement mode?
19:05:57 <fungi> it looked like a really neat idea
19:06:10 <fungi> i think he said it was ready for evaluation?
19:06:12 <ianw> it's ready for review.  it falls under the general heading of "making bridge not special"
19:06:16 <ianw> #link https://review.opendev.org/q/topic:bridge-encrypt-logs
19:06:53 <clarkb> the tldr is encrypt ansible logs with root gpg keys so that they can be uploaded as zuul artifacts?
19:07:24 <ianw> yes, but i think the key is not just root gpg keys -- anyone could add a key per prod job
19:07:36 <clarkb> ah neat
19:08:27 <ianw> but for root users who can already see the logs, the benefit is that instead of having to dig into bridge.o.o:/var/logs/... when we see a failed job, we can grab the logs for that run directly
19:09:19 <clarkb> right, its more consistent with the typical zuul experience. My only concern was I had to fiddle with gpg recently to test signed tag pushes with Gerrit and gpg really didn't want to work headlessly. But I imagine most of the time this will be done on my desktop/laptop and I won't notice
19:09:28 <clarkb> (mostly me just grumping that gpg does't work without a display)
19:10:14 <ianw> that is indeed correct, and there's about 8-10 changes in the encrypt-files role where i was figuring out how to get gpg to import a pubilc key and trust it headlessly :)  there's some comments around that section about several things that *don't* work :)
19:11:13 <clarkb> This is a neat idea. Thank you for putting it together. Anything else on this topic?
19:11:18 <ianw> in terms of decrypting -- i repurposed the download logs script we have that pulls things via the manifest
19:11:37 <ianw> as long as you have gpg-agent setup, it "just works" to decrypt; you run that script and it does it all for you
19:12:01 <ianw> but you're welcome to just grab the .gpg file directly too
19:12:30 <ianw> that is all, we can discuss in reviews, thanks! :)
19:12:41 <clarkb> #topic Container Maintenance
19:12:58 <clarkb> No real movement here. I keep forgetting to do the limnoria update on a thursday afternoon when meetings don't happen
19:13:13 <clarkb> And then haven't had time to dig into the dedicated user work. But I really do want too :/
19:13:26 <clarkb> #topic Nodepool image cleanup
19:13:37 <clarkb> Fedora 34 is now completely gone including its mirror content
19:14:12 <clarkb> I did notice that we still have the centos-8 python wheel mirror, but that content is relatively small. I'm not too worried about it
19:14:36 <clarkb> I think we can call this topic done for now? frickler looked into xenial cleanup and there is a bit of work to be done
19:14:36 <ianw> oh, i can clean that up
19:14:48 <clarkb> #link https://etherpad.opendev.org/p/ubuntu-xenial-jobs Ubuntu Xenial cleanup
19:15:00 <clarkb> I think we need to push on ^ those items a bit before we are naywhere near removing the image
19:15:06 <clarkb> I did try to add notes about things there though
19:15:45 <clarkb> This went pretty smoothly even with the shorter than expected centos-8 mirror grace period
19:15:55 <clarkb> Thank you to everyone that updated jobs and helped with this cleanup
19:16:16 <clarkb> #topic New Nodepool Images
19:16:23 <clarkb> This is a meeting chair addition to the agenda :)
19:16:29 <clarkb> #link https://review.opendev.org/c/openstack/project-config/+/828435 Add rocky linux images to nodepool
19:16:45 <clarkb> I think that change is about ready to go. I half suspect we'll get image build failures and then pause the image while we sort through them
19:16:58 <clarkb> If we are ok with that iterative process reviews would be good
19:17:29 <clarkb> and I think we'll run these without mirrors and see how that does
19:17:40 <ianw> just on the previous one; i have a couple of low-priority py3 and centos wheel build updates with
19:17:43 <ianw> #link https://review.opendev.org/q/topic:8-stream-wheel
19:18:22 <clarkb> thanks
19:18:36 <ianw> i plan to do 9-stream, that's why dropping the virtualenv requirement
19:19:04 <ianw> the goal for rocky is more kolla and things, rather than full devstack runs?
19:19:33 <clarkb> ianw: yes aiui kolla specifically is asking for it. I think it would be up to the qa team or new volunteers to step up and do devstack work
19:20:09 <clarkb> ianw: the way the TC describes platform support it might make sense for them to say replace centos stream with rocky/euler/alma
19:20:21 <fungi> thouvgh always possible that goal will expand in scope if there are more centos stream regressions
19:20:23 <clarkb> (the real target is rhel 8 and those three may be more representative?)
19:21:15 <clarkb> I do think it would be good to avoid having 3 different rhel8 clones (this rocky change gets us to 2. If it becomes an issue we should discuss with interested parties in why we'd need the whole set
19:21:17 <fungi> right, part of why openstack yoga is being tested with python 3.6 even though centos stream 9 is out with newer python is that there is no rhel 9 yet
19:22:09 <fungi> openstack zed may end up in the same situation
19:22:19 <clarkb> But ya after centos-8-stream stopped working at the beginning of the year for a bunch of jobs and they didn't revert nor push a fix I can understand why people want something more stable
19:22:29 <clarkb> so having an option or two like euler and rocky is a good idea imo
19:22:34 <fungi> they want there to be at least one release overlap where a user can move from rhel 8 to rhel 9 while running the same version of openstack
19:23:12 <fungi> so we're probably looking at openstack wanting to test zen on a rhel 8 clone as well, the way rhel 9's schedule seems to be progressing
19:23:39 <clarkb> ya I think from our position we're trying to enable interested parties and less prescribing what they should test
19:23:59 <clarkb> in this case people are interested in replacing stream with a proper clone to get closer to what people are likely to use in production aiui
19:24:06 <clarkb> (kolla specifically)
19:25:00 <fungi> though it's worth noting, on the openstack development pain points front, continuing to test with python 3.6 when lots of libs on pypi are dropping support for it is causing a lot of headache
19:25:17 <clarkb> anyway I just wanted to call out the change and the fact that we might have to pause builds and iterate. If we are ok with that then we can proceed. If we want more upfront vetting then we'll have to sort out local builds
19:25:57 <ianw> i think it's close enough that submitting and pausing if failing is reasonable
19:26:08 <fungi> that seems fine to me
19:26:15 <clarkb> me as well I've already +2'd
19:26:19 <clarkb> and thanks for listening :)
19:26:29 <ianw> (both close enough to working by itself, and close enough theoretically to things that already work :)
19:26:58 <clarkb> #topic Cleaning up old reviews
19:27:05 <clarkb> #link https://etherpad.opendev.org/p/opendev-repo-retirements List of repos to retire. Please double check
19:27:20 <clarkb> Based on this list I went ahead and pushed some chagnes to start the retirement process
19:27:25 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/829119 Removes unused repos from integration testing.
19:27:32 <clarkb> #link https://review.opendev.org/c/openstack/project-config/+/829121 Sets noop jobs on unused repos so that they can be retired.
19:28:06 <clarkb> The first one has the reviews it needs as does its parent. I guess we approve that when we are able to watch it for any unexpected puppet fallout
19:28:15 <clarkb> I can do that probably tomorrow (today is a busy one already)
19:28:35 <clarkb> Then the second change ensures we can push up and land all the retire this repo changes in the repos themselves
19:29:06 <clarkb> If you notice something doesn't look right please let me/us know. Cleanup like this is always a little scary particularly since our coverage in testing is less good for puppet
19:30:08 <clarkb> #topic Gitea 1.16.1
19:30:15 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/828184 Change to upgrade to 1.16.1 when we are ready
19:30:20 <clarkb> #link https://104.130.74.7:3081/opendev/system-config Test site via held node here
19:30:43 <clarkb> ianw has looked this over and called out one weird behavior that seemed to correct itself. I suspect a race in when the information is queried and the state of classification of the repo contents
19:31:15 <ianw> yeah, it was odd that it changed when i was looking, as the host had been up for a few days.  i guess looking triggers some sort of refresh
19:31:58 <clarkb> If others can look at it carefully that would be helpful as the gitea 1.16 changelog is quite large. The thing I'm most concerned about is ssh functionality as they specifically call out breaking changes to that. 828184 attempts to accomodate the ssh changes and updates our testing to push via ssh to mimic gerrit replication
19:32:59 <clarkb> #topic Gerrit Gitea Weblinks
19:33:28 <clarkb> I've spent a good chunk of the last day and a half figuring out how to java better. I'm hopeful the fixups for this upstream are now in a mergeable state
19:33:50 <clarkb> #link https://gerrit-review.googlesource.com/c/gerrit/+/329279 Allow for sha1 to be specified in filelinks
19:33:54 <clarkb> #link https://gerrit-review.googlesource.com/c/plugins/gitiles/+/330361 related gitiles plugin update
19:34:04 <clarkb> Assuming those land we can rebuild our gerrit and restart with some gitea link config
19:34:41 <clarkb> I got a separate bug fix for gerrit ls-members --recursive ssh command landed whcih a rebuild would now pick up if we want to do that (it isn't urgent now that we know the bug exists and can check via http instead)
19:35:51 <ianw> so it's already in 3.4 branch?
19:35:53 <clarkb> It is nice to see that upstream is responsive to these issues and is willing to guide us through fixing them
19:36:04 <clarkb> ianw: ya ls-members was fixed on 3.3 and the merge to 3.4 happened yesterday
19:36:15 <clarkb> the other two are proposed to 3.4
19:36:57 <ianw> ++; i'm happy to restart later this afternoon if we have a change bumping it
19:37:19 <clarkb> I havne't pushed a change for that yet. I also don't think it is urgent if we just want to wait for the gitea link work to hopefully land
19:37:34 <clarkb> until then do group membership listings via rest or the web ui :)
19:38:00 <clarkb> #topic Open Discussion
19:38:02 <clarkb> Anything else?
19:38:15 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/829141 improve haproxy checks for gitea
19:38:45 <clarkb> fungi called out that our old checks weren't quite right as apache could be up and gitea could be down and haproxy would think the service was still up. This change attempts to address that by doing http checks which should check both apache and gitea are functional as a unit
19:40:01 <fungi> yesterday i moved production zanata and refstack over to authenticating with id.openinfra.dev instead of openstackid.org, in case anyone hears of problems logging into those
19:40:54 <fungi> i haven't touched the keycloak poc yet, because we might want to redeploy it soonish anyway, i'm guessing, as we put more of the configuration under management
19:42:39 <clarkb> Sounds liek that may be it?
19:42:48 <fungi> i'm still a teensy bit worried i missed something on refstack since i had to resort to using sed on a mysqldump to work around its use of openid fields as foreign key constraints, but so far people seem to be having no trouble with it
19:42:58 <clarkb> thats good
19:43:09 <fungi> and no, i didn't have anything else at the moment
19:43:32 <clarkb> I think we can call it there and we can all go find food :) I need to get ready for my next meeting.
19:43:38 <clarkb> thank you everyone. We'll see you here next week
19:43:42 <fungi> thanks!
19:43:43 <clarkb> #endmeeting