Tuesday, 2022-02-01

clarkbAnyone else here for the meeting?19:00
clarkbI'm still getting things together, but we'll start momentarily19:00
clarkb#startmeeting infra19:01
opendevmeetMeeting started Tue Feb  1 19:01:05 2022 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:01
opendevmeetThe meeting name has been set to 'infra'19:01
ianwo/19:01
fricklero/19:01
fungiohai19:01
clarkb#link http://lists.opendev.org/pipermail/service-discuss/2022-January/000316.html Our Agenda19:02
clarkb#topic Announcements19:02
clarkbService coordinator nominations run January 25, 2022 - February 8, 2022  You have another week :)19:02
clarkbAs always let me know if you have questions about that and I'd be happy to answer them19:02
clarkbOpenInfra Summit CFP needs your input: https://openinfra.dev/summit/19:03
clarkbIf you'd like to talk at the open infra summit there is a ci/cd track as well as other tracks you may be interested in proposing towards. I think you have until february 9 for that19:03
clarkbAnd finally Zuul v5 released today! The culmination of much long term planning and effort. Thank you everyone who helped make that possible19:03
clarkbside note our zuul isntall still says v4.12.something but we're running the same commits that were tagged v519:04
clarkb#topic Actions from last meeting19:04
clarkb#link http://eavesdrop.openstack.org/meetings/infra/2022/infra.2022-01-25-19.01.txt minutes from last meeting19:05
clarkbThere were no actions recorded19:05
clarkb#topic Topics19:05
clarkb#topic Improving Opendev's CD throughput19:05
clarkb#link https://review.opendev.org/c/opendev/infra-specs/+/821645 -- spec outlining some of the issues with secrets19:05
clarkb#link https://review.opendev.org/c/opendev/system-config/+/821155 -- sample of secret writing; more info in changelog19:05
clarkbUnfortunately the Gerrit upgrade and server patching had me far more distracted last week than I would've liked19:06
clarkbI haven't had time to look at these yet. They are still on my todo list though...19:06
clarkbmaybe I should make an action for everyone to review those :)19:06
clarkb#action infra-root Review OpenDev CD throughput related spec for secrets management: https://review.opendev.org/c/opendev/infra-specs/+/82164519:06
clarkbianw: is there anything else to add to this topic?19:06
ianwno, no work has been done on this one 19:07
clarkb#topic Container Maintenance19:08
clarkb#link https://etherpad.opendev.org/p/opendev-container-maintenance19:08
clarkbMy time for this last week was largely sidelined by server patching19:08
clarkbI don't really have anything new to add to this unfortunately.19:08
clarkb#topic Nodepool Image Cleanup19:08
clarkbChanges to remove CentOS 8 have been pushed as promised by the end of January. However, at least one project (OSA) is still struggling with removing centos 8 so we can hold off until they are ready since they are actively workign to correct this19:09
clarkbOnce projects like OSA are ready we can land the chagnes in this order:19:09
clarkb#link https://review.opendev.org/c/opendev/base-jobs/+/82718119:09
clarkb#link https://review.opendev.org/c/openstack/project-config/+/82718419:09
clarkb#link https://review.opendev.org/c/opendev/system-config/+/82718619:09
ianwso we found yesterday that the centos mirror infrastructure stopped returning links19:10
clarkbIt looks like centos 8 upstream is starting to archive itself which is causing some issues here and there and people will be motivated to start moving19:10
clarkbya that19:10
ianwyeah, so things are really only working because we run in a little mirror bubble19:11
fungibless this bubble19:11
ianwi've stopped the image builds (https://review.opendev.org/c/openstack/project-config/+/827195) because they will fail19:12
ianwif upstream modifies their mirror bits and we rsync that on an run, then we will be totally broken19:13
ianwso tbh i feel like we could probably pull the images now, and if jobs fail people need to switch them to 8-stream and make them non-voting if it doesn't work ootb and fix it19:13
fungithe problem with dropping our images is that if something happens we can't upload them again19:14
clarkbianw: do we know why the rsyncing hasn't broken yet?19:14
clarkbI think I'm ok with leaving this up until projects migrate and removing sooner if the upstream infrastructure is no longer tenable19:15
fungioh, i see what you mean, drop the images in our providers and stop providing centos-8 nodes, full stop19:15
ianwclarkb: when i last checked, they hadn't moved the 8/ directories into vault.centos.org19:15
ianwbut that may happen at any time i guess, which would make them disappear from the mirror and we'd pull that19:16
funginot that we should have to handhold anyone, but i know we've been focused mostly on openstack's use of centos-8 nodes... does anyone happen to know if starlingx is also impacted? (or did they never finish moving off centos-7?)19:16
fricklermirror.centos dropped like 90G in size at around 11:30 today19:16
fungiand yeah, this is probably good to make folks on openstack-discuss aware of. centos-8 is going away even if we do nothing. your jobs are breaking today, sorry!19:17
clarkbfungi: ++ maybe the thing to do is respond to my thread that warned people about the removal. Indicate that centos-8 doesn't work if you talk to upstream anymore and as a result we're going to remove things?19:18
clarkbjrosser isn't here but was one who wanted to keep them up19:20
ianwahh, yeah, http://mirror.iad.rax.opendev.org/centos/8/os isn't there ...19:20
ianwso that might have happened overnight19:20
clarkbianw: I assume os/ includes important packages :)19:20
ianw(or today, depending on how you look at it :)19:20
clarkbI definitely think we should accelerate the removal given ^19:20
ianwlooks like from the logs it largely cleared itself out @ 2022-02-01T10:43:44,980764868+00:0019:22
clarkbanyone want to volunteer to respond to the thread? The changes should be ready to go once we're ready19:23
ianwi can chase up on it, reply and merge those things through today19:24
clarkbthank you19:24
fungii've got a few other deadlines looming so probably can't give it the immediate attention it deserves19:24
fungithanks ianw!19:24
clarkb#topic Cleaning up old reviews19:25
clarkb#action clarkb to produce a list of repos that can be retired. We can then retire the repos and abandon their open reviews as step 019:25
clarkbI'll go ahead and record this now as an explicit action for my todo list. I think we still start with ^ which should take out a chunk of reviews then reevaluate when this is done19:25
clarkbfrickler: anything else to add to this topic?19:25
fricklernope, didn't do anything on that yet19:25
clarkb#topic Gerrit mergeability checking19:26
clarkbWhen we upgraded to gerrit 3.4 we lost mergability checking by default. Gerrit disabled this functionality by default as it can use a disproportionate amount of resources to calculate merge conflicts19:26
clarkbThe functionality is still in Gerrit though and we can opt into it via a config switch19:26
clarkbA few users have mentioned that the infromation was useful to them.19:27
fungiparticularly folks using it to omit unmergeable changes from their review dashboards, seems like19:27
clarkbI'm not opposed to reenabling the functionality but do have some minor concerns. The biggest is that this will likelymake reindexing projects take longer now. But we were ok with that in the past.19:27
clarkbOther concerns are that gerrit has a tendency to remove functionality entirely after disabling it by default so we may have to accept it going away one day. But no automated tooling relies on this functionality so the cost remains with humans. If they remove it entirely we should be fine other than having some sad users19:28
clarkbThere is also the potential that we'd be exposing ourselves to bugs in that functionality since few other users are going to use it. If that happens we can always disable it again19:28
fricklermaybe we could send them (gerrit devs) feedback that we would like not to loose that functionality?19:29
clarkbAll that to say I think despite my concerns there are good solutions should the concerns become a problem which means I'm ok with reenabling this19:29
clarkbfrickler: ya that is another option. Basically "we're toggling to non default here please keep it working"19:29
fungii'm also fine with bringing it back if someone proposes a patch19:29
ianw++ i agree with having it on, i always found it useful, and also agree with sending some feedback that we're turning it on19:30
fricklerI can look at doing a patch, since I'm one of the users who like to have it19:30
ianwit seems like we could probably create conflicting changes and push them during the testing at least19:30
clarkbI guess there is the msall matter of weather or not we need to offline reindex after enabling it. But I suspect that it will just start adding the info to new changes19:30
clarkbfrickler: thank you!19:30
clarkband ya our testing should alrgely cover major concerns with enabling it19:31
ianwit would be super cool to check that with selenium but probably just seeing in a screenshot is enough19:31
fricklerclarkb: can you #action me on that or can I do that myself?19:31
fricklerjust so I don't forget it19:31
clarkbfrickler: you should be able to do it yourself.19:31
frickler#action frickler propose patch to re-enable Gerrit mergeability checking19:32
clarkb(the bot doesn't give a lot of feedback though so I guess we'll find out after the meeting is done)19:32
clarkb#topic Gerrit issues we are tracking19:32
clarkbFirst up is the regression with gerrit ignoring signed tag acls for pushing tags. My patch to fix this which I tested manually on a held test node landed upstream and we have restarted Gerrit with that code and removed our workaround19:33
clarkbWe are just waiting on someone to push a signed tag and confirm it is happy now. Once that is done I'll merge the 3.4 fix into gerrit 3.5 as well19:33
clarkbNext is url text substitution for gitweb links doesn't provide the hash value for ${commit} in all cases and gitea needs that19:34
clarkb#link https://bugs.chromium.org/p/gerrit/issues/detail?id=1558919:34
clarkbfungi and I are working to test a fix that I pushed upstream. Still no review comments though19:34
clarkbOne neat thing we are trying to do with our testing though is depends-on against upstream gerrit and running that code in our test jobs19:35
fungialso i'm somewhat blocked on zuul-client autohold's --change option not working as advertised19:35
clarkbIt seems t owork from what I've seen so far19:35
fungii'm testing a workaround with --ref instead, but we're probably going to need to fix zuul-client to be able to continue doing change-specific autoholds (or go back to the rpc client in the meantime)19:35
clarkbAnd finally yesterday we noticed that git pulls over ssh can backlog in gerrit where the tcp connection is made and gerrit recognizes there is a pull waiting but the tasks remain in waiting and are not processed by a thread19:36
clarkbIf this happens long enough and the backlog grows eventually it leads to Zuul being very backlogged with its mergers19:36
clarkb#link https://bugs.chromium.org/p/gerrit/issues/detail?id=1564919:36
clarkbUpstream asked for a thread dump which we have. We just need to audit it for any over exposure of sensitive info19:37
clarkbI'll try to work on that I guess since I've been working upstream with Gerrit more and more19:37
clarkb#topic Open Discussion19:38
clarkbAnything else?19:38
fricklernot sure where to discuss, but lp is nearing the 2000000 bug count19:39
fricklerso overlap with storyboard ids will happen19:39
clarkbfun. I'm not sure I personally grasp the impact of that. fungi would probably know better19:39
fungithat's a good reminder19:39
fricklerwe will loose the option to migrate existing bugs keeping their ids19:40
fungiit basically means we can no longer migrate projects from lp to sb and expect a 1:1 correlation between imported bug numbers19:40
fungiyeah, exactly19:40
clarkbI see19:40
clarkbsolvable but with degraded ease of migration19:40
clarkb(since we would have to map to new numbers)19:40
fricklerbut then I also don't see a tendency to further do migrations to sb19:40
fungiwe can probably do some logic to uprev any imported bugs in the 20k+ range and continue to import earlier reports the way we did in the past19:41
clarkbya thats a good point19:41
clarkbIf anything it seems like projects are looking at github issues more than anything else19:41
fungiand asking us to turn on gitea's issues feature, yeah19:41
clarkbthank you for calling that out19:42
fungi(but that means fixing the clustering problem, account management, and a host of other challenges)19:42
clarkbAs a general heads up my availability over the next week may be spotty. I'm going to do my best to be around but not sure what my availability will be like.19:42
clarkbfungi: yup not an easy task.19:42
ianwhave upstream fixed the clustering issues?19:43
ianwor not so much fixed, but added?19:43
clarkbyes I think the elasticsearch backend is there. Not sure if it will work with opensearch though19:44
clarkbunfortunate that the gitea effort happened while elasticsearch became less open but at least in theory we could run gitea with an opensearch cluster, a mariadb cluster, and a shared cephfs fs19:45
ianwinteresting ... another one for the todo list :)19:45
ianwall that running together ... sounds like kubernetes might fit in ...19:46
fungithat's what we tried to use the first time!19:46
clarkbya that was the original goal with gitea19:46
clarkbbut when we realized the indexes weren't distributed it wasn't tenable until that got fixed (and they addressed that with the elasticsearch backend option)19:46
fungii think that kubernetes cluster might still exist even, but we'd almost certainly want to rebuild it from scratch if we take it in that direction19:47
clarkb++19:47
fungiif memory serves, we also ran into trouble with rook19:47
clarkbfiguring out how to manage a k8s cluster is probably step 019:48
fungibut that's almost certainly improved in the meantime19:48
fungiyeah, managing ceph within kubernetes was a struggle back then19:48
clarkbsince there are many options and every option we tried perviously had its downsides (magnum didn't do upgrades via the api and you couldn't upgrade directly because there wasn't enough disk to grab two copies of the k8s images)19:48
clarkbWe don't need to solve that in this meeting though. But if people want to investigate that again now might be a good time to start looking into it19:49
clarkbSounds like we may be winding down. I'll give it a couple more minutes for any last minute items then call it a meeting19:50
clarkbThank you everyone!19:52
clarkb#endmeeting19:52
opendevmeetMeeting ended Tue Feb  1 19:52:33 2022 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:52
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2022/infra.2022-02-01-19.01.html19:52
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2022/infra.2022-02-01-19.01.txt19:52
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2022/infra.2022-02-01-19.01.log.html19:52
fungithanks clarkb!19:53

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!