19:01:12 #startmeeting infra 19:01:12 Meeting started Tue May 24 19:01:12 2022 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:12 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:12 The meeting name has been set to 'infra' 19:01:19 #link https://lists.opendev.org/pipermail/service-discuss/2022-May/000337.html Our Agenda 19:01:23 o/ 19:01:28 \o 19:01:28 #topic Announcements 19:01:41 A reminder that the Summit starts in 2 weeks. Running June 7-9 19:01:55 I think we should go ahead and cancel the meeting on June 7 (I don't anticipate being able to attend 19:02:16 i support that 19:02:23 who is going there? me isn't 19:02:37 I think fungi and myself will be there 19:03:02 if others want to hold a meeting that tuesday i'm not opposed, i'll just catch up from the meeting log 19:03:22 ya thats fine too. I won't be able to run it 19:04:15 maybe we can decide next week? 19:04:19 sure 19:04:57 Also keep in mind the event if making changes to things like etherpad which will be used by the forum 19:05:49 #topic Actions from last meeting 19:05:54 #link http://eavesdrop.openstack.org/meetings/infra/2022/infra.2022-05-17-19.01.txt minutes from last meeting 19:06:14 This wasn't recorded explicitly as an action but I do recall saying I would get it done this week. wiki.openstack.org has a new cert now. Thank you fungi for installing it 19:06:23 The email spam about that should go away now 19:06:29 yay 19:06:35 thanks for ordering it 19:07:25 #topic Improving CD throughput 19:07:54 With the summit and everything else going on recently (jammy and afs and pip and so on) this has taken a backseat for me when I meant to work on some related items 19:08:02 Hopefully next week I'll have made progress 19:08:27 Did anyone else have anything to mention on this topic? 19:09:54 Sounds like no. We can continue 19:09:59 #topic Container Maintenance 19:10:12 ianw: did end up testing the mariadb upgrade as part of the gerrit 3.5 upgrade 19:10:39 ianw: I left comments on the etherpad but was curious if you tried the container image env var or if you decided not to use it for some reason? 19:10:59 Just trying to figure out if running the upgrade command by hand is preferable to using the automation built into the container image for some reason 19:11:44 sorry haven't looped back to that yet. i can check on the env var way of doing things 19:11:47 either way the process looked straightforward and while we aren't in a rush to do this due to mariadb support periods it is probably good hygiene 19:11:51 ianw: thanks 19:13:35 #topic Support For Jammy Jellyfish 19:13:46 frickler: has made some more progress on spinning up the wheel mirrors for jammy 19:13:57 #link https://review.opendev.org/c/openstack/project-config/+/842841 add jobs to publish to openafs 19:14:12 I think that is the last remaining piece? frickler do we need to create a new openafs volume for that or is that done already? 19:14:37 I think I did that, let me check 19:15:13 I guess ti is two volumes. One for each architecture 19:15:52 seems I created the volumes, but didn't mount them 19:15:55 doesn't seem mounted @ http://mirror.iad.rax.opendev.org/wheel/ 19:17:13 maybe we review and +2 the change then you can +A when it is ready on the afs side? 19:17:13 will need to check that later 19:17:35 i can push it through today if you like to get it going 19:18:55 cool I'm sure we'll sort it out. Thank you for getting this together 19:19:03 #topic Gerrit 3.5 upgrade planning 19:19:11 #link http://lists.opendev.org/pipermail/service-announce/2022-May/000039.html Scheduled for 20:00 UTC June 19, 2022 19:19:18 the announcement for the upgrade went out. 19:19:22 #link https://etherpad.opendev.org/p/gerrit-upgrade-3.5 19:19:27 ianw also has notes on that etherpad. 19:19:42 I've reviewed them and tried to leave notes/questions/comments for stuff but this seems pretty straightforwatd 19:20:02 the main thing I think we're missing now is explicit config to enable collision checking. We should be able to land that on 3.4 and gerrt will ignore it until 3.5 19:20:08 ianw: ^ is there a change for that yet? 19:21:27 no sorry, there's a couple of those todo items that need changes. i will do those 19:21:47 #link https://etherpad.opendev.org/p/gerrit-upgrade-3.5 19:21:50 is what we're talking aobut 19:22:41 feel free to ping me when they're up. Happy to revie wthat stuff and I tend to have just enough gerrit context paged in to do a decent job :) 19:23:04 Anything else to bring up on the gerrit upgrade? 19:24:34 #topic Move reqirements propose-updates job to py3.8 + 3.9 19:24:55 frickler: want to fill us in on this one? I've reviewed at least one change but I think I'm lacking some context 19:26:29 I believe what is going on is openstack is trying to update requirements handling for newer python 19:26:45 and there were issues in the first attempt because all the needed python interpreters weren't installed 19:26:47 That should be fixed now 19:27:21 In the process frickler discovered that enqueing periodic jobs manually is not straightfoward and was asking how to do that properly 19:27:50 I want to say we have to use enqueue-ref of master for periodic jobs but periodic triggers have been weird. corvus is there anything else to know about that? 19:28:28 clarkb: i don't know off hand either way, sorry 19:29:36 I'll give it a couple more minutes in case frickler is still around to add anything I missed 19:30:11 i have de ja vu on starting periodic jobs 19:30:45 http://lists.zuul-ci.org/pipermail/zuul-discuss/2019-May/000909.html 19:30:51 Thinking out loud the script to dump queues is probably a good one to cross reference 19:31:00 i don't think i ever really found the right thing 19:31:02 since it will produce enqueue commands for the periodic pipelines 19:31:26 the difference seems to be a "commit 00000" added to the manual trigger 19:31:40 and then jobs end up in error 19:32:43 yeah, i think, and from re-reading that message, you need to set newrev as the HEAD 19:32:56 sounds like this problem deserves more debugging. I wonder if we can use the zuul client testing or a similar setup to get testing going for it and then encode that in docs? 19:34:07 this might be a good topic to resurrect on the zuul mailing list if there continues to be confusion 19:34:12 the other thing I noticed: the docs talk about adding the --trigger option 19:34:23 but our zuul-client doesn't know that option 19:35:02 the old built in zuul enqueue commands did know trigger but I think it became unnecessary because each pipeline has a specific trigger already 19:35:10 i think that must be cruft, i haven't needed to specify it. in theory zuul should just get it from the config 19:35:30 o.k., so just a docs update needed then 19:35:34 right so zuul-client just dropped it as unnecessary where it was required with the old zuul command 19:35:35 yeah, trigger is dead. the other stuff certainly warrants more debugging and should be testable within zuul itself. 19:36:29 I wanted to test on my local zuul but didn't get to that yet 19:37:28 i'm having some more memories, based on http://lists.zuul-ci.org/pipermail/zuul-discuss/2019-May/000914.html 19:37:47 which was all rpc client at the time, so things might have changed a bit 19:38:53 but there wasn't a way to create an event that looked *exactly* like the timer trigger remotely, as that didn't fill in the oldrev/newrev 19:39:39 but iirc setting the newrev value to the current HEAD was enough to make it work 19:39:55 makes sense. And if we have time to add a etst for that in zuul somewhere that would be great 19:40:10 but i never became convinced enough this was the real solution to update the docs 19:40:11 thx for the pointer, I'll test that @home 19:41:53 #topic Zuul changing default Ansible to v5 19:42:15 Zuul supports ansible v5 now. Eventually v5 will become the default ansible version and then sometime after that old EOL ansible versions will stop being supported by zuul 19:42:37 What that means is now is the time to start sorting out Ansiblev5 support so that we can hopefully uplift onto that version in opendev for our tenants. 19:43:03 we've found a number of issues doing that uplift so far. In particular the include task si no longer valid. You need to use include_* or import_* 19:43:40 We've also had issues where ansible running as zuul wants to become another unprivileged user and this fails because ansible can't setfacl or chown/chmod the temporary files it copies to do work on the remote 19:43:52 it is looking like ensuring that setfacl is installed may be the fix for this 19:44:14 installed on the job nodes 19:44:21 re setfacl: oy ... that's such a weird thing to have changed.... 19:45:11 i've been working through some stuff for the zuul tenant, mostly around container jobs at this point. not quite done yet. 19:45:46 ya reading the code diff it isn't clear to me how old ansible worked and new ansible fails if setfacl isn't installed 19:45:57 it looks like we'll be dropping support for focal in the ensure-podman role, because there is no longer any place (that i know of) to get podman for focal 19:46:01 but installing it seems to make new ansible happy so that is probably a reasonable thing to us to add to our base images 19:46:12 we previously relied on kubic for that, but it's now gone... 19:46:51 that may be something to keep in mind in the future potential uses of kubic... 19:47:01 yeah, i just saw that they have restored that 19:47:08 https://github.com/containers/podman/issues/14336 19:47:13 they note that it has known security bugs though 19:47:21 not sure we should set users up with that by default 19:47:23 but an old version. it's definitely a better idea to switch to jammy for that 19:47:29 oh neat 19:47:44 (side note, this is why tools like docker end up ebing so popular. Its the solaris problem all over again... access and simplicity) 19:47:54 i guess once we get the wheels sorted we will have a fully operational deathstar^w jammy environment 19:47:59 well, the current version of my change is just to throw an error if you use ensure-podman on focal... that may still be the best approach.... 19:48:37 probably not a bad idea. i'm not sure there's a lot of users 19:49:02 system-config-run jobs are happy with new ansible 19:49:36 I'm wondering if we should start forcing ansible v5 on in places like that and slowly just keep adding to the list. Or is that problematic because if v6 happens we have to go and update all those explicit entries? 19:50:07 i really like keeping it out of job defs because of ^ 19:51:14 yeah, that just seems like a way of inflicting pain on our future selves 19:51:19 I guess the ideal then is to test what we can, set a date to update to v5 by default (we can do that in opendev by tenent before zuul changes) and then anything that breaks after gets fixed 19:51:20 i think in the past, we've gotten like 90% confidence with targeted testing, then switch the default and folks can pin to the old version if something comes up... 19:51:39 ya ok. Maybe end of june is a good target for that? 19:51:44 yeah, pin or fix depending on severity :) 19:52:18 (thinking we want it to be long enough for people to give testing a go before we switch and with summit going on that means some time, but also early enough we don't majorly impact say the openstack release cycle) 19:52:43 sounds good. i don't know what zuul's schedule will end up being, but probably not before end of june. but it doesn't really matter because opendev can set the tenant defaults regardless of what zuul does -- just as long as we communicate and know when it's happening 19:52:53 yup 19:53:19 and for setfacl I think adding that to our base images is reasonable. Its a system utility that helps bootstrap ansible. 19:53:32 agree 19:53:33 rather than force everyone to go and install the package themselves if they are using bcome 19:54:06 (also, facls are awesome and should just be installed everywhere anyway) 19:54:36 fyi devstack passed with acl installed https://zuul.opendev.org/t/openstack/build/113f568178fc4012b7cb862713a48130 19:55:26 yup I suspect if we update our base images this problem largely goes away 19:55:49 probably infra-package-needs is the thing to modify 19:56:01 I can try to collect all the different setfacl package names later today and push a change for that 19:56:05 except maybe for some weird distro ... but we'll see 19:56:52 #topic Open Discussion 19:57:05 We are almost out of time so want to give any other business a quick opportunity 19:57:31 [i plan to be at the summit] 19:58:37 cool. It will be an interesting experience to do a conference again. 19:59:03 interesting is what i'm worried about! :) 19:59:31 * fungi hopes it remains safely boring 20:00:18 and we are at time 20:00:20 thanks everyone 20:00:20 thanks clarkb! 20:00:22 #endmeeting