clarkb | Meeting time in a couple minutes | 18:59 |
---|---|---|
clarkb | #startmeeting infra | 19:01 |
opendevmeet | Meeting started Tue Feb 7 19:01:06 2023 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
opendevmeet | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/QJK7E7D7HG5ZNT4UE7T5QIQ5TARIAXP6/ Our Agenda | 19:01 |
clarkb | #topic Announcements | 19:01 |
clarkb | The service coordinator nomination period is currently open. You have until February 14 to put your name into the hat. I'm happy to chat about it if there is interest too before any decisions are made | 19:02 |
clarkb | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/32BIEDDOWDUITX26NSNUSUB6GJYFHWWP/ | 19:02 |
clarkb | Also, I'm going to be out tomorrow (jsut a heads up) | 19:02 |
clarkb | #topic Topics | 19:04 |
clarkb | #topic Bastion Host Updates | 19:04 |
clarkb | #link https://review.opendev.org/q/topic:bridge-backups | 19:04 |
clarkb | I truly feel bad for not getting to this. I should schedule an hour on my calendar just for this already. But too many fires keep coming up | 19:04 |
clarkb | ianw: fungi: were there any other bastion host updates you wanted to call out? | 19:05 |
fungi | i don't think so | 19:05 |
ianw | sorry, woke up to a dead vm, back now :) | 19:06 |
clarkb | you haven't missed much. Just wanted to make sure there wasn't anthing else bastion related before continuing on | 19:06 |
ianw | no changes related to that this week | 19:06 |
clarkb | #topic Mailman 3 | 19:06 |
clarkb | The restart of containers ot pick up the new site owner email landed and fungi corrected the root alias email situation | 19:07 |
fungi | current state is that i need to work out how to create new sites in django using ansible so that the mailman domains can be associated with them | 19:07 |
clarkb | Fixing the vhosting is still a WIP though I think fungi roughly understands the set of steps tha tneed to be taken and now is just a matter of figuring out how to automate django things | 19:07 |
fungi | and yeah, this is really designed to be done from the django webui. if i were a seasoned django app admin i'd have a better idea of what makemigrations could do to ease that from the command line | 19:08 |
clarkb | I wonder if we've got any of those in the broader community? Might e worth reaching out to the openstack mailing list? | 19:09 |
fungi | but it's basically all done behind the scenes by creating database migrations which prepopulate new tables for the site you're creating | 19:09 |
fungi | databases were never my strong suit to begin with, and db migratopns are very much a black box for me still. django seems to build on that as a fundamental part of its management workflow | 19:10 |
clarkb | ya I suspect what we might end up with is having a templated migration file in ansible that gets written out to $dir for mailman for each site and then ansible triggers the migrations | 19:10 |
clarkb | and future migrations should just ensure that steady state without changing much | 19:10 |
clarkb | the tricky bit will be figuring out what goes into the migration file definition | 19:10 |
fungi | yeah, django already templates the migrations, as i loosely understand it, which is what manage.py makemigrations is for | 19:11 |
fungi | it seems you're expected to tell django to build the migrations necessary for the new site, and then to apply those migrations it's made | 19:11 |
fungi | which results in bringing the new site up | 19:12 |
ianw | it sort of seemed like you needed a common settings.py, and then each site would have it's own settings.py but with a different SITE_ID? | 19:12 |
fungi | i think so, but then mailman when it runs needs SITE_ID=0 instead | 19:12 |
clarkb | ianw: I think thats for normal django multi sites. But mailman doesn't quite do it that way? YOu don't have a true extra site it just uses the site db info to vhost its single deployment | 19:12 |
fungi | which is a magic value telling it to infer the site from the web requests | 19:13 |
clarkb | ya so ultimately we run a single site with ID=0 but the db has entries for a few sites | 19:13 |
fungi | the other related tidbit is i need to update docker on lists01 and restart the containers | 19:14 |
fungi | which i plan to do first on a held node i have that pre-dates the new docker release | 19:14 |
clarkb | cool sounds like we know what needs to happen just a matter of sorting through it. Anything else? | 19:15 |
fungi | i don't have anything else, no | 19:16 |
clarkb | #topic Git updates | 19:16 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/873012 Update our base images | 19:16 |
fungi | i restacked the mm3 version upgrades change behind the vhosting work | 19:16 |
clarkb | The base python images did end up updating. Then I realized we use the -slim images which don't include git so this isn't really useful other than as a semi periodic update to the other things we have installed | 19:17 |
clarkb | I was looking at the non slim images to see if git had updated not realizing we only have git where we explicitly install it. All that to say next week we can drop this topic. | 19:17 |
clarkb | And that change is not urgent, but probably also a reasonable thing to do | 19:18 |
clarkb | #topic New Debuntu Releases Preventing sudo pip install | 19:18 |
clarkb | fungi called out that debian bookworm and consequently ubuntu 23.04 and after will prevent `sudo pip install` from working on those systems | 19:19 |
clarkb | For OpenDev we've shifted a lot of things into docker images built on our base python images. These don't use debian packaging for python and I suspect will be fine. However if they are not we should be able to modify the intsallation system on the image to use a single venv that gets added to $PATH | 19:19 |
clarkb | I think this means the risk to us is relatively low | 19:20 |
clarkb | Aditionally ansible is already in a venv on bridge and we use venvs on our test images | 19:20 |
ianw | docker-compose isn't though. that's one i've been meaning to get to | 19:20 |
clarkb | good call | 19:20 |
clarkb | definitely anything you can think of that is still running outside of a venv should be moved. We can do that ahead of the system server upgrades that will break us since old stuff can handle venvs | 19:21 |
ianw | ++ i'm sure we can work around it, but it's a good push to do things better | 19:21 |
clarkb | Elsewhere we should expect projects like openstack and probably starlingx to struggle with this change | 19:22 |
clarkb | in particular tools like devstack are not venv ready | 19:22 |
fungi | yeah, i posted to openstack-discuss about it as well, just to raise awareness | 19:22 |
ianw | yeah there have been changes floating around for years, that we've never quite finished | 19:22 |
clarkb | and ya I think talking about it semi regularly is a good way to keep encouraging people tochip away at it | 19:23 |
clarkb | for a lot of stuff we should be able to make msall measureable progress with minimal impact over time | 19:23 |
clarkb | #topic Gerrit Updates | 19:25 |
clarkb | A number of Gerrit related changes have landed over the last week. In particular our use of submit requirements was cleaned up and we have a 3.7 upgrade job | 19:25 |
clarkb | That expanded testing was used to land the base image swap for gerrit | 19:25 |
clarkb | this base image swap missed (at least) one thing: openssh-client installation | 19:25 |
clarkb | this broke jeepyb as it uses ssh to talk to gerrit for new repo creation via the manage-projects tool | 19:26 |
clarkb | Apologies for that. | 19:26 |
clarkb | fungi discovered that even after fixing openssh jeepyb's manage-projects wedges itself for projects if the initial creation fails. The reason for this is that no branch is created in gerrit if manage-projects fails on the first run. This causes subsequent runs to clone from gerrit and not be able to checkout master | 19:26 |
clarkb | To work around this fungi manually pushed a master branch to starlingx/public-keys | 19:27 |
fungi | and discovered in the process that you need an account which has agreed to a cla in gerrit in order to do that to a cla-enforced repository | 19:27 |
fungi | my fungi.admin account had not (as i suspect most/all of our admin accounts haven't) | 19:28 |
clarkb | I've only had a bit of time today to think about that but part of thinks that this may be desireable as I'm not sure we can fully automate around all the gerrit repo creation failed causes? | 19:28 |
fungi | the bootstrapping account is in the "System CLA" group, which seems to be how it gets around that | 19:28 |
clarkb | in this specific case we chould just fallback to reiniting from scratch but I'm not sure that is appropriate for all cases | 19:28 |
clarkb | fungi: ya I wonde rif we should just go ahead and add the admin group to system cla or something like that | 19:28 |
fungi | or add project bootstrappers to it | 19:29 |
clarkb | ah yup | 19:29 |
fungi | as an included group | 19:29 |
clarkb | with that all sorted I think ianw's change to modify acls is landable once communicated | 19:29 |
clarkb | #link https://review.opendev.org/c/openstack/project-config/+/867931 Cleaning up deprecated copy conditions in project ACLs | 19:29 |
clarkb | it would've had a bad time with no ssh :( | 19:30 |
fungi | indeed | 19:30 |
fungi | thanks for fixing it! | 19:30 |
ianw | yeah sorry, will send something up about that | 19:30 |
clarkb | Other Gerrit items include a possible upgrade to java 17 | 19:30 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/870877 Run Gerrit under Java 17 | 19:30 |
clarkb | I'd still like to hunt down someone who can explain the workaround that is necessary for that to me a bit better | 19:31 |
clarkb | but I'm finding that the new discord bridge isn't as heavily trafficed as the old slack system. I may have to break down and sign up for discord | 19:31 |
clarkb | And yesterday we had a few users reporting issues with large repo fetches | 19:31 |
clarkb | ianw did some debugging on that and it resulted in this issue for MINA SSHD | 19:32 |
clarkb | #link https://github.com/apache/mina-sshd/issues/319 Gerrit SSH issues with flaky networks. | 19:32 |
ianw | oh, that just got a comment a few minutes ago :) | 19:32 |
ianw | ... sounds like whatever we try is going to involve a .java file :/ | 19:34 |
clarkb | ya looks like tomas has a theory but we need to update gerrit to better instrument things in order to confirm it | 19:34 |
clarkb | Progress at least | 19:34 |
clarkb | Anything else gerrit related before we move on? | 19:35 |
ianw | jayf was the first to mention it, but it is a pretty constant thing in the logs | 19:35 |
clarkb | if it is a race the chagne in jdk could be exposing it more too | 19:36 |
clarkb | since that may affect underlying timing of actions | 19:36 |
fungi | and others are still reporting connectivity issues to gerrit today (jrosser at least) | 19:36 |
clarkb | oh side note: users can use https if necessary. Its maybe a bit more clunky if using git-review but is a fallback | 19:37 |
ianw | i think it would be easy-ish to add the close logging suggested there in the same file | 19:37 |
ianw | (if it is) i could try sending that upstream, and if it's ok, we could build with a patch | 19:38 |
clarkb | yup and we could even patch that into our image if upstream doesn't want the extra debugging (though ideally we'd be upstream first as I like not having a fork) | 19:38 |
ianw | yeah. although we haven't had a lot of response on upstream things lately :/ but that was mail, not patches | 19:38 |
clarkb | ianw: oh also March 2 at a terrible time of day for you (8am for me) they have their community meeting. Why don't I go ahead and throw this on the agenda and I'll do my best to attend | 19:39 |
clarkb | I can ask about java 17 too | 19:39 |
clarkb | (not that we have to wait that long just figure having a direct conversation might help move some of these things forward) | 19:40 |
ianw | ++ | 19:40 |
clarkb | #topic Python 2 removal from test images | 19:40 |
clarkb | 20 minutes left lets keep things moving | 19:41 |
clarkb | some projects have noticed the python2 removal. It turns out listing python2 as a dependency in bindep was not something everyone understood as necessary | 19:41 |
clarkb | some projects like nova and swift are fine. Others like glance and cinder and tripleo-heat-templates are not | 19:41 |
clarkb | When this came up earlier today I had three ideas for addressing this. A) revert the python2 removal from test images B) update things to fix buggy bindep.txt C) have -py27 jobs explicitly install python2 | 19:42 |
clarkb | I'm beginning to wonder if we should do A) then announce we'll remove it again after the antelope release so openstack should do either B or C in the meantime? | 19:42 |
fungi | per a post to the openstack-discuss ml. tripleo seems to have gone ahead with option b | 19:42 |
ianw | yeah i'm just pulling it up ... | 19:43 |
ianw | i think maybe we have openstack-tox-py27 install it | 19:43 |
fungi | apparently stable branch jobs supporting python 2.7 are very urgent to some of their constituency | 19:43 |
clarkb | my main concern here is that openstack isn't using bindep properly | 19:43 |
ianw | i agree on that | 19:43 |
ianw | if we put it back in the images, i feel like we just have to do a cleanup again at some point | 19:44 |
clarkb | ianw: yup I think we'd remova python2 again say Late april after the openstack release? | 19:44 |
ianw | at least if it's in the job, when the job eventually is unreferenced, we don't have to think abou tit again | 19:44 |
fungi | what is properly in this case? they failed to specify a python version their testing requires... i guess that means they should include python3 as well | 19:44 |
clarkb | thats a good point | 19:44 |
clarkb | fungi: yes python3 should be included too | 19:44 |
ianw | yeah, i mean the transition point between 2->3 was/is a bit of a weird time | 19:45 |
ianw | they *should* probably specify python3, but practically that's on all images | 19:45 |
ianw | at least until python4 | 19:45 |
clarkb | I suspect that nova and swift have/had user using bindep outside of CI | 19:45 |
fungi | also a chicken-and-egg challenge for our jobs running bindep to find out they already have the python3 requested | 19:46 |
clarkb | and that is why theirs are fine. But the others never used bindep except for in CI and once things went green they shipped it | 19:46 |
clarkb | So maybe the fix is update openstack -py27 jobs to install python2 and encourage openstack to update their bindep files to include runtime dependencies | 19:46 |
fungi | basically we can't really have images without python3 on them, because ansible even beofre it runs bindep | 19:46 |
fungi | so, yeah, i agree including python3 in bindep.txt is a good idea, it just can't be enforced by ci through exercising the file itself (a linting rule could catch it though) | 19:48 |
clarkb | we also don't need to solve that in the meeting (lack of time) but I wanted to make sure everyone was aware of the speed bump they hit | 19:48 |
ianw | ++ i'll have a suggested patch to openstack-zuul-jobs for that in a bit | 19:48 |
clarkb | thanks | 19:48 |
clarkb | #topic Docker 23 | 19:48 |
clarkb | Docker 23 released last week (skipping 21 and 22) and created some minor isues for us | 19:48 |
clarkb | In particular they have an unlisted hard dependency on apparmor which we've worked around in a couple of places by installing apparomor | 19:49 |
clarkb | Also things using buildx need to explicitly install buildx as it has a separate package now (docker 23 makes buildx the default builder for linux too, I'm not sure how that works if buildx isn't even installed by default though) | 19:49 |
fungi | hard dependency on apparmor for debuan-derivatives anyway | 19:49 |
clarkb | right | 19:50 |
fungi | s/debuan/debian/ | 19:50 |
clarkb | and maybe on opensuse but we don't opensuse much | 19:50 |
clarkb | at this point I think the CI situation is largely sorted out and ianw has started a list for working through prod updates | 19:50 |
clarkb | prod updates are done manually because upgrading docker implies container restarts | 19:50 |
clarkb | Mostly just a call out topic since these errors have been hitting things all across our world | 19:51 |
ianw | #link https://etherpad.opendev.org/p/docker-23-prod | 19:51 |
clarkb | thank you to everone who has helped sort it out | 19:51 |
ianw | most done, have to think about zuul | 19:51 |
clarkb | ya zuul might be easiest in small batches | 19:52 |
ianw | i'm thinking maybe the regular restart playbook, but with a forced docker update | 19:52 |
ianw | rolling restart playbook | 19:52 |
clarkb | ya that could work too. A one off playbook modification? | 19:52 |
ianw | yeah, basically just run a custom playbook | 19:52 |
fungi | the pad contains list.katacontainers.io (what are we using docker for there?) but not lists.openstack.org | 19:52 |
clarkb | fungi: we're not I think the entire inventory went in there and has been edited to reflect reality? | 19:53 |
corvus | that seems like it should work | 19:53 |
fungi | oh, i see lists.openstack.org is in the not using list | 19:53 |
fungi | list.katacontainers.io probably just hasn't been checked yet | 19:53 |
ianw | yeah sorry, i didn't | 19:53 |
ianw | what i would like to do after this is rework things so we have one docker group | 19:53 |
fungi | no worries, i'll take a look | 19:53 |
ianw | so hosts that run install-docker now are all in that group. will take a bit of playbook swizzling | 19:54 |
clarkb | ok running out of time and I want to get to ade_lee's topic | 19:54 |
clarkb | #topic FIPS jobs | 19:54 |
ade_lee | :) | 19:54 |
clarkb | speaking of swizzling | 19:54 |
fungi | at this point 866881 needs a second zuul/zuul-jobs | 19:55 |
fungi | reviewer | 19:55 |
fungi | the rest of the changes are ready to merge once that does? | 19:55 |
clarkb | #link https://review.opendev.org/c/zuul/zuul-jobs/+/866881 | 19:55 |
clarkb | #link https://review.opendev.org/c/zuul/zuul-jobs/+/866881 | 19:55 |
clarkb | #link https://review.opendev.org/c/openstack/project-config/+/872222 | 19:55 |
ade_lee | I think so yes | 19:55 |
clarkb | https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/872223 | 19:55 |
fungi | ianw and i +2'd the later changes ready to approve once the zuul-jobs change is in | 19:56 |
clarkb | and the tldr here is the jobs are getting reorganized to handle pass to parent and early fips reboot needs. They should emulate how our jobs for docker images are set up | 19:56 |
clarkb | right? | 19:56 |
ade_lee | yup | 19:56 |
fungi | more to handle the need for secret handling in the new role that handles ubuntu advantage subscriptions | 19:56 |
clarkb | ah right thats the bit that needs the secret and uses pass to parent | 19:57 |
fungi | ua just ends up being a prerequisite for fips on ubuntu | 19:57 |
fungi | since it requires a license to get the packages | 19:57 |
fungi | (which opendev has been granted by canonical in order to make this work) | 19:58 |
clarkb | sounds like mostly just need reviews at this point. I'll tr to review today if I don't run out of time. | 19:58 |
clarkb | #topic Open Discussion | 19:58 |
clarkb | Any last minute concerns or topics before we can all go find a meal? | 19:58 |
ade_lee | clarkb, that would be great - thanks! | 19:58 |
fungi | we're running into dockerhub tag pruning issues which are blocking deployment from image updates | 19:59 |
clarkb | ianw has a change to aid in debugging that | 19:59 |
fungi | just a heads up to people who haven't seen the discussion around that yet | 19:59 |
clarkb | #link https://review.opendev.org/c/zuul/zuul-jobs/+/872842 | 19:59 |
fungi | as soon as that's worked out we'll have donor logos on the main opendev.org page | 20:00 |
ianw | also speaking of distro deprecated things | 20:00 |
ianw | #link https://review.opendev.org/c/opendev/system-config/+/872808 | 20:00 |
ianw | was one to stop using apt-key for the docker install ... it warns on jammy now | 20:00 |
fungi | thanks for fixing that | 20:00 |
clarkb | and reminder I'll be afk tomorrow | 20:00 |
clarkb | thats our hour. Thanks everyone | 20:01 |
clarkb | #endmeeting | 20:01 |
opendevmeet | Meeting ended Tue Feb 7 20:01:18 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 20:01 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2023/infra.2023-02-07-19.01.html | 20:01 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2023/infra.2023-02-07-19.01.txt | 20:01 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2023/infra.2023-02-07-19.01.log.html | 20:01 |
clarkb | We didn't get to a couple of topics but the sqlalchemy one isn't urgent and the others dind't really have updates | 20:01 |
clarkb | but feel free to bring them up in #opendev or on the mailing list if I'm istaken | 20:01 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!