#openstack-meeting log

19:01:12 <clarkb> #startmeeting infra
19:01:13 <openstack> Meeting started Tue Jan 21 19:01:12 2020 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:14 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:16 <openstack> The meeting name has been set to 'infra'
19:01:22 <clarkb> #link http://lists.openstack.org/pipermail/openstack-infra/2020-January/006577.html Our Agenda
19:01:24 <zbr> o/
19:01:30 <clarkb> #topic Announcements
19:01:46 <ianw> o/
19:01:56 <clarkb> I did not have any announcements
19:02:33 <clarkb> #topic Actions from last meeting
19:02:42 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-01-14-19.01.txt minutes from last meeting
19:03:07 <clarkb> I had an action item to start making the opendev governance changes official. This has been done, but lets talk about it during our OpenDev topic
19:03:14 <clarkb> #topic Priority Efforts
19:03:18 <clarkb> #topic OpenDev
19:03:26 <clarkb> we can jump into that immediatly :)
19:04:09 <clarkb> There are two related changes to make the OpenDev governance changes official. The first removes OpenDev from OpenStack Governance. The second creates a new self bootstrapped governance system for OpenDev
19:04:12 <clarkb> #link https://review.opendev.org/#/c/703134/ Split OpenDev out of OpenStack Governance
19:04:20 <clarkb> #link https://review.opendev.org/#/c/703488/ Update OpenDev docs with new Governance
19:04:41 <clarkb> Both have received a fair amount of review, thank you allfor that. If you haven't yet had a chance to review them your feedback is very welcome
19:05:11 <clarkb> Are there any questions or concerns at this point that I (or others) can help with?
19:06:03 <fungi> seems like a fairly straightforward evolution for me
19:06:24 <clarkb> One thing to point out is that I think the first change needs to merge before the second
19:06:29 <clarkb> otherwise there is a governance conflict
19:06:32 <fungi> i guess the main question i have is whether there will also be an openstack-infra governance defined after the opendev governance gets published in the place that used to live
19:06:52 <clarkb> fungi: ajaeger brought that up too, and I don't htink it needs anything beyond the default openstack system
19:07:02 <fungi> wfm
19:07:28 <fungi> also we could consider openstack-infra becoming a sig, but one step at a time
19:07:30 <clarkb> fungi: the repos/tools/systems that remain in openstack are fairly self contained and won't require subteam organization I don't think
19:08:24 <corvus> yeah, if it end up needing more formal organization, maybe a sig
19:09:18 <fungi> well, a sig would be *less* formal organization really
19:09:28 <corvus> i mean more formal than nothing :)
19:09:33 <fungi> (less organization than an openstack project team anyway)
19:09:39 <fungi> oh, sure ;)
19:09:46 <corvus> oh, yes.
19:10:06 <corvus> on the question of sig vs subteam, sig sounds right, due to the value of cross-subteam collaboration.
19:10:39 <corvus> (and relative dearth of release deliverables)
19:12:57 <clarkb> On the technical side of things jrosser noticed that some jobs running between 1400 and 1500 UTC today had git clone errors. I've suggested that the job instead rely on zuul managed repos (as they come from caches and allow for cross repo testing). I also discovered that about half the total connections to gitea04 (which was the unhappy node at the time) came from huawei IPs
19:13:05 <clarkb> those IPs were all in the same /22 subnet
19:13:15 <clarkb> it looks like a CI system in that way
19:13:57 <clarkb> I think we have ~3 options. A) bigger gitea nodes with more memory to avoid swapping and OOMs B) work with huawei to not DDoS us 3) try least connections load balance method again (this is unlikely to be stable)
19:14:05 <fungi> we also had a massive load spike on lists.o.o which looked like they could be a pipermail python process pileup
19:14:15 <fungi> but i didn't find a smoking gun
19:14:40 <clarkb> at this point I think its motsly a monitor situation. If it continues and or gets worse then we will need to sort out a plan
19:14:47 <clarkb> but maybe this is a one off and we'll be ok
19:14:59 <corvus> clarkb: ++ let's get 2 or 3 data points
19:15:03 <fungi> and... same recommendation on the lists.o.o front for now
19:15:29 <clarkb> Anything else opendev related?
19:16:10 <fungi> nothing springs to mind
19:16:30 <clarkb> #topic Update Config Management
19:16:41 <clarkb> review-dev is running without puppet at this point \o/
19:17:10 <clarkb> we did discover a couple gaps that need to be addressed though. The first is in disabling the old init script for gerrit once converted to podman. The other is configuring a webserver to front gerrit
19:17:32 <clarkb> I think for the first one we ma yjust make that part of our migration document but the second will need updated to the ansible/container tooling
19:18:01 <fungi> yeah, extra ansible to delete an initscript seems overkill
19:18:24 <clarkb> I don't think the webserver change(s) have been written yet and I don't think we have mordred here today
19:18:36 <fungi> or possibly he's just waking up
19:19:56 <clarkb> I guess keep an eye out for those changes as they are the next step in getting gerrit switched over
19:20:06 <clarkb> are there any other config management related changes to call out?
19:21:18 <clarkb> Sounds like no. Let's move on
19:21:22 <clarkb> #topic General Topics
19:21:51 <clarkb> ianw: do you want to start with static.opendev.org or arm64 cloud?
19:22:09 <ianw> i haven't made too much progress on static.o.o
19:22:18 <clarkb> looks like the changes have landed at least
19:23:04 <ianw> yep, i can loop back on that and i think starting a new server is probably the next step
19:23:14 <clarkb> cool
19:23:33 <ianw> then governance and some others should be publishing to afs and we can move them to be served from there
19:23:34 <clarkb> related to this topic, we threw out last week that maybe having a day or two to focus on getting rid of the last trusty servers would be worthwhile
19:24:02 <clarkb> suggestion is to use the 27th and 28th for that (28th and 29th australia days)
19:24:18 <clarkb> I'm still on board for that and will plan to focus on this work on those days
19:24:28 <clarkb> everyone else is more than welcome to join in :)
19:25:23 <clarkb> ianw: we must be getting close to needing to set up dns records as well
19:25:33 <clarkb> ianw: does that need to happen before the server starts (so that LE can configure things properly)
19:26:19 <ianw> clarkb: umm, yes probably, i will loop back on that
19:26:34 <fungi> i can try to set aside those two days to make progress on wiki.o.o configuration management in hopes we can migrate to a replaceable xenial server with actual puppet-applied configuration
19:26:49 <fungi> (i think wiki-dev is close-ish)
19:26:59 <clarkb> sounds good, thanks
19:27:10 <clarkb> Next up is a new arm64 cloud
19:27:26 <clarkb> #link https://review.opendev.org/#/q/topic:linaro-us+status:open Changes to deploy new arm64 cloud
19:27:37 <clarkb> ianw: I left a minor -1 on the clouds.yaml changes
19:27:47 <corvus> ooh neat -- we'll have 2 now?
19:28:09 <clarkb> corvus: yup. This one looks to be quite a bit larger than the existing one as well
19:28:16 <ianw> clarkb: yep, will do comments on that small stack asap
19:28:25 <clarkb> 44 test instances (if using 8vcpu per instance)
19:28:47 <ianw> if we can get that in, i can look at bringing up a mirror, and then ... it should work? :)
19:29:06 <clarkb> ianw: ya I think we are less concerned with performance given it is arm and simply different
19:29:12 <clarkb> people running jobs thee will have to accept it may be slower
19:29:23 <ianw> kevinz did mention it was ipv6 only which is new territory for arm64 cloud so may be wrinkles there
19:30:37 <clarkb> that is a good point.
19:30:54 <fungi> well, in #openstack-infra he just said there were a lot fewer ipv4 addresses than quota for test nodes, but yes in practice that means ipv6 global with v4 egress via overload nat
19:31:34 <fungi> (port address translation)
19:31:43 <clarkb> and dedicated ipv4 for the mirror
19:32:16 <frickler> this likely also means that we should setup some special routing for the v4 mirror like FN has
19:32:35 <frickler> iirc
19:32:50 <clarkb> frickler: thats a good point ,we should double check the traceroute for test nodes to the mirror
19:32:58 <clarkb> in case the ipv6 setup is similar and bottlenecks through the router
19:33:30 <fungi> yeah, or rather we end up overrunning the nat's port mapping pool with local connections to the mirror server
19:34:11 <fungi> (remembering that the tcp/udp port range is a mere 65k, so that's not a lot of parallel connections in reality)
19:35:02 <clarkb> Anything else on the topic of a new arm64 cloud?
19:36:19 <ianw> we did have a mutli-arch meeting
19:36:38 <ianw> #link https://storyboard.openstack.org/#!/project/openstack/multi-arch-sig
19:36:46 <ianw> plan is to put interesting things to track @ ^
19:38:01 <fungi> awesome!
19:38:05 <fungi> i hope that goes somewhere
19:38:35 <clarkb> I guess that sig will be taking advantage of the new set of resources to do more testing of openstack on more than just x86?
19:39:20 <fungi> it's apparently in tonyb's interest to have better support for powerpc now
19:39:36 <fungi> so maybe even more than just x86 and aarch64
19:40:08 <ianw> :) yep, i think we start with a build-it-and-they-will-come approach
19:40:35 <fungi> indeed
19:40:38 <fungi> it's worked so far
19:41:51 <clarkb> Alright anything else?
19:42:54 <ianw> not from me
19:44:06 <clarkb> #topic Open Discussion
19:44:27 <clarkb> I'll give it a few minutes for other topics
19:47:24 <clarkb> not hearing any other business. I'll go ahead and call the meeting
19:47:26 <clarkb> Thank you everyone!
19:47:31 <clarkb> #endmeeting