Friday, 2025-09-19

opendevreviewMerged openstack/diskimage-builder master: root password for dynamic-login made simpler  https://review.opendev.org/c/openstack/diskimage-builder/+/96144900:44
sean-k-mooneyfungi: so the error Connection failed: [Errno 113] EHOSTUNREACH 12:30
fricklerinfra-root: seems we are getting nodes that have two ethernet interfaces in raxflex, see e.g. https://zuul.opendev.org/t/openstack/build/488552424c654e2e8a4d8c5cb82f02a0/log/compute1/logs/worlddump-latest.txt12:30
sean-k-mooneyoften means arp or dns  resoltuion is not working12:30
fricklerthis may be related to latest cloud config changes12:30
sean-k-mooneybut not alwasy as it can be just a routing issue12:31
sean-k-mooneyhttps://acbbb7942ca0556fa51c-bd29254d8f6365fc838eabec881efe79.ssl.cf1.rackcdn.com/openstack/488552424c654e2e8a4d8c5cb82f02a0/compute1/logs/worlddump-latest.txt12:31
sean-k-mooneyah you just pasted that oo12:31
fricklersean-k-mooney: but there should only be one interface in the first place, I think that that is the root cause for the issue12:31
sean-k-mooneyso ya we have 2 routs for the same subnet12:31
sean-k-mooneyand the second interface is listed first12:31
sean-k-mooneyya12:32
sean-k-mooneyso we are expectign to usei 10.0.16.0/20 dev ens4 proto kernel scope link src 10.0.16.132 12:32
sean-k-mooneyi think based on the ansibel vars that i looked at12:32
fricklerI'm not sure yet whether this is a new issue in zuul-launcher or caused by our config changes, will wait for someone else to take a closer look12:33
sean-k-mooneyack, ill also wait for an update but what im thinkin is this might also be reated to how linux does filtering of reply path traffic12:36
sean-k-mooneyfiltering12:36
sean-k-mooneywe do not have world dumps form both host but we do have the roting info  in https://acbbb7942ca0556fa51c-bd29254d8f6365fc838eabec881efe79.ssl.cf1.rackcdn.com/openstack/488552424c654e2e8a4d8c5cb82f02a0/zuul-info/zuul-info.compute1.txt and12:48
sean-k-mooneyhttps://acbbb7942ca0556fa51c-bd29254d8f6365fc838eabec881efe79.ssl.cf1.rackcdn.com/openstack/488552424c654e2e8a4d8c5cb82f02a0/zuul-info/zuul-info.controller.txt12:48
sean-k-mooneywhile ens4 is the default in both case the first 10.0.16.0/20 route on the comptue is via ens3 and its via ens4 on the other host12:49
mnasiadkafrickler: that might also explain networking issues I see on multi node Kolla-Ansible jobs12:51
sean-k-mooneyso that why i think it might be related to  net.ipv4.conf.all.rp_filter12:52
sean-k-mooneyobviously if we only expect 1 port that also a problem12:52
sean-k-mooneybut i think settign sysctl -w net.ipv4.conf.all.rp_filter=2 might have allowd it to work12:53
sean-k-mooneywell or 012:53
sean-k-mooneyhttps://github.com/torvalds/linux/blob/master/Documentation/networking/ip-sysctl.rst?plain=1#L1972-L199112:56
fungisean-k-mooney: our iptables rules may be configured to reject blocked connections with icmp "host unreachable" responses, which would account for that too, though we should be using the "administratively prohibited" code instead13:24
fungibut yeah, if we've suddenly grown a second network interface on those nodes with another ipv4 default route, that could cause quite a bit of chaos13:27
fungihttps://zuul.opendev.org/t/openstack/build/488552424c654e2e8a4d8c5cb82f02a0/log/zuul-info/zuul-info.compute1.txt#44-4613:29
fungiyeah, so only one default route but the gateway is on the same lan as both ens3 and ens4 interfaces13:30
fungidepending on whether the kernel knows to always reply from the same interface the connection came in on, there could be quite a bit of craziness13:32
fungilooks like ens3 is probably the interface we're connecting to, but the default route is "via ens4" so the kernel is probably replying from the ens3 ip address but with the ens4 mac, which would cause constant arp overwrites on the gateway13:33
fungii think https://zuul.opendev.org/t/openstack/build/488552424c654e2e8a4d8c5cb82f02a0/log/zuul-info/inventory.yaml#196 indicates that the floating ip is bound to the ens3 interface but i'm not positive13:36
fricklerI'm pretty convinced https://review.opendev.org/c/opendev/system-config/+/961537 is the trigger for the extraneous interface13:40
fricklernot overriding the default, but adding another one13:41
fungibefore that, the flex clouds were refusing to boot anything, insisting we needed to now specify a network (we still don't know what changed a few days ago that caused it to start happening)13:43
fungii think none of us realized that specifying the existing network and setting it as the default interface would result in the instance getting a second interface on the same network13:46
fungifrickler: do you happen to know what the correct syntax would have been?13:46
fricklerI've never used that part of clouds.yaml, it might also be specific to how the zuul-launcher is invoking the sdk14:03
fungiwe discussed the option of setting it in zuul-launcher configuration rather than clouds.yaml, maybe that would have worked the way we expected...14:08
clarkbfungi: ya my atke on this is sdk or the cloud are doing sometign wrong and we're being treated poorly by the tools :)14:43
clarkbwell the problem originated with the cloud rejecting all boot attempts because multiple networks are present14:43
clarkbso we were getting 0 interfaces and failed boots as a result. Thats bug 114:44
clarkbwe assumed we could workaround this by specifying an explicit network and listing that as the default interface. Apparently that was wrong that is bug 214:44
clarkbin both cases I don't think we the user have done anything wrong. We need otrack down where the tools are beraking I guess14:44
clarkbthe first two questions I have are: do we have two fips or one? and did this work as expected when the chagne first landed and we've regressed in a new way or did the "fix" produce this behavior from the start?14:46
fungii guess we could find a build result that ran in a flex region just after the config change was restarted onto14:47
fungiin order to see if there was just one interface originally14:48
fungias to the multiple fip question, i guess we can just ask for details on those ports for a currently-booted instance?14:49
clarkbyup I think server list / server show would answer the fip count question14:49
clarkbalso do we know if this affects all of the rax-flex regions or just one?14:51
clarkbwe might be able to figure that out via server list/ server show too14:51
fungii've only seen the one example so far14:51
clarkbserver list seems to show ti affecting dfw3 and sjc3. No instances in iad3 right now. And they only have one fip14:52
clarkbok zuul launcher does supply label.networks to the server instance creation. I wonder if sdk's latest release changed how it handles defaults for that and we went from nullish value means autoselect to null value means supply explicit null value and cloud can no longer automatically select14:56
clarkbbut then by using the clouds.yaml override we're again flipping back into some auto select behavior that has gone wrong14:57
clarkbhrm no we were already supplying the network value for this cloud14:57
opendevreviewClark Boylan proposed opendev/zuul-providers master: Stop supplying the network value for rax-flex  https://review.opendev.org/c/opendev/zuul-providers/+/96181115:01
clarkbinfra-root ^ I think we can try that without restarting any services and see if it produces a better result15:01
clarkbI still don't understand why this is happening, but I figure that is a low cost change that is easy to revert etc to check if we get a happier state15:02
corvusthat makes me wonder why the clouds.yaml fix changed anything.15:02
clarkbcorvus: exactly15:02
clarkbI still suspect a bug in the recent sdk release changing behaviors around this stuff15:03
clarkbmaybe the launcher network value isn't supplying sufficient info like nat_destination or default_interface and the sdk is confused?15:03
corvusclarkb: yesterday i was ambivalent, but now i think all this stuff should go in zuul-providers so it's easier to change.15:03
opendevreviewMerged opendev/zuul-providers master: Stop supplying the network value for rax-flex  https://review.opendev.org/c/opendev/zuul-providers/+/96181115:04
corvusi +3d that change, but i kind of think the next thing we do should be to move that stuff out of clouds.yaml.15:04
corvusi mean, after the dust settles.15:04
clarkbcorvus: I agree, except that it aws already there and wasn't working so we need to figure out how to make it work (if this naive update does fix it)15:04
corvusyes, perhaps there are settings needed in clouds.yaml that we don't support in zuul-launcher.15:05
clarkbone upside to putting things like this in clouds.yaml is it makes it easier to manually try to reproduce, but I think having the config in zuul-providers is likely to be less confusing in the long run if we're consistent about it so am willing to sort out extra flags to openstack client when necessary15:05
corvusyeah.  i don't feel strongly about it.  just noting that it took a few hours to get the update in clouds.yaml and a few seconds in zuul-providers.  :)15:06
clarkbSJC3 is building a handful of nodes. Not sure if those would'ev used the old or new config15:06
clarkb++15:06
clarkbnp0bbc0c68f75f4 cloud uuid 67e69f3d-a259-421e-94db-d67e851a894f has one interface and one fip I think15:07
clarkbthat is in sjc315:07
clarkbso I think this did "fix" it15:07
clarkbI have no idea why at this moment. The change seemed to start after last weekends zuul-launcher restart which would've picked up this new release of openstacksdk for the first time https://pypi.org/project/openstacksdk/4.7.1/15:08
clarkbas far as we can tell the network resources in the clouds themselves have not updated in this time period so cloud side changes to the resource themselves don't appear to be at fault15:09
clarkbcould be that cloud side changes to the api code did change in a meaningful way though15:09
clarkbfrickler: ^ do you know what if anything openstacksdk 4.7.1 might have changed around network selection and utilization15:09
clarkbcorvus: I wonder if we can simply change networks:\n - opendevzuul-network1 to networks:\n - name: opendevzuul-network1\n  default_interface: true\n  nat_destination: true in the zuul launcher config and have it pass through the same attributes as clouds.yaml?15:13
clarkbthere might be a schema update necessary first15:13
clarkbhttps://opendev.org/zuul/zuul/src/branch/master/zuul/driver/openstack/openstackendpoint.py#L712-L717 I think this explains why we got two interfaces15:14
clarkbwe explicitly asked for one nic on the zuul-provider listed network, then the clouds.yaml config must imply a second interface (possibly because I set default_interface or nat_destination and either one may trigger creation of a nic?)15:15
clarkbstill not clear why explicitly creating a nic like we did before would start failing at the beginning of the week.15:16
clarkboh I see we don't supply the zuul-provider network info beyond that explicit nic list so yes it seems very likely that the nic information is incomplete (at least according to the sdk or cloud)15:17
clarkbopenstacksdk diff between 4.7.0 and 4.7.1 looks unremarkable so now I'm back to thinking something changed in the cloud15:19
clarkbcorvus: I'm beginning to wonder if maybe https://opendev.org/zuul/zuul/src/branch/master/zuul/driver/openstack/openstackendpoint.py#L714 is the source of the original issue. Specifically I'm wondering if we cached a network id value that was invalid somehow (possibly due to an api issue)15:25
corvusclarkb: that is cached indefinitely because the network id should never change15:26
corvus(so if it did, then that's certainly a very unexpected event)15:27
corvusthough... the exception is that we clear the cache on some errors15:28
corvusthat would handle the case where someone deleted the network and replaced it with a new network with the same name15:29
clarkbagreed. Part of my suspicion for that is reading through the network and nic related code in openstacsdk it seems like sdk converts the network values into nics similar to what launcher is already doing15:29
corvusso.. that is expected.  but since we didn't do that, then ... :)15:29
clarkbcorvus: we manage that opendevzuul-network1 network ourslves15:29
clarkbya that15:29
fungiheading out to run a lunch errand, shouldn't be too long but once i get back i'll start on the mirror cleanup tasks and then final prep for our 20:00 utc mailman server maintenance15:29
clarkbwe do have ansible automatically deploy that though so maybe something went wrong there?15:29
corvusif you list the networks thru the api is there a timestamp?15:30
clarkbchecking15:31
clarkbupdated_at                | 2025-06-25T15:55:47Z for DFW315:31
clarkbthat is within a minute of the SJC3 updated at value and IAD3 updated on August 2015:32
clarkbcorvus: https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/cloud/_compute.py#L980-L983 this is the area of code where I think openstacksdk is now currently doing roughly the same thing we should've been doing with launcher previously15:34
clarkbwhcih is why I'm now wondering if we had a data error15:34
corvusif zuul got an invalid net-id and never got an openstack.exceptions.BadRequestException it would have kept using  it.15:37
corvushttps://opendev.org/zuul/zuul/src/branch/master/zuul/driver/openstack/openstackendpoint.py#L724 is the check for that15:38
clarkbIf we want I think we can remove the clouds.yaml update, revert 961811 then restart the launcher and see if things just work again15:39
corvusi think that's worth doing.  do you read the sdk code as suggesting that during the time we had both in place, we may have been using only the launcher-provided data?15:40
corvusbecause i don't think https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/cloud/_compute.py#L980-L983 will have run until just now, since the launcher will have been supplying the nics arg15:40
corvusand if that's true, that suggests one of two possibilites: 1) your change to clouds.yaml to add the other network settings had some effect other than what we're looking at here in the create server call, or 2) the thing that actually fixed it was the restart and cache clearing.15:42
Clark[m]I'm having what is now becoming a ritual morning ISP packet loss problem arg. corvus looking at that code findNetworks raises Exception if the network is nil. I think we may cache that nullish value? but since we raised eception and not bad requests maybe we use that going forward15:42
Clark[m]I suspect that we want to clear the cache in that situation too15:42
Clark[m]The other thought is the comment there indicates that exception occurs on 400 errors. Maybe we also need to handle 500 errors?15:42
Clark[m]as for coinciding with the weekly restarts I guess the thought there is we could've had this issue during the startup process and if it was something persistent on the cloud side both launchers could've cached the same results15:44
Clark[m](this is all still a bunch of hunches, I think we have to revert back to the config state we were in previously and restart to really start to blame this stuff. But I think the story is feasible)15:45
corvusClark: the functools lru cache will not cache the exception15:45
corvusi agree that it may be worth expanding the cache clearing if we're seeing a new error that warrants it, but i want to be careful and not just add all 5xx errors -- i want to know it really could be related to sending bad data.  because with the behavior we see from some clouds, i worry if we're too agressive we could just nullify the caches altogether.15:47
clarkblooks like client.get_network calls an internal find_network method with ignore_missing=True. That ignore_missing=True parameter is what causes a None return if no result is found rather than raising an exception. That explains why we haev that test in _findNetwork()15:51
clarkbI thought my irc connection was happier but its still iffy...15:51
clarkbcorvus: I'm thinking maybe we can add debug logging to _findNetwork after line 790 since in theory we're calling that almost never. Then revert back to the old config staet (no networks specified in clouds.yaml and network specified in zuul-provider) then restart on that and see what we get?15:53
clarkbthen if we run into a similar situation again that extra logging should hopefully expose what the source of the issue was?15:53
clarkbif we immediately start failing then we probably aren't on the right hunch. If things work then the hunch is probably a good one and we just need more info on where things are going sideways?15:53
corvusclarkb: yep sounds good.  want me to monkeypatch that in?15:58
corvusoh15:58
corvusno yeah, i can do that15:58
corvusi'd monkeypatch it, then clear the caches to force it to re-run.15:58
corvusor do you just want to merge a change and let it go in with the restarts?15:59
clarkbI think the main issue is that we have to restart either way to clear out the clouds.yaml config?16:00
clarkbso its a question of do we update clouds.yaml and zuul-provider config and restart with the debug info or without then monkey patch?16:00
clarkbconsidering we're going to restart automatically in a few hours maybe we should rely on that process more so that we're not in its way (or we don't have it undo our work)16:01
corvusyeah, just not sure if you want to do that like right now real quick, or get both changes merged and restart (which is like... later this afternoon)16:01
corvusack.  if you can write those changes, i'm happy to review/approve16:01
clarkbI think later is fine. Things are working right now. I'll push up the two revert changes16:01
corvusand the debug line too pls.16:02
corvusmy workspace is not conducive to writing that atm.  :)16:02
clarkbwill do16:02
opendevreviewClark Boylan proposed opendev/system-config master: Revert "Select the network to use in raxflex"  https://review.opendev.org/c/opendev/system-config/+/96181516:03
opendevreviewClark Boylan proposed opendev/zuul-providers master: Revert "Stop supplying the network value for rax-flex"  https://review.opendev.org/c/opendev/zuul-providers/+/96181616:05
clarkbremote:   https://review.opendev.org/c/zuul/zuul/+/961817 Add debug logging for openstack network lookups16:12
clarkbmy opportunity for a bike ride for the next few days is basically in the next 15 minutes or so. I think I'm going to pop out nowish to take advantage of that but then should be back to help with ^ and lists stuff etc16:13
clarkbif anyone sees problems with those three changes feel free to push updates I don't mind16:14
corvusall lgtm and approved16:14
opendevreviewMerged opendev/zuul-providers master: Revert "Stop supplying the network value for rax-flex"  https://review.opendev.org/c/opendev/zuul-providers/+/96181616:14
opendevreviewMerged opendev/system-config master: Revert "Select the network to use in raxflex"  https://review.opendev.org/c/opendev/system-config/+/96181516:36
fungiback just in time to see that it's all back to a wait-and-see17:18
Clark[m]fungi: I can't recall if putting lists01 in the emergency file was on the plan doc but it might be a good idea to do so. Also due to a scheduling conflict I have to do the school run at ~2105 UTC18:13
fungiyep, it's step 1 in fact. i'll do that in a sec18:16
fungiplanning to do a penultimate rsync starting in a little over an hour18:17
fungiit's in the disable list now18:17
clarkbcorvus: my change is hitting a test error. I'm going to look into it `AttributeError: 'FakeOpenstackProviderEndpoint' object has no attribute 'provider'`18:34
clarkbI used the same attributes that the exception uses to generate its message but I guess we don't have it faked out?18:34
clarkbI've updated my zuul launcher change to fix it18:43
fungiinfra-root: in precisely one hour i'll be starting our lists01 maintenance as described at https://etherpad.opendev.org/p/2025-09-mailman-volume-maintenance18:59
fungistatus notice All hosted mailing lists are undergoing maintenance for the next hour: https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/message/UTMXRWWTE5WA3IF6WS3BIEJAORI2D62V/19:01
fungiat 20:00 utc i'll send something like this ^ to irc19:01
Clark[m]Lgtm. I'm eating lunch now so that I'm not distracted by hunger in an hour19:01
fungii've cleaned up stretch and bullseye-backports from the mirror.debian volume, working on the same for mirror.debian-security now19:03
clarkbfwiw I thought about whether or not it is an issue to allow multiple interfaces on the same network to be attached to a node. I don't think it is as you may assign them to different namespaces or pass them through to VMs hosted by the instance19:14
clarkbthat said I do think there is a small bug in openstacksdk: I think explicit network lists should override not supplement the networks provided in clouds.yaml19:14
clarkbthe reason for this is that you can already override most clouds.yaml options by supplying explicit values (think api versions or even credentials)19:15
clarkbbut also if you want to override and not add you have to rewrite your clouds.yaml file which seems like a pain. That said this is a minor issue and I don't think the internals of openstacksdk can currently distinguish what was passed explicitly vs via clouds.yaml right now so would require some refactoring19:16
fungithere is a potential problem for ipv4 routing where you may connect remotely to the machine on a different address than the one through which its default route lies19:17
fungiipv6 doesn't have that problem19:18
clarkbfungi: I think if you're doing pass through or separate namespces you avoid that problem though19:18
clarkbas each network stack is effectively decoupled from the other and they are both going to see that interface attached to their bubble as the default route19:19
fungiyeah, it's really just for services listening directly on the interface19:19
corvusclarkb: are you sure we weren't overriding clouds.yaml?  https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/cloud/_compute.py#L980-L983 reads like the caller wins if they supply nics, and that's should have been happening19:19
clarkbcorvus: I guess I'm not 100% certain but its the only explanation I can come up with for why we got two nics19:21
clarkbcorvus: oen from the clouds.yaml definition and the other from the zuul-provider network list (that gets passed to openstacksdk as a nics list)19:21
corvusyeaah -- maybe there's something happening at another level19:27
fungipenultimate mailman rsync is in progress now, should finish by the top of the hour19:33
fungi#status notice All hosted mailing lists are undergoing maintenance for the next hour: https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/message/UTMXRWWTE5WA3IF6WS3BIEJAORI2D62V/20:00
opendevstatusfungi: sending notice20:00
-opendevstatus- NOTICE: All hosted mailing lists are undergoing maintenance for the next hour: https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/message/UTMXRWWTE5WA3IF6WS3BIEJAORI2D62V/20:00
fungithe irony of linking to the ml archive in that isn't lost on me20:00
clarkbheh20:01
opendevstatusfungi: finished sending notice20:02
clarkblooks like things are shutdown at this point20:05
fungiyep, final rsync is already underway20:05
fungionce that's done, i'll start the containers again and send a test post for the maintenance conclusion20:05
fungii already have it queued up20:05
fungiif earlier rsyncs were any indication, this should finish around 20:25 utc20:07
fungimaybe sooner since the data shouldn't be changing this time20:07
fungihoping i'll have the maintenance wrapped up by half-past20:07
clarkback I'm following along just holler if I can be useful20:08
fungiespecially if the faster filesystem means quicker container startup20:08
fungiwill do, so far this is all going to plan20:08
fungidone and starting20:16
fungihttps://lists.opendev.org/archives/list/service-announce@lists.opendev.org/message/UTMXRWWTE5WA3IF6WS3BIEJAORI2D62V/ is loading for me now20:18
fungias is https://lists.opendev.org/mailman3/lists/service-announce.lists.opendev.org/20:19
fungiso that covers both hyperkitty and postorius20:19
fungisending the completion e-mail20:19
clarkblists.zuul-ci.org archives also load for me (just checking a different vhost for completeness)20:20
clarkbwhich list is the completion email being sent to? I'm not seeing it yet20:22
fungiservice-announce. i'm about to start cross-referencing logs20:22
fungi2025-09-19 20:19:40 1uzhZx-0003R2-Qh => service-announce@lists.opendev.org R=dnslookup T=remote_smtp H=lists.opendev.org [2001:4800:7813:516:be76:4eff:fe04:5423] X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no DN="C=UK,O=Exim Developers,CN=lists01.opendev.org" C="250 OK id=1uzha0-006Dzl-A3"20:25
fungithat's my end20:25
clarkb'2025-09-19 20:19:40 1uzha0-006Dzl-A3 <= fungi@yuggoth.org' that is in exim's mainlog file20:25
fungi2025-09-19 20:25:44 1uzha0-006Dzl-A3 == service-announce@lists.opendev.org R=mailman_router T=mailman_transport defer (-54): retry time not reached for any host for 'lists.opendev.org'20:26
clarkband then ^ that ya20:26
clarkbis it possible that exim noticed that we shut things down and is simply waiting to try deliveries again?20:27
fungisince exim was up and running while mailman was down, i bet it tried to deliver some spam20:27
fungiand yeah, it'll retry in a bit20:27
clarkblooking at https://www.exim.org/exim-html-current/doc/html/spec_html/ch-retry_configuration.html I'm still not quite sure what I should be looking at in the exim config to know when it is likely to retry heh20:30
clarkbI think we retry every 15 minutes for 2 hours then back off20:31
corvusexim -qff if you want to process the queue20:32
clarkbso around 20:34 we should expect it to try again20:32
clarkbcorvus: thanks! I suspect it will try on its own in just a minute or two at this point20:32
clarkboh yup I just got the email20:32
fungiyeah, i'm not in any hurry, i blocked out to 21:00 in the announcement anyway20:32
fungiah perfect20:32
fungias did i20:32
fungihttps://lists.opendev.org/archives/list/service-announce@lists.opendev.org/thread/UTMXRWWTE5WA3IF6WS3BIEJAORI2D62V/#UTMXRWWTE5WA3IF6WS3BIEJAORI2D62V shows it now too20:33
clarkbnow we'll have to see if the performance is better. I noticed that iowait was somewhat high around the time things were starting up but that may be residual due to all the startup actions. But also the web ui was responsive for me during that time so could also be that we still have iowait but we're processing io requests quickly enough that we don't notice as much20:34
clarkbtime will tell20:34
fungiother than cleaning up the temporary /var/cache/var_lib_mailman.old directory and taking the server back out of the disable list, the maintenance is done20:34
fungisystem load seems a bit lower, and iowait, while bursty, is not sustained at a significant percent of cpu right now20:35
clarkbya that may be an indication that when we need the disk we really need ti and nwo w'ere able to get through those requests more quickly20:36
fungii just went through the moderation queues for about a dozen lists discarding some spam and everything was snappy20:36
clarkbnice20:36
fungithere have been days recently where i'd tell my browser to load the moderation queue for a list, then wait 2 minutes for the page to render20:37
fungithen select some messages to discard, and wait a couple more minutes for it to do that20:38
fungii'm going to go ahead and self-approve https://review.opendev.org/961528 to clear the mirror.openeuler volume contents20:42
clarkbsounds good. Is that the last cleanup of the known cleanups that we can do at this point?20:43
fungii think so, other than maybe going through puppet/ceph mirrors and some of the wheel volumes20:44
fungiwe still have bionic arm64 wheels for example20:44
fungiand xenial amd6420:44
fungiopeneuler's the biggest cleanup opportunity though at the moment20:46
fungi337gb of data20:46
fungii've taken lists01 back out of the emergency disable list now, but haven't terminated the screen session nor deleted the moved original data directory yet20:49
clarkb337gb is not small (thats almost a whole centos 9 stream)20:51
fungiyeah, it's nearly 10% of our total data20:52
clarkbfor the wheel caches/mirrors we never added noble (or centos 10 stream or rocky linux etc) and it seems to work. I think that enough of the python ecosystem caught up with needing to publish wheels that we just don't have problems htere anymore. We might even be able to look into cleaning up wheels for other things too20:52
clarkbI know as you go back in time in terms of python versions wheels were less common though so maybe its best to let them die on the vine instead20:52
fungiright, i think we stop adding new wheels and clean up the old ones when we drop images/nodes for those platforms20:53
fungiso could clean up the wheel volumes for xenial-amd64 and bionic-arm64 but probably makes sense to batch those up with other cleanups since they're small to begin with20:54
fungixenial is under 10gb and it's the largest of them20:54
fungipre-noble, we didn't add wheel mirrors for jammy either20:56
fungidebian bullseye, ubuntu focal and centos 9 stream are the newest20:56
fungioh, we have a wheel mirror volume for debian buster too20:57
clarkbhuh we'll be able to clear out everything but centos 9 pretty son probably (relative to how long we've had up xenial)20:57
opendevreviewMerged opendev/system-config master: Stop updating and delete OpenEuler mirror content  https://review.opendev.org/c/opendev/system-config/+/96152821:11
*** dmellado9 is now known as dmellado22:01

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!