*** darmach6 is now known as darmach | 00:12 | |
opendevreview | Tony Breeds proposed openstack/diskimage-builder master: Add new openstack/devstack based functional testing https://review.opendev.org/c/openstack/diskimage-builder/+/949942 | 01:06 |
---|---|---|
opendevreview | Tony Breeds proposed openstack/diskimage-builder master: Add new openstack/devstack based functional testing https://review.opendev.org/c/openstack/diskimage-builder/+/949942 | 02:30 |
opendevreview | Tony Breeds proposed openstack/diskimage-builder master: Add new openstack/devstack based functional testing https://review.opendev.org/c/openstack/diskimage-builder/+/949942 | 02:32 |
opendevreview | Tony Breeds proposed openstack/diskimage-builder master: Add new openstack/devstack based functional testing https://review.opendev.org/c/openstack/diskimage-builder/+/949942 | 03:46 |
mnasiadka | Ramereth[m]: it seems you were able to fix it - it's now way better - thanks :) | 05:40 |
mnasiadka | morning | 05:59 |
opendevreview | Elod Illes proposed openstack/project-config master: Allow late EOL tagging of trailing projects https://review.opendev.org/c/openstack/project-config/+/951842 | 10:51 |
opendevreview | Elod Illes proposed openstack/project-config master: Allow late EOL tagging of trailing projects https://review.opendev.org/c/openstack/project-config/+/951842 | 11:08 |
clarkb | fungi: disks are full again | 14:55 |
clarkb | at least on gitea10 which I pulled up to look into doing a RO change | 14:55 |
clarkb | fungi: we could click buttons on gitea10 to clean it up then set the dir to ro then reboot and see how we do? | 14:55 |
clarkb | 10, 11, 13, and 14 seem to be in the bad state | 14:57 |
clarkb | infra-root ^ any other opinions on how to tackle that? | 14:57 |
clarkb | my concern is that gitea may just update the perms on the dir if it finds them unwriteable. Which I guess is fine as a test in prod. I suspect that we might generate more 500 errors otherwise whcih I think is also acceptable | 14:58 |
clarkb | so I think its fine to try that on gitea10 and see what happens? | 14:58 |
corvus | seems like a great time to try that experiment since the service is impacted anyway. | 14:58 |
clarkb | ok I'll start on that in a few | 14:59 |
fungi | yeah, no objection on that test | 15:00 |
fungi | gotta do something anyway | 15:00 |
fungi | maybe we can also block the actual url pattern or something | 15:01 |
clarkb | ya that might cut down on 500 errors | 15:02 |
fungi | .*/archive/[^/]+\.(tar\.gz|zip)$ | 15:02 |
clarkb | fungi: hrm I got a 500 error just trying to login did that happen to you? | 15:03 |
clarkb | maybe you had to free up some other disk space first? | 15:03 |
fungi | clarkb: yes, you need to free up a little space on the rootfs first | 15:03 |
clarkb | ack | 15:03 |
fungi | i did it with `sudo journalctl --vacuum-time=1d` | 15:04 |
fungi | free up a little space so it can write to the db, then log in, then clear the archive cache, then reboot | 15:04 |
fungi | archive/[^/]+ is too conservative, there's branches with / in their names, eg. https://opendev.org/openstack/nova/archive/stable/2025.1.tar.gz | 15:05 |
fungi | so something more like /archive/.+\.(tar\.gz|zip)$ i guess | 15:06 |
clarkb | dr-xr-xr-x 2 1000 1000 4096 Jun 5 15:07 repo-archive | 15:08 |
clarkb | I'm going to reboot and start services again (I shut them down before setting up the new archive perms) | 15:08 |
clarkb | things are up and perms haven't changed | 15:09 |
clarkb | seems like normal operations continue to work on that host. I tried to download the tar.gz for x/wsme master and I see the cleint making the requests and gettinga 200 back indicating the request has been accepted but I don't see anything changing in the repo-archive dir | 15:12 |
clarkb | so I think this means we're successfully breaking this | 15:12 |
fungi | did it actually download? | 15:13 |
clarkb | fungi: no | 15:13 |
clarkb | I think the way this works is you make a request to start the download process which is what creates the artifact and caches it then whe nthe file is available the browser client initiates the actual download | 15:14 |
clarkb | let me know if you want to do any addtional testing before I proceed with 11, 13, and 14 (then we can do 09 and 12 after) | 15:14 |
clarkb | and then we can retrigger more replication | 15:15 |
clarkb | I picked x/wsme because it is small (so should tarball quickly) and is an easy thing to grep for | 15:16 |
fungi | no, lgtm. i've got a ProxyPassMatch reject line to add to the apache config, just need to write the test for it | 15:16 |
clarkb | thanks | 15:16 |
clarkb | I'm actually going to manually trigger the admin task to delete the archives to see what happens there | 15:16 |
clarkb | since we run that in a daily ish cron too I want to make sure we don't make the server sad with the updated perms doing that | 15:17 |
clarkb | ok I think it naively succeeded because I created a new empty dir for repo-archive. Looking at admin/monitor/cron it says the trigger I just did succeeded. It must've listed the dir saw nothing to cleanup and exited immediately | 15:18 |
clarkb | which is perfect | 15:18 |
clarkb | proceeding with the other servers now | 15:19 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Block access to Gitea's archive feature https://review.opendev.org/c/opendev/system-config/+/951873 | 15:28 |
slittle | Hi, is opendev experiencing any issues propagating merged revies from review.opendev.org to opendev.org ? | 15:30 |
slittle | e.g. https://review.opendev.org/c/starlingx/root/+/951867 merged 45 min ago, but isn't visible on opendev.org | 15:31 |
clarkb | slittle: yes, I'm busy correcting it now. If you pull up channel logs you'll see more deatils | 15:31 |
fungi | slittle: at the moment yes, some (presumably llm training) crawler is filling up the disk by trying to download tarballs and zipfiles of every repository state | 15:31 |
fungi | as soon as surgery is complete we'll force full re-replication from gerrit and they should catch up | 15:32 |
slittle | love those llm's | 15:33 |
fungi | we've got a couple of mitigations in flight to prevent it long-term (by disabling the gitea feature that lets users download tarballs and zips of repositories) | 15:33 |
slittle | I'll alert my community | 15:33 |
fungi | #status log Gitea archive caches filled up filesystems again today on most of the backends, cleanup is proceeding, Git repository states are lagging behind Gerrit by a few hours but will catch up soon. | 15:35 |
opendevstatus | fungi: finished logging | 15:35 |
clarkb | ok the four properly broken servers should be "fixed" now | 15:36 |
* fungi shakes fist at the rogue ai takeover of the internet | 15:36 | |
clarkb | I'm going to do 09 and 12 so they match and then we can trigger replication and then we can write a system-config change to match the chmod 555 | 15:37 |
fungi | sounds good, want that included in https://review.opendev.org/c/opendev/system-config/+/951873 or is it better as a separat change? | 15:37 |
fungi | looks like we have a lot of other chmods in docker/gitea-init/entrypoint.sh, is that a good place for this too or should ansible take care of it? | 15:39 |
clarkb | fwiw I moved aside the old dir as some of them still had like 400MB of content in them | 15:40 |
clarkb | fungi: I was going to have ansible take care of it. I think that entrypoint stuff may come from upstream | 15:40 |
fungi | makes sense. i can put together the ansible change | 15:40 |
clarkb | fungi: we create the data dir in ansible and just need to create the repo-archive dir which lives under there at /var/gitea/data/gitea/repo-archive and set the mode | 15:40 |
clarkb | thanks | 15:40 |
fungi | yep | 15:40 |
clarkb | you might need to create the data/gitea/ dir also so that we can set the mode on that to not 555 | 15:41 |
clarkb | whcih I think if ansible does it recursively it will apply the same mode everywhere? | 15:41 |
fungi | does the file task recursively create directories ? | 15:42 |
clarkb | I think it can in certain situations | 15:46 |
clarkb | all 6 giteas should be done now. I'm going to double check my work and make sure they all rebooted recently, have all containers running, still have mode 555 on repo-archive, and aren't consuming all the disk then I can trigger gerrit replication | 15:46 |
clarkb | yup that all checks out. Triggering replication now | 15:49 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Block access to Gitea's archive feature https://review.opendev.org/c/opendev/system-config/+/951873 | 15:50 |
fungi | 951873 failed system-config-run-gitea on a mariadb issue | 15:50 |
fungi | never got far enough to run the new testinfra test | 15:51 |
fungi | i should say, on a failure to download the mariadb container image | 15:52 |
clarkb | 502 response from quay I think? | 15:52 |
clarkb | ya | 15:52 |
fungi | looks like it | 15:52 |
fungi | ERROR: for mariadb received unexpected HTTP status: 502 Bad Gateway | 15:52 |
clarkb | down to 11k replication tasks. That seems to be steadily falling | 15:54 |
clarkb | fungi: another thought would be to disable the archive cleanup cron entirely too | 15:55 |
clarkb | fungi: that would be in the gitea app.ini config file if you want to add that to your change | 15:55 |
clarkb | or set it to run less often in case somethign sneaks through somehow? | 15:55 |
fungi | i'm adding a disable for it for now | 16:00 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Block access to Gitea's archive feature https://review.opendev.org/c/opendev/system-config/+/951873 | 16:01 |
fungi | if reviewers would prefer to just see the frequency reduced, i can adjust the change | 16:02 |
clarkb | I think this should be fine since we're no longer writing to those dirs | 16:02 |
clarkb | fungi: just posted a thought on the regex you're using too | 16:05 |
clarkb | slittle: I think we should be caught up now. Can you check your repo again? | 16:06 |
fungi | clarkb: i'm unclear on how your proposed regex would avoid matching attempts to browse a repository that has a directory named "archive" with files under it ending in those extensions | 16:08 |
fungi | also where did you see the bundle file downloads linked? | 16:08 |
clarkb | fungi: because its rooted with two sections under / one for org and one for repo eg opendev/system-config/ | 16:08 |
fungi | the archive feature in the webui only showed me .tar.gz and .zip | 16:08 |
clarkb | fungi: if you go to a repo page and click on the code dropdown it shows me .tar.gz .zip and .bundle | 16:09 |
fungi | aha, it doesn't do .bundle for the branch downloads, which is where i was looking | 16:09 |
clarkb | oh ya this is on the root page of a repo so its doing it for the master branch there | 16:09 |
fungi | can we assume that repositories we're hosting will always have one and only one "/" in their names? | 16:10 |
fungi | if so, then yes your suggestion works | 16:10 |
clarkb | that is a good question | 16:10 |
clarkb | I think all do today. And I'm not sure if gerrit and gitea handle deeper nesting | 16:10 |
clarkb | the paths that we want to not break are of the form /opendev/system-config/src/branch/master/archive/foo.tar.gz | 16:11 |
fungi | i don't think gerrit nests anything, does it? so this would only be a gitea question, as it has an org concept | 16:11 |
clarkb | another approach would be to ensure that src not occur before archive? | 16:11 |
fungi | that would need a negative lookahead, i guess apache's regex parser can handle those? | 16:12 |
clarkb | its possible that the git protocols also use "archive" in paths? | 16:12 |
fungi | probably git protocol urls would never include those file extensions at the very end | 16:12 |
clarkb | quickly approaching "it would be nice if gitea just let us disable this feature" | 16:12 |
fungi | yes, indeed | 16:12 |
fungi | i wonder if we can strip it out of the web template instead? | 16:12 |
clarkb | I suspect we can though that may not prevent smarter bots | 16:13 |
fungi | yeah that wouldn't prevent anyone who used the direct url, but the filesystem perms adjustment breaks those and we don't care if they're confused | 16:13 |
clarkb | true | 16:13 |
clarkb | let me see what I can find in the templates | 16:13 |
fungi | not providing the links in the ui would be the most user-friendly solution | 16:14 |
fungi | i need to step away from the keyboard for a few minutes, brb | 16:14 |
clarkb | {{if and (not $.DisableDownloadSourceArchives) $.RefName}} | 16:15 |
clarkb | maybe we can disable this! | 16:15 |
fungi | whoa! | 16:15 |
fungi | that must have sneaked in and not gotten a mention in the changelog? | 16:15 |
clarkb | its a per repo setting it looks like so maybe something that ya they added and since it isn't a system wide setting didn't get much attention? | 16:16 |
clarkb | btu I suspect we may be able to update our repo management tooling to set the flag | 16:16 |
clarkb | or maybe it isn't exposed via the api at all? I've having trouble finding it | 16:17 |
clarkb | fungi: left a comment with the way to disable them on your change | 16:23 |
clarkb | feature was added in august 2022 so somewhat new and ya doesn't seem to have gotten much attention that I see | 16:24 |
clarkb | ctx.Error(http.StatusNotFound) <- that appears to be what gitea will return if you amke the request so I think your existing test cases may still be valid | 16:29 |
fungi | clarkb: should we also undo the chmod in that case, once this lands, instead of adding it to the ansible? | 16:46 |
fungi | or keep it in for belt-and-braces? | 16:46 |
clarkb | I could go either way on that. I guess undoing in prod is somethign we don't need to take a downtime like we just did to apply it since we're making things more permissive | 16:47 |
clarkb | so ya maybe drop the special config in ansible in your change to create the dir and chmod things, then after your change lands and applis we can re chmod that dir back to its old perms | 16:47 |
clarkb | and confirm that we're still not getting flooded then | 16:48 |
clarkb | that wfm | 16:48 |
fungi | looks like gitea doesn't do typical html error pages like apache does, so the response only contains the string "Not found." | 16:49 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Block access to Gitea's archive feature https://review.opendev.org/c/opendev/system-config/+/951873 | 16:51 |
clarkb | also note I'm not sure if updating app.ini will do automated restarts of gitea | 16:53 |
clarkb | I feel like it didn't at one time then I have vague memory of fixing that but not sure if that is a hallucination in the AI system | 16:54 |
fungi | glitch in the matrix | 16:55 |
corvus | 2025-06-05 17:37:09,450 DEBUG zuul.Launcher: [e: 1856baa99bb84856855a99b450571e2c] [req: f315607a0dce4b7a95e3e065907789c6] Selected request main provider <OpenstackProvider canonical_name=opendev.org%2Fopendev%2Fzuul-providers/openmetal-iad3-main> | 17:40 |
corvus | 2025-06-05 17:37:19,075 ERROR zuul.Launcher: [e: 1856baa99bb84856855a99b450571e2c] [req: f315607a0dce4b7a95e3e065907789c6] Error in creating the server. Compute service reports fault: No valid host was found. There are not enough hosts available. | 17:40 |
corvus | seeing that error a bit in openmetal iad3 | 17:40 |
corvus | (from niz) | 17:41 |
corvus | perhaps we just have more quota than actual hosts? | 17:41 |
corvus | 2025-06-05 17:37:19,773 DEBUG zuul.Launcher: [e: 1856baa99bb84856855a99b450571e2c] [req: f315607a0dce4b7a95e3e065907789c6] Provider quota including Zuul: {'instances': inf, 'cores': inf, 'ram': inf, 'volumes': inf, 'volume-gb': inf} | 17:42 |
corvus | yes! yes we do :) | 17:42 |
corvus | i guess we depend on max-servers with nodepool. and of course we could do the same with niz later. | 17:43 |
clarkb | corvus: should we set a quota in the cloud or set a limit on the zuul provider? | 17:43 |
clarkb | ya | 17:43 |
corvus | and in the interim, the two aren't going to be able to cooperate since they don't know the quota | 17:43 |
corvus | but also, i'm thinking that if we could set the quota, that would probably be good, because if we're going to allow node size diversity with niz, that will be important | 17:43 |
corvus | (ie, having 4, 8, 16gb nodes) | 17:44 |
clarkb | I think if we log into horizon in that cloud it gives us a quick overview of total resources which we can use to quickly math out some quotas | 17:44 |
clarkb | and setting quotas in horizon is probably easy too so one stop shopping there | 17:44 |
clarkb | I can look at doing that closer to lunch if no one beats me to it | 17:44 |
corvus | yeah, and if it somehow isn't, we can at least use those numbers to set niz max-resources (so basically, like max-servers, but fine-grained). but quota would be best. | 17:45 |
corvus | that sounds great, thanks! | 17:45 |
corvus | we stopped running puppet on cacti because of an rce, but that also means we're not getting new hosts added | 17:52 |
corvus | i wonder if we should update the config for the host to make the blocking permanent and turn it back on | 17:53 |
corvus | or should we just run the create-graphs script manually | 17:53 |
corvus | i am going to run create_graphs.sh for zl01 manually, but let's think about whether we should update the ansible for that host | 17:54 |
clarkb | I think I'd be ok with trying to make it permanent | 17:55 |
clarkb | then reenabling it for now | 17:55 |
clarkb | corvus: I think that can be done without any puppet actually since we use ansible to do the firewall rules? | 17:55 |
clarkb | so it might be fairly straightforward | 17:55 |
corvus | yeah | 17:55 |
corvus | #status log ran create_graphs.sh to create cacti graphs for zl01, review03, nl05-8, nb05-7, codesearch02, grafana02 | 18:01 |
opendevstatus | corvus: finished logging | 18:01 |
corvus | i just worked back in the history of the file until i found a host already created | 18:01 |
fungi | log stream says the testinfra test i added is failing, probably need to null-terminate the string or something | 18:11 |
fungi | yep, that, but also the string isn't quite the same as what i got from a random nonexistent url on gitea | 18:18 |
fungi | AssertionError: assert 'Not Found\n' == 'Not found.' | 18:18 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Block access to Gitea's archive feature https://review.opendev.org/c/opendev/system-config/+/951873 | 18:19 |
clarkb | I've updated the quota in openmetal to 50 instances (matches nodepool max servers though we are slightly over so may have leaked a small number of nodes), 520 vcpus (we seem to have capacity for 540 (I think this is with oversubscription)) that should give us some headroom for cloud services and ~2TB of memory. Note horizon seems only let you modify the default quotas but since we | 18:23 |
clarkb | are the only users of the cloud I figured this was ok for now and we can be more specific later if necessary | 18:23 |
clarkb | corvus: ^ fyi | 18:23 |
clarkb | if anyone knows how to do this on a per project basis via horizon I'm happy to revert the defaults and set project specific values | 18:23 |
clarkb | oh I think I just found it acutally. its a hidden menu option for projects | 18:24 |
corvus | clarkb: do you know how many ips we have? | 18:25 |
clarkb | corvus: its ~50. Thats where the max server size comes from | 18:26 |
clarkb | s/size/count/ | 18:27 |
clarkb | but I can see if I can find the actual pool allocation | 18:27 |
corvus | cool -- that's where i was going with that question. thanks :) | 18:28 |
clarkb | for the record I reverted the defaults back to unlimited instances, memroy, and vcpus. Then modified the zuul project to a limit of 51 instances (in the project settings it wouldn't let me set a quota lower than current consumption), 500 vcpus (thats 10 per instance) , and 2TB of memory | 18:28 |
clarkb | corvus: ya in this cloud our primary limitation is IP addrs | 18:28 |
corvus | 2025-06-05 18:28:51,411 ERROR zuul.Launcher: [e: fd10e30d60474b68a40f8616bd7f1dec] [req: 18281e1a650b44b9b9c00f557759c4b6] Error in creating the server. Compute service reports fault: Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance 56fcb68f-206e-4aff-92bc-710417b30935. Last exception: sequence item 1: expected string, NoneType found | 18:29 |
corvus | that looks like a python error from openstack | 18:29 |
corvus | lemme check which cloud that is | 18:30 |
clarkb | corvus: we have a /26 which is 64 total IPs then you have overhead for your neutron routers and dhcp agents iirc. We might get away with something closer to 55? but not much larger | 18:30 |
corvus | canonical_name=opendev.org%2Fopendev%2Fzuul-providers/rax-iad-main was the compute service fault | 18:30 |
clarkb | did the limit not land there? | 18:31 |
clarkb | there == zuul-provider/zuul-launcher since rax-iad is the one we wanted to disable | 18:31 |
corvus | i think that's a completely different error | 18:31 |
corvus | just bringing it up since it's new to me | 18:31 |
clarkb | well it should stop trying to boot instances with a limit of 0 right? | 18:31 |
corvus | oh that, sorry i misunderstood | 18:32 |
clarkb | so I'm mostly wondering why it tried to boot an instance. But maybe the error itself is manifesting in zuul launcher differently than it did with nodepool? | 18:32 |
corvus | i thought you were asking about the new limits from today, not the old limits from yesterday :) | 18:32 |
clarkb | same issue different symptom | 18:32 |
corvus | i'm checking on your qestion re resource-limits | 18:33 |
clarkb | corvus: one thing I noticed with your change is it applied to section not provider (or maybe vice versa) maybe that matters? | 18:33 |
corvus | yep, shouldn't, but could be a bug | 18:33 |
clarkb | #status log Set project quotas for instances, memory, and vcpus in the zuul openmetal cloud project. | 18:39 |
opendevstatus | clarkb: finished logging | 18:39 |
corvus | clarkb: i believe the setting is correct, but there is a bug in the launcher in provider selection that lets it slip through. working on it. | 18:52 |
corvus | clarkb: re openmetal, this is what zl sees now: {'instances': 51, 'cores': 500, 'ram': 2097152, 'volumes': inf, 'volume-gb': inf} | 18:54 |
Clark[m] | That looks correct. Sorry switched clients as I'm sorting out lunch now | 18:57 |
clarkb | apparently you can get mulch blown in like insulation. It is not quiet | 19:47 |
clarkb | fungi: AssertionError: assert 'Not Found\n' == 'Not found.' | 19:47 |
clarkb | fungi: I think you just need a newline on your test assertions then your change should pass | 19:48 |
fungi | that was what i fixed in the most recent patchset | 19:49 |
fungi | oh! i only fixed one occurrence of it | 19:49 |
clarkb | fungi: its still failing on the latest patchset | 19:50 |
clarkb | oh ok cool I wanted to make sure i wasn't missing something | 19:50 |
fungi | yeah, i needed to fix it three times, not just once :/ | 19:50 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Block access to Gitea's archive feature https://review.opendev.org/c/opendev/system-config/+/951873 | 19:51 |
* fungi sighs | 19:51 | |
fungi | is it friday yet? | 19:51 |
clarkb | not quite but I feel that | 19:51 |
opendevreview | Merged opendev/zuul-providers master: Use zuul-supplied image formats https://review.opendev.org/c/opendev/zuul-providers/+/949944 | 20:29 |
corvus | re zuul-launcher: 2 things: 1) there's an error where we don't check whether a provider could ever possibly have enough quota for a node before we assign it. that's an easy fix. | 20:41 |
corvus | 2) without the fix for #1 in place, we should see some nodes that are assigned to rax-iad and stay stuck there forever, but that didn't happen because at some point, the quota went crazy and it thought it actually could proceed. | 20:41 |
corvus | i can not find a code path for #2 to happen; i have no explanation. there is a bit more debugging info in the latest image. i'm going to restart zuul-launcher on that. perhaps something in my monkeypatching to fix the zero division error caused it. | 20:41 |
corvus | but if not, hopefully we'll see it again with more debug info. | 20:41 |
clarkb | ack | 20:41 |
mnasiadka | hello | 21:02 |
mnasiadka | clarkb: do you think we'll get a second core review on https://review.opendev.org/c/opendev/glean/+/941672? I'd like to move on a bit - I know we're waiting for tonyb on the DIB switch to use devstack for testing - but still :) | 21:03 |
clarkb | mnasiadka: my hunch is that fungi is the most likely extra reviewer. fungi do you think you'd like to review that or should we proceed with my review? | 21:04 |
mnasiadka | thanks :) | 21:13 |
fungi | looking | 21:25 |
fungi | big diff, but all the new codepaths look sufficiently gated by conditionals that at worst it will only break rhel10/centos10, which wasn't working anyway | 21:30 |
clarkb | and TheJulia acked that this is unlikely to create problems for ironic | 21:32 |
* TheJulia is summoned | 21:32 | |
TheJulia | oh hai | 21:32 |
* TheJulia reads | 21:32 | |
clarkb | TheJulia just noting that the glean update for centos 10 stream seems safe enough as written | 21:32 |
TheJulia | Oh yeah, I concur | 21:33 |
clarkb | and that you chimed in to agree when it comes to ironic | 21:33 |
TheJulia | ... there is an aspect I was made aware of yesterday in general, but more so a quark regarding cloud-init with the metadata, but that is entirely unrelated. | 21:34 |
TheJulia | (and I think glean does it right.) | 21:34 |
* TheJulia may also be biased | 21:36 | |
* TheJulia steps back into the shadows | 21:37 | |
opendevreview | Merged opendev/glean master: Add support for CentOS 10 keyfiles https://review.opendev.org/c/opendev/glean/+/941672 | 22:08 |
mnasiadka | yay :) | 22:14 |
clarkb | I think we need a new release for that too, but one step at a time | 22:14 |
clarkb | fungi: https://review.opendev.org/c/opendev/system-config/+/951873 passes now \o/ I've double checked our ansible and I don't believe that we will restart gitea automatically after that change lands | 22:24 |
clarkb | thats ok I'm ahppy to work through it manually tomorrow (running out of energy for that today I think) if we land the hcange at some point between now and then | 22:24 |
clarkb | the process requires us to start the containers in the right order so that gerrit replication doesn't lose events | 22:24 |
clarkb | I'm also happy to walk anyone else through it if they are interested. Its straightforward just repetetive | 22:25 |
corvus | https://review.opendev.org/951919 should address the main issue with launcher attempting to create nodes in iad. the parent should also fix an issue with request priority (which we haven't seen yet, but i think we would as we scale up). | 22:41 |
corvus | now that cacti is in place: the load average on zl01 is negligible. it peaks at 0.1. | 22:42 |
corvus | it's not doing much yet, but we've had some bursts of activity. | 22:43 |
clarkb | corvus: for 951919 I'm not understanding where the limits come into play. It seems to only consider the provider's quota? | 22:47 |
clarkb | corvus: specifically getProviderQuota() doesn'ts eem to refer to limits | 22:48 |
clarkb | oh we overlay resource-limits over the quota looks like | 22:51 |
clarkb | ya ok I see it now. Basically we get the resource-limits and the cloud provided quota then take the minimum value and use that as our quota | 22:53 |
corvus | yep | 23:20 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!