*** rlandy|ruck is now known as rlandy|out | 00:37 | |
kevinz | clarkb:fungi: That issue was due to one controller node was not working properly. And I've fixed it. Sorry for the inconvenience. | 00:53 |
---|---|---|
kevinz | please let me know if there is any futher issue. | 00:54 |
fungi | oh, thanks kevinz! | 01:38 |
kevinz | fungi: welcome | 01:50 |
opendevreview | Jack Morgan proposed opendev/system-config master: Minor update to documentation. https://review.opendev.org/c/opendev/system-config/+/824274 | 01:54 |
jentoio | I found a minor issue with the documentation so using it as a practice for gerrit. | 01:55 |
fungi | thanks, i approved it | 02:28 |
opendevreview | Merged opendev/system-config master: Minor update to documentation. https://review.opendev.org/c/opendev/system-config/+/824274 | 02:35 |
fungi | jentoio: ^ it's merged, thanks! | 03:01 |
jentoio | fungi: great, thanks. I feel legit now ;) | 03:02 |
fungi | as you should | 03:02 |
*** ysandeep|out is now known as ysandeep | 03:31 | |
*** ysandeep is now known as ysandeep|afk | 03:38 | |
*** ysandeep|afk is now known as ysandeep | 05:08 | |
*** cloudnull5 is now known as cloudnull | 05:51 | |
*** ykarel_ is now known as ykarel | 06:13 | |
opendevreview | Ananya proposed opendev/elastic-recheck rdo: Updating gitignore to ignore local config files https://review.opendev.org/c/opendev/elastic-recheck/+/824283 | 06:32 |
opendevreview | Dmitriy Rabotyagov proposed openstack/project-config master: Add Backport-Candidate label to openstack-ansible ACL https://review.opendev.org/c/openstack/project-config/+/824229 | 07:51 |
*** ysandeep is now known as ysandeep|lunch | 08:16 | |
*** jpena|off is now known as jpena | 08:37 | |
*** ysandeep|lunch is now known as ysandeep | 08:59 | |
*** elodille1 is now known as elodilles | 09:29 | |
*** ysandeep is now known as ysandeep|afk | 10:09 | |
*** rlandy|out is now known as rlandy|ruck | 11:04 | |
*** ysandeep|afk is now known as ysandeep | 11:10 | |
*** dviroel_ is now known as dviroel | 11:21 | |
opendevreview | Dmitriy Rabotyagov proposed openstack/project-config master: Add Backport-Candidate label to openstack-ansible ACL https://review.opendev.org/c/openstack/project-config/+/824229 | 11:27 |
*** ysandeep is now known as ysandeep|brb | 11:49 | |
*** sshnaidm|afk is now known as sshnaidm | 11:58 | |
*** ysandeep|brb is now known as ysandeep | 12:06 | |
*** ysandeep is now known as ysandeep|brb | 12:20 | |
*** ysandeep|brb is now known as ysandeep | 12:59 | |
frickler | the announcement of gerritbot "verification failed" messages has some false positives when rechecking a patch and there are results delivered from a non-voting pipeline like arm64, the latter leads to the message being repeated, see e.g. https://review.opendev.org/c/openstack/kolla-ansible/+/824153 vs. | 13:04 |
frickler | https://meetings.opendev.org/irclogs/%23openstack-kolla/%23openstack-kolla.2022-01-12.log.html#t2022-01-12T12:44:18 | 13:04 |
opendevreview | Shnaidman Sagi (Sergey) proposed zuul/zuul-jobs master: Include podman installation with molecule https://review.opendev.org/c/zuul/zuul-jobs/+/803471 | 13:36 |
*** ykarel_ is now known as ykarel | 14:06 | |
*** artom__ is now known as artom | 14:27 | |
*** dviroel is now known as dviroel|lunch | 15:13 | |
rlandy|ruck | clarkb: fungi: hello ... https://zuul.opendev.org/t/openstack/status has a bunch of patches that have jobs queued for 5 or 6 hours - with no jobs started | 15:59 |
rlandy|ruck | openstack/tripleo-validations seem to be queuing up for some time | 16:00 |
fungi | thanks, looking | 16:11 |
*** dviroel|lunch is now known as dviroel | 16:11 | |
*** ysandeep is now known as ysandeep|out | 16:16 | |
clarkb | I'm going to work on hrw's account cleanup. My plan is to double check both accounts to make sure there isn't some unexpected conflict, then assuming it is as expected retire the new account and delete its conflicting external ids | 16:24 |
fungi | rlandy|ruck: based on our utilization graphs we've been topped out at our maximum test capacity since roughly 08:00 utc. i definitely see changes getting nodes assigned, but keep in mind that zuul's "fair queuing" algorithm attempts to prevent projects from monopolizing available test resources by reducing their node request priority if they've already enqueued another change recently. so the | 16:24 |
fungi | more changes request testing for a project, the longer those changes will take to get nodes assigned | 16:24 |
clarkb | if any other infra-root would like to be walked through this process instead (its a good interacting with gerrit as admin and the new account system exercise) let me know. You have a little bit as I settle in and catch up on email before I start :) | 16:25 |
rlandy|ruck | fungi: k, make sense - I was just checking that we were not hanging on jobs | 16:25 |
clarkb | fungi: that was my take on it looking at the status page too | 16:25 |
fungi | rlandy|ruck: it appears the node request backlog peaked around 14:00 utc and has been rapidly falling, so unless there's another spike in node requests i expect the current burn rate to get us back to starting builds on-demand within the next few hours | 16:25 |
rlandy|ruck | if they start when capacity allows, we're ok | 16:25 |
rlandy|ruck | sounds good - thank you | 16:26 |
fungi | rlandy|ruck: if you're interested, graphs are here: https://grafana.opendev.org/d/5Imot6EMk/zuul-status?orgId=1&from=now-24h&to=now | 16:26 |
fungi | the most interesting ones in this case are probably "node requests" and "test nodes" | 16:26 |
rlandy|ruck | right - that is useful, it makes it easier to estimate if things are getting better | 16:27 |
fungi | yeah, i'm mostly looking at the node requests backlog there to attempt to predict when this should clear up | 16:27 |
fungi | when we're not maxxed out, that graph should normally read ~0 with the occasional blip | 16:28 |
rlandy|ruck | thanks - next time I'll know where to look | 16:29 |
fungi | the other indication is that the used line in the test nodes graph is consistently plateaued, indicating our effective maximum node capacity (the "max" line is misleading, it's a theoretical capacity based on max-servers values but our providers often set lower quotas or we end up with leaked undeletable cruft they have to clean up manually in some cases, but also not all flavors use the same | 16:31 |
fungi | amount of ram/cpu/etc) | 16:31 |
clarkb | hrw's account setup seems pretty straightfoward. The email hrw requested to add to the old account is indeed the email associated with the new accounts openid so can't be removed by the user. This means the previously described plan should work fine. We retire the old account then delete its external ids containing the conflicting email. This includes the openid. Then hrw can add | 16:42 |
clarkb | that email address manually when logged into the old account | 16:42 |
clarkb | I'm going to proceed with that. Might take me a bit to page in how these cleanup scripts work, but shouldn't be any trouble once I'm back up to speed | 16:42 |
clarkb | the first step in the process has been completed | 16:47 |
clarkb | now I need to run the script to clean out the unwanted external ids for that account | 16:48 |
kopecmartin | clarkb: you may go ahead with https://review.opendev.org/c/opendev/system-config/+/821335 | 16:53 |
clarkb | kopecmartin: thank you for checking. I'll get to that next | 16:55 |
clarkb | the account modifications are done, I'm just recording my work in the usual location now | 16:55 |
clarkb | and removing my extra perms | 16:56 |
clarkb | infra-root ok I should be all done with the hrw account cleanup. hrw will need to manually add the email addr to the older account though as I didn't manually do that | 16:58 |
clarkb | fungi: frickler: any objections to approving the refstack image update now? I think frickler's rereview might be the most important thing at this point if that is still a possibility | 17:00 |
opendevreview | Clark Boylan proposed opendev/bindep master: Replace centos-8 with centos-8-stream https://review.opendev.org/c/opendev/bindep/+/824237 | 17:02 |
clarkb | fungi: frickler ^ also thank you for calling out the weird parenting there. THat was unnecessary and not desirable so I have shifted it around with a rebase | 17:03 |
clarkb | we did get new arm64 centos 8 stream images overnight so I've rechecked the ozj centos 8 cleanup change in hopes that the openafs package will build now | 17:08 |
frickler | clarkb: +2 on refstack | 17:09 |
clarkb | thanks! | 17:09 |
fungi | clarkb: refstack image update seems fine to approve now, yeah. i can do it as soon as i find that change again | 17:09 |
clarkb | I'll get it now that frickler +2'd | 17:09 |
fungi | ahh, cool | 17:09 |
clarkb | for the rename plugin I have no idea why I can't figure out the ssh command for it. Well I Mean the docs use text substituion which doesn't substitute in the repo but some rendered form that I don't know how to see so thats part of it. Anyway I'm going to pause on that for a bit while I catch up on other stuff but probably need to hold a node and inspect it more closely | 17:11 |
frickler | the bindep change seems to have kept the +2s, so you can approve it once the checks succeed | 17:11 |
clarkb | excellent | 17:12 |
clarkb | Once I get a few of these changes out of my queue I'll feel better about doing some zuul reviews later today | 17:12 |
clarkb | woot the arm64 openafs build is working now | 17:19 |
opendevreview | Merged opendev/bindep master: Replace centos-8 with centos-8-stream https://review.opendev.org/c/opendev/bindep/+/824237 | 17:34 |
*** marios is now known as marios|out | 17:35 | |
* clarkb looks for breakfast while the resfstack change gates | 17:37 | |
*** jpena is now known as jpena|off | 17:40 | |
opendevreview | Merged opendev/system-config master: Update refstack image to bullseye https://review.opendev.org/c/opendev/system-config/+/821335 | 17:50 |
fungi | clarkb: 824236 passes now! | 18:02 |
clarkb | fungi: ya I think it was the old arm64 centos 8 stream image causing the problem. With the new image all the packages aligned with the running kernel and we were good | 18:03 |
clarkb | refstack's container just restarted | 18:05 |
clarkb | it seems to be up and I can load the front page. kopecmartin not sure if there is any other checkign you want to do | 18:05 |
clarkb | fungi: do note that that change bumps up the openafs version too, but I think that should be fine (we go from a prerelease to a bugfix release of the same release) | 18:07 |
fungi | yep | 18:08 |
rlandy|ruck | rcastillo|rover: hey | 18:17 |
rlandy|ruck | clarkb: fungi: hi again ... wanted to introduce rcastillo|rover (usually just rcastillo). He's joined the Red HaT TripleO CI team and would like to get involved in some infra projects | 18:18 |
clarkb | hi! | 18:18 |
rcastillo|rover | o/ nice to meet y'all | 18:18 |
rcastillo|rover | would love getting involved :) | 18:18 |
rlandy|ruck | yep - so if you have any work he can start with ... please be in touch | 18:19 |
clarkb | Right now the major things I'm working on are container maintenance as described by https://etherpad.opendev.org/p/opendev-container-maintenance, Zuulv5 reviews (hashtag:sos), CentOS 8 image removal now that it is EOL, and then supporting ianw's Gerrit 3.4 upgrade efforst and fungi's mailman 3 upgrade efforts | 18:20 |
clarkb | If you're interested in the dedicated container users effort described in the container maintenance doc I think there is enough room there for us to split that up. jentoio will be helping with that too likely | 18:21 |
fungi | welcome rcastillo! aside from general user support, my broader focus over the next few months is going to be split between a couple of specs for improving our services: https://docs.opendev.org/opendev/infra-specs/latest/specs/mailman3.html https://docs.opendev.org/opendev/infra-specs/latest/specs/central-auth.html | 18:33 |
fungi | if you're interested in helping with either of those, let me know | 18:33 |
rcastillo|rover | fungi: thanks! I'll take a look at both of those, the auth proposal interests me for sure | 18:47 |
fungi | rcastillo|rover: on that front, we have a poc keycloak deployment already we've been testing, so there's some progress on it | 18:51 |
fungi | #status log Restarted statusbot and gerritbot as they did not seem to gracefully cope with an apparent netsplit we experienced around 18:30 UTC | 18:53 |
opendevstatus | fungi: finished logging | 18:53 |
clarkb | thanks! | 18:54 |
fungi | infra-root: we got a ticket from rackspace to let us know they had to reboot the ethercalc server due to hypervisor host issues. it seems to be up and running fine so i'm going to close out the ticket they opened for it | 18:54 |
clarkb | ++ and thank you for following up on that too :) | 18:54 |
fungi | #status log The ethercalc server was rebooted at 11:17 UTC due to a hypervisor host problem in our donor provider | 18:57 |
opendevstatus | fungi: finished logging | 18:57 |
*** timburke__ is now known as timburke | 19:12 | |
*** sshnaidm is now known as sshnaidm|afk | 19:20 | |
fungi | also ianw's ticket about being unable to connect to the emergency console from fedora 33 with its default security settings was closed out claiming to be solved, though i'm not in a position to be able to test it | 19:29 |
clarkb | I don't think we have fedora 33 instances anymore either | 19:30 |
fungi | right, that's a big part of why i'm not in a position to be able to test it | 19:33 |
clarkb | https://github.com/unbit/uwsgi/pull/2384 is still open against uwsgi which means the hack in https://review.opendev.org/c/opendev/system-config/+/821339 is still our best bet for now | 19:38 |
clarkb | 821339 is our next step in bullseye updates for containers. Should I single core approve that? frickler you might have time to take a look? (its late for you though and this can probably wait for tomorrow) | 19:38 |
*** dviroel is now known as dviroel|afk | 20:04 | |
fungi | rlandy|ruck we caught up on the node request backlog in the last few minutes, so in theory you should have nodes assigned to all those builds now | 20:31 |
rlandy|ruck | fungi: yep thanks - we seems good to go now | 20:32 |
rlandy|ruck | had a small panic this morning :) - but all the jobs at through now | 20:32 |
fungi | cool | 20:33 |
*** promethe- is now known as prometheanfire | 20:42 | |
*** rlandy|ruck is now known as rlandy|out | 23:16 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!