*** benj_71 is now known as benj_7 | 02:41 | |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: promote-image-container: do not delete tags https://review.opendev.org/c/zuul/zuul-jobs/+/878612 | 04:42 |
---|---|---|
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [dnm] rough draft of deleting old quay tags https://review.opendev.org/c/zuul/zuul-jobs/+/878614 | 05:35 |
eandersson | Is github not mirror properly right now? | 05:56 |
eandersson | Last update is 3 days old from what I can see | 05:57 |
eandersson | oh probably related to the keys that github revoked | 05:57 |
frickler | eandersson: infra-root: ack https://zuul.opendev.org/t/openstack/build/84a411295fae463fa58f676fd09d3da7 failing since friday | 06:56 |
frickler | https://opendev.org/openstack/project-config/src/branch/master/zuul.d/secrets.yaml#L648 | 07:02 |
frickler | airship and starlingx seem to have that in every single repo ... :-( | 07:02 |
opendevreview | Dr. Jens Harbott proposed openstack/project-config master: Update github ssh rsa hostkey for uploads https://review.opendev.org/c/openstack/project-config/+/878616 | 07:06 |
frickler | StorPool OpenStack CI also is a bit noisy | 07:07 |
*** jpena|off is now known as jpena | 07:42 | |
amoralej | hi, there is some issue with the rackspace mirror we use as source for centos-stream sync http://mirror.rackspace.com/centos-stream/SIGs/ | 07:45 |
amoralej | any idea about how can we report the issue or get in touch with rackspace mirrors admins? | 07:46 |
mnasiadka | Seems Gerrit has some issues, getting 503 on requests... | 07:54 |
frickler | mnasiadka: yes, looking already | 07:56 |
frickler | amoralej: this seems to be a neverending story. maybe something for redhat as a company to solve if they want to keep us supporting centos | 07:59 |
frickler | mnasiadka: restarted gerrit, should be better for now | 08:13 |
opendevreview | Merged openstack/project-config master: Update github ssh rsa hostkey for uploads https://review.opendev.org/c/openstack/project-config/+/878616 | 08:40 |
frickler | infra-root: ^^ not sure if someone would want to retrigger uploads or if we think having the next commit fix things would be good enough | 08:43 |
frickler | upload for the above merge itself succeeded https://zuul.opendev.org/t/openstack/build/28afc4daa44f447ebcc861e6b94beea4 | 08:49 |
opendevreview | Thierry Carrez proposed opendev/irc-meetings master: Move Large Scale SIG EU+APAC meeting hour https://review.opendev.org/c/opendev/irc-meetings/+/878634 | 09:08 |
opendevreview | Merged opendev/irc-meetings master: Move Large Scale SIG EU+APAC meeting hour https://review.opendev.org/c/opendev/irc-meetings/+/878634 | 09:31 |
*** amoralej is now known as amoralej|lunch | 12:16 | |
frickler | fyi, github jobs are failing again now, but with a different error. apparently gh is having a major outage | 13:03 |
frickler | not much green on https://www.githubstatus.com/ | 13:04 |
fungi | ouch | 13:09 |
*** amoralej|lunch is now known as amoralej | 13:20 | |
*** ykarel_ is now known as ykarel | 13:21 | |
fungi | github marked it resolved about 30 minutes ago: "an infrastructure change that has been rolled back" | 13:58 |
corvus | fungi: where did you find the logs for the strange node request failure? | 14:08 |
fungi | corvus: nl03:/var/log/nodepool/launcher-debug.log.2023-03-23_06 | 14:08 |
fungi | i was going to start looking at graphs for the zk servers when i get a break between ptg sessions | 14:10 |
corvus | and we're looking at 300-0020813181 | 14:10 |
fungi | correct | 14:10 |
fungi | nl04 too it first, gave up trying to boot something and declined, then nl03 too it next and that happened | 14:10 |
fungi | s/too/took/ | 14:11 |
corvus | i think i see the bug | 14:14 |
fungi | oh! that's fast | 14:15 |
fungi | something new with the openstack statemachine driver rework? | 14:15 |
corvus | fungi: if you note, the osuosl thread is locking the request only 3 microseconds after the linaro thread unlocked it. that's certainly close enough for them to race these two lines: https://opendev.org/zuul/nodepool/src/branch/master/nodepool/zk/zookeeper.py#L2174-L2175 | 14:17 |
corvus | linaro unlocks; osulosl locks; linaro sets the lock attribute to None.; osuosl fails to unlock because the lock attribute is none | 14:18 |
fungi | oh, right | 14:18 |
corvus | this is a fairly ancient bug -- except that it only really became an issue when we started caching node request objects | 14:18 |
corvus | (when originally written, each thread would have gotten its own request object) | 14:19 |
corvus | i can work on a fix today | 14:19 |
fungi | thanks, it's certainly not debilitating, this is the first one i noticed and it eventually got dequeued naturally a few days later when someone uploaded a new patchset, just wanted to try to figure out what happened | 14:21 |
fungi | likely there have been more and we've never noticed | 14:21 |
corvus | yeah, stuck node requests are infrequent, but we've certainly seen them. | 14:28 |
opendevreview | Clark Boylan proposed opendev/system-config master: Update Gerrit 3.7 to 3.7.2 https://review.opendev.org/c/opendev/system-config/+/878695 | 15:14 |
clarkb | infra-root ^ as noted in the commit message this is largely bookkeeping but upstream gerrit did make a release to fix that related changes issue so may as well reflect we're pulling that locally | 15:14 |
clarkb | infra-root also I shutdown gitea services on gitea01-04 on friday. I don't think we've had any trouble with gitea since? I'll look at deleting the servers in that case just as soon as the PTG tc meeting finishes later this morning | 15:17 |
clarkb | please say something if you have any concerns with that plan | 15:17 |
clarkb | gitea01 in partiular was our backed up server. But I ported the db out of it to the new servers and gitea09 is backed up now so we shouldhave continuity of the database . We can also refer back to the gitea01 backups as they don't disappear | 15:21 |
opendevreview | Alfredo Moralejo proposed opendev/system-config master: Use Red Hat managed mirror to sync CentOS Stream content https://review.opendev.org/c/opendev/system-config/+/878701 | 16:26 |
*** jpena is now known as jpena|off | 16:33 | |
*** amoralej is now known as amoralej|off | 16:34 | |
clarkb | infra-root I'll proceed with gitea01-04 deletions now | 18:08 |
clarkb | do we think we need to do anything else with gitea01 prior to its deletion given its special status? | 18:08 |
fungi | nothing i can think of | 18:12 |
clarkb | I guess what I can do is delete the server but not the backing disk volume | 18:15 |
clarkb | and make note of that volume being the old gitea01 volume and we can clean it up even later | 18:15 |
clarkb | since the main concern here is the running instance and its associated resource (the volume is one of those resources but IPs and instance quota are the bigger issue I think) | 18:15 |
clarkb | ok I think I can delete gitea01.opendev.org then `openstack volume set --name gitea01-old-boot-volume` on its boot device to make that clear in the openstack api state | 18:21 |
clarkb | I'll proceed with that in a few minutes if no one objects (gitea02-04 are gone as are their volumes) | 18:22 |
clarkb | ok that appears to have worked. gitea01-04 are deleted. gitea01's boot disk remains and I set its name so that its clear what that is/was in a volume listing | 18:28 |
clarkb | #status log Gitea01-04 have been deleted. Gitea is entirely running off of the new replacement servers at this point. | 18:28 |
opendevstatus | clarkb: finished logging | 18:29 |
opendevreview | Clark Boylan proposed opendev/zone-opendev.org master: Remove gitea01-04 server DNS records https://review.opendev.org/c/opendev/zone-opendev.org/+/878710 | 18:29 |
clarkb | and that change will clean up dns to match | 18:29 |
clarkb | now to push and hold a node for gitea 1.19.0. Historically we've often not upgraded to the .0 releases though | 18:31 |
opendevreview | Clark Boylan proposed opendev/system-config master: DNM intentional gitea failure to hold a node https://review.opendev.org/c/opendev/system-config/+/848181 | 18:33 |
clarkb | I've put a hold in place for ^ | 18:35 |
clarkb | fungi: frickler do we need to send a service-announce email indicating projects with github replication may need to update the ssh host key? | 18:56 |
clarkb | ildikov: ^ fyi frickler found evidence that airship and starlingx have this issue | 18:56 |
ildikov | clarkb: thank you for the heads up! | 18:57 |
clarkb | hrm though I'm not finding it. I'm also probably not looking in the right way or maybe it call got fixed | 18:57 |
clarkb | is it possible airship and starlingx use the openstack account and jobs maybe? so fixing openstack fixed it for them too | 18:58 |
clarkb | that may be it | 18:58 |
ildikov | I'm not 100% sure, but I think that's more than possible | 18:59 |
clarkb | https://104.130.127.14:3081/opendev/system-config is running gitea 1.19.0 and looks good | 19:55 |
clarkb | the main feature added by this release (actions) is something I have disabled in the change | 19:56 |
clarkb | I suspect we're mostly going to need to check for regressions of existing behavior | 19:57 |
clarkb | with the horizon thing sorted out my tuesday ptg schedule is looking fairly light. I'm up for doing the meeting if others are | 20:02 |
clarkb | let me know and I'll put an agenda together later today if we decide to have one | 20:02 |
fungi | i can take it or leave it | 20:36 |
ianw | clarkb: with 878695 i think we can also revert download-commands to 3.7.2 tag -- mind if i push an update with that? | 20:42 |
clarkb | ianw: its already on the 3.7.2 tag? | 20:44 |
clarkb | oh wait | 20:44 |
clarkb | I'm looking at my old checkout not for that change. when did we flip it to master? | 20:44 |
clarkb | but ya 3.7.2 and master are the same so thats a good cleanup | 20:45 |
clarkb | ianw: feel free | 20:45 |
ianw | yeah i had to flip it to rebuild for https://review.opendev.org/c/opendev/system-config/+/878042 | 20:46 |
opendevreview | Ian Wienand proposed opendev/system-config master: Update Gerrit 3.7 to 3.7.2 https://review.opendev.org/c/opendev/system-config/+/878695 | 20:48 |
ianw | there's no changes since https://23.253.56.187 deployed, so i think that's still a vaild tester | 20:50 |
ianw | valid even | 20:50 |
clarkb | ianw: thats true for gerrit itself as well (no changes since that deployment)? | 22:41 |
clarkb | ianw: any opinion on having a meeting tomorrow? | 22:54 |
clarkb | I suspect fungi and frickler are most involved in the ptg and would be happy to skip (fungi indicated this earlier actually) | 22:54 |
clarkb | I'm inclined to skip if you don't think there is an urgent reason to have it | 22:54 |
opendevreview | Merged opendev/system-config master: Update Gerrit 3.7 to 3.7.2 https://review.opendev.org/c/opendev/system-config/+/878695 | 23:07 |
ianw | ok, happy to skip | 23:17 |
ianw | i think we have enough going on to keep us busy :) | 23:17 |
* Clark[m] jumped onto the laptop to redo python3.11 things to match the desktop and haven't loaded keys. I'll go ahead and send a note about skipping this week | 23:40 | |
Clark[m] | I should probably make a little anisble playbook to add a new python version and install the tools and update the symlinks | 23:41 |
fungi | i have a bash script, which i guess is approximately the same thing | 23:42 |
Clark[m] | ok I sent an email to the list to make it official | 23:47 |
fungi | thanks! | 23:49 |
*** dhill is now known as Guest9071 | 23:51 | |
Clark[m] | tomorrow I should run the zuul test suite and see if it goes zoom zoom under py311 compared to p310 since my local 311 installation has the extra cpu flags enabled | 23:51 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!