Tuesday, 2023-10-31

Clark[m]tonyb: I would leave mirror update alone for now since that is the backend update server and is a bit decoupled. The rest of it sounds right00:15
tonybClark[m]: Thanks.00:19
tonybI'll get that going first thing tomorrow.00:19
tkajinamlooks like some jobs are stuck in queued status for very long. Can it indicate some infra problems ?04:24
tkajinamhmm I see bunch of periodic jobs are triggered and I guess it consumes a lot of vms, I guess04:25
opendevreviewMerged openstack/project-config master: Update Grafana dashboard for Neutron master  https://review.opendev.org/c/openstack/project-config/+/89883609:32
opendevreviewMerged openstack/project-config master: sdk/osc: Rollback to LaunchPad for issuetracking  https://review.opendev.org/c/openstack/project-config/+/89428509:33
fricklerclarkb: fungi: ^^ I merged this because gtema had already changed the osc docs, but please double check that this works as expected. also I think one of you will want to disable the affected projects in sb?10:23
mnasiadkafrickler: do you remember if there was any conclusion with removing kayobe feature/zookeeper branch?10:39
fricklermnasiadka: I need to check logs since I restarted my client yesterday10:45
fricklermnasiadka: so clarkb wanted to check back with the release team, but there was no response to that. there was also general agreement that it would be ok-ish for infra-root to do this as a one-off, so I'm going to doublecheck which permissions I need and go ahead with it10:49
mnasiadkafrickler: thanks :)10:49
fricklerduckduckgo gives me https://docs.openstack.org/fuel-docs/newton/plugindocs/fuel-plugin-sdk-guide/create-environment/repository-branching/repository-branching-delete.html which isn't helpful when being on the receiving end of that request ;)10:52
frickler#status log deleted the "feature/zookeeper" branch from the openstack/kayobe which was an unwanted remainder from importing the project11:00
opendevstatusfrickler: finished logging11:00
fricklermnasiadka: ^^ I'm not sure how things work for the github mirror, it may take another change to get merged for the replication to trigger11:01
mnasiadkafrickler: thanks, we'll check after any merged change11:02
*** dhill is now known as Guest525912:25
fungifrickler: thanks, i had somehow missed the cli/sdk change, i'll get things cleaned up in the sb database momentarily12:49
fungifrickler: for branch deletion, simplest solution is to temporarily add your normal account to the administrators (or openstack release managers) group, then browse to configuration for the project in question, and on the branches tab you should see a delete button (there is also a rest api call for branch deletion, which is what the scripts used by the openstack release managers rely on)12:50
fungifrickler: gtema: just to be clear, that change makes it so openstack/ansible-collections-openstack bugs are now tracked in https://launchpad.net/openstacksdk12:56
fungithat's the intended url now, right?12:56
fungijust making sure you didn't mean to have a separate lp project for that12:57
gtemahmm, no, that was not the intention12:57
fungithere's no https://launchpad.net/ansible-collections-openstack so i expect that's correct12:57
fungiif there's a different lp project for it, you'll want to update the "groups" entry in gerrit/projects.yaml to whatever lp project name you want to use12:58
fungiand similarly, openstack/cliff is using https://launchpad.net/python-openstackclient12:59
fungigranted, https://launchpad.net/cliff seems to be something else entirely anyway13:00
gtemaright, creating change now13:02
gtemathanks for hint fungi13:02
fricklergtema: looking at the change I'd assumed you'd want to keep a-c-o in the openstacksdk bug group like it was for storyboard, do you want to create a dedicated project for that instead? from my consumer perspective both would be fine. all the other repos should certainly stick to using only sdk/osc as bug projects, right?13:19
fungionce that's decided, i'll go back and adjust the urls in the descriptions i set for the old projects in sb13:20
fungifor whichever ones are getting a different lp project than what they're currently set to13:21
opendevreviewArtem Goncharov proposed openstack/project-config master: Set proper LP project for ansible-collections-openstack  https://review.opendev.org/c/openstack/project-config/+/89967813:21
fungigtema: when you create https://launchpad.net/ansible-collections-openstack please make sure to set it as "part of" openstack, and also if you're creating any new teams for it then whatever team owns the project should itself be owned by ~openstack-admins13:22
gtemauhm, too late. Just created13:23
gtemacan it be moved or changed otherwise?13:23
fungiyou should be able to set the "part of" in the project settings still13:23
fungiit's just a piece of metadata really13:24
fungihttps://launchpad.net/ansible-collections-openstack isn't showing up for me yet, but maybe there's a delay13:24
gtemayes, changed, thanks. Noticing this is not trivial13:24
fricklerseems to be set already13:24
fricklerfungi: different name, which is a bit confusing, not sure what the idea is behind that, gtema? https://launchpad.net/openstack-collections-openstack13:25
fungioh, never mind, i see it's openstack-collections-openstack13:25
fungiassuming https://launchpad.net/openstack-collections-openstack is really intentional, it looks correct13:26
gtemaand there is no way to rename project (only its title) as far as I see13:28
gtemaok, created also https://launchpad.net/ansible-collections-openstack and added it to be part of openstack13:31
fungigtema: i guess you'll want to revise https://review.opendev.org/899678 in that case13:34
opendevreviewArtem Goncharov proposed openstack/project-config master: Set proper LP project for ansible-collections-openstack  https://review.opendev.org/c/openstack/project-config/+/89967813:35
gtemadone. I think I should stop today with LP - it is driving me crazy13:35
opendevreviewArtem Goncharov proposed openstack/project-config master: Set proper LP project for ansible-collections-openstack  https://review.opendev.org/c/openstack/project-config/+/89967813:36
opendevreviewArtem Goncharov proposed openstack/project-config master: Set proper LP project for ansible-collections-openstack  https://review.opendev.org/c/openstack/project-config/+/89967813:47
TheJuliaAny chance I can get a hold for job name "ironic-tempest-standalone-advanced" on openstack/ironic ? Change ID 898010 in particular.14:13
fungiTheJulia: can you provide a few words about what you're going to investigate on that held node, for context? i'll include it in the autohold comment14:16
TheJuliafungi: trying to figure out virtual media networking issues14:19
TheJuliabut can't reproduce them locally... which seems my luck14:19
fungiTheJulia: autohold has been created14:21
fungiis this related to your glean questions from last week?14:22
TheJuliaAlso, looks like cloud-init just tries dhcp and stomps config too14:22
TheJuliaso... all sorts of *fun*14:22
fungiso sounds like you've progressed to thinking it's not mounting the configdrive for some reason?14:22
TheJuliaWell, config drive gets mounted, config gets created, networkmanager sort of just spins and tries to dhcp anyway14:24
TheJuliacloud-init fires up shortly afterwards and goes "everything is dhcp!" and then, things just don't *seem* to work14:24
fungiso being mounted but not "found" (or not parsed successfully) perhaps14:25
TheJuliait sort of even looks like it is parsing correctly too, but can't know for sure from just console logs14:26
opendevreviewTony Breeds proposed opendev/system-config master: Add a jammy test node for regional mirrors  https://review.opendev.org/c/opendev/system-config/+/89971014:48
clarkbtkajinam: I think those builds and change are stuck due to requesting very particular types of nodes (in helm's case the 32GB flavors) and we must not be able to fulfill them for some reason and/or are stuck in the process of determining no cloud can fulfill them so NODE_FAILURE is not reported14:49
clarkbI've got meetings for like the next 5 hours though so not sure I'll be able to debug14:50
tkajinamclarkb, ok. these jobs are eventually started and completed and I agree with what you said.14:59
fricklerhttps://grafana.opendev.org/dashboards is giving me an error popup "permission needed: folders:read". listing dashboards on the home page is working fine, though15:02
clarkbI can confirm the behavior. Not sure why it happens. Maybe look at what the front page does permissions wise and we might need to update for the dashboards/ path?15:03
opendevreviewMerged openstack/project-config master: Set proper LP project for ansible-collections-openstack  https://review.opendev.org/c/openstack/project-config/+/89967815:12
* frickler really likes the Archived-At header that mm3 generates. earlier I always needed to look for the mail in the archive in order to give other ppl a link to it, now I can copy it straight from the mutt :-)15:23
tonybThat is nice.15:24
fungiyes, i make extensive use of it as well15:36
opendevreviewTony Breeds proposed opendev/system-config master: Add a jammy test node for regional mirrors  https://review.opendev.org/c/opendev/system-config/+/89971015:47
opendevreviewTony Breeds proposed opendev/system-config master: Add a jammy test node for regional mirrors  https://review.opendev.org/c/opendev/system-config/+/89971016:07
clarkbfollowing up on the gerrit plugin manager exception. I think it is a different issue to the one they fixed recently after comparing tracebacks. Makes me more confident we aren't accidentally building something wrong somehow and ending up with old versions of the plugins16:20
clarkband to followup on the commentlinks thing my plan is to restart gerrit this afternoon around school run scheduling16:27
clarkbI have too many meetings early in the day to give that proper attention16:27
fungisounds good. i'll go ahead and approve the two mailman cleanup changes16:30
fungiand send a heads up to service-announce about landing the upgrade change on thursday16:30
fricklerinfra-root: seems we have some issue with rax since end of last week https://grafana.opendev.org/d/a8667d6647/nodepool3a-rackspace?orgId=1&from=now-30d&to=now16:35
fricklerseems related to nodepool/sdk https://paste.opendev.org/show/bwFMydSlE3lyrBf2WKTd/16:37
fricklerlauncher container on nl01 is 4 days old, so that matches grafana16:38
clarkbd474eb84c605c429bb9cccb166cabbdd1654d73c is the likely issue given timing I think cc stephenfin 16:38
stephenfinmy guess is you're using cinder's v2 API (on RAX)?16:40
fungiyes, they don't support anything newer16:40
clarkbyup just noticed v2 doesn't have get_limits16:40
clarkbI think sdk is broken here for all v2 cinder api usage16:41
stephenfinwe've found a few of those lately. Should be a copy-paste job16:41
clarkbI'll propose a change to nodepool to exclude this openstacksdk version16:42
clarkblooks like 1.5.0 should be fine but 2.0.0 is not16:43
clarkbremote:   https://review.opendev.org/c/zuul/nodepool/+/899717 Exclude openstacksdk 2.0.016:46
fricklerclarkb: do we want to verify that within the container on nl01 first or did you test that locally?16:47
clarkbfrickler: I verified via git log which is pretty light verification. I guess we can downgrade in nl01 and restart the container16:48
clarkbmaybe. I'm not sure we have permissiosn to do that within the container?16:48
fricklerentering the container as root should work, let me do a quick test16:49
clarkblooks like the conatiner was just restarted. Fwiw I did a user installation of the library and was going to test if pythonpath would cause that to get picked up16:51
clarkbit seems to be working now. Not sure if you did anything else. But maybe the user install and pythonpath made it all happy16:52
fricklerI just did "docker exec -u root bash" and then "pip install openstacksdk\<2"16:52
fricklerand then docker restart16:52
fricklerbut yes, 23 building already16:52
clarkbfrickler: ah ok, so unsure if user install would have worked16:52
clarkbbut either way 1.5.0 seems to be valid16:52
stephenfinclarkb: https://review.opendev.org/c/openstack/openstacksdk/+/899718 should be able to get that into 2.0.1/2.1.016:53
clarkbstephenfin: thanks!16:53
clarkbstephenfin: my nodepool side change did !=2.0.0 so we should automatiically pick that up once available16:53
stephenfinclarkb: seeing as you have a credentials to a v2-having cloud, it would be helpful if you could validate that also. We don't have v2 in CI16:55
stephenfins/have a/have/16:56
clarkbstephenfin: I thought gtema did get rax credentials?16:56
clarkbbut we can probably manage to test that16:56
fricklerI was just about to say it would be interesting to see if that extra test would catch the issue16:57
gtemaclarkb - I remember I got the funny pair of secrets and I had troubles establishing proper tests with it (lack of privs or so). I will try to have another look on that later this week16:57
clarkbgtema: cool let us know and if that continues to be problematic we can probably cross check with how things are setup on our side and/or run a one off test16:59
clarkbfrickler: fwiw I did `pip install openstacksdk==1.5.0` and that put it in the users local lib path. pip freeze did report it as the right version afterwards so I think it may have worked without root anyway17:01
fricklerclarkb: ah, o.k., I will try to test that next time17:03
opendevreviewMerged opendev/system-config master: Merge production and test node mailman configs  https://review.opendev.org/c/opendev/system-config/+/89930417:03
fricklerclarkb: regarding testing the sdk fix we should be able to do it with a local venv on bridge, I'll see if I can get that done between dinner and TC meeting17:04
clarkbfrickler: a small script that makes a cloud object then fetches volume limits should be sufficient rather than running all of nodepool17:06
clarkbalso much safer since we don't want nodepool fighting with another instance17:06
clarkb(leak cleanups etc)17:06
frickleryes, I wasn't planning to run nodepool there. my first would be to see if also getting limits via osc is affected17:09
clarkboh ya that would be a nice easy reproduced17:10
fricklerinfra-root: seems there is a new zuul config error for github? projects like ansible: "Will not fetch project branches as read-only is set"17:31
clarkbfrickler: yes I brought it up in the zuul matrix room last week17:33
clarkbfrickler: basically the way zuul startup should work is it checks for configs while thinsg are starting up and notices that the configs aren't ready and marks it in error. Then later when it does load the config it should flip the state out of error with the new config17:34
clarkbfrickler: it appears that something is not causing it to clip over in this case and we need to do further debugging but I've been busy with other stuff recently17:34
fricklerah, o.k., I skipped reading the zuul channel during the ptg, sorry for the duplicate then17:36
clarkbif anyone has time to dig into the zuul logs to see if they can determine why it happened that would be great17:38
*** dhill is now known as Guest528118:15
fungimailman upgrade announcement sent18:16
clarkbfungi: does https://review.opendev.org/c/opendev/system-config/+/899305 need to be rebased due to the other config merge change?18:17
clarkbI see the announcement too fwiw18:17
fungiah, whoops, thanks--i just saw the -2 from that18:18
fungithey didn't originally conflict, but i think i ended up touching other files later that did18:18
clarkbfixing the rax provider allowed zuul and nodepool to claer out those helm builds18:18
clarkbthats one item that no longer needs debugging18:19
opendevreviewJeremy Stanley proposed opendev/system-config master: Clean up old Mailman v2 roles and vars  https://review.opendev.org/c/opendev/system-config/+/89930518:21
clarkbfungi: I went ahead and reapproved ^ after checking the diff18:22
fricklerthe neutron job that was stuck in periodic for days is gone as well18:44
opendevreviewMerged opendev/system-config master: Clean up old Mailman v2 roles and vars  https://review.opendev.org/c/opendev/system-config/+/89930518:59
TheJuliao/ Looks like my autohold is ready for me! Who wants my ssh key?19:09
fungiTheJulia: gimme19:10
fungiTheJulia: ssh root@
TheJuliaworks, awesome!19:11
fungimy pleasure19:11
fungii ran this on the old listserv just now:20:02
fungisudo su - root20:02
fungicd ~/kernel-stuff20:03
fungicp /boot/vmlinuz-5.4.0-165-generic ./20:03
fungibash extract-vmlinux vmlinuz-5.4.0-165-generic > vmlinuz-5.4.0-165-generic.extracted20:03
fungicp vmlinuz-5.4.0-165-generic.extracted /boot/vmlinuz-5.4.0-165-generic20:03
fungii think that should be sufficient to get it booting successfully on the latest kernel in /boot20:03
fungiclarkb: if that looks right to you, i'll perform a reboot test on it next20:04
fungiand assuming it comes back up, i'll shut it down tomorrow and make an image before deleting20:05
fungitomorrow will be the 13.5 year anniversary of that vm's creation20:06
fungier, 11.5 year i mean20:06
fungiit didn't quite live to reach its teens20:06
fungioriginally booted 2012-05-0120:07
clarkbfungi: the size is a bit bigger but in the same rnage of magnitude as the one we're booted off of20:22
clarkbso ya I think its good20:22
fungicool, rebooting it now20:23
fungicurrent uptime is 494 days20:23
clarkbwill services start that we don't want to start?20:23
clarkbspecifically mailman and exim20:23
fungiexim and mailman services are all disabled as part of the maintenance20:24
fungiK links in /etc/rc2.d20:24
fungi(created by `systemctl disable ...`)20:24
fungiso no, they should not start again on boot20:25
fungibut i'll double-check once i can log into it after it comes up20:25
fungiokay, actually rebooting it now-now20:26
fungii probably should have touched /fastboot since i guarantee it's running a fsck of the rootfs20:28
clarkbeither that or its found some new way to fail to boot :/20:28
*** elodilles is now known as elodilles_pto20:29
fungiif so, i'm tempted to just image it as-is and then we can attach it to a recovery boot or something down the road if we actually need any files from it20:29
clarkbit pings now20:30
fungijust started responding to ping20:30
clarkbI can login via ssh20:31
fungiand i can ssh in20:31
fungino mailman/exim services started on boot either20:31
fungihuh, of course, unattended-upgrades has staged linux-image-5.4.0-166-generic to be installed at the next shutdown20:38
fungii can probably just kill the unattended-upgrades daemon, or upgrade manually and do the extraction dance again followed by another reboot test20:39
clarkbthe computers are conspiring against us20:40
fungiokay, did the same thing again with the newer kernel20:57
fungihopefully it'll come back up faster this time20:57
fungiand i'm already ssh'd back in again20:57
fungiLinux lists 5.4.0-166-generic #183-Ubuntu SMP Mon Oct 2 11:28:33 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux20:58
fungitomorrow i'll power it off for good and archive an image, then delete20:58
fungiand also take it out of the disable list on bridge20:58
clarkbfungi: snapshot is in progress now?21:00
fungino, tomorrowe21:04
clarkbhow does this look for 20 minutes from now #status notice Gerrit on review.opendev.org will be restarted to pick up a configuration change required as part of Gerrit 3.8 upgrade preparations.21:41
clarkbrough plan is `docker-compose down ; mv /home/gerrit2/review_site/data/replication/ref-updates/waiting /home/gerrit2/tmp/clarkb/waiting_20231031 ; docker compose up -d` we don't need any pulls oranythin like that so should be straightforward21:45
fungiyou're not moving out of /home/gerrit2 so should be ~instantaneous21:51
clarkbok time to start. I'll send that notice then proceed with the plan outlined above22:00
* fungi is on hand22:00
clarkb#status notice Gerrit on review.opendev.org will be restarted to pick up a configuration change required as part of Gerrit 3.8 upgrade preparations.22:00
opendevstatusclarkb: sending notice22:00
-opendevstatus- NOTICE: Gerrit on review.opendev.org will be restarted to pick up a configuration change required as part of Gerrit 3.8 upgrade preparations.22:00
opendevstatusclarkb: finished sending notice22:03
clarkbok proceeding now that the notice is done22:03
fungiwebui is loading for me now22:04
fungifile diffs aren't showing for me... very odd22:06
clarkbas a sanity check it doesn't look like gerrit.config was updated. its timestamp is from when the change merged to update the config22:06
clarkbfungi: I think we've seen that before. Its slow while it rebuilds caches22:06
clarkbthey should eventually load for you22:06
fungiaha, okay22:06
clarkbspotchecking some chagnes I already had opened their diffs load for me22:07
funginow it's working, yep22:07
clarkbhttps://review.opendev.org/c/opendev/system-config/+/898756/ and https://review.opendev.org/c/openstack/nova/+/899753 for example22:07
clarkbchange id comment links work for me. Not sure I have any examples of other commentlinks handy22:07
fungiyeah, i was looking22:08
fungidid test that already at least22:08
clarkbthe sha commentlink in https://review.opendev.org/c/opendev/system-config/+/899283 is working22:08
clarkband shas were the main one that got updated so I think I'm happy with this22:08
fungihttps://review.opendev.org/c/starlingx/utilities/+/897335 has a closes-bug22:08
fungistill functioning22:08
fungihttps://review.opendev.org/c/starlingx/tools/+/899742 has story and task footers22:10
fungialso working22:11
fungieverything lgtm22:11
clarkbI agree that your examples look good. So ya commentlink config updated to 3.8 compatible specification and we're still functional22:11
clarkbshould've scheduled the Gerrit upgrade for this Friday :)22:12
fungimy last comment there confirms the commit id links you redid still work22:14
clarkbya if we want to change that behavior I'm sure we can adjust the regex but I didn't want to confuse matters when shifting formats of config22:14
fungiagreed, it's fine22:15
TheJuliaso going back to glean, it looks like glean works as we would epxect on a first pass, the challenge is if cloud-init also triggers and smashes what is there. At least that is my running theory at the moment.  Glean likely needs to move away from just using the network-script config format as well, but while deprecated, it still works at the moment22:15
clarkbI don't know that glean + cloud-init was ever considered a use case. It was always glean or cloud-init22:15
TheJuliawell, we have no elements to remove cloud-init22:16
TheJuliawe likely ought to22:16
clarkbas for configuring networks I think we've tried to take a path of least resistance with network manager on centos and friends which meant keep emitting the network script stuff and turn on the compat flag for network manager. It was my undersatnding that enabling the compat layer by default is deprecated but that the compat layer wasn'tgoing away22:17
clarkbits just that we may have to explicitly enable it at some point. If this is wrong then yes we should investigate alternative network manager configuration methods22:17
TheJuliawell, the error that now gets spit out is we need to move to keyfiles22:18
TheJuliaand there is a command to do it for us22:18
TheJulialike nmcli connection migrate22:18
opendevreviewTony Breeds proposed opendev/system-config master: Add a jammy test node for regional mirrors  https://review.opendev.org/c/opendev/system-config/+/89971022:22
opendevreviewTony Breeds proposed opendev/system-config master: [testinfra] Add port into curl's --resolve arg.  https://review.opendev.org/c/opendev/system-config/+/89976222:22
opendevreviewTony Breeds proposed opendev/system-config master: [DNM] Testinfra debugging  https://review.opendev.org/c/opendev/system-config/+/89976322:22
tonybNot important but I noticed that the sign-in link/button doesn't work on zuul.openstack.org as the redirect_uri is invalid, it does happily work on zuul.opendev.org.  Are the valid redirect_uris managed somewhere in system-config or is that manually done?22:38
clarkbtonyb: looks like its is the redirect from zuul.openstack.org to keycloak.opendev.org that has failed so it must be the keycloak config which I think is manual still22:55
clarkbcorvus:  would know more22:55
clarkbtonyb: in general I think we try to encourage people to just use zuul.opendev.org. Maybe we should haev the openstac.org vhost redirect to opendev.org22:56
tonybYeah, I think we could do that and we could just rewrite the url to include the '/t/openstack'22:57
clarkbI don't know why we did the proper vhost rather than a redirect. Maybe it was to prove you could do it with zuul multitenancy22:58
tonybHard to say.23:01
tonybI'll try to remember how apache and mod_rewrite work23:02
corvusi don't really recall a desire to remove the openstack whitelabel23:10
corvusi'm certain part of the lack of desire comes from the fact that it is semi-helpful to zuul developers to have a whitelabel on hand for testing23:12
corvusbut i don't think that needs to be a blocker for removing the openstack whitelabel if we want to.  we could always add a zuul whitelabel and not publicize it.23:13
corvusand yeah, the keycloak urls are currently configured manually; the next step in that project would be to export that data (as json iirc) and then add automatic import/configuration to the deployment.  there's a model for that in the zuul tutorial system which uses keycloak (the bootstrap stage of that imports a serialized keycloak config)23:15
opendevreviewTony Breeds proposed opendev/system-config master: Add a jammy test node for regional mirrors  https://review.opendev.org/c/opendev/system-config/+/89971023:28

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!