| cloudnull | hey all, just reading scrollback. Are the API issues at RAX all on legacy OSPC? | 00:01 |
|---|---|---|
| cloudnull | anything we can help with ? | 00:03 |
| clarkb | cloudnull: yes! I'm drafting an email that summarizes if you want to wait a few more minutes | 00:05 |
| cloudnull | ++ | 00:05 |
| clarkb | cloudnull: sent you should see it momentarily | 00:08 |
| cloudnull | 👍 ack | 00:09 |
| clarkb | cloudnull: and I'm happy to followup here once you have a chance to digest | 00:12 |
| * cloudnull replied. | 00:20 | |
| cloudnull | tldr IDK what's happening in OSPC, but folks are looking | 00:20 |
| * cloudnull looking at what we can do with your flex quota now | 00:21 | |
| clarkb | cloudnull: thanks! re the nova keyerror thing I'd have to hand you over to melwitt and sean-k-mooney (sean isn't in here but is in #openstack-infra) | 00:22 |
| cloudnull | o/ melwitt | 00:24 |
| melwitt | hey, was just reading the backscroll. do we have a log to look at with the keyerror in it? | 00:25 |
| cloudnull | clarkb SJC and DFW quotas have increased. | 00:27 |
| cloudnull | * SJC - instances 100, memory 512, FIP 60 | 00:27 |
| cloudnull | * DFW - instances 100, memory 768, FIP 75 | 00:27 |
| clarkb | cloudnull: melwitt in sjc3 today 75355359-40d7-4005-aae4-13de2efa0c0b was one of the instances I manually deleted to clear out the listing error | 00:29 |
| clarkb | I don't know if its only a subset that had the keyerror or not. I can record all of the uuids that were in an error state when I did my deletes | 00:29 |
| clarkb | cloudnull:l re quota thanks! | 00:29 |
| clarkb | I think the zuul launcher should autoamtically use the extra room | 00:30 |
| cloudnull | melwitt here's what it looks like server side https://gisty.link/603eee0e3c31d9a0dfd23e8fce753dc4b7adca85/raw | 00:32 |
| melwitt | thanks | 00:33 |
| melwitt | hrm, looks like this might have caused a regression https://review.opendev.org/c/openstack/nova/+/939658 | 00:34 |
| cloudnull | any idea on how we might mitigate that ? | 00:37 |
| cloudnull | BTW - we're still on 2024.1 | 00:37 |
| cloudnull | updates to 2025.1 planned - but not done yet. | 00:37 |
| melwitt | hm ok, it's only been backported back to 2024.2 so far and 2024.1 hasn't merged yet https://review.opendev.org/c/openstack/nova/+/955305 | 00:39 |
| cloudnull | clarkb once you all are logged into IAD and ready to put work there, let me know and I'll get that quota adjusted too. | 00:39 |
| melwitt | oh lol .. this is actually a fix for the KeyError, not the other way around | 00:39 |
| * melwitt can't read | 00:40 | |
| cloudnull | oh, cool. so fixed in 2025.1 - potential backport for 2024.1 | 00:40 |
| melwitt | yes, 2024.1 backport is approved but hasn't made it through the gate yet | 00:40 |
| melwitt | looks like the root cause is when a server in the returned server list does not have a matching record in the nova_api.request_specs database table | 00:42 |
| cloudnull | ++ added a comment to the review. hopefully we can appease the gate-keeper | 00:43 |
| melwitt | afaict the only way to get around this for now is to delete the server the KeyError is being raised for | 00:43 |
| cloudnull | ++ that's what we've been doing, when it happens. | 00:44 |
| cloudnull | its been a rare occurrence but good to know the fix is already in | 00:45 |
| clarkb | cloudnull: yup, not sure how quickly I'll get to that as we have to boot a new mirror node first, but will try to get to it sooner than later | 00:47 |
| melwitt | fingers crossed for the gate | 00:47 |
| opendevreview | James E. Blair proposed opendev/zuul-providers master: Set rax-flex instance limits https://review.opendev.org/c/opendev/zuul-providers/+/957478 | 01:07 |
| corvus | clarkb: cloudnull ^ we don't detect fip limits in zuul (yet), so that will set an instance limit equal to the fip limits you specified, since that's our effective upper bound on instances. | 01:08 |
| clarkb | corvus: thanks I've approved it | 01:10 |
| opendevreview | Merged opendev/zuul-providers master: Set rax-flex instance limits https://review.opendev.org/c/opendev/zuul-providers/+/957478 | 01:10 |
| opendevreview | Merged openstack/project-config master: Retire Monasca project https://review.opendev.org/c/openstack/project-config/+/957063 | 01:18 |
| corvus | i deleted 3 nodes leaked by the old nodepool system | 01:25 |
| corvus | remote: https://review.opendev.org/c/zuul/zuul/+/957479 Replace getQuotaUsed with cache [NEW] | 02:01 |
| corvus | remote: https://review.opendev.org/c/zuul/zuul/+/957480 Only log quota messages when acted upon [NEW] | 02:01 |
| corvus | two more improvements that can wait for tomorrow | 02:01 |
| opendevreview | OpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml https://review.opendev.org/c/openstack/project-config/+/957483 | 02:58 |
| *** mrunge_ is now known as mrunge | 05:41 | |
| frickler | seems some changes got completely lost in the check pipeline yesterday? like https://review.opendev.org/c/openstack/kolla-ansible/+/957319 never saw a result, after rechecking I see fresh jobs getting started now | 10:45 |
| fungi | frickler: at 23:26 utc corvus said "i dequeued 2 kolla changes..." | 13:16 |
| fungi | likely those | 13:17 |
| corvus | launchers seem to be behaving better now; the periodic surge happened and we recovered. | 13:49 |
| corvus | the changes we're running in the ad-hoc build merged, so it should be safe to allow the normal reboot cycle | 14:08 |
| corvus | the request backlog is all arm64, as expected | 14:09 |
| frickler | fungi: oh, I missed that comment, thx. now I guess I need to find the second one ;-/ | 14:15 |
| fungi | hopefully there aren't too many kolla changes that have no verified vote | 14:17 |
| fungi | other than ones that are already in check/gate pipelines anyway | 14:17 |
| fungi | unless the other one was a recheck and had a prior check result | 14:17 |
| fungi | then it might be harder to track down | 14:18 |
| frickler | looks like it was https://review.opendev.org/c/openstack/kolla-ansible/+/957320 | 14:48 |
| fungi | i'm popping out to grab an early lunch and run some errands now that the openstack release meeting is done, will be back in a couple of hours | 14:49 |
| fungi | okay, headed out now | 14:56 |
| profcorey | Quick question, what is the process to receive edit permission to create wiki pages? I'd like to create a single wiki page for listing meeting info and etherpads on https://wiki.openstack.org/wiki/Cascade, thanks! | 15:29 |
| profcorey | I didn't see anything in the docs for requesting wiki permissions or perhaps I missed this | 15:32 |
| profcorey | The only mention is to add the wiki url to the reference/projects.yaml for not yet official OpenStack projects. | 15:51 |
| clarkb | profcorey: I think that you propose an edit then it goes into a moderation queue and once moderated and accepted as valid your user is marked as no longer needing moderation for future edits | 16:00 |
| clarkb | fungi: and tonyb have primarily been on top of that and can confirm (or deny) | 16:00 |
| opendevreview | Clark Boylan proposed opendev/system-config master: Update Gitea to 1.24.5 https://review.opendev.org/c/opendev/system-config/+/957554 | 16:18 |
| opendevreview | Clark Boylan proposed opendev/system-config master: Update Gerrit images to 3.10.8 and 3.11.5 https://review.opendev.org/c/opendev/system-config/+/957555 | 16:22 |
| profcorey | clarkb: where do I propose the edit? | 16:35 |
| clarkb | profcorey: on the wiki itself either via creating a new page or editing an existing one | 16:36 |
| profcorey | clarkb: I tried that but, I receive an "OpenID error: An error occurred: an invalid token was found" | 16:38 |
| clarkb | profcorey: ok I think that is an authentication problem with the system itself. I want to say that this is a bug in the openid implementation that restarting services typically clears out. (again fungi and tonyb are more on top of what is going on with the wiki than I am) | 16:39 |
| profcorey | clarkb: ahh ok, thanks | 16:40 |
| fungi | clarkb: profcorey: moderation on the wiki is post-facto, i keep an eye on what edits get made by new users and then add them to a list that filters them out of my patrol list if they've made legitimate edits, so this isn't moderation-related | 17:21 |
| profcorey | fungi: ok good to know | 17:25 |
| fungi | profcorey: maybe try again soon, for some reason the openid tokens issued by launchpad/ubuntuone sometimes get rejected by the openid client implementation in the mediawiki plugin we've got, i was never able to track down the exact reason but we're hoping once we get the wiki upgraded and/or migrate to our own keycloak-based sso that problem will go away | 17:26 |
| fungi | when it does crop up it only seems to happen with new account autocreation, so once you're able to log in the first time it should cease to be an issue | 17:27 |
| profcorey | fungi: ok I'll try later this evening and see if the error goes away, thanks! | 17:27 |
| profcorey | I'll keep you updated | 17:27 |
| fungi | appreciated | 17:27 |
| fungi | happy to help you out adding things in the short term if the problem persists too | 17:28 |
| profcorey | Oh thank you fungi! | 17:28 |
| fungi | once you're able to log in, creating new pages and editing existing ones should "just work" without needing special permissions (except for a very few existing pages we've locked to keep non-admins from changing them) | 17:30 |
| opendevreview | Merged openstack/project-config master: Normalize projects.yaml https://review.opendev.org/c/openstack/project-config/+/957483 | 17:37 |
| clarkb | looks like both the gerrit and gitea image update changes are happy. Not sure i'm in a great spot for restarting gerrit today, but gitea's chagnelog looked small if we want ot do that | 17:57 |
| clarkb | and then I was going to attempt to log into voip.ms today and see what options there are for disabling/cancelling there | 17:57 |
| fungi | looking | 17:58 |
| fungi | yeah, nothing in the gitea 1.24.5 changelog looks relevant to us whatsoever, but it's good to keep the delta from upstream minimized | 17:59 |
| fungi | i approved it and can monitor deployment | 18:00 |
| clarkb | thanks | 18:04 |
| clarkb | the other thing on my radar is https://review.opendev.org/c/zuul/zuul-jobs/+/957188 I know corvus indicated being happy with that as the workaround for now but doesn't seem to have gotten around to reviewing it yet. I think the main thing is that we land that before we land the tenant ansible version default updates | 18:23 |
| gouthamr | hello o/ i was trying to erase contents of openstack/monasca-events-api repo in lieu of its retirement; i'm hitting an issue with zuul there: https://review.opendev.org/c/openstack/monasca-events-api/+/957065 - an incorrectly configured job by the looks of it.. any idea how i can tackle this? | 18:35 |
| gouthamr | zuul's finding such issues in other monasca repos as well | 18:36 |
| clarkb | its saying openstack/monasca-events-api/.zuul.yaml@unmaintained/2023.1 uses that job and you're deleting it | 18:43 |
| clarkb | I think you need to retire the other branches before master | 18:43 |
| profcorey | fungi: ok thanks! | 18:55 |
| gouthamr | ah, ty... i don't think i know how to do that | 19:04 |
| * gouthamr reads the repo retirement instructions | 19:04 | |
| clarkb | #status log Deleted the old Asterisk PBX DID number and requested account cancellation via a ticket to the provider. | 19:09 |
| opendevstatus | clarkb: finished logging | 19:09 |
| opendevreview | Merged opendev/system-config master: Update Gitea to 1.24.5 https://review.opendev.org/c/opendev/system-config/+/957554 | 19:45 |
| fungi | infra-prod-service-gitea is already running | 19:49 |
| fungi | gitea09 is restarting | 19:49 |
| fungi | https://gitea09.opendev.org:3081/ reports v1.24.5 | 19:51 |
| fungi | i was able to clone bindep from it just fine | 19:52 |
| clarkb | sorry got sidetracked by lunch | 20:03 |
| fungi | ansible finished with gitea14 at 19:59 utc, so anything updated in gerrit as of the top of the hour was replicated with the new version | 20:03 |
| clarkb | looks like they all updated (I checked directly). Let me find a replication example | 20:04 |
| clarkb | https://opendev.org/starlingx/apt-ostree/commit/e4c6fcb9b0e1709e2ae949bad56339ac4a3312d2 from https://review.opendev.org/c/starlingx/apt-ostree/+/957289 ps3 seems to indicate replication worked | 20:05 |
| fungi | yep | 20:05 |
| fungi | lgtm | 20:05 |
| clarkb | cool I'm going to go review those zuul changes now | 20:06 |
| clarkb | looks like they were already approved but failed in CI. I'll review and check the prior results before deciding if they should be rechecked | 20:07 |
| clarkb | looks liek the failure occrred in the quota cache but what we're modifying is the node cache. I think its probably ok to recheck | 20:10 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!