Tuesday, 2024-11-19

opendevreviewMerged openstack/project-config master: Disable raxflex cloud  https://review.opendev.org/c/openstack/project-config/+/93557500:12
clarkbok my edits to the meeting agenda are in. Let me know if I' forgetting anything. Will send that before I call it a day00:17
fungii guess this is a good opportunity to recreate the networks in flex while there are no jobs running there00:18
clarkboh thats a good point00:18
clarkbin theory if we delete them then the cloud launcher cron will recreate them. Before we do that we should check that cloud launcher is happy00:19
*** elodilles_pto is now known as elodilles08:03
opendevreviewKarolina Kula proposed openstack/diskimage-builder master: WIP: Add support for CentOS Stream 10  https://review.opendev.org/c/openstack/diskimage-builder/+/93404512:40
opendevreviewKarolina Kula proposed openstack/diskimage-builder master: WIP: Add support for CentOS Stream 10  https://review.opendev.org/c/openstack/diskimage-builder/+/93404513:09
opendevreviewKarolina Kula proposed openstack/diskimage-builder master: WIP: Add support for CentOS Stream 10  https://review.opendev.org/c/openstack/diskimage-builder/+/93404513:14
fungi#status log Pruned backups on backup02.ca-ymq-1.vexxhost.opendev.org reducing volume utilization from 91% to 77%13:47
opendevstatusfungi: finished logging13:48
fungiperhaps unrelated to the disappearing volume for mirror01.sjc3.raxflex.opendev.org, this morning i can't reach the server at all, openstack server lists says it's in shutoff state13:49
fungialso a test vm i've got in flex under my personal rackspace account rebooted spontaneously about 14 hours ago13:50
fungiseems there's some major disruption occurring in there13:50
opendevreviewMerged opendev/irc-meetings master: Add eventlet-removal biweekly meeting  https://review.opendev.org/c/opendev/irc-meetings/+/93557314:24
opendevreviewMerged opendev/irc-meetings master: Update chair for manila meetings  https://review.opendev.org/c/opendev/irc-meetings/+/93557214:28
clarkbfungi: if it weren't for your additional test vm I would wonder if the kernel simply panicked on the mirror after trying to do some fs operation15:29
clarkbI'ev also been trying to do some local testing of docker pulls to see if I can better undertsand the rate limit changes. (Un)fortunately `for X in latest 3.20.3 3.19.4 3.18.9 3.17.10 3.16.9 3.15.11 3.14.10 3.13.12 3.12.12 3.11.13 3.10.9 3.9.6 ; do echo $X ; sudo docker image pull alpine:$X ; done` works for me locally without authentication which should pull ~12 unique images in a15:36
clarkbfairly short span of time which is >1015:36
clarkbmakes me wonder if indeed the library and open source images are not rate limited, but maybe once you trip over the rate limit all requetss from your IP even to those library iamges are blocked?15:36
clarkbor they haven't completed the rollout yet and whatever geo I hit doesn't have the new limits?15:36
clarkbalso I couldn't figure out how to force a docker image pull without also running a conatiner so I just listed out all the distinct tag labels instead...15:41
clarkbhttps://www.docker.com/blog/checking-your-current-docker-pull-rate-limits-and-status/ has info on directly checking your anonymous rate limit values via the token. I'll poke at that as meetings allow (I love their example is literally take your auth token and paste it into a third party web service...)15:42
corvusclarkb: maybe try pulling 12 different repositories (instead of 12 tags).  here's the list of library images, maybe try pulling :latest from 12 of these? https://github.com/docker-library/official-images/tree/master/library15:45
clarkbcorvus: ++15:48
clarkb`for X in debian ubuntu fedora python node almalinux rockylinux nginx perl mariadb cirros bash ; do echo $X ; sudo docker image pull $X:latest ; done` is a bit slower15:52
clarkbbut running now15:52
clarkbrockylinux doesn't have a :latest so that failed but its still 11 images which should be more than the limit. There were no other failures15:57
clarkbfor completeness a manual pull of rockylinux:9 does not hit the rate limit either15:58
clarkbso in total I've fetched over 20 distinct images half of which belong to the same alpine repo15:58
opendevreviewDmitriy Rabotyagov proposed openstack/project-config master: End gating for Murano/Senlin/Sahara OSA roles  https://review.opendev.org/c/openstack/project-config/+/93567116:20
clarkbnoonedeadpunk: I left a note on ^16:35
clarkbfungi: did you want to followup with rax on the server reboot cinder volume situation? Seems like you have have additional info. Alternatively I can put together what I saw and take it from there. We may also want to try a manual reboot of the server via nova?16:38
noonedeadpunkclarkb: ok, fair. was following https://docs.opendev.org/opendev/infra-manual/latest/drivers.html#step-1-end-project-gating too literally I guess16:39
fungiclarkb: yeah, i'll test bringing it back online first, since i have a brief break between meetings16:39
funginoonedeadpunk: openstack has its own instructions on retiring things, with additional openstack-specific step16:39
fungis16:40
noonedeadpunkfair...16:40
funginoonedeadpunk: see the "note" at https://docs.openstack.org/project-team-guide/repository.html#step-1-end-project-gating16:40
noonedeadpunkyeah, just spotted it 16:40
opendevreviewDmitriy Rabotyagov proposed openstack/project-config master: End gating for Murano/Senlin/Sahara OSA roles  https://review.opendev.org/c/openstack/project-config/+/93567116:41
clarkbjwt.io posts this warning: "Warning: JWTs are credentials, which can grant access to resources. Be careful where you paste them! We do not record tokens, all validation and debugging is done on the client side." maybe we shouldn't have a website whose landing page is dedicated to this practice and instead ship a common tool to do it?16:43
clarkbI guess they would argue that they are shipping a common tool since it is supposedly client side but that is hard to verify and it is a webform which often does submit to a server16:43
fungifwiw, mirror01.sjc3.raxflex.opendev.org came up cleanly and connected to its cinder volume again16:44
fungilast line it wrote into syslog was timestamped 04:05:01 utc16:47
clarkbI guess the question now is if we should put it into service or redo networking first16:48
fungimy personal server, coincidentally, seems to have gone offline sortly after 04:56:10 utc (last entry in the journal) maybe 04:57:30 since my irc client there began to time out at that point16:51
fungiabout an hour apart16:52
fungimaybe they were doing some impactful upgrades or forced offline migrations?16:52
fungibut yeah, if we're going to recreate networks we need to delete the server's port and then reconnect it later, right?16:53
fungii don't expect i'll have time to fiddle with that until my meetings have wrapped up at 20z16:54
clarkbI'm not actually sure how straightforward it is to replace the networking of a server. One awkward componment is we try to manage the networks with cloud launcher16:55
clarkbthe easiest thing might actually be to delete the mirror and all the networks/routers/subnets and then have cloud launcher redeploy all of that and then build a new mirror. But that is a lot of moving parts16:55
fungiyeah, and i'd rather avoid updating dns and inventory if we can just do a few osc calls to reconnect after16:57
opendevreviewDmitriy Rabotyagov proposed openstack/project-config master: Remove Murano/Senlin/Sahara OSA roles from infra  https://review.opendev.org/c/openstack/project-config/+/93568316:58
clarkb'access': [{'actions': ['pull'], 'name': 'library/alpine', 'parameters': {'pull_limit': '100', 'pull_limit_interval': '21600'}, 'type': 'repository'}] <- is what i get for library/alpine17:01
clarkbsame limits for kolla/base so maybe there was a typo in the blog post and they meant 100 not 10?17:05
clarkbhowever 21600 is much greater than an hour17:05
clarkbI ended up using pyjwt and had to discover the don't validate option since I don't know docker's public key17:06
fungi6 hours17:34
mordredit depends on how the port was created ...17:40
mordredif it was created naturally as part of the server create call, then you can't delete it and reattach it later. If we created the port manually and fed it to the server create, then it should be no problem to delete it, and the network, then create a new port and attach it17:40
corvusclarkb: that's for an "unauthenticated" token, right?17:42
cardoeclarkb / fungi: you can poke klamath for flex stuff as well. he's actually part of the flex team.17:49
clarkbmordred: this is likely the "naturally" case17:49
clarkbcorvus: correct unauthenticated17:49
cardoeHe's suggesting to switch from the "Capacity" to "Standard" for volume types.17:53
opendevreviewDmitriy Rabotyagov proposed openstack/project-config master: Add repository for OSA's httpd role  https://review.opendev.org/c/openstack/project-config/+/93569317:54
opendevreviewDmitriy Rabotyagov proposed openstack/project-config master: Add repository for OSA's httpd role  https://review.opendev.org/c/openstack/project-config/+/93569317:58
opendevreviewDmitriy Rabotyagov proposed openstack/project-config master: Add Zuul jobs for ansible-role-httpd project  https://review.opendev.org/c/openstack/project-config/+/93569518:00
mordred<clarkb> "mordred: this is likely the "..." <- Yeah, then you'll have to delete and recreate the server 18:03
opendevreviewDmitriy Rabotyagov proposed openstack/project-config master: Add Zuul jobs for frrouting role  https://review.opendev.org/c/openstack/project-config/+/93569618:03
fungiklamath: ooh! maybe you know what happened around 0400-0500 utc that server instances in flex ended up shutoff or spontaneously rebooted? and also what might have caused one of our flex server instances to lose contact with an attached cinder volume earlier yesterday?18:13
fungithanks cardoe!18:13
clarkbcardoe: klamath I've confirmed the volume is of type capacity (from volume show: type | Capacity) we can create a new volume with type Standard and replace thingson the server. Is there any more detail on what the difference between the two is?18:13
clarkbI believe that capacity must be the default too since I don't think I specified that18:14
fungiwe didn't18:18
klamathfungi: We have a RCA tomorrow, TLDR services got restarted and they should not have.  18:20
klamathclarkb: you can create new volumes of the standard type and those will land on the netapp. 18:21
fungiklamath: thanks for the confirmation! we were mainly trying to figure out if the environment has stabilized enough for us to re-add it to our ci/cd18:22
klamathfungi: it should be, I was the person helping with the swift SLO stuff last week.18:23
fungiawesome18:24
clarkbprobably best to reenable it for now then plan to turn the region off at some point, delete the networking and the mirror and its volume then rebuild it all. That will get bigger mtus and also we can use standard volume types when we rebuild the mirror18:24
opendevreviewClark Boylan proposed openstack/project-config master: Revert "Disable raxflex cloud"  https://review.opendev.org/c/openstack/project-config/+/93569818:29
opendevreviewMerged openstack/project-config master: End gating for Murano/Senlin/Sahara OSA roles  https://review.opendev.org/c/openstack/project-config/+/93567118:30
corvusclarkb: fungi can you look at https://review.opendev.org/935455 ? i'd like to start testing raw images18:30
fungisure thing18:30
fungilgtm18:31
opendevreviewJames E. Blair proposed zuul/zuul-jobs master: WIP: Add mirror-container-images role and job  https://review.opendev.org/c/zuul/zuul-jobs/+/93557418:35
clarkbfungi: it just occurred to me that the mirror and the test nodes are not in the same tenant and use different networks anyway so we could change the networking for test nodes. However if we do that the mtu for the test nodes will be larger than the mtu for the mirror which is maybe a bad thing given the utility of the mirror18:45
fungiright, i'd rather do them at the same time, and we need to not be booting nodes there when redoing the mirror's network connectivity18:46
opendevreviewMerged opendev/irc-meetings master: Oslo update meeting time.  https://review.opendev.org/c/opendev/irc-meetings/+/93464818:47
opendevreviewMerged opendev/infra-openafs-deb noble: Build 1.8.13 for Noble  https://review.opendev.org/c/opendev/infra-openafs-deb/+/93555618:57
opendevreviewMerged openstack/project-config master: Revert "Disable raxflex cloud"  https://review.opendev.org/c/openstack/project-config/+/93569819:05
opendevreviewMerged opendev/system-config master: Add vexxhost connection to zuul-launcher  https://review.opendev.org/c/opendev/system-config/+/93545519:38
opendevreviewOria Weng proposed opendev/lodgeit master: Captcha: Fix text layer location calculation  https://review.opendev.org/c/opendev/lodgeit/+/93571219:53
opendevreviewMerged openstack/diskimage-builder master: Upgrade curl-minimal for RHEL based images built from containers  https://review.opendev.org/c/openstack/diskimage-builder/+/92442420:13
opendevreviewOria Weng proposed opendev/lodgeit master: Captcha: Fix text layer location calculation  https://review.opendev.org/c/opendev/lodgeit/+/93571220:41
opendevreviewJay Faulkner proposed openstack/diskimage-builder master: [gentoo] Fix+Update CI for 23.0 profile  https://review.opendev.org/c/openstack/diskimage-builder/+/92398521:05
opendevreviewClark Boylan proposed opendev/lodgeit master: Captcha: Fix text layer location calculation  https://review.opendev.org/c/opendev/lodgeit/+/93571221:06
opendevreviewClark Boylan proposed opendev/lodgeit master: Run python3.11 job on Jammy  https://review.opendev.org/c/opendev/lodgeit/+/93571921:06
clarkbinfra-root I'd liek to consider disabling docker hub proxy caching for jobs but looking at https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/use-docker-mirror/tasks/main.yaml#L10 it seems like if we configure any mirror at all we expect to have a docker mirror too. Before I spend a lot of time on this do we A) think this is worth doing and B) if so should I add a flag to21:09
clarkbuse-docker-mirror to skip setup if set?21:09
clarkbcorvus: ^ you may have thoughts on both of those items21:10
fungipython 3.14.0a2 is out now21:20
fungialso i've received 15 uncaught bounce notifications to openstack-discuss-owner already, pretty much all deactivated or deleted red hat mail addresses (and a few for ibm)21:21
clarkbfungi: these are the notifications indicating that the threshold has been reached and subscription is paused?21:23
fungino, just copies of the ndr messages21:23
clarkbah21:24
opendevreviewJay Faulkner proposed openstack/diskimage-builder master: [gentoo] Fix+Update CI for 23.0 profile  https://review.opendev.org/c/openstack/diskimage-builder/+/92398521:29
corvusclarkb: looks like use-docker-mirror doesn't use the new style of mirror config.  ideally we would solve this problem by migrating to that.  that may be slightly more work than adding a throwaway flag, but it would be forward progress.  may not be too hard for something like this.21:30
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Support new style mirror_info in use-docker-mirror  https://review.opendev.org/c/zuul/zuul-jobs/+/93572221:55
opendevreviewJeremy Stanley proposed opendev/system-config master: Allow all Ubuntu releases to use our OpenAFS PPA  https://review.opendev.org/c/opendev/system-config/+/93572321:55
opendevreviewClark Boylan proposed openstack/project-config master: Disable docker hub mirror use in jobs  https://review.opendev.org/c/openstack/project-config/+/93572522:02
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Support new style mirror_info in use-docker-mirror  https://review.opendev.org/c/zuul/zuul-jobs/+/93572222:10
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Fully qualify openvswitch_bridge to make linter happy  https://review.opendev.org/c/zuul/zuul-jobs/+/93572622:10
fungiopenstack-discuss-owner has received 8 more uncaught bounce notifications22:11
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Cap the ansible version used by ansible-lint  https://review.opendev.org/c/zuul/zuul-jobs/+/93572622:28
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Support new style mirror_info in use-docker-mirror  https://review.opendev.org/c/zuul/zuul-jobs/+/93572222:28
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Support new style mirror_info in use-docker-mirror  https://review.opendev.org/c/zuul/zuul-jobs/+/93572222:51
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Support new style mirror_info in use-docker-mirror  https://review.opendev.org/c/zuul/zuul-jobs/+/93572222:57
clarkbto catch people up here I think we do something like https://review.opendev.org/c/zuul/zuul-jobs/+/935722 and parent then https://review.opendev.org/c/openstack/project-config/+/935725 then https://review.opendev.org/c/zuul/zuul-jobs/+/849989 and then we keep an eye on stability without the proxy cache23:13
corvuslgtm23:29

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!