opendevreview | Merged openstack/project-config master: Disable raxflex cloud https://review.opendev.org/c/openstack/project-config/+/935575 | 00:12 |
---|---|---|
clarkb | ok my edits to the meeting agenda are in. Let me know if I' forgetting anything. Will send that before I call it a day | 00:17 |
fungi | i guess this is a good opportunity to recreate the networks in flex while there are no jobs running there | 00:18 |
clarkb | oh thats a good point | 00:18 |
clarkb | in theory if we delete them then the cloud launcher cron will recreate them. Before we do that we should check that cloud launcher is happy | 00:19 |
*** elodilles_pto is now known as elodilles | 08:03 | |
opendevreview | Karolina Kula proposed openstack/diskimage-builder master: WIP: Add support for CentOS Stream 10 https://review.opendev.org/c/openstack/diskimage-builder/+/934045 | 12:40 |
opendevreview | Karolina Kula proposed openstack/diskimage-builder master: WIP: Add support for CentOS Stream 10 https://review.opendev.org/c/openstack/diskimage-builder/+/934045 | 13:09 |
opendevreview | Karolina Kula proposed openstack/diskimage-builder master: WIP: Add support for CentOS Stream 10 https://review.opendev.org/c/openstack/diskimage-builder/+/934045 | 13:14 |
fungi | #status log Pruned backups on backup02.ca-ymq-1.vexxhost.opendev.org reducing volume utilization from 91% to 77% | 13:47 |
opendevstatus | fungi: finished logging | 13:48 |
fungi | perhaps unrelated to the disappearing volume for mirror01.sjc3.raxflex.opendev.org, this morning i can't reach the server at all, openstack server lists says it's in shutoff state | 13:49 |
fungi | also a test vm i've got in flex under my personal rackspace account rebooted spontaneously about 14 hours ago | 13:50 |
fungi | seems there's some major disruption occurring in there | 13:50 |
opendevreview | Merged opendev/irc-meetings master: Add eventlet-removal biweekly meeting https://review.opendev.org/c/opendev/irc-meetings/+/935573 | 14:24 |
opendevreview | Merged opendev/irc-meetings master: Update chair for manila meetings https://review.opendev.org/c/opendev/irc-meetings/+/935572 | 14:28 |
clarkb | fungi: if it weren't for your additional test vm I would wonder if the kernel simply panicked on the mirror after trying to do some fs operation | 15:29 |
clarkb | I'ev also been trying to do some local testing of docker pulls to see if I can better undertsand the rate limit changes. (Un)fortunately `for X in latest 3.20.3 3.19.4 3.18.9 3.17.10 3.16.9 3.15.11 3.14.10 3.13.12 3.12.12 3.11.13 3.10.9 3.9.6 ; do echo $X ; sudo docker image pull alpine:$X ; done` works for me locally without authentication which should pull ~12 unique images in a | 15:36 |
clarkb | fairly short span of time which is >10 | 15:36 |
clarkb | makes me wonder if indeed the library and open source images are not rate limited, but maybe once you trip over the rate limit all requetss from your IP even to those library iamges are blocked? | 15:36 |
clarkb | or they haven't completed the rollout yet and whatever geo I hit doesn't have the new limits? | 15:36 |
clarkb | also I couldn't figure out how to force a docker image pull without also running a conatiner so I just listed out all the distinct tag labels instead... | 15:41 |
clarkb | https://www.docker.com/blog/checking-your-current-docker-pull-rate-limits-and-status/ has info on directly checking your anonymous rate limit values via the token. I'll poke at that as meetings allow (I love their example is literally take your auth token and paste it into a third party web service...) | 15:42 |
corvus | clarkb: maybe try pulling 12 different repositories (instead of 12 tags). here's the list of library images, maybe try pulling :latest from 12 of these? https://github.com/docker-library/official-images/tree/master/library | 15:45 |
clarkb | corvus: ++ | 15:48 |
clarkb | `for X in debian ubuntu fedora python node almalinux rockylinux nginx perl mariadb cirros bash ; do echo $X ; sudo docker image pull $X:latest ; done` is a bit slower | 15:52 |
clarkb | but running now | 15:52 |
clarkb | rockylinux doesn't have a :latest so that failed but its still 11 images which should be more than the limit. There were no other failures | 15:57 |
clarkb | for completeness a manual pull of rockylinux:9 does not hit the rate limit either | 15:58 |
clarkb | so in total I've fetched over 20 distinct images half of which belong to the same alpine repo | 15:58 |
opendevreview | Dmitriy Rabotyagov proposed openstack/project-config master: End gating for Murano/Senlin/Sahara OSA roles https://review.opendev.org/c/openstack/project-config/+/935671 | 16:20 |
clarkb | noonedeadpunk: I left a note on ^ | 16:35 |
clarkb | fungi: did you want to followup with rax on the server reboot cinder volume situation? Seems like you have have additional info. Alternatively I can put together what I saw and take it from there. We may also want to try a manual reboot of the server via nova? | 16:38 |
noonedeadpunk | clarkb: ok, fair. was following https://docs.opendev.org/opendev/infra-manual/latest/drivers.html#step-1-end-project-gating too literally I guess | 16:39 |
fungi | clarkb: yeah, i'll test bringing it back online first, since i have a brief break between meetings | 16:39 |
fungi | noonedeadpunk: openstack has its own instructions on retiring things, with additional openstack-specific step | 16:39 |
fungi | s | 16:40 |
noonedeadpunk | fair... | 16:40 |
fungi | noonedeadpunk: see the "note" at https://docs.openstack.org/project-team-guide/repository.html#step-1-end-project-gating | 16:40 |
noonedeadpunk | yeah, just spotted it | 16:40 |
opendevreview | Dmitriy Rabotyagov proposed openstack/project-config master: End gating for Murano/Senlin/Sahara OSA roles https://review.opendev.org/c/openstack/project-config/+/935671 | 16:41 |
clarkb | jwt.io posts this warning: "Warning: JWTs are credentials, which can grant access to resources. Be careful where you paste them! We do not record tokens, all validation and debugging is done on the client side." maybe we shouldn't have a website whose landing page is dedicated to this practice and instead ship a common tool to do it? | 16:43 |
clarkb | I guess they would argue that they are shipping a common tool since it is supposedly client side but that is hard to verify and it is a webform which often does submit to a server | 16:43 |
fungi | fwiw, mirror01.sjc3.raxflex.opendev.org came up cleanly and connected to its cinder volume again | 16:44 |
fungi | last line it wrote into syslog was timestamped 04:05:01 utc | 16:47 |
clarkb | I guess the question now is if we should put it into service or redo networking first | 16:48 |
fungi | my personal server, coincidentally, seems to have gone offline sortly after 04:56:10 utc (last entry in the journal) maybe 04:57:30 since my irc client there began to time out at that point | 16:51 |
fungi | about an hour apart | 16:52 |
fungi | maybe they were doing some impactful upgrades or forced offline migrations? | 16:52 |
fungi | but yeah, if we're going to recreate networks we need to delete the server's port and then reconnect it later, right? | 16:53 |
fungi | i don't expect i'll have time to fiddle with that until my meetings have wrapped up at 20z | 16:54 |
clarkb | I'm not actually sure how straightforward it is to replace the networking of a server. One awkward componment is we try to manage the networks with cloud launcher | 16:55 |
clarkb | the easiest thing might actually be to delete the mirror and all the networks/routers/subnets and then have cloud launcher redeploy all of that and then build a new mirror. But that is a lot of moving parts | 16:55 |
fungi | yeah, and i'd rather avoid updating dns and inventory if we can just do a few osc calls to reconnect after | 16:57 |
opendevreview | Dmitriy Rabotyagov proposed openstack/project-config master: Remove Murano/Senlin/Sahara OSA roles from infra https://review.opendev.org/c/openstack/project-config/+/935683 | 16:58 |
clarkb | 'access': [{'actions': ['pull'], 'name': 'library/alpine', 'parameters': {'pull_limit': '100', 'pull_limit_interval': '21600'}, 'type': 'repository'}] <- is what i get for library/alpine | 17:01 |
clarkb | same limits for kolla/base so maybe there was a typo in the blog post and they meant 100 not 10? | 17:05 |
clarkb | however 21600 is much greater than an hour | 17:05 |
clarkb | I ended up using pyjwt and had to discover the don't validate option since I don't know docker's public key | 17:06 |
fungi | 6 hours | 17:34 |
mordred | it depends on how the port was created ... | 17:40 |
mordred | if it was created naturally as part of the server create call, then you can't delete it and reattach it later. If we created the port manually and fed it to the server create, then it should be no problem to delete it, and the network, then create a new port and attach it | 17:40 |
corvus | clarkb: that's for an "unauthenticated" token, right? | 17:42 |
cardoe | clarkb / fungi: you can poke klamath for flex stuff as well. he's actually part of the flex team. | 17:49 |
clarkb | mordred: this is likely the "naturally" case | 17:49 |
clarkb | corvus: correct unauthenticated | 17:49 |
cardoe | He's suggesting to switch from the "Capacity" to "Standard" for volume types. | 17:53 |
opendevreview | Dmitriy Rabotyagov proposed openstack/project-config master: Add repository for OSA's httpd role https://review.opendev.org/c/openstack/project-config/+/935693 | 17:54 |
opendevreview | Dmitriy Rabotyagov proposed openstack/project-config master: Add repository for OSA's httpd role https://review.opendev.org/c/openstack/project-config/+/935693 | 17:58 |
opendevreview | Dmitriy Rabotyagov proposed openstack/project-config master: Add Zuul jobs for ansible-role-httpd project https://review.opendev.org/c/openstack/project-config/+/935695 | 18:00 |
mordred | <clarkb> "mordred: this is likely the "..." <- Yeah, then you'll have to delete and recreate the server | 18:03 |
opendevreview | Dmitriy Rabotyagov proposed openstack/project-config master: Add Zuul jobs for frrouting role https://review.opendev.org/c/openstack/project-config/+/935696 | 18:03 |
fungi | klamath: ooh! maybe you know what happened around 0400-0500 utc that server instances in flex ended up shutoff or spontaneously rebooted? and also what might have caused one of our flex server instances to lose contact with an attached cinder volume earlier yesterday? | 18:13 |
fungi | thanks cardoe! | 18:13 |
clarkb | cardoe: klamath I've confirmed the volume is of type capacity (from volume show: type | Capacity) we can create a new volume with type Standard and replace thingson the server. Is there any more detail on what the difference between the two is? | 18:13 |
clarkb | I believe that capacity must be the default too since I don't think I specified that | 18:14 |
fungi | we didn't | 18:18 |
klamath | fungi: We have a RCA tomorrow, TLDR services got restarted and they should not have. | 18:20 |
klamath | clarkb: you can create new volumes of the standard type and those will land on the netapp. | 18:21 |
fungi | klamath: thanks for the confirmation! we were mainly trying to figure out if the environment has stabilized enough for us to re-add it to our ci/cd | 18:22 |
klamath | fungi: it should be, I was the person helping with the swift SLO stuff last week. | 18:23 |
fungi | awesome | 18:24 |
clarkb | probably best to reenable it for now then plan to turn the region off at some point, delete the networking and the mirror and its volume then rebuild it all. That will get bigger mtus and also we can use standard volume types when we rebuild the mirror | 18:24 |
opendevreview | Clark Boylan proposed openstack/project-config master: Revert "Disable raxflex cloud" https://review.opendev.org/c/openstack/project-config/+/935698 | 18:29 |
opendevreview | Merged openstack/project-config master: End gating for Murano/Senlin/Sahara OSA roles https://review.opendev.org/c/openstack/project-config/+/935671 | 18:30 |
corvus | clarkb: fungi can you look at https://review.opendev.org/935455 ? i'd like to start testing raw images | 18:30 |
fungi | sure thing | 18:30 |
fungi | lgtm | 18:31 |
opendevreview | James E. Blair proposed zuul/zuul-jobs master: WIP: Add mirror-container-images role and job https://review.opendev.org/c/zuul/zuul-jobs/+/935574 | 18:35 |
clarkb | fungi: it just occurred to me that the mirror and the test nodes are not in the same tenant and use different networks anyway so we could change the networking for test nodes. However if we do that the mtu for the test nodes will be larger than the mtu for the mirror which is maybe a bad thing given the utility of the mirror | 18:45 |
fungi | right, i'd rather do them at the same time, and we need to not be booting nodes there when redoing the mirror's network connectivity | 18:46 |
opendevreview | Merged opendev/irc-meetings master: Oslo update meeting time. https://review.opendev.org/c/opendev/irc-meetings/+/934648 | 18:47 |
opendevreview | Merged opendev/infra-openafs-deb noble: Build 1.8.13 for Noble https://review.opendev.org/c/opendev/infra-openafs-deb/+/935556 | 18:57 |
opendevreview | Merged openstack/project-config master: Revert "Disable raxflex cloud" https://review.opendev.org/c/openstack/project-config/+/935698 | 19:05 |
opendevreview | Merged opendev/system-config master: Add vexxhost connection to zuul-launcher https://review.opendev.org/c/opendev/system-config/+/935455 | 19:38 |
opendevreview | Oria Weng proposed opendev/lodgeit master: Captcha: Fix text layer location calculation https://review.opendev.org/c/opendev/lodgeit/+/935712 | 19:53 |
opendevreview | Merged openstack/diskimage-builder master: Upgrade curl-minimal for RHEL based images built from containers https://review.opendev.org/c/openstack/diskimage-builder/+/924424 | 20:13 |
opendevreview | Oria Weng proposed opendev/lodgeit master: Captcha: Fix text layer location calculation https://review.opendev.org/c/opendev/lodgeit/+/935712 | 20:41 |
opendevreview | Jay Faulkner proposed openstack/diskimage-builder master: [gentoo] Fix+Update CI for 23.0 profile https://review.opendev.org/c/openstack/diskimage-builder/+/923985 | 21:05 |
opendevreview | Clark Boylan proposed opendev/lodgeit master: Captcha: Fix text layer location calculation https://review.opendev.org/c/opendev/lodgeit/+/935712 | 21:06 |
opendevreview | Clark Boylan proposed opendev/lodgeit master: Run python3.11 job on Jammy https://review.opendev.org/c/opendev/lodgeit/+/935719 | 21:06 |
clarkb | infra-root I'd liek to consider disabling docker hub proxy caching for jobs but looking at https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/use-docker-mirror/tasks/main.yaml#L10 it seems like if we configure any mirror at all we expect to have a docker mirror too. Before I spend a lot of time on this do we A) think this is worth doing and B) if so should I add a flag to | 21:09 |
clarkb | use-docker-mirror to skip setup if set? | 21:09 |
clarkb | corvus: ^ you may have thoughts on both of those items | 21:10 |
fungi | python 3.14.0a2 is out now | 21:20 |
fungi | also i've received 15 uncaught bounce notifications to openstack-discuss-owner already, pretty much all deactivated or deleted red hat mail addresses (and a few for ibm) | 21:21 |
clarkb | fungi: these are the notifications indicating that the threshold has been reached and subscription is paused? | 21:23 |
fungi | no, just copies of the ndr messages | 21:23 |
clarkb | ah | 21:24 |
opendevreview | Jay Faulkner proposed openstack/diskimage-builder master: [gentoo] Fix+Update CI for 23.0 profile https://review.opendev.org/c/openstack/diskimage-builder/+/923985 | 21:29 |
corvus | clarkb: looks like use-docker-mirror doesn't use the new style of mirror config. ideally we would solve this problem by migrating to that. that may be slightly more work than adding a throwaway flag, but it would be forward progress. may not be too hard for something like this. | 21:30 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Support new style mirror_info in use-docker-mirror https://review.opendev.org/c/zuul/zuul-jobs/+/935722 | 21:55 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Allow all Ubuntu releases to use our OpenAFS PPA https://review.opendev.org/c/opendev/system-config/+/935723 | 21:55 |
opendevreview | Clark Boylan proposed openstack/project-config master: Disable docker hub mirror use in jobs https://review.opendev.org/c/openstack/project-config/+/935725 | 22:02 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Support new style mirror_info in use-docker-mirror https://review.opendev.org/c/zuul/zuul-jobs/+/935722 | 22:10 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Fully qualify openvswitch_bridge to make linter happy https://review.opendev.org/c/zuul/zuul-jobs/+/935726 | 22:10 |
fungi | openstack-discuss-owner has received 8 more uncaught bounce notifications | 22:11 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Cap the ansible version used by ansible-lint https://review.opendev.org/c/zuul/zuul-jobs/+/935726 | 22:28 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Support new style mirror_info in use-docker-mirror https://review.opendev.org/c/zuul/zuul-jobs/+/935722 | 22:28 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Support new style mirror_info in use-docker-mirror https://review.opendev.org/c/zuul/zuul-jobs/+/935722 | 22:51 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Support new style mirror_info in use-docker-mirror https://review.opendev.org/c/zuul/zuul-jobs/+/935722 | 22:57 |
clarkb | to catch people up here I think we do something like https://review.opendev.org/c/zuul/zuul-jobs/+/935722 and parent then https://review.opendev.org/c/openstack/project-config/+/935725 then https://review.opendev.org/c/zuul/zuul-jobs/+/849989 and then we keep an eye on stability without the proxy cache | 23:13 |
corvus | lgtm | 23:29 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!