SvenKieske | mhm, so I get this weird build error, do we have infra restrictions in place which external servers I can reach? | 08:53 |
---|---|---|
SvenKieske | Failed to fetch https://td-agent-package-browser.herokuapp.com/lts/5/debian/bookworm/dists/bookworm/InRelease Clearsigned file isn't valid, got 'NOSPLIT' (does the network require authentication?) | 08:53 |
SvenKieske | build log: https://zuul.opendev.org/t/openstack/build/3fefdffe6d7641a581f862350945e796/log/kolla/build/000_FAILED_fluentd.log | 08:53 |
SvenKieske | looks like apt can't reach this network from the opendev infra to me. works reasonably well locally. | 08:54 |
SvenKieske | another question: zigo: would it be possible to enable https for http://osbpo.debian.net ? I noticed it seems to only support http and we are installing gpg keys from it, without further verification which is kind of a problem | 08:57 |
SvenKieske | zigo, see this review for further context :) https://review.opendev.org/c/openstack/ansible-collection-kolla/+/852240/comment/db0db1e9_9608917b/ | 08:59 |
SvenKieske | I know it might not be possible to trivially enable https support, I just hope it is, thanks in advance :) | 09:00 |
zigo | SvenKieske: Well, the question is rather: why do you wget the GPG key, instead of using extrepo which does the job with authentication ? | 09:00 |
SvenKieske | zigo: good question! "I" don't do it. I was not familiar with this server to begin with. Could you kindly point me maybe to some code example where this is already done? I at least don't recall anything called "extrepo" currently, but that might be just my faulty brain. | 09:02 |
zigo | apt-get install extrepo | 09:02 |
zigo | extrepo enable openstack_bobcat | 09:02 |
zigo | apt-get update | 09:02 |
zigo | That's all there is to it ... :) | 09:02 |
SvenKieske | ah I see, it's on the debian wiki | 09:02 |
SvenKieske | interesting | 09:02 |
SvenKieske | thank you, will look into it | 09:03 |
zigo | FYI, I was the one contributing the extrepo-offline-data package and the --offlinedata option ... :) | 09:03 |
zigo | Though probably, at this point, we'd need an upload to bookworm-backports ... | 09:04 |
SvenKieske | I already noticed you are behind a lot of stuff when it comes to debian and openstack, so thanks for that :) | 09:04 |
zigo | Yeah ... :) | 09:04 |
zigo | Well, packaging OpenStack in Debian since 2011 ... :) | 09:04 |
SvenKieske | I'll just need to dig in what extrepo really does and if it's suitable for our usecase on the kolla-ansible side | 09:04 |
zigo | About extrepo: you can consider it to be the Debian counter-part of Ubuntu PPAs ... | 09:04 |
SvenKieske | a link to the source of extrepo on the debian wiki would be convenient, I don't have currently an account over there to edit it myself | 09:05 |
SvenKieske | ah okay | 09:05 |
zigo | That's where you may contribute to extrepo: | 09:05 |
zigo | https://salsa.debian.org/extrepo-team/extrepo-data/-/tree/master/repos/debian | 09:05 |
zigo | A lot of stuff is already in there. | 09:05 |
zigo | I always push the unofficial debian.net stable backports of OpenStack in there, from day one of the release. | 09:06 |
zigo | IMO, that's a *way* safer than just using https. | 09:06 |
SvenKieske | https would still be a nice addition, but if the gpg keys are retrieved in a secure way that's also sufficient. but it seems there are people who don't know how to install a debian third party repo securely (me included) because the tools where not known, so thanks! | 09:13 |
zigo | It's been only 3 or 4 years extrepo exists ... | 09:21 |
zigo | :) | 09:21 |
SvenKieske | I'm still left with my problem to reach the td-agent aka fluent-package repo over at herokuapp.com, it seems there might be some redirect shenanigans going on.. | 11:53 |
*** elodilles is now known as elodilles_afk | 12:17 | |
frickler | to me that looks like a broken repo setup, https://td-agent-package-browser.herokuapp.com/lts/5/debian/bookworm/dists/bookworm/InRelease has "Unable to load bucket: packages.treasuredata.com" as content. not much we can do about that | 12:17 |
fungi | SvenKieske: catching up, seems like a lot of this ground has been trodden in our discussion in #openstack-kolla already (apologies, i read and responded there first because there was a nick highlight) | 12:29 |
SvenKieske | fungi: no problem | 12:31 |
SvenKieske | I would still be glad if someone could check out https://zuul.opendev.org/t/openstack/build/3fefdffe6d7641a581f862350945e796/log/kolla/build/000_FAILED_fluentd.log because I have never seen this apt-get error and all the internet tells me is that it's a proxy issue, but I somehow doubt that. | 12:32 |
fungi | so for starters, no "we" don't restrict what networks our test nodes are allowed to reach. we purposefully install empty/allow-all security groups in the providers who donate resources for job nodes and then restrict traffic locally on each node with an iptables/nftables ruleset which allows all egress connections (statefully), but jobs can and do adjust the firewall rules on the nodes so may | 12:32 |
fungi | end up restricting their own ability to reach things. also the internet is an unstable place, and sometimes "you can't get there from here" as folks in maine like to say | 12:32 |
SvenKieske | yeah sure, I wasn't certain about the sec groups, so thanks for confirming this is not a problem. it's still weird. I still need to check the dns, might be related (because the original mirror url does not point to heroku) | 12:33 |
fungi | also yes we could probably mirror the osbpo repository similarly to how we mirror uca (ubuntu cloud archive), the amount of data is in the same order of magnitude and we've got some free space on our fileservers at the moment: https://grafana.opendev.org/d/9871b26303/afs (i'll put this topic on the agenda for tomorrow's sysadmins meeting) | 12:33 |
fungi | also, both wget and extrepo are more fragile than just serializing the key and embedding it in the job. why have every single job waste time and resources re-verifying a key if you can verify it once yourself and stick it in an ansible role? then it doesn't need to be retrieved at all, eliminating one more network interaction that could randomly break and cause a job to fail or get rerun | 12:36 |
fungi | (wasting even more resources). for opendev's own jobs we embed public keys as a matter of course | 12:36 |
SvenKieske | yeah, that would also be my favorable solution, and I _think_ kevko will go this route, I did just review the code. | 12:36 |
SvenKieske | I'll make sure to leave a comment regarding this on the changeset. | 12:37 |
SvenKieske | so I'm not sure if these requests to heroku are directly related from the package installation of fluentd-package. the mirror fqdn is packages.treasuredata.com. which resolves to many ips on cloudfront, which forward the requests who-know-where.. | 12:42 |
fungi | also, if that's a cdn, it likely resolves to different ip addresses depending on who's asking and where they are on the internet | 12:44 |
SvenKieske | yeah, the joy of anycast :) | 12:44 |
fungi | well, also just using techniques like dns3 | 12:45 |
fungi | nameservers returning different results based on things like geo-ip lookup for the client address | 12:45 |
SvenKieske | yeah, that's also possible | 12:45 |
SvenKieske | soo..should I just recheck the job? I hate to do that. | 12:46 |
SvenKieske | that build _did_ work, so maybe it was really just a spurious error: https://zuul.opendev.org/t/openstack/build/f18392d9f660473ea3b25d0be070b054/log/kolla/build/fluentd.log | 12:47 |
fungi | when it looks like a network failure, i try to see if the site that the job failed to reach has anything like a service status page where they might list outages/incidents that could explain it and maybe even indicate whether it's still happening. beyond that, i recheck and include a message explaining that the previous build log indicated a network connection issue | 12:47 |
SvenKieske | seems reasonable: is there some list or something somewhere where I can correlate build machines with locations? I guess you just know them? :D | 12:48 |
fungi | basically, my goal is to not waste resources rechecking something that's likely to hit the same exact network issue again, but our reality is that the internet is not reliable | 12:48 |
SvenKieske | yeah sure :) I was in a Ops centric role for a data center/webhoster for almost a decade, tell me more about internet reliability ;) | 12:49 |
fungi | SvenKieske: if you mean where the job nodes are located in the world, the zuul inventory.yaml archived with the build results includes the donor provider's region name, and those are almost always based on icao or iata airport codes | 12:49 |
SvenKieske | ah that's a good pointer! | 12:50 |
SvenKieske | thank you | 12:50 |
fungi | there's a telecommunications semi-standard about location-based device naming that some providers follow too, but i've been out of that industry for long enough i don't remember off the top of my head what it's called (also not everyone follows it, more popular for backbone providers than for isps) | 12:51 |
SvenKieske | rax-dfw might be something rackspace like? | 12:52 |
SvenKieske | I just read about that some months ago, I guess also in the context of the openstack region names.. but I also don't remember the standards name | 12:53 |
fungi | yes, rackspace is the provider there, and dfw is the dallas/fort worth texas airport code so it's somewhere in that area | 12:53 |
fungi | it's not really anything openstack-oriented. all sorts of utility and service providers name their facilities based on similar patterns | 12:54 |
fungi | aha, clli was the semi-standard i was trying to recall (but again, not everyone uses that, it's just an example): https://en.wikipedia.org/wiki/CLLI_code | 12:55 |
SvenKieske | yeah, I never saw that in europe; reminds me of airport names in aviation language.. | 12:56 |
fungi | reusing nearby airport codes is more common among our donors, clli has its own set of location abbreviations | 12:57 |
SvenKieske | silly question, but are all our buildlogs in UTC? I think they are? | 12:58 |
fungi | do for example, the ovh regions we boot job nodes in are bhs1 (beauharnois, canada) and gra1 (gravelines, france) | 12:59 |
fungi | SvenKieske: yes, build logs are in utc | 12:59 |
SvenKieske | rackspace seems to be fine, the status page is a maze though..thanks again | 13:00 |
fungi | SvenKieske: well, i was talking more about status pages for the services the job was trying (and failing) to reach | 13:01 |
fungi | you mentioned packages.treasuredata.com and cloudfront, so i would generally start by trying to see if they have status pages | 13:01 |
fungi | for example, when i see issues getting packages from pypi (even through our proxies), i take a quick look at https://status.python.org/ | 13:02 |
fungi | dockerhub and quay have status pages too where they post incidents, same for github, gitlab, et cetera | 13:03 |
SvenKieske | yeah, I'm a frequent visitor of githubstatus.com ;) | 13:04 |
fungi | anyway, i need to knock out some other morning tasks, i'll be back in a bit | 13:04 |
SvenKieske | thanks for all the hints provided so far :) | 13:07 |
fungi | yw | 14:03 |
fungi | config-core: infra-root: anyone else available to review https://review.opendev.org/896943 (the dependency has merged now)? | 15:06 |
fungi | once it's in, i'll let the openstack release managers to run a release test so we can confirm the signatures look right | 15:06 |
fungi | er, i'll let the openstack release managers know to run release test | 15:06 |
frickler | https://review.opendev.org/c/openstack/project-config/+/896944 is the one to review ;) but I'm also fine to self-approve given that it will be tested later anyway | 15:07 |
fungi | gah, right thanks. i keep copying/pasting the depends-on field instead of the correct url | 15:08 |
clarkb | fungi: I'm just sitting down can take a look | 15:11 |
clarkb | ah looks like it is done | 15:12 |
fungi | clarkb: as frickler correctly noted, i pasted the wrong change. i meant https://review.opendev.org/896944 | 15:13 |
clarkb | oh ack | 15:13 |
clarkb | done | 15:13 |
clarkb | infra-root https://review.opendev.org/c/opendev/system-config/+/892699 has been on the todo list as an item for after the openstack release. That has happened now and I can make time today to help restart gerrit and ensure it is happy with its new runtime if we want to land it | 15:14 |
fungi | sounds great, i'll be around | 15:16 |
clarkb | ok I can approve it in a few | 15:17 |
fungi | awesome | 15:17 |
fungi | or i can approve it if you're ready | 15:18 |
clarkb | yup that works. I'm just settling in first (loading ssh keys, opening office window, finding something to drink) | 15:18 |
fungi | cool, i approved it just now | 15:20 |
clarkb | thanks. I now have a cup of chai and ssh keys are loaded | 15:21 |
opendevreview | Merged openstack/project-config master: Replace 2023.2/Bobcat key with 2024.1/Caracal https://review.opendev.org/c/openstack/project-config/+/896944 | 15:25 |
clarkb | fungi: I see the well behaved bots doing their thing against gitea, but no flood currently at least on gitea09 | 15:28 |
clarkb | which is good means we didn't accidentally block the well behaved bots and maybe we made an impact on the bad ones | 15:29 |
*** elodilles_afk is now known as elodilles | 15:37 | |
clarkb | I'm going ahead and starting on a gitea 1.21 change. There are no release notes/changelog updates yet but the template changes are annoying enough to get out of the way on their own | 15:40 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP Update gitea to 1.21 https://review.opendev.org/c/opendev/system-config/+/897679 | 16:02 |
clarkb | I'm trying to keep myself distracted so I don't look at java today :) | 16:07 |
fungi | i must say, that's a compulsion i don't believe i've ever experienced | 16:08 |
fungi | i got mosh set up on my linux phone and confirmed i can get to the tmux session on my shell server where my irc client, mua, calendar, et cetera run, so i'll be able to get by without my netbook while i'm travelling next week | 16:09 |
clarkb | that sounds "fun" | 16:09 |
clarkb | fungi: It is mostly a compulsion to fix the problem more than a desire to java. The wall I'm up against now is I really need access to the logs but they seem to be bitbucketed :/ | 16:10 |
fungi | it's working really well, just thankful i've got good enough eyesight to see a 80x25 terminal on there | 16:10 |
clarkb | I've been debating a cheap chromebook running the debian overlay or whatever it is called these days as a backup device | 16:11 |
fungi | well, i now have two identical dead netbooks i need to send back to hk for repairs, because i procrastinated getting my backup one fixed | 16:12 |
clarkb | the gerrit update should be landing soon | 16:13 |
clarkb | there is a config update that I think is a noop to set java home (noop because we don't use the init script for gerrit in our containers and that value is used by the init script to know which java to run) | 16:14 |
clarkb | once that is in place we should be good to stop gerrit, move the replication waiting queue aside, then start gerrit | 16:14 |
fungi | i also just +1'd a change in gertty over mosh from my phone to my home workstation | 16:16 |
clarkb | I was a big fan of my OG Droid phone with the slide out keyboard | 16:18 |
clarkb | they don't make them like that anymore | 16:18 |
clarkb | most devices do usb on the go now though so you can hook up other keyboard devices at least | 16:19 |
opendevreview | Merged opendev/system-config master: Update gerrit image to bookworm https://review.opendev.org/c/opendev/system-config/+/892699 | 16:19 |
fungi | i bought a lapdock for this, which is basically a usb type-c connected streaming kvm in the form of a slim laptop with a holder for the phone next to the screen, making it essentially a dual-screen system | 16:22 |
clarkb | oh neat | 16:22 |
fungi | can drag apps from the phone screen onto the laptop-sized screen with the pointer (the screen is also a touchscreen though) | 16:23 |
clarkb | gerrit's config should be getting udpated as we speak | 16:23 |
clarkb | java home is updated on disk | 16:24 |
clarkb | and the job completed successfully. I think everything is ready if we are | 16:25 |
clarkb | I think the process we want to use is stop gerrit (docker-compose down), mv the replication waiting/ tasks dir aside, docker-compose pull, then docker-compose up -d | 16:26 |
clarkb | maybe make note of the current iamge in use so that we can return to it if necessary | 16:26 |
clarkb | should we notify the openstack release team then go for it? | 16:27 |
clarkb | fungi: ^ | 16:27 |
clarkb | opendevorg/gerrit 3.7 3a2e576abc94 | 16:28 |
clarkb | that is the image we are currently running | 16:28 |
clarkb | /home/gerrit2/review_site/data/replication/ref-updates/waiting is the dir to move aside | 16:30 |
fungi | status notice The Gerrit service on review.opendev.org will be offline momentarily while we restart it for a patch upgrade | 16:30 |
fungi | that look right? | 16:30 |
clarkb | fungi: maybe s/patch/runtime and platform/ | 16:30 |
clarkb | though we'll also do some gerrit patch updates too due to the nature of our image rebuild process | 16:31 |
fungi | status notice The Gerrit service on review.opendev.org will be offline momentarily while we restart it for a combined runtime and platform upgrade | 16:31 |
clarkb | yup lgtm | 16:31 |
fungi | should i send that now? or are we not ready to start yet? | 16:31 |
clarkb | I'm ready if you are. I can do the steps I outlined above | 16:32 |
fungi | #status notice The Gerrit service on review.opendev.org will be offline momentarily while we restart it for a combined runtime and platform upgrade | 16:32 |
opendevstatus | fungi: sending notice | 16:32 |
fungi | yep, steps lgtm. thanks! | 16:32 |
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org will be offline momentarily while we restart it for a combined runtime and platform upgrade | 16:32 | |
clarkb | ok starting momentarily | 16:32 |
clarkb | I'm actually going to docker0-cmopose pull first as that should minimize the outage time | 16:33 |
fungi | good idea | 16:33 |
clarkb | ok image pulled. Stopping, moving tasks ,and starting now | 16:34 |
opendevstatus | fungi: finished sending notice | 16:35 |
clarkb | INFO com.google.gerrit.pgm.Daemon : Gerrit Code Review 3.7.5-45-g90bfc86419-dirty ready | 16:36 |
clarkb | web ui loads for me | 16:36 |
clarkb | the reported jvm looks correct too | 16:36 |
fungi | lgtm as well | 16:38 |
fungi | gertty has come back to online mode as well | 16:38 |
clarkb | another minor prep step towards 3.8 | 16:39 |
fungi | for those not following #openstack-release, the updated key checks out: https://paste.opendev.org/show/bNky4Hv8qcBKcmexjB7p/ | 17:29 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP Update gitea to 1.21 https://review.opendev.org/c/opendev/system-config/+/897679 | 17:31 |
clarkb | if ^ gets gitea 1.21 further along then that will be an annoying thing to change, but probably worthwhile in the long run | 17:33 |
clarkb | fungi: anything we should be doing to help prep for the mm3 migration THursday? | 17:43 |
fungi | not yet, i'll put together the system-config changes in a bit for reviewing | 17:45 |
opendevreview | Clark Boylan proposed openstack/project-config master: Update the jeepyb gerrit build jobs to match current base image https://review.opendev.org/c/openstack/project-config/+/897710 | 17:47 |
clarkb | This change is a followup on the gerrit chagne we just put into production. Keeps everything aligned | 17:47 |
clarkb | and https://review.opendev.org/c/opendev/gear/+/895968 is the last update for container images in opendev I think. The remainder are in zuul | 17:48 |
clarkb | I think I'm goign to pivot the zuul-registry update to just be a python update and stay on bullseye for now. Then we can clean up the other bullseye images at least | 17:48 |
clarkb | and once that is done we can add python3.12 images :) | 17:49 |
clarkb | oh storyboard is also lagging behind but I don't think the containerization went anywhere so would need to be updated when picked up? fungi does that make sense or should we update those images already? | 18:18 |
opendevreview | Clark Boylan proposed opendev/lodgeit master: Update lodgeit container image to Python3.11 on Bookworm https://review.opendev.org/c/opendev/lodgeit/+/897711 | 18:21 |
clarkb | and I missed lodgeit | 18:21 |
fungi | clarkb: yeah, i don't think there was any further work on the sb container configs | 18:34 |
clarkb | arg changing the key type in the gitea change got the job to pass. I think I saw that gitea is requiring a minimum number of rsa bits and that key mustn't meet the criteria | 18:46 |
clarkb | we'll probably want to replace the key. In my test change I went to ed25519 to avoid needing to bump the bit count in the future again, but open to feedback on how we want ot approach that | 18:46 |
clarkb | this is the key that gerrit uses to replicate to gerrit | 18:46 |
clarkb | *gerrit uses to replicate to gitea | 18:47 |
fungi | how long is the current key? | 18:47 |
opendevreview | Clark Boylan proposed opendev/lodgeit master: Update lodgeit container image to Python3.11 on Bookworm https://review.opendev.org/c/opendev/lodgeit/+/897711 | 18:51 |
opendevreview | Clark Boylan proposed opendev/lodgeit master: Put regex multiline specifier at beginning of regex string https://review.opendev.org/c/opendev/lodgeit/+/897713 | 18:51 |
clarkb | fungi: I'm not sure. Cna you determine that easily from the pubkey? gitea must be somehow | 18:53 |
frickler | ssh-keygen with the proper option can tell you that. but I'm already too much offline to check the details | 19:24 |
clarkb | the minimum keylength is 3072 according to the git commit I found | 19:40 |
clarkb | looks like we might be able to disable the min key legnth checks | 19:43 |
clarkb | I'll update the change to do that instead of replcaging the key and we can discuss if we are happy with that | 19:43 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP Update gitea to 1.21 https://review.opendev.org/c/opendev/system-config/+/897679 | 19:45 |
opendevreview | Clark Boylan proposed opendev/lodgeit master: Update lodgeit container image to Python3.11 on Bookworm https://review.opendev.org/c/opendev/lodgeit/+/897711 | 19:52 |
opendevreview | Clark Boylan proposed opendev/lodgeit master: Replace inspect.getargspec with inspect.getfullargspec https://review.opendev.org/c/opendev/lodgeit/+/897714 | 19:52 |
clarkb | I'm impressed that lodgeit manages to hit so many deprecations | 19:52 |
fungi | the current default rsa keylength generated by ssh-keygen is 3072 bits. a recent pubkey i have is 1.5x as long as the one in inventory/service/group_vars/gitea.yaml | 20:05 |
fungi | so probably 2048 | 20:05 |
opendevreview | Clark Boylan proposed opendev/lodgeit master: Replace inspect.getargspec with inspect.signature https://review.opendev.org/c/opendev/lodgeit/+/897714 | 20:40 |
opendevreview | Clark Boylan proposed opendev/lodgeit master: Update lodgeit container image to Python3.11 on Bookworm https://review.opendev.org/c/opendev/lodgeit/+/897711 | 20:40 |
clarkb | ok that config option does fix things for gitea 1.21. We should be able to use that until/when we decide to replace the key with something newer | 20:48 |
clarkb | ok I think that makes lodgeit py311 compatible | 20:52 |
clarkb | I've updated the meeting agenda. Please add any other items or let me know if there are things that need to go in there | 21:24 |
clarkb | arg I missed that https://review.opendev.org/c/opendev/system-config/+/895522 was still outstanding. The only thing it will do is update tags for plugins which probably in all cases didn't actually change between 3.7.4 and 3.75 | 21:38 |
clarkb | let me double check that. But I suspect we can land that and just not restart gerrit since it is equivalent to what we are running | 21:39 |
clarkb | plugin-manager and webhooks do have differences. We don't really use either right now and the single commit updates to both of them seem minimal | 21:42 |
clarkb | but I'm happy to get that in and restart again in order to be properly in sync. I don't think it is urgent | 21:43 |
clarkb | infra-root ^ fyi a small annoying but not super urgent thing | 21:44 |
clarkb | I need to pop out soon for some family errand stuff. Last call for meeting agenda updates | 21:59 |
clarkb | agenda sent | 22:24 |
fungi | ah, i meant to add debian osbpo mirroring that got brought up in channel earlier today, but can cover it during open discussion, no worries | 22:37 |
opendevreview | Samuel Walladge proposed zuul/zuul-jobs master: Ensure the log dir exists before writing in it https://review.opendev.org/c/zuul/zuul-jobs/+/897743 | 23:23 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!