opendevreview | Merged opendev/system-config master: Revert "Switch from legacy to new style keycloak container" https://review.opendev.org/c/opendev/system-config/+/907119 | 00:06 |
---|---|---|
clarkb | last call for meeting content | 00:07 |
JayF | Weird question: do we generally do anything to set /bin/sh on our testing images. | 00:31 |
JayF | or do we leave it at the default (dash on ubuntu) | 00:31 |
clarkb | should be the default | 00:34 |
* elibrokeit waves to JayF | 00:34 | |
JayF | elibrokeit: found https://review.opendev.org/c/openstack/liberasurecode/+/907156 and we were just conjecturing how it passed CI | 00:34 |
JayF | given afaict the defaults on ubuntu should make dash the /bin/sh provider | 00:35 |
clarkb | our ubuntu and debian images are built with debootstrap and that should pull in default shell stuff. On CentOS things are built with dnf/yum and should produce similar results | 00:35 |
elibrokeit | this is actually hilariously complex | 00:35 |
elibrokeit | so, debian and ubuntu as a result default to dash | 00:35 |
JayF | So basically I'm trying to ensure we haven't shipped bash-centric code elsewhere it doesn't belong | 00:35 |
elibrokeit | but dash once did not support the basics to be a valid autoconf shell | 00:36 |
elibrokeit | they added support a decade plus ago, then a bunch of software predictably failed in debian buildbots to build from source: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=582952 | 00:36 |
elibrokeit | so, they passed the configure option to dash to disable valuable POSIX features | 00:36 |
elibrokeit | this restored the pre-2010 fact that a configure script will check to see if the current shell is basically capable of POSIX, and if not -> it checks for bash and tries to re-exec as bash | 00:37 |
elibrokeit | you cannot really control this in configure.ac but you can control it by defacing /bin/sh itself | 00:37 |
clarkb | JayF: https://paste.opendev.org/show/bjLLwdrP6DWxiHkaQRws/ jammy node on a running test I ssh'd into to check | 00:37 |
clarkb | if I had to guess whatever ran the shell was selecting bash | 00:38 |
elibrokeit | debian has flipflopped several times on whether /bin/sh has $LINENO enabled, and the ultimate debian goal is to fix all packages shipped by debian's own archives and then enable $LINENO | 00:38 |
JayF | clarkb: that's basically what I was concerned about, if there's something weird that's configure-specific I think that is a small enough set of things that we don't worry about it :) | 00:39 |
JayF | I'm a bash lover, but I #!/bin/bash all the things, and I may have never in my life written a configure script :) | 00:39 |
elibrokeit | GNU created configure.ac technology in order to be maximally portable, so they kinda don't want configure.ac authors to be able to influence which shell it runs with :) | 00:40 |
clarkb | elibrokeit: you said just above it will use bash though? | 00:41 |
JayF | well, the build is either broken or requires bash | 00:41 |
elibrokeit | it will only use bash until debian's own archives are fixed, then they will re-enable $LINENO support in their dash package and thirdparty projects will find their builds failing | 00:41 |
clarkb | my point is the test environment we provide isn't divergent from the ubuntu default of dash as sh | 00:41 |
clarkb | so if dash is not compatibile then something must be choosing bash | 00:42 |
JayF | clarkb++ that was the only question I was asking, yeah | 00:42 |
JayF | and I don't wanna get into the weeds of CI for a library I don't know much about, I just wanted to make sure we hadn't slipped a "and replace /bin/sh with real bash" into our CI jobs at some point in the last decade :D | 00:42 |
JayF | it sounds like the answer is soundly no, something else weird was going on with that change, and I'm not interested as much in that change as I am the general case :D | 00:43 |
elibrokeit | another debian bug report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=842242 | 00:43 |
clarkb | JayF: some of us run systems that use bash :) | 00:43 |
elibrokeit | someone wanted to stop cripping /bin/sh on debian, and for a while it was un-crippled again | 00:43 |
JayF | clarkb: I <3 bash, but I also like knowing our stuff works where advertised. Especially in a world where people ship/build stuff in slimmed containers | 00:44 |
elibrokeit | then they re-crippled it *again* after some more reports of debian packages failing to build from source came in | 00:44 |
JayF | elibrokeit: I just hear you say we're not weird for being temporarily broken, too :) | 00:44 |
clarkb | JayF: you should not build in your final slim container | 00:44 |
elibrokeit | basically, the core issue here is: that configure.ac is not "properly good" but debian is not a good test environment for this | 00:44 |
JayF | clarkb: I agree, but I'm not 100% of everyone on the internet :D | 00:44 |
clarkb | JayF: I understand the motivation there but I think it is misplaced | 00:44 |
clarkb | elibrokeit: right and we intentionally do our best to provide an ubuntu system that mimics actual ubuntu installations | 00:45 |
clarkb | same for debian and centos and so on | 00:45 |
JayF | Maybe, but I think about ops in terms of apis oftentimes, and if an api says "this is posix shell" we should make it posix shell, and that's what gives us the opportunity to work on platforms we don't always target | 00:45 |
clarkb | replacing dash or changing how dash is configured would be a problem for us because now our test environments don't mimic actual installations | 00:45 |
JayF | that's the primary reason I care about that :D | 00:46 |
JayF | clarkb+++++++++++++ that is 100% the heart and soul of why I asked the question | 00:46 |
elibrokeit | and to be clear: this *will* break one day, once debian re-enables a feature-complete dash shell | 00:46 |
clarkb | right but the fix isn't to change the shell | 00:46 |
clarkb | its to change the autoconf | 00:46 |
JayF | That's what eli pushed | 00:46 |
JayF | the original linked patch | 00:46 |
elibrokeit | however, for openstack you don't really need to worry about that for now, just merge a patch if you think it is correct -- and rest easy knowing that this won't be an issue in 2 years | 00:47 |
elibrokeit | and that's what my patch does | 00:47 |
JayF | elibrokeit: nobody in this conversation can merge that patch fwiw :D | 00:47 |
JayF | well, I mean, clark can do whatever he wants in gerrit, he's got a big hammer on the systems, but can and should are not the same ;) | 00:47 |
elibrokeit | no problem :) | 00:48 |
elibrokeit | just trying to clarify the risks involved | 00:48 |
JayF | yeah, I'm going to ask a question but taking it to another channel, we've bugged these folks enough I think :D | 00:48 |
JayF | thanks o/ | 00:48 |
clarkb | also it isn't clear to me how LINENO support affects [/test functionality? Maybe they are all bundled up into the same set of functionality that is enabeld and disabled. | 00:48 |
elibrokeit | GNU autoconf has a weird suite of tests that it runs at the beginning of a configure script, to check how good your /bin/sh is, and if it detects something that isn't a very good POSIX sh (for example, the solaris 10 /bin/sh) it will also search for some other shells such as bash and zsh and try to re-exec as that | 00:49 |
elibrokeit | one of its internal tests is for $LINENO support | 00:49 |
JayF | so LINENO is just the canary that makes configure say "I'm going to run as bash instead, because this sh is so bad" | 00:50 |
clarkb | got it | 00:50 |
JayF | I think I finally understood that for the first time across like, 2 channels and DMs of this conversation LOL | 00:50 |
elibrokeit | yup | 00:50 |
clarkb | and [ will never work under dash so when LINENO is present it runs dash then fails | 00:50 |
JayF | Which is why the issue is exposed on Gentoo, but not on the versions of Debian we support today | 00:51 |
elibrokeit | no, [ works fine, but [ foo == bar ] is not valid because /bin/sh doesn't know what "==" is | 00:51 |
JayF | that's the root I've been looking for the whole time: why eli saw it on gentoo and we didn't see it on debian/ubuntu | 00:51 |
elibrokeit | well, that does mean that the [ program emits an error :) | 00:52 |
JayF | but the build would succeed in that case anyway lol | 00:52 |
clarkb | ah ok I saw the original change which switched to test implying [ was the problem | 00:52 |
JayF | https://review.opendev.org/c/openstack/liberasurecode/+/907156/1/configure.ac#167 just with all specific CPU support would be disabled | 00:53 |
clarkb | fwiw you should squash those two changes and abandon the child | 00:53 |
JayF | ++ | 00:53 |
elibrokeit | it would emit scary QA warnings if your package manager catches those, then disable tons of useful features, yeah | 00:53 |
clarkb | oh wait it is two changes | 00:53 |
elibrokeit | clarkb: they are two distinct problems, although they do both appear on the same lines | 00:53 |
clarkb | I have no say but I think I'm -1 on the second | 00:53 |
clarkb | [ is the same as test they aren't different not sure why it matters | 00:53 |
clarkb | I guess this is specific to autoconf | 00:54 |
clarkb | if it were a shell script I'd care | 00:54 |
JayF | per the commit, when you run configure through M4 scripting, it changes things, and test just lets you avoid escaping | 00:54 |
elibrokeit | also yes, the first one is an actual bug today but the second one is a style issue, so I figured I would allow the maintainers to decide if they care | 00:54 |
JayF | if it were a shell script, I'd -1 both changes and say "just shebang it to bash, openstack requires bash" :D | 00:54 |
* elibrokeit knows entirely too much about autoconf, all of it bad | 00:55 | |
* JayF & | 00:56 | |
clarkb | I can't remember who I first heard say it but its funny how `sh` is the standard but all the `sh`s are different these days so you are more portable if you actually write against bash | 00:58 |
opendevreview | Merged opendev/git-review master: Add --wip as an alias to --work-in-progress https://review.opendev.org/c/opendev/git-review/+/906508 | 01:04 |
* tkajinam didn't know the --work-in-progress option. that's nice | 01:54 | |
fungi | tkajinam: -w and -W can also be used to set/unset the wip toggle at upload | 02:50 |
tkajinam | fungi, yeah I noticed there are several options I was not aware of. it's good chance to learn these (because my previous workflow can be hugely optimized by some of these) | 02:55 |
timburke | i'm glad i noticed this convo about liberasurecode! fwiw, i doubt any of the current maintainers know as much about autoconf as elibrokeit, and in general we've been pretty lax wrt style even for the C code (much less the ac stuff) | 04:47 |
elibrokeit | timburke: for my sins, I'm a core developer for the meson build system :) | 04:47 |
timburke | god help me, i had commit rights to eventlet before it was cool ;-) | 04:48 |
elibrokeit | that would be the project where someone squashed all my fine-grained commits using the github merge button | 04:49 |
* elibrokeit reaches desperately for gerrit | 04:50 | |
timburke | elibrokeit, as long as i've got you -- got any ideas about how i could force a more-strict build env in the libec gate? i'd love ot have another job that actually wouldn't have passed without your fix in https://review.opendev.org/c/openstack/liberasurecode/+/907156 | 04:54 |
tonyb | frickler: Yes, that's the current situation. There are ~28 nodes that are visible in `openstack server list` but `openstack server delete $UUID` says no such record. | 04:57 |
elibrokeit | timburke: if you set `CONFIG_SHELL=/bin/sh ./configure then the configure script will understand that you "darned well want to use this specific shell" and will not try to run bash instead | 04:57 |
elibrokeit | or well, I suppose you can add Gentoo CI :) | 04:58 |
elibrokeit | I think setting CONFIG_SHELL is easier though | 04:58 |
timburke | i like both options :-) i've mainly messed around with ubuntu or centos for my CI jobs, but especially for a non-python project, more targets seems like a good idea | 04:59 |
elibrokeit | portage in particular uses this: https://github.com/gentoo/portage/blob/c82dc8c8fe92979b82df4c71bb9961973367e8f9/lib/portage/package/ebuild/doebuild.py#L2247 | 05:01 |
elibrokeit | it will grep build logs and scan for things that are usually a suspicious sign of something wonky | 05:02 |
elibrokeit | in particular: r"(.*): line (\d*): (.*): command not found$" | 05:02 |
elibrokeit | r"(.*): (\d+): (.*): not found$" | 05:02 |
opendevreview | Merged openstack/project-config master: Fix wheel_volume values for centos stream wheel mirrors https://review.opendev.org/c/openstack/project-config/+/907150 | 05:05 |
JayF | Isn't a base level answer just to see if you can get your thing to build an environment with no bash whatsoever? | 05:15 |
JayF | If I were approaching a CI job with this in mind, I would probably try to build inside a container that explicitly had no bash installed. | 05:16 |
timburke | 🎉 it's a start! https://zuul.opendev.org/t/openstack/build/978a763d413944bf9efd949165515e5d/log/job-output.txt#1942-1949 | 05:39 |
opendevreview | Dr. Jens Harbott proposed opendev/git-review master: Add CC similarly to reviewers https://review.opendev.org/c/opendev/git-review/+/849219 | 07:58 |
jrosser | could i get a held node on job openstack-ansible-deploy-aio_magnum_octavia_capi-ubuntu-jammy / 905199 | 08:56 |
frickler | jrosser: sorry, but that job name is too long, can't get a grip on that. just kidding, on it ;) | 09:06 |
jrosser | thankyou :) | 09:06 |
jrosser | fwiw the name string is parsed and determines what runs in the job | 09:07 |
opendevreview | Jan Marchel proposed openstack/project-config master: Add new components to NebulOuS project: prediction-orchiestrator, exn-middleware, overlay-network-agent https://review.opendev.org/c/openstack/project-config/+/907060 | 09:54 |
opendevreview | Jan Marchel proposed openstack/project-config master: Add new components to NebulOuS project: prediction-orchiestrator, exn-middleware, overlay-network-agent https://review.opendev.org/c/openstack/project-config/+/907060 | 10:12 |
ildikov | Hi All, I have a quick question. If there's someone who has two OpenInfra IDs, is there a way to merge them or remove the one they don't use anymore? | 13:15 |
fungi | ildikov: we don't manage openinfraid, nor is it currently used to log into any of the systems we manage or host really (aside from zanata and refstack). is it maybe launchpad/ubuntuone sso openids they have two of? | 13:35 |
*** ykarel__ is now known as ykarel | 14:04 | |
ildikov | fungi: hmm, I might've mixed it up with the OpenInfra Foundation profile | 14:08 |
ildikov | fungi: I meant the data that Bitergia is using as well | 14:09 |
fungi | ildikov: i see. it's probably a question for the openinfraid maintainers as to how to close down one of their ids there (if they do have multiple foundation profiles). if they contributed to a project with multiple gerrit accounts though, they ought to be able to just list the e-mail addresses for both on their foundation profile (i think it lets you include up to 3 addresses) | 14:11 |
fungi | my understanding is that bitergia gets the preferred e-mail address associated with the gerrit account that owns each merged change, and then asks openinfraid (really the "summit api" at openstackid-resources.openstack.org) for the foundation profile associated with each address | 14:13 |
fungi | ildikov: so it's possible to have a many-to-one (well, up to three-to-one) relationship between multiple gerrit accounts and a single foundation profile | 14:23 |
fungi | clarkb: i think i see where testinfra's is_listening is getting tripped up. the address getting returned from the listening sockets list is ::ffff:127.0.0.1 rather than 127.0.0.1 | 14:40 |
fungi | there's a (very old) open issue about it: https://github.com/pytest-dev/pytest-testinfra/issues/286 | 14:42 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Upgrade to Keycloak 23.0 https://review.opendev.org/c/opendev/system-config/+/907141 | 14:46 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: DNM: Fail keycloak testing for an autohold https://review.opendev.org/c/opendev/system-config/+/906600 | 14:46 |
fungi | restored the is_listening test with ::ffff: prefixed on the loopback address | 14:46 |
fungi | another option might be to ask it to listen on ::1 instead? but i seem to remember that docker doesn't like ipv6 | 14:48 |
Clark[m] | I don't think docker cares here because we are using host networking. Keycloak is what would matter for ipv6 support | 14:49 |
fungi | ah okay, so maybe we could use v6 loopback in that case | 14:49 |
fungi | mariadb is listening on both 0.0.0.0 and :: according to ss, so should be reachable by either, though maybe we also want to limit it to the loopback for added safety | 14:51 |
fungi | https://mariadb.com/kb/en/server-system-variables/#bind_address is what we want to tweak for that, i suppose | 14:54 |
fungi | also, is there a reason we should be sticking with old mariadb versions? | 14:54 |
clarkb | no reason | 14:55 |
apevec | openstack-stable-maint list owners are getting: Attempting to deliver following mail to recipient(s): <mriedem@linux.ibm.com> | 14:56 |
apevec | VMSDVMA.POK.IBM.COM unable to connect for 1 days to recipient host. | 14:56 |
fungi | clarkb: 10.6 is the newest we're using anywhere, but looks like they have 11.2 tagged as "latest" at the moment | 14:56 |
fungi | apevec: you could set that subscription to "no delivery" or unsubscribe them | 14:56 |
clarkb | fungi: ya I think whatever is the current "stable" release is fine. There may be an env var to set for the container image to set the listen addr but I'm not seeing at first glance | 14:56 |
fungi | https://hub.docker.com/_/mariadb doesn't mention a "stable" tag | 14:57 |
apevec | ack, I wonder why listman is not doing that after bouncing continues, it is now every day afaict ... | 14:57 |
clarkb | fungi: https://mariadb.com/kb/en/mariadb-server-release-dates/ | 14:58 |
fungi | apevec: mailman 3 uses separate address probes rather than relying on bounces, but also i think it doesn't disable them right away in case there's a temporary problem at the recipient's mailserver. how long has it been happening for that address? | 14:58 |
clarkb | 10.11 is the newest long term stable release series | 14:58 |
fungi | clarkb: oh! i missed it, there's an "lts" tag | 14:59 |
fungi | so should we use the 10.11 tag or the lts tag? | 15:00 |
apevec | fungi: in my email, I see it starting Jan 19 but maybe I deleted older | 15:00 |
clarkb | fungi: I would use the 10.11 series | 15:00 |
clarkb | upgrades between mariadb releases are typically easy but not free aiui | 15:01 |
apevec | https://github.com/mriedem says still at IBM | 15:01 |
apevec | but maybe IBM killed @linux.ibm.com ? | 15:01 |
apevec | MX is still there | 15:02 |
opendevreview | Merged openstack/diskimage-builder master: gentoo: don't uninstall packages that aren't installed https://review.opendev.org/c/openstack/diskimage-builder/+/904236 | 15:03 |
JayF | I'll note the linked github profile has an alternate contact email address and a linkedin. Might be worth pausing the sub to the list and reaching out via a different method? | 15:03 |
fungi | though also, i suspect matt r. has no interest in receiving a slew of periodic stable job failure e-mails daily to that or any address | 15:04 |
clarkb | ya I'm not sure we have to over think this | 15:05 |
clarkb | matt knows where to find us if necessary | 15:05 |
fungi | the openstack-stable-maint is only used for job failure notifications | 15:05 |
fungi | er, the openstack-stable-maint mailing list i mean | 15:05 |
apevec | yeah | 15:09 |
fungi | clarkb: looking at https://hub.docker.com/_/mariadb i wonder if we should be setting MARIADB_AUTO_UPGRADE for future-proofing | 15:14 |
clarkb | fungi: we can cross check against what ianw did to upgrade mariadb during one of our gerrit upgrades (its documented in the etherpad for that gerrit upgrade 3.5 or 3.6 maybe?). I think in that case whathappened was a manual run of the upgrade command then a start of the container | 15:16 |
clarkb | but ya maybe we set that flag then we can simply bump the version and monitor in the future | 15:16 |
fungi | if we want to limit it to listening on ::1 we'll need to install a custom my.cnf by mounting it into the container (as you observed, there's no listed envvars i can find for setting bind-address) | 15:32 |
clarkb | if we do that do we override all of the other settings in the process? That may be more trouble than it is worth. If we can "mix in" a my.cnf that would be better | 15:34 |
fungi | maybe it supports run-parts type inclusion dirs or something | 15:35 |
clarkb | looks like ubuntu one support is interpreting fungi's message on the lp issue for openid logins as implying the bug is in the library gerrit uses for openid | 15:35 |
clarkb | I'm not sure of that. I think it more likely that either bits flipped somehow or there is a record keeping error either in ubuntu one or gerrit | 15:35 |
fungi | i guess i don't know enough about openid protocol to figure out if gerrit or ubuntuone sso chose the association handle | 15:36 |
clarkb | fungi: ubuntu one chooses it | 15:36 |
clarkb | consumers make a post to the server saying "give us an association" the server responds with the hash material to use for verification | 15:36 |
fungi | so i suppose it could still be that something internal to gerrit's openid plugin didn't switch to checking things with the new handle (though from the logs we see it's at least trying to use the new handle) | 15:37 |
clarkb | it is possible that either side break things by improperly or inadverdently recording the information. My point is that i don't want us thinking the issue is in java or the java lib. We have insufficient data to indicate who may have had the problem | 15:37 |
clarkb | if anything failures with wiki might point to the common problemy existing on the ubuntu one side. However, we could have independent issues in the different implementations on the consumer side as well. tl;dr who knows where the problem lies :) | 15:46 |
fungi | so inside the mariadb container there's a /etc/mysql/my.cnf which ends in "!includedir /etc/mysql/conf.d/" | 15:47 |
fungi | we should be able to mount a custom config stub into it | 15:47 |
clarkb | ++ | 15:48 |
fungi | ah, yeah, we already add a custom mount like this in the mm3 compose file: | 15:52 |
fungi | - /var/lib/mailman/99-max_allowed_packet.cnf:/etc/mysql/conf.d/99-max_allowed_packet.cnf:ro | 15:52 |
clarkb | oh ya for the large emails :) | 15:55 |
fungi | mmm, we restart the mailman containers with a notify/trigger when changing that config. should we do the same with keycloak's containers? | 16:04 |
clarkb | it seems unlikely to change often so that should be safe and good belts and suspenders | 16:04 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Upgrade to Keycloak 23.0 https://review.opendev.org/c/opendev/system-config/+/907141 | 16:22 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: DNM: Fail keycloak testing for an autohold https://review.opendev.org/c/opendev/system-config/+/906600 | 16:22 |
clarkb | tree style tabs randomly stopped working in firefox | 16:55 |
clarkb | this might legitimately make firefox unusable | 16:55 |
fungi | the rust-vmm community decided to move their online meetings from a proprietary/commercial videoconferencing platform to meetpad: https://lists.opendev.org/archives/list/rust-vmm@lists.opendev.org/message/IV3BDGEYUILRMWZPRYHDEA777IKSC3U7/ | 16:59 |
clarkb | uninstalling, restarting firefox, then starting firefox again doesn't fix it. It was working with the same version of firefox previously so it must be breaking on some sort of local state. I wonder if clearing out tabs would help | 17:00 |
clarkb | looks like they updated the plugin today | 17:07 |
clarkb | and it doesn't work | 17:07 |
clarkb | https://github.com/piroor/treestyletab/issues/3440 for anyone else currently suffering | 17:09 |
jrosser | is there caching of container images that i can use in my jobs, for example from registry.k8s.io and docker.io? | 17:13 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Upgrade to Keycloak 23.0 https://review.opendev.org/c/opendev/system-config/+/907141 | 17:14 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: DNM: Fail keycloak testing for an autohold https://review.opendev.org/c/opendev/system-config/+/906600 | 17:14 |
clarkb | yes for docker and quay. I think there is a role that will autoconfigure them for you | 17:14 |
jrosser | i would like to be able to point this to something https://docs.openstack.org/magnum/latest/user/index.html#container-infra-prefix | 17:15 |
jrosser | but a bit of codesearch suggests that no-one does that in magnum jobs currently | 17:15 |
clarkb | jrosser: zuul/zuul-jobs/roles/use-docker-mirror | 17:16 |
clarkb | that integrates with the zuul registry as well for buildset caching of images | 17:17 |
jrosser | hmm | 17:17 |
jrosser | that is a little difficult for magnum, `container_infra_prefix` gets passed into the VM it creates | 17:18 |
clarkb | ya its the difference between test harness and workload, but it shoudl point you at what needs to be configured | 17:18 |
clarkb | you can also look at system-config/playbooks/roles/mirror/templates/mirror.vhost.j2 if you want to see the proxy cache config directly | 17:19 |
jrosser | cool, thanks | 17:19 |
clarkb | ok I fixed tree style tabs but it lost all my config | 17:19 |
clarkb | whcih is almost as bad as not having them in the first place (why adding new tabs to the top of the list is the default over appending to the end I'll never know) | 17:20 |
jrosser | clarkb: ah i see why this does not get used with magnum, when you tell it where a local registry is it assumes that all the images you need are in that one place, losing all understanding of their upstream source | 17:28 |
clarkb | that seems to be a common problem with the container world | 17:29 |
clarkb | docker for example can only configure mirrors for docker hub | 17:29 |
clarkb | otherwise you have ot point it at specific locations | 17:29 |
jrosser | i beleive there is more flexibility with containerd | 17:30 |
clarkb | yes podman and libcontainer and so on are better about it | 17:30 |
jrosser | but regardless, magnum does not expose that in its api | 17:30 |
clarkb | however skopeo can't talk to docker right now because of protocol version negotiation so you win some and lsoe some | 17:30 |
clarkb | fungi: looks like you have two sets of held keycloak nodes. Should we cleanup the older one? | 18:03 |
fungi | i thought i had, but i'm about to blow them away again for another revision anyway | 18:04 |
fungi | clarkb: oh! i see what happened. i accidentally blew away tonyb's meetpad autohold. sorry tonyb! would you like me to reset it and recheck the change? | 18:06 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Upgrade to Keycloak 23.0 https://review.opendev.org/c/opendev/system-config/+/907141 | 18:11 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: DNM: Fail keycloak testing for an autohold https://review.opendev.org/c/opendev/system-config/+/906600 | 18:11 |
clarkb | looks like you found the two things I just posted comments about. | 18:13 |
clarkb | fungi: if ::1 works in the wait_for I think that is a problem since it implies that java is only listening on ipv6 which I guess is ok for local proxying but you may also need to update apache? | 18:13 |
fungi | yeah, i'm curious to see | 18:14 |
fungi | we could switch them to all 127.0.0.1 we just need to make sure testinfra's is_listening knows the socket is tcp://::ffff:127.0.0.1:8080 in that case | 18:15 |
fungi | also mariadb's bind-address can take a list of addresses as of 10.11 | 18:16 |
clarkb | right. I guess :: is the one that also accepts ipv4 connections but ::1 is a specific singular address? | 18:16 |
JayF | :: is the v6 equivalent to 0.0.0.0 | 18:16 |
JayF | listen on all | 18:16 |
fungi | well, for mariadb 10.11 :: no longer listens on all addresses, only ipv6 addresses, and you'd need to make it ::,0.0.0.0 instead | 18:16 |
JayF | ::1 is loopback, which is like 127.0.0.1 | 18:16 |
clarkb | JayF: yes but it also includes all ipv6 addrs | 18:17 |
clarkb | er ipv4 addrs | 18:17 |
clarkb | fungi: wow ok | 18:17 |
fungi | JayF: depends on the socket implementation as to whether it binds to multiple address families | 18:17 |
JayF | That is standard for dual-stacked apps, they likely represent the IPs as ::1.2.3.4 in the logs, right? | 18:17 |
JayF | yes, ::ffff:1.2.3.4 | 18:17 |
fungi | ::ffff:1.2.3.4 is a v6-mapped v4 address | 18:18 |
JayF | Yeah, and I don't think it's wrong to say that most apps just expose the single socket now, yeah? | 18:18 |
JayF | Or is my ops knowledge crusty and more new things do v4/v6 listens separately? | 18:18 |
fungi | most apps do, but it's a sockopt | 18:18 |
clarkb | I think using ::1 or 127.0.0.1 is fine we just need to be consistent across the board and testing should mostly cover that (except we don't have tests for the ssl terimination proxy?) | 18:18 |
fungi | also, "most apps do" *on linux* (bsd's default is the opposite) | 18:19 |
clarkb | no wonder ipv6 adoptions is terrible | 18:19 |
JayF | fungi: ah, a useful distinction | 18:19 |
clarkb | if you have to explicitly opt into ipv6 when configured to listen on all addresses thats a whole set of things that will never update | 18:19 |
JayF | And in many networks, v6 terminates at the edge | 18:20 |
fungi | but yeah, for mariadb 10.11 and later if you want it listening on both v4 and v6 loopback you need to tell it ::1,127.0.0.1 so it opens both | 18:20 |
JayF | which is like giving yourself all the pain of the new with none of the benefits :( whereby most v4 networks have roughly a similar shape | 18:20 |
clarkb | fungi: I think as long as the local my.cnf is set properly then it won't matter too much. so docker exec -it mariadb mysql -p$PASSWORD | 18:21 |
fungi | yeah | 18:21 |
fungi | anyway, we do have testing of this so it should tell us. and yes we do test the ssl termination, the testinfra-based api test is connecting to apache over https, not directly to the keycloak container | 18:22 |
clarkb | fungi: 8080 is not apache though? | 18:22 |
fungi | but 443 is | 18:22 |
clarkb | oh the api tests hit apache cool | 18:22 |
fungi | and the api tests connect to 443 | 18:23 |
fungi | right | 18:23 |
fungi | over https | 18:23 |
fungi | i did at least test that they work with the [::1]:443 mapping syntax | 18:23 |
clarkb | cool then ya I think it doesn't matter which we choose as testing will force us to be consistent | 18:25 |
fungi | that's my hope, yes | 18:25 |
fungi | looks like the keycloak api is returning a "503 Service Unavailable" in the latest iteration. i bet that's your apache canary clarkb! | 18:53 |
clarkb | progress | 18:53 |
fungi | "localhost" in the vhost config likely means 127.0.0.1 | 18:53 |
fungi | yep, /etc/hosts has "127.0.0.1 localhost" and nothing for ::1 | 18:54 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Upgrade to Keycloak 23.0 https://review.opendev.org/c/opendev/system-config/+/907141 | 18:57 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: DNM: Fail keycloak testing for an autohold https://review.opendev.org/c/opendev/system-config/+/906600 | 18:57 |
fungi | that revision ^ worked on the held node | 18:57 |
tonyb | clarkb: No it's fine. | 19:32 |
clarkb | I think that was for fungi | 19:32 |
fungi | looks like that last one succeeded, but now i'm going to do some more cleanup on it while i'm thinking about it, to make implementing the replacement server easier | 19:32 |
fungi | tonyb: yeah, i figured the held nodes were what led to the latest series of meetpad changes | 19:33 |
fungi | still, sorry about that | 19:33 |
fungi | i'll be more careful in the future | 19:33 |
fungi | i should make sure to always grep the autohold list for my own nick to avoid similar accidents | 19:34 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Upgrade to Keycloak 23.0 https://review.opendev.org/c/opendev/system-config/+/907141 | 19:35 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: DNM: Fail keycloak testing for an autohold https://review.opendev.org/c/opendev/system-config/+/906600 | 19:35 |
tonyb | clarkb: Yeah you're right that was meant for fungi. Either way the hold is no longer needed | 19:39 |
fungi | well, cool, i cleaned it up for you in that case. yeah, that's my story ;) | 19:40 |
fungi | given the keycloak server "upgrade" idea is wholesale replacement and reconfiguration, what do folks think about this quick-but-ugly process: 1. put keycloak01 in the emergency disable list, 2. merge 907141, 3. boot a keycloak02, 4. add to inventory and dns switching the keycloak cname at the same time, 5. log in as generic service admin and create the zuul realm and accounts, 6. let our | 19:43 |
fungi | sysadmins set their credentials and associate any identities again | 19:43 |
fungi | i can shoot for doing all that within a day, so that anyone who does use it with some regularity doesn't have to go without for too long | 19:44 |
tonyb | I'm fine with that, I'm a very light user. | 19:46 |
fungi | it would of course be possible to have a more atomic cut-over, or even to try to export/import our configuration, but all of those also imply more research, testing and hand-holding which add far more effort to the task, so i'm trying to be pragmatic | 19:48 |
fungi | if we were already using it for something like gerrit logins, the cost/benefit analysis might tip the other direction | 19:49 |
* tonyb steps away for a couple of hours | 20:00 | |
fungi | 23.253.164.201 is the new held keycloak99 test node if anyone wants to poke at it | 20:12 |
fungi | the only other thing i'm tempted to work in is that keycloak 24.0.0 is due out in a week or so. we might want to consider targeting that version instead so we don't immediately start out a major version behind | 20:15 |
fungi | there's a "nightly" container tag which probably comes close to what will end up in the official 24.0 images | 20:17 |
clarkb | that seems reasonable but also I think upgrades are supposed to be striaghtforward | 20:23 |
clarkb | it is just the runtime framework swap that really broke things there | 20:23 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: DNM: Try future Keycloak 24.0 https://review.opendev.org/c/opendev/system-config/+/907253 | 20:24 |
fungi | https://github.com/keycloak/keycloak/milestone/35 technically says "Due by February 29, 2024 ... 62% complete" so maybe we just plan to upgrade, but still it'll be good to see what things are looking like there | 20:26 |
fungi | looks like fido passkeys are supposed to work for v24 too | 20:28 |
fungi | in case anybody cares about that | 20:29 |
fungi | or maybe it'll get bumped to keycloak 25, it's still waiting for https://github.com/keycloak/keycloak/pull/24305 to be revised | 20:33 |
fungi | ah, there's basic support for them already in v23, just not for that bit | 20:34 |
fungi | they also added a high availability guide to their documentation recently, though i expect that would need an external db clustering approach similar to what we discussed for zuul | 20:38 |
fungi | swapping 23.0 for nightly also passed our tests | 20:49 |
fungi | so if nothing else, it probably means the upgrade won't be too onerous (basic service configuration remains functional) | 20:50 |
fungi | infra-root: rough plan outline for keycloak upgrade/replacement is https://etherpad.opendev.org/p/keycloak-refresh-2024 | 21:06 |
fungi | anything obvious missing? | 21:07 |
fungi | if we do steps 8 and 9 close together we can minimize the downtime as much as possible without creating extra work | 21:09 |
fungi | though now that i'm looking back over it, 6 and 7 might need local name resolution overrides on people's machines | 21:10 |
fungi | but not a huge complication | 21:10 |
fungi | also, anyone who doesn't care about it can personally chose to do step 7 to after step 11 | 21:11 |
fungi | er, after step 9 i mean | 21:11 |
corvus | having multiple systems that would benefit from an ha dbms makes the idea of running a single large pxc cluster to server zuul + gerrit + keycloak more attractive | 22:35 |
clarkb | note there isn't much benefit to gerrit for that after the notedb transition | 23:09 |
clarkb | the sql db in gerrit only tracks the "reviewed" flag next to files | 23:09 |
clarkb | and I think gerrit even functions if the sql db does not. You just don't get that data and might get some errors you can ignore | 23:09 |
clarkb | fungi: the keycloak etherpad lgtm | 23:22 |
fungi | thanks. if there's interest from enough folks in reviewing 907141 soon, then i can possibly get us up to step 6 there this week or early next | 23:27 |
fungi | depending on how the review shakes out | 23:27 |
fungi | and also prep/stage the remaining changes for the cut-over window | 23:28 |
clarkb | I'll rereview it when I'm done looking at this nodepool change | 23:28 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!