Thursday, 2023-11-02

fazledynHi everyone. Glad to be here. 11:29
fazledynI had a query regarding a commit made on 11:30
fazledynCan anyone kindly point me whom should I contact?11:31
jrosserfazledyn: you can try in #openstack-ironic11:39
*** eandersson0 is now known as eandersson11:50
opendevreviewMerged opendev/irc-meetings master: Edit QA team meeting info
fazledynI'm curious. Where does the meeting take place?13:15
*** dhill is now known as Guest554713:16
fazledynAre there any open source alternatives to Zoom ?13:16
fricklerfazledyn: if you refer to the QA team meeting, it is taking place in IRC, in the #openstack-qa channel, like most team meetings do13:26
fricklerfazledyn: regarding audio/video meetings we have our own server,, running jitsi with etherpad associated for collaborative editing of a document13:26
fazledynthanks for the info! @frickler13:40
kevkoHi, any advice please ?  unable to download image from mirror ...but it is normally working locally :/ 14:41
SvenKieskeyeah I also just checked it, works fine for me locally (using the opendev mirror), the culprit seems to be:14:45
SvenKieskedocker pull
SvenKieskeor in that case podman pull I guess14:46
fungii can look whether there's any issues with the proxy in inmotion14:46
fungithis is the same thing you pinged about in #openstack-kolla a few minutes ago?14:47
SvenKieskeyeah, sorry :) I'll also try to replicate this locally for podman14:47
SvenKieskemaybe it _is_ podman related14:47
SvenKieskethinking about it: kevko: do we setup any proxy env var or something to reach that mirror?14:48
fungidoesn't look like there are any local filesystem issues on mirror.iad3.inmotion at least (plenty of space, no errors in dmesg)14:48
SvenKieskemhm, having trouble setting up podman in a local docker container..I hate nested container stuff with different engines :D14:50
fungifwiw, browsing to does show master-debian-bookworm available, but as i said it's just proxying all that from dockerhub anyway14:51
fungiinfra-root: i've un-wip'd for the mailman upgrade, and am getting ready to approve it at the top of the hour (best case it'll still be after the announced 15:30 utc when the upgrade actually occurs)14:54
SvenKieskeoh another mailman upgrade?14:54
SvenKieskekevko: did you test locally with podman or docker? maybe it was just a random network error?14:55
SvenKieskefungi: ah thanks for the pointer, I really should subscribe to service-announce I guess14:56
fungiit's very low-traffic, one or two messages a month14:56
fungii've approved 899300 now and will keep tabs on the eventual deployment jobs15:00
fungii'll #status log once the upgrade is completed15:01
clarkbfungi: I'm here waiting for the gerrit community meeting to happen but maybe it will occur at 9am and respect the EU DST time change15:03
clarkbor maybe it won't happen at all. Either way I'm around15:04
clarkbSvenKieske: Gerrit will be upgraded November 17. Another announcement you would've received :)15:05
SvenKieskeclarkb: thx, upgrades on a friday, you are certainly fearless ;)15:06
clarkbits well tested and we try to do gerrit upgrades during quiet times. I expect that time to be quiet in particular as a large US holiday week starts that weekend15:07
fungizuul seems to think 899300 is going to merge around 15:47-ish15:25
fungihopefully the deploy builds will beat the top of the hour when our periodic deploys start15:26
fungidepends on how long promote takes, i guess15:26
clarkbfungi: we do promote in deploy so should be good15:36
fungioh, right15:37
fungiis all this traceback spam expected?
fungiModuleNotFoundError: No module named 'tzdata'15:41
fungiseems to be maybe something with ara, i guess?15:42
clarkbyes its an ara thing15:43
clarkbit was there when I debugged the ruamel.yaml thing but didn't appear to be fatal so I focused on the problematic issue15:44
opendevreviewMerged opendev/system-config master: Upgrade to latest Mailman 3 releases
fungiand it's in deploy now15:51
clarkbhasn't touched the running containers yet15:53
fungiit's pulling images now15:53
fungiand now it's restarting the containers onto the new images15:55
clarkbI'm still getting 503s from apache but docker does report the containers are up15:57
clarkbI seem to recall running into this last time as startup is not quick?15:57
fungiyeah, it takes some time15:58
fungithere it goes16:01
fungi reports "Postorius Version 1.3.10" at the bottom now16:01
fungi reports "Powered by HyperKitty version 1.3.8." at the bottom now16:01
fungithose match the updated state in
clarkbfungi: some minor things in the web logs:
clarkbfungi: also do we write logs to logfiles or is it all through the docker logging system? I don't remember and can't find any specific log fiels after a quick look16:04
fungiclarkb: in /var/lib/mailman/core/var/logs/16:05
clarkbalso looks like /var/lib/mailman/web-data/logs for the web server16:06
fungialso /var/lib/mailman/web-data/logs/16:06
fungii notice some old elements of the webui are cached, force refreshing pages gets everything loaded though16:07
fungistatus log Completed upgrade of mailing list sites to Mailman 3.3.9 as announced in
fungithat look good?16:09
fungi#status log Completed upgrade of mailing list sites to Mailman 3.3.9 as announced in
opendevstatusfungi: finished logging16:09
fungiSvenKieske: is probably the other stream worth watching, in addition to the announce ml, if you aren't already16:10
fungipcheli: i checked back in on your ci system's gerrit account, and it has 11 ssh connections open at the moment. looks like they're accumulating slowly, so if you can't figure out the cause you're probably going to end up blocked again at some point weeks down the road16:12
fungiclarkb: any feel for when we should merge the etherpad changes? would you want to do the config refresh and the version upgrade on separate days?16:14
pchelifungi: thanks for this. We'll check our infra again. Still I see only 1 connection from ci server.16:15
JayFThis would strongly imply a badly behaving intermediate firewall or proxy pcheli 16:16
clarkbfungi: I don't think they need to be separate days. Just separated by enough time to have a good idea if one created issues vs another16:16
clarkbfungi: maybe we go ahead with the config update now. Restart on that and an hour later deploy the upgrade?16:17
clarkbI'm in the gerrit meeting now though so a bit distracted16:17
SvenKieskefungi: ah will use my private mastodon account for that I guess16:19
fungiclarkb: no problem. i can go ahead and approve the config change and keep an eye on it. probably won't deploy until the call ends anyway16:19
SvenKieskecouldn't those two information channels be merged?16:20
clarkbfungi: ya and I can't remember if we auto restart etherpad. We may not so may require us to manually restart anyway16:20
fungiJayF: agreed, that was my initial suspicion as well. i've seen it a lot with firewalls that assume idle ssh connections can just be silently dropped from the state table without doing a courtesy rst or fin to close them. sometimes turning on ssh keepalives helps16:21
fungiSvenKieske: maybe, though we use the announcements list as a very low-volume way of broadcasting what's coming up, while we use the statusbot to do more immediate notifications about things that are occurring or have occurred16:22
fungistatusbot has multiple urgency levels (log, notice, alert) we use to decide how widely to broadcast things (log only goes to mastodon/wiki page, notice also notifies irc channels/matrix rooms, alert temporarily changes irc channel topics)16:24
fungiand now that i think about it, i don't believe we got matrix support added to statusbot yet, so just irc not matrix at the moment16:24
kevkoSvenKieske sorry, I fell in sleep 💤 😂16:36
clarkbI wouldn't want to emit statusbot notices to service-announce but we could possibly go in the other direction16:36
clarkbbut also I'm unlikely to implement that as I am not a mastadon user (and never used twitter eitehr)16:37
opendevreviewMerged opendev/system-config master: Update Etherpad settings from upstream
fungideploy job is pulling new etherpad image now16:53
fungijob claims it downed/upped the container16:54
fungiyeah, nodejs process has a start time of about a minute ago16:55
clarkb appears to still be there for me16:55
clarkbnothing looks obviously different16:56
fungisomehow got reverted to a new pad16:56
fricklershowing proper content for me16:57
* fungi is an idiot16:57
clarkbI see content there too16:57
clarkbyou still have your /etc/hosts override in place?16:57
fungii did. now it's fine16:57
fungipanic subsiding16:57
fungilgtm now that i'm connecting to the actual server16:58
clarkbcool. Honestly seeing that existing pads are still there (db connection is good) and web ui looks the same (explicit defaults didn't change somethnig somehow) make me happy with the config update16:59
clarkbhappy to proceed to the version update whenever others are happy too16:59
fungisure, i can approve that one next. there's still time for other infra-root to object with a -2 before it merges17:03
fungibut we did test the held node with the new version fairly extensively yesterday17:03
clarkbI also confirmed the config file updated on disk on the server17:04
clarkband appears to have done so before the services were restarted based on timestamps so ya all good from my end17:04
clarkbunrelated to all this Gerrit 3.9 should release day after thanksgiving so November 2417:05
fungiinfra-root: i've approved to upgrade from etherpad 1.9.2 to 1.9.4, please -2 to prevent it merging if you want more time to test it out17:05
kevkofungi: regarding where can I check how is configured ? 17:05
clarkbkevko: opendev/system-config/playbooks/roles/mirror/templates/mirror.vhost.j217:06
kevkothanks 17:06
clarkbkevko: you can drop the -int from the fqdn to access it externally too17:06
clarkbwe found that network performance was sigifnicantly better within the cloud over the internal networks so point ci nodes at the internal network interfaces but expose it publickly too17:06
fungikevko: specifically,
kevkocan i check apache log somewhere ? 17:12
fungikevko: no, but i can check them for you. what's the client ip address of the test node and the approximate time of the request?17:13
fungilooks like the "primary" node was probably connecting with its private address17:16
kevkoit is this zuul run
kevkothis one ?
fungithe only connections i see logged from that address were at 09:07:28 utc, so probably from another build17:19
kevkothat's weird 17:20
fungii do see some connections to the 443 mirror port17:22
fungilooks like it was successfully downloading rockylinux packages from the mirror17:22
kevkoi also download successfully what is failing in a log :D 17:24
kevkowhy 443 and not 444717:25
clarkb<> Failed to connect to the host via ssh <- isinteresting17:25
clarkbbut maybe that is a quirk of ansible logging.17:25
fungiyeah, i checked the logs on a whim but there's never any connection from
clarkbkevko: 443 hosts things that allow us to change the root path. Docker in its infinite wisdom decided to not let that be possible so it gets a dedicated port17:26
fungi10.209.33.135 does connect to 443 to fetch distro packages17:27
kevkoyou probably can't give an access for a while to check what is happening, can you ? 17:28
clarkbkevko: looks like you are using some ansible module to do the pulls?17:28
clarkbkevko: I would look at the source for that module and determine if that error message can be triggered by other failures17:29
clarkbfor example the one above17:29
kevkoclarkb: yes, and it is not problem for docker pull's problem for podman pull17:29
clarkbI think this is failing at an ansible level and never talking to the mirror17:29
fungii've got about 400 lines of access logs from that build hitting the mirror's 443/tcp address. i can probably break it up into 4 pastes17:30
clarkbI don't know that that will help17:31
clarkba random ansible module with who knows what implementation is returning an error and it never talks to the web server17:31
clarkbthe logs we need are on the ansible side17:32
clarkb(this is why I'm so often in favor of running commands directly rather than using modules, they are so obtuse when they break)17:32
fungiactually, looks like it was fetching python packages through the proxied connection to pypi17:32
clarkband github won't let me search their repos without signing in17:33
kevkoclarkb: fungi: let me try it to pull with bash 17:33
fungiso we have proof that the test node did connect to the mirror and fetch things through a caching proxy there, just not the caching proxy for quay17:33
opendevreviewMerged opendev/system-config master: Upgrade Etherpad to 1.9.4
fungideploy job is pulling the etherpad 1.9.4 image now17:36
clarkb this is where the error comes from17:36
funginodejs process start time now 17:3617:37
fungii'm able to pull up ptg etherpads still17:37
clarkbI can pull up other etherpads I've been using like the gerrit 3.8 etherpad17:37
fungi#status log The Etherpad service on has been upgraded to the 1.9.4 release17:38
opendevstatusfungi: finished logging17:38
fungikevko: looking for some of those container identifiers, i do see other test nodes fetching things with similar names through that proxy17:41
clarkb seems to imply it only returns {} if it got a 404. But our server logs indicate it never even made a request so how can it be a 404?17:41
fungi10.209.129.178 - - [2023-11-02 17:02:20.638] "HEAD /v2/openstack.kolla/neutron-l3-agent/manifests/master-rocky-9 HTTP/1.1" 200 - - "-" "docker/24.0.7 go/go1.20.10 git-commit/311b9ff kernel/5.14.0-284.30.1.el9_2.x86_64 os/linux arch/amd64 UpstreamClient(docker-sdk-python/6.1.3)"17:42
clarkbunless there is no exception and returns {}17:42
fungidifferent test node of course17:42
kevkoas i said ..this is not docker pull but podman pull17:42
kevkocan you check agent = .*podman.*17:42
kevkoinstead of docker 17:42
fungino references to "podman" at all in the log17:43
kevkoand that's weird :P 17:43
fungiwell, not if it's never actually connecting17:43
fungii basically don't see any indication from the mirror server/proxy side of things that podman ever attempted to request anything through 4447/tcp17:44
kevkonah, ubuntu don't have mod_macro packaged ? :( 17:44
kevkofungi: cool, thanks, it's also helfull information 17:44
fungihowever i do see evidence that the same build was successfully reaching the mirror/proxy to pull things from pypi, so connectivity itself seems to have been fine17:45
kevkofungi: we will see in few minutes 17:45
fungialso looks like docker clients are pulling through that proxy successfully in other jobs, so the proxy seems okay17:46
kevkonow job is running ..i've bypassed ansible pull with ssh primary "sudo podman pull same_address_as_from_log"17:46
kevkopodman is specific somehow :P 17:46
clarkbkevko: returns a List[Image] type here List doesn't have .attrs17:47
clarkbI don't think that is the issue, but potentially is another issue?17:47
clarkbwould occur if you request all tags17:48
kevkomaybe ..thanks .i will check 17:52
clarkbidea time: maybe it is requesting these resources over http and not https. TLS setup fails and we never get far enough to record the action in the vhost. There may be logs in the general server error log for this?17:53
clarkbkevko: fungi  ^17:53
clarkbif so we don't seem to record it in the logs I can see17:55
fungiyeah, that would never reach apache's logs17:55
fungiwe could set up special iptables connection logging17:56
fungibut that's probably overkill17:56
kevkohmm, http vs https can be issue17:57
clarkbits also making the request to a podman socket which presumably has some running daemon like docker does which makes the actual request18:01
clarkbAt least I think that is how this is setup. If so I would look for logs from whatever has opened that socket file18:01
SvenKieskeclarkb: fungi: could you maybe assist in debugging a login issue in gerrit? mjk: is new to the openstack contributor party :)18:05
SvenKieskehe's getting a 404 redirect error upon login, which I can't replicate, just logged in again and it works18:06
SvenKieskethe gerrit upgrade has not begun earlier, has it? :D18:06
mjkhey team! let me know what info you need from me 18:06
SvenKieskeat least I hope someone in here can help from the gerrit side, the launchpad stuff is somewhere in canonicals hands18:07
clarkbkevko: I would try setting to true. We have valid certs on our registries so that should rule out https vs http as the issue (or fix it)18:07
SvenKieskemjk: can you link me to your launchpad account? maybe something is off with that?18:07
SvenKieskekevko: rereading the thread tomorrow I guess18:08
clarkbmjk: can you recount the steps you've taken and where it breaks?18:08
mjkyup, I hit sign in on review.opendev, it punts me to login.ubuntu, click yes login me in, get a 404 and this is the landing URL,SIGN_IN,Contact+site+administrator 18:09
fungithat can indicate the account was previously retired, or is a duplicate of another existing account, i think18:10
SvenKieskeweird, have you per chance any browser extensions that do interfere with redirects? I personally use ublock origin and firefox containers but have no issue with them18:10
clarkbthanks. I've found the error in the logs. You are logging in with a new launchpad account that has the same email address as an existing launchpad and gerrit account. Gerrit won't let multiple accounts have the same email address so it errors18:10
clarkbfungi: yup its exactly that18:10
kevkoclarkb: fungi: do you know what ? :D i've commented the pull command with ssh primary "sudo podman pull something " ...and now it runs suspiciously long (which means that actually ONLY pull probably not working ..but your proxy is OK ) :D :D :D 18:10
SvenKieskeyou guys are just awesome :) I would never have thought about that :D mjk: did you maybe register already in the past?18:11
clarkbmjk: the options here are for you to login using your old launchpad account and use the existing gerrit account.18:11
clarkbor we disable the old account and make it inactive and then let gerrit create the new account for you on login with the new launchpad credentials18:11
fungii'm stepping away for a moment to pick up dinner, but shouldn't be gone long (probably back in 15 minutes tops)18:11
mjkhrm, I will have to figure out the old account 18:12
mjkI will pipe up if I can't figure it out 18:12
clarkbI guess technically there is a third otpion. Use a different email address and let it create a new account for you18:13
mjkwill use that as plan B if needed :) 18:13
clarkbgive me a few and I'll page back in the admin steps to inspect the old account in gerrit18:16
clarkbI think I have to promote my user then use the rest api because ssh doesnt' cut it for this18:16
clarkbthat has given me a little bit of extra info. First this email address was the only one ever used. It wasn't a secondary email. Next the account is from 2016. And finally the openid associated with the old account gives me a 404 when I fetch it indicating the openid isn't part of ubuntu one anymore. I don't know why this is the case18:21
SvenKieskeokay guys, thanks for helping out :) I guess I'm out for today. o/18:21
clarkbwhat we can do is remove the primary email address, and launchpad ids with that email address from the old account then disable the account entirely. Then you can login with the email address and get a new account. This assumes you cannot recover the old ubuntu/launchpad openid18:22
clarkbor the third options I mentioned use another email address18:22
clarkbfwiw openstack-discuss has had a couple of messages since we upgraded mm318:36
clarkbmaybe even 4?18:37
fungiokay, back18:38
fungiand yeah, foundation@openinfra had a message too18:39
*** travisholton2 is now known as travisholton19:36
opendevreviewJulia Kreger proposed opendev/glean master: Fix glean docs build
opendevreviewJulia Kreger proposed opendev/glean master: Add support for tinycore linux
opendevreviewJulia Kreger proposed opendev/glean master: Fix glean docs build
clarkbstill no failuers from the letsencrypt job. Maybe the issue we had is similar to the kolla one on the openstack list. Some bug hidden inside of ansible and upgrading got us beyond it21:15
kevkoclarkb fungi hmm, it looks like podman pull really not working (second thing is that python module is hiding real error  of course) , but podman pull failing
clarkbkevko: works from here21:43
opendevreviewJulia Kreger proposed opendev/glean master: Add support for tinycore linux
clarkbso the service is generally up. Could be a networking problem I suppose. Is podman not willing to route over the internal network maybe?21:43
kevkoclarkb: hmm, i've commented tests and write to file podman config and also raw podman pull
fungikevko: mirror-int won't be reachable externally if that's how you tested21:44
fungibut mirror (without the -int) should still be testable over the internet21:44
kevkofungi no, it's the same ENV I would say21:44
fungiGet \"\": dial tcp i/o timeout"21:45
clarkbI am able to get that url from nb01 which is in the same cloud as the mirror21:46
fungiwhere were you testing it from? did that run in one of our ci nodes in rax-dfw or somewhere else?21:46
kevkofungi: i was wondering if it can be a proxy problem ?
clarkbits from this job run
fungiah, okay21:46
clarkbkevko: it would be nice if you linked to the zuul pages. makes linking to log lines possible21:46
clarkbalso gives a lot more information about the change in question21:47
kevkoclarkb: fungi: but check patch1 for a real change ... latest patchset is just my debug for now :) 21:47
kevkoit's nothing with a change I would say 21:47
fungiif it was a proxy problem, i wonder why pip installing things wasn't affected21:48
fungianyway, if we have the client address for this one, i can check the apache mod_proxy logs on the mirror server there again21:48
clarkboh wait21:49
clarkbyou hardcoded the mirror address you cannot do that21:49
kevkoI don't know, sometimes I have a felling that podman is little little different as docker for example 21:49
clarkbthat job ran in ovh so ya it cannot reach mirror-int.dfw.rax21:49
clarkbso your test using podman is invalid21:49
fungithose mirror-int addresses will only be available from job nodes inside the same cloud providers 21:49
fungiso only nodes booted in rax-dfw will be able to connect to mirror-int.dfw.rax.opendev.org21:50
fungieach cloud provider has its own mirror hostnames/urls that job nodes in that provider use21:51
kevkofungi: well, i copied that address from python error log ...21:51
fungiright, but then when you ran the job again it ran in a different cloud provider/region21:51
kevkoah okay, understand 21:51
kevkocool, let me fix me test then21:51
clarkbkevko: zuul_site_mirror_fqdn this ansible var sets the fqdn. So you should be able to do something like https://{{ zuul_site_mirror_fqdn }}:4447/ as the mirror location21:55
kevkoclarkb: thank you 21:56
clarkbkevko: did you try setting verifyTls to true though?21:58
clarkbverify tls is the default when running the podman command so it used https (which we saw in that last log file). But the way the ansible module is written it defaults to false there and it was false when it failed so maybe it tried to do http instead?21:59
kevkoclarkb: no, didn't tried yet ..22:02
opendevreviewMerged opendev/glean master: Fix glean docs build

Generated by 2.17.3 by Marius Gedminas - find it at!