Tuesday, 2024-01-02

tonyb904151 Looks good to me.  I have +2'd it and will +A tomorrow unless someone says stop00:38
frickleroh, I somehow had in my mind we'd emergencied the container downgrade already, but indeed we're running 10.2.3. so let's see if that "patch version" downgrade is enough07:35
opendevreviewMerged opendev/system-config master: Temporarily pin Grafana to 10.2.2  https://review.opendev.org/c/opendev/system-config/+/90415108:02
fricklerseems the downgrade has fixed the nodepool dashboards, yay08:39
tonybokay awesome.  I can play with an autohold to see if any of the identified issues are ours08:51
*** mrunge_ is now known as mrunge08:54
*** dhill is now known as Guest1263515:16
opendevreviewClark Boylan proposed opendev/system-config master: Test gitea haproxy maxconn limits  https://review.opendev.org/c/opendev/system-config/+/90450016:34
clarkbreminder we said no meeting today. Have a happy new year!16:35
fungisame to you!16:35
opendevreviewClark Boylan proposed opendev/system-config master: Upgrade to etherpad 1.9.5  https://review.opendev.org/c/opendev/system-config/+/90423116:42
opendevreviewClark Boylan proposed opendev/system-config master: DNM force etherpad failure to hold node  https://review.opendev.org/c/opendev/system-config/+/84097216:42
clarkbautohold is in place for ^16:43
clarkbre the long queued job/ref in the periodic queue: if I had to guess one of our nodepool providers isn't functioning properly and we're not able to reject a node request from that job16:51
fungispeaking of which, the linaro cloud api cert is expiring in a couple of weeks16:54
fricklerI think kevinz was taking care of those? doesn't seem to be around though16:56
frickleralso www.openstack.org getting old16:56
fricklerregarding periodic queue, the stuck job is a plain openstack-tox-py3916:57
clarkbI did the last refresh of the linaro cloud cert16:57
clarkbwe basically have to run the acme refresh script and then the kolla redeploy. There was email about it16:57
fricklerI thought linode is what we did ourselves? or am I mixing this up?16:57
clarkbthere is no linode16:58
fricklerah, inmotion is the other one16:59
clarkblooks like the certcheck for linaro is working now. That is good (one of the issues before was we were pointing to an old url and it wasn't checking the correct item)16:59
fungilinode has never been one of our cloud donors (i don't think they even use openstack, they used to be uml-based)16:59
clarkbthe cert I get for www.openstack.org expires in may17:12
clarkbthey must be serving different certs depending on region?17:12
clarkbsince it is being served by cloudflare17:13
fungiis that the www.openstack.org cert or the openstack.org cert?17:20
clarkbat least that is what certcheck is reporting expires in 23 days, but using openssl s_client from here I get a valid cert until may17:21
fungithe www.o.o cert is handled by cloudflare, the o.o cert is letsencrypt17:23
clarkbhrm that implies certcheck is hitting openstack.org but thinking it is looking at www.openstack.org?17:24
fungithat's my suspicion, without checking the config17:24
fungilooks like service-discuss just got spammed. investigating now17:46
frickler"openssl s_client -connect openstack.org:443" gives a cert with "subject=CN = www.openstack.org", that's exactly what certcheck is reporting I'd say. and that has "NotAfter: Jan 25 02:17:49 2024 GMT"17:47
fungiit also includes san for openstack.org, web1.openstack.org and web2.openstack.org which is why it's valid for openstack.org requests17:49
fungii've deleted the spam post from the service-discuss archive and switched that user's queue processing to automatically discard future posts17:51
funginote that this spam was sent through the hyperkitty webui, rather than via smtp17:52
clarkbis it possible to force web post user through the initial moderation but not email posters?17:53
clarkbseems like the bulk of the trouble is via web and not smtp so if we can apply the small barrier there that might be a good compromise17:53
fungithere's no separation between smtp and web injection, so it has to be set for everyone17:56
fungithe main reason i think we get most of it coming via web is that the hyperkitty workflow is very convenient for new users: click "reply" and you'll be automatically subscribed to the list with delivery initially disabled (until you validate your e-mail address). posting via smtp requires explicit subscription and validation steps first before your post will be accepted17:59
fungione thing which might help (coming in the next mailman version i think?) is captcha support18:00
clarkb104.130.253.220 is the held etherpad. I'm using the clarkb-test etherpad. It seems to work for me in ff just fine. Chrome is having connection resets. I want to say we saw this before and the assumption was it was related to cookie mismatches between the actual server and the dev one18:09
clarkboh actually pulling up chrome dev tools it is because the socket.io connection is failing to validate the cert even though I told it to accept the cert. I guess in chrome they don't carry that over to websocket connections18:10
opendevreviewClark Boylan proposed opendev/system-config master: Test gitea haproxy maxconn limits  https://review.opendev.org/c/opendev/system-config/+/90450018:12
clarkbthe first ps of ^ used the latest haproxy image and the test failed which is what I expected. I've updated to undo the image update and go back to lts which should pass and make the change mergeable18:12
clarkbre etherpad I think the update lgtm. I'll let others check but we can probably merge that soon18:12
fungiworking for me from firefox too18:13
clarkbthe gitea upgrade is one that we should look at soon too, but I'm thinking today may not be best for that from me. Just because I may not fully be back yet18:16
clarkbhttps://review.opendev.org/c/opendev/system-config/+/904500 passes against the lts haproxy image. I think that means this is a good test for this problem19:34
clarkband since it is a test only update it should be safe to land whenever reviewers are happy with it19:34
fricklerhttps://github.com/haproxy/haproxy/issues/2395 is closed now, but it seems we need a 2.9.2 tag for the fix to show up in the docker image19:52
clarkbyes they seem to be backporting fixes to the 2.9 branch but no new release has been made yet19:52
clarkbonce new images are ready we can confirm it fixes our problem using this test and an image update change19:53
clarkbit isn't clear to me how their branching and tagging work though. They say they backported to 2.9 but then there is no 2.9 branch. There is also no 2.9.1 tag19:55
fricklerthere is, but not on github19:55
fricklerthey only mirror the master branch from https://git.haproxy.org/19:56
clarkboh I see19:56

