opendevreview | Merged opendev/zone-opendev.org master: Add openmetal.us-east record https://review.opendev.org/c/opendev/zone-opendev.org/+/921480 | 05:03 |
---|---|---|
opendevreview | Jens Harbott proposed opendev/zone-opendev.org master: Remove old meetpad and jvb servers https://review.opendev.org/c/opendev/zone-opendev.org/+/920761 | 05:27 |
opendevreview | Jens Harbott proposed opendev/zone-opendev.org master: Drop records for inmotion mirror https://review.opendev.org/c/opendev/zone-opendev.org/+/921489 | 05:27 |
opendevreview | Merged opendev/zone-opendev.org master: Remove old meetpad and jvb servers https://review.opendev.org/c/opendev/zone-opendev.org/+/920761 | 05:35 |
opendevreview | Jens Harbott proposed zuul/zuul-jobs master: Drop outdated testing platforms https://review.opendev.org/c/zuul/zuul-jobs/+/921501 | 09:45 |
frickler | ^^ should hopefully fix the gate failures on zuul-jobs. I opted to drop bionic jobs at the same time, but feel free to amend if you'd rather split that. I also noticed that https://review.opendev.org/c/zuul/zuul-jobs/+/891221 is still open, had to delete the podman jobs by hand | 09:50 |
opendevreview | Jens Harbott proposed zuul/zuul-jobs master: Auto-generate ensure-podman jobs https://review.opendev.org/c/zuul/zuul-jobs/+/891221 | 10:36 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Update ansible versions used in unittesting https://review.opendev.org/c/zuul/zuul-jobs/+/920857 | 10:40 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Add nox and tox py312 jobs https://review.opendev.org/c/zuul/zuul-jobs/+/920841 | 10:40 |
opendevreview | Tony Breeds proposed opendev/system-config master: Add an opendev specific build of mediawiki https://review.opendev.org/c/opendev/system-config/+/921321 | 13:17 |
opendevreview | Tony Breeds proposed opendev/system-config master: DNM: Initial dump or mediawiki role and config https://review.opendev.org/c/opendev/system-config/+/921322 | 13:17 |
opendevreview | Merged zuul/zuul-jobs master: Drop outdated testing platforms https://review.opendev.org/c/zuul/zuul-jobs/+/921501 | 14:36 |
opendevreview | Merged zuul/zuul-jobs master: Update ansible versions used in unittesting https://review.opendev.org/c/zuul/zuul-jobs/+/920857 | 14:43 |
opendevreview | Merged zuul/zuul-jobs master: Add nox and tox py312 jobs https://review.opendev.org/c/zuul/zuul-jobs/+/920841 | 14:43 |
clarkb | today should still be godo for me to do a quick gerrit restart if anyone else wants to review https://review.opendev.org/c/opendev/system-config/+/920939 and child | 15:06 |
fungi | yep, looking now | 15:08 |
fungi | clarkb: all those (including 920938 you approved yesterday), look fine to me but don't appear to alter what we're putting in the image. is the restart just to be hygienic and make sure that the refreshed image that gets built still works normally? i guess some plugins we're pulling in from branches might get new commits etc? | 15:20 |
clarkb | fungi: yes exactly. We will build and promote a new 3.9 image alongside the new 3.10 stuff. That new 3.9 image will include whatever commits have landed in things we pull from stable branches since the last image build | 15:21 |
clarkb | fungi: rather than be surprised at some point in the future when we are forced to do a restart if anything goes sideways I prefer to plan to do gerrit restarts when we know the image updates so that we can keep on top of unexpected behaviors | 15:21 |
fungi | cool, makes sense. i'll single-core approve 921060 too since the prior +2 was lost in a rebase | 15:22 |
clarkb | we could change over to using tags only, but then we will end up doing more churn for every tag update. Its a tradeoff I guess. Also I think this gets us bugfixes from stable more quickly which can be nice if upstream is slow making a release | 15:23 |
fungi | yeah, seems fine | 15:24 |
clarkb | frickler: re no AAAA record for the openmetal cloud they did confirm they don't have ipv6 support yet | 15:27 |
frickler | sad. I could possibly sell them consulting to implement that, though ;-) | 15:30 |
opendevreview | Merged openstack/project-config master: Add wandertracks repos https://review.opendev.org/c/openstack/project-config/+/920616 | 15:31 |
fungi | i approved a post through moderation for openstack-discuss just now and the message seems to have been submitted to exim almost immediately, so i think the extra threads have helped | 15:52 |
frickler | from what I read in the thread that was cited in the review, the main effect should be seen when multiple messages are received in parallel? | 16:03 |
fungi | yeah, i guess the next time i simultaneously approve several messages i'll check that too | 16:04 |
clarkb | yes and I think that these runners are shared by all the vhosts. So previously we had one doing it for all lists and now we have 4 which means there should be less conetntion assuming that was part of the problem before | 16:06 |
fungi | random data point: current subscriber count for openstack-discuss is 1991 addresses, 535 of which end in gmail.com, and 207 end in redhat.com | 16:10 |
opendevreview | Merged opendev/system-config master: Add Gerrit 3.10 image builds and testing https://review.opendev.org/c/opendev/system-config/+/920939 | 16:14 |
opendevreview | Merged opendev/system-config master: Add Gerrit 3.10 upgrade testing https://review.opendev.org/c/opendev/system-config/+/921060 | 16:14 |
clarkb | once the deploy jobs for ^ are done (they will rpomote images) we should check the wandertracks repos were created properly then we can proceed to restart | 16:15 |
clarkb | fungi: one thing that came up last friday during the upgrade was we should give people some warning so they can save review comments and finish up thoughts. So maybe we do a #status notice with a 15 minute warning | 16:16 |
clarkb | the deploy jobs ran were all successful. | 16:29 |
clarkb | I see wandertracks repos in gitea with .gitreview files implying the gerrit project creation was successful: https://opendev.org/wandertracks/wandertracks/src/branch/master/.gitreview | 16:29 |
clarkb | should we do a #status notice now and then restart in about 15 minutes? | 16:30 |
fungi | clarkb: i'm about to step out to run some quick errands. maybe we give a one-hour warning? | 16:40 |
fungi | i'll be back in plenty of time for that | 16:40 |
clarkb | sure something like #status notice Gerrit will be restarted at aroung 17:45 UTC to pick up some small image updates | 16:43 |
clarkb | fungi: ^ that sound good to you? | 16:43 |
clarkb | I'll fix the "aroung" type | 16:43 |
fungi | perfect! | 16:43 |
* fungi will bbiab | 16:43 | |
clarkb | #status notice Gerrit will be restarted at around 17:45 UTC to pick up some small image updates | 16:44 |
opendevstatus | clarkb: sending notice | 16:44 |
-opendevstatus- NOTICE: Gerrit will be restarted at around 17:45 UTC to pick up some small image updates | 16:44 | |
opendevstatus | clarkb: finished sending notice | 16:46 |
Guest8493 | clarkb: if you get bored (haha) feel like adding me to wandertracks-core and waterwanders-core? | 16:59 |
clarkb | ya give me a minute to load keys | 17:02 |
clarkb | Guest8493: should be done now | 17:05 |
Guest8493 | thank you sir!!! | 17:05 |
clarkb | opendevorg/gerrit 3.9 afd543ba48a1 <- this is the image we are currently running | 17:23 |
clarkb | I've started a root screen on review02 | 17:24 |
clarkb | https://hub.docker.com/layers/opendevorg/gerrit/3.9/images/sha256-f3744db8d83d775063cfb83ae9d2ac1148f10afc968d0e4469ab6c5f770c56e3?context=explore is the image we expect to update to whcih we'll check after doing the pull | 17:25 |
clarkb | new image is pulled and matches ^ | 17:38 |
fungi | okay, i'm back | 17:38 |
clarkb | the process at this point should be a docker-compose down, mv the waiting queue dir aside, docker-compose up -d | 17:39 |
clarkb | I'm ready to run through that in ~5 minutes | 17:40 |
clarkb | it is 1745 now. I'm proceeding as planned | 17:45 |
fungi | thanks! | 17:45 |
clarkb | web ui is up, but slow (thats normal) | 17:46 |
clarkb | still waiting on diffs to be available (also normal) | 17:46 |
clarkb | the reported version in the error log and in the web ui is what I expected based on the test job logs | 17:46 |
fungi | Powered by Gerrit Code Review (3.9.5-13-gb4c6cb91d9-dirty) | 17:47 |
clarkb | diffs are working on at least one chagne for me now too. All of my initial checks are good. I think the only other major thing would be to check that changes can be pushed up | 17:47 |
clarkb | gmann just pushed a governance change so that checks out | 17:48 |
fungi | yeah, so far everything looks fine to me | 17:48 |
clarkb | https://review.opendev.org/c/openstack/governance/+/920848 | 17:48 |
fungi | gerritbot also reported it | 17:48 |
clarkb | the other thing on my backlog is https://review.opendev.org/c/opendev/system-config/+/921061 frickler how strongly did you feel about the suggested different approach? I'm msotly wanting to do the least friction fix possible but your suggestion might make a good followup? | 17:49 |
gmann | did I help something in testing :P | 17:51 |
clarkb | gmann: yes you helped confirm that we can still push files to gerrit after doing a container image update | 17:52 |
gmann | +1 | 17:53 |
*** Guest8598 is now known as dhill | 18:07 | |
*** dhill is now known as Guest8909 | 18:07 | |
clarkb | my recent email to openstack-discuss still took 8 minutes so we haven't fixed that issue I don't think. Time to look at the verp stuff | 18:09 |
clarkb | I'm going to shutdown the screen on review02 now | 18:11 |
fungi | clarkb: looking at headers, exim got it from mailman within 10 seconds of mailman receiving your post, but exim didn't hand it off to my mta for a further 7m50s, so any tuning we need to do to speed things up is presumably on the exim side | 18:13 |
clarkb | ack | 18:14 |
fungi | looking at the exim mainlog for that message id: | 18:16 |
fungi | 2024-06-07 18:01:37 1sFduD-00FttR-Vv no immediate delivery: more than 10 messages received in one connection | 18:16 |
fungi | the next mention of that message id in /var/log/exim4/mainlog is at 18:09:24 | 18:16 |
fungi | er, queue id not message id | 18:17 |
fungi | but yeah, grepping the log for the queue id, it first arrives to the mta for handoff to mailman at 18:01:27 | 18:18 |
clarkb | which goes to what corvus said about increasing the number of deliveries per connection | 18:18 |
fungi | though as far as mailman splitting up what it hands back to exim, looks like message id 4c4746ad-012a-4c99-9ba4-3bc40119575c@app.fastmail.com ended up with 164 different corresponding outbound queue ids | 18:28 |
fungi | implying mailman may have split it up into that many copies with different recipient sets, i guess? | 18:29 |
clarkb | rather than one connection to send 10 then another connection to do 10 and so on? | 18:30 |
fungi | https://docs.mailman3.org/projects/mailman/en/latest/src/mailman/config/docs/config.html#max-recipients | 18:44 |
fungi | i guess the exim log message is misleading and triggers at >=10 rather than strictly >10 | 18:45 |
fungi | https://www.exim.org/exim-html-4.66/doc/html/FAQ-html/FAQ_0.html#TOC49 | 18:48 |
fungi | https://www.exim.org/howto/mailman21.html#problem | 18:50 |
fungi | Mailman will send as many MAIL FROM/RCPT TO as it needs. It may result in more than 10 or 100 messages sent in one connection, which will exceed the default value of Exim's smtp_accept_queue_per_connection This is bad because it will cause Exim to switch into queue mode and severely delay delivery of your list messages. The way to fix this is to set mailman's SMTP_MAX_SESSIONS_PER_CONNECTION | 18:52 |
fungi | (in ~mailman/Mailman/mm_cfg.py) to a smaller value than Exim's smtp_accept_queue_per_connection | 18:52 |
fungi | looks like something is causing mailman's max-recipients default of 10 to trigger exim's SMTP_MAX_SESSIONS_PER_CONNECTION limit of 10, perhaps by accident (seems like the value in mailman was chosen intentionally to avoid this problem in exim) | 18:53 |
fungi | so we could increase one or decrease the other, or increase them both but one more than the other... | 18:54 |
opendevreview | Merged opendev/system-config master: Mark source repos as safe in install-ansible-role https://review.opendev.org/c/opendev/system-config/+/921061 | 19:00 |
Clark[m] | We could set exim to say 150 and mailman to 125 and satisfy the rules? Or maybe just bump exim to 11 and see if that fixes things? | 19:03 |
fungi | "these go to eleven" | 19:09 |
fungi | i'm open to opinions on what a good batch size should be, but also finding it odd that when mailman is configured to send in batches of 10 exim complains that it got "more than 10" as opposed to "10 or more" | 19:10 |
fungi | grepping queue id 1sFduD-00FttR-Vv from /var/log/exim4/mainlog seems to indicate there were only 10 recipients, which implies the counting error is on exim's end | 19:13 |
Clark[m] | Could just be an off by one problem. But maybe we set exim to something clearly bigger than mailman and avoid it by some margin | 19:17 |
fungi | looks like overriding smtp_accept_queue_per_connection to 0 disables that behavior too | 19:22 |
fungi | but yeah, seems like there may be an undocumented (unknown? bug?) discrepancy, since the config documentation at https://www.exim.org/exim-html-current/doc/html/spec_html/ch-main_configuration.html claims: | 19:23 |
fungi | This option limits the number of delivery processes that Exim starts automatically when receiving messages via SMTP, whether via the daemon or by the use of -bs or -bS. If the value of the option is greater than zero, and the number of messages received in a single SMTP session exceeds this number, subsequent messages are placed in the queue, but no delivery processes are started. This helps to | 19:23 |
fungi | limit the number of Exim processes when a server restarts after downtime and there is a lot of mail waiting for it on other systems. On large systems, the default should probably be increased, and on dial-in client systems it should probably be set to zero (that is, disabled). | 19:23 |
fungi | the documentation does indicate it has a default value of 10 | 19:24 |
clarkb | the main reason to not just let it grow unbound is due to potential for being flagged as a spammer? | 19:46 |
clarkb | if so then ya maybe we bump it up to some reasonable number like 50 and see if mailman is happier then bump mailman up to 45? | 19:46 |
fungi | yeah, seems like an okay value to me | 20:20 |
clarkb | Just got email (looks like only I was included) indicating the openmetal cloud is ready for us | 21:00 |
clarkb | they gave a link to the location with some credentials but that seems to be behind an auth wall (good!) and i'm nto sure how to login properly there. I'll email and ask | 21:01 |
clarkb | looks like they sent invites out to the email addrs we provided (mine ended up in the spam folder so check yours). We should be able to get started via that process and then that will give us access to the cloud via the management system | 21:10 |
fungi | there was a completion e-mail sent to the root alias yesterday, i found it in the spam folder a few hours ago and moved it to the normal inbox | 21:58 |
clarkb | that was separate and I think part of the deplyoment process | 22:15 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!