Tuesday, 2025-03-25

opendevreview	Merged zuul/zuul-jobs master: mirror-container-images: use skopeo to mirror multiarch images https://review.opendev.org/c/zuul/zuul-jobs/+/944878	00:00
clarkb	I'm going to look for dinner now but would be good if we can keep an eye on ^ during opendev's mirror jobs that trigger in ~2 hours	00:02
clarkb	image mirroring looks ok to me https://quay.io/repository/opendevmirror/registry?tab=tags there are manifests for unknown arches and platforms in addition to the linux on amd64 linux on arm64 etc manifests	02:24
clarkb	not sure what is with those unknown ones. I feel like we've looked into this with nodepool images before and decided it wasn't a problem but I don't recall specifics	02:25
clarkb	corvus would be good for you to double check tomorrow but my first glance seems fine	02:25
clarkb	also periodic jobs are a great way to exercise the new nodepool launchers	02:25
corvus	clarkb: i agree that looks good. we could probably recheck a zuul change and that might exercise the images	02:27
corvus	rechecked https://review.opendev.org/944303	02:27
clarkb	cool	02:28
frickler	is this job supposed to do anything useful? https://zuul.opendev.org/t/zuul/builds?job_name=zuul-nox-py311-multi-scheduler&project=zuul%2Fzuul&result=SUCCESS&skip=0 does timeout for me, without the success filter I only see failures+timeouts	07:44
*** dmellado0755393737 is now known as dmellado075539373		09:09
*** ykarel_ is now known as ykarel		11:12
frickler	#status log paused ubuntu-noble image builds and deleted the most recent one to mitigate https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2104134	12:17
opendevstatus	frickler: finished logging	12:17
frickler	jamespage: seems haleyb is out, can you take a look at this bug ^^ and make sure it gets proper attention?	12:22
Clark[m]	frickler: I proposed a change yesterday to move that zuul job to the experimental pipeline. The jobs purpose is to run tests with multiple coordinating schedules which has value but getting it stable has been difficult. Maybe easier with larger test nodes I don't know	13:04
Clark[m]	Re the kernel bug this seems like deja vu I swear we had the same problem not too long ago	13:06
Clark[m]	Oh jammy broke in December and now noble is broken on the same bug	13:07
ykarel	Clark[m], yes same issue was with Jammy in second last week of December	13:07
Clark[m]	https://wiki.ubuntu.com/KernelTeam says the kernel team is on matrix now	13:15
Clark[m]	Bugs that break firewalls on lts kernels are probably worth being up there?	13:15
Clark[m]	If no one beats me to it I can send a message once I'm actually fed and awake	13:16
frickler	Clark[m]: iiuc ykarel did so already	13:32
fungi	all's the better	13:33
ykarel	Clark[m], frickler yes i already send a message there	13:59
opendevreview	Jeremy Stanley proposed opendev/bindep master: Comment reminding to replace extras with depgroups https://review.opendev.org/c/opendev/bindep/+/945402	14:14
fungi	latest test results on ^ indicate centos 9 mirrors are back to working again	14:45
opendevreview	Jeremy Stanley proposed opendev/engagement master: Update project boilerplate https://review.opendev.org/c/opendev/engagement/+/945151	14:46
opendevreview	Jeremy Stanley proposed opendev/engagement master: Import old who-approves.py script https://review.opendev.org/c/opendev/engagement/+/945152	14:46
opendevreview	Jeremy Stanley proposed opendev/engagement master: Ratchet down and simplify linting rules https://review.opendev.org/c/opendev/engagement/+/945212	14:46
opendevreview	Jeremy Stanley proposed opendev/engagement master: Rename who-approves.py to maintainers.py https://review.opendev.org/c/opendev/engagement/+/945224	14:46
opendevreview	Jeremy Stanley proposed opendev/engagement master: Add a convenience entrypoint for maintainers.py https://review.opendev.org/c/opendev/engagement/+/945225	14:46
opendevreview	Jeremy Stanley proposed opendev/engagement master: Rewrite maintainers.py functionality https://review.opendev.org/c/opendev/engagement/+/945262	14:46
clarkb	infra-root https://review.opendev.org/c/openstack/project-config/+/945398 and https://review.opendev.org/c/opendev/zone-opendev.org/+/945399 are the last two changes for cleaning up the old nodepool launchers if the new launchers look good to you	14:56
corvus	i don't think the mariadb statement timeouts are working in opendev. i ran through everything manually and they seem to work. so i'm going to restart the web servers again just to make sure i didn't get wires crossed and they somehow started using the mysql dialect dburi. if that doesn't work, then i'll have to dig deeper.	15:48
clarkb	ack	15:48
clarkb	fwiw the serach builds by project performance did seem a lot better	15:48
corvus	yep, that much is working (which does make me suspect that the configuration is correct). but still, gotta cross this off the list.	15:49
corvus	oh actually, that would hit with mysql dialect too	15:49
corvus	so, yeah. restarting now.	15:49
clarkb	ah	15:50
corvus	i'll restart the schedulers too, just because there's a small version bump. that way they match.	15:51
clarkb	last call for objections on 945398 and 945399 otherwise I'll approve them and then work on cleaning up nl01, nl02, nl03, and nl04 on the cloud side	15:51
fungi	another fairly active thread has started up on the python community discourse in relation to yesterday's setuptools regression: https://discuss.python.org/t/how-can-build-backends-avoid-breaking-users-when-they-make-backwards-incompatible-changes/85847	15:55
fungi	clarkb: i've approved them both	15:58
clarkb	fungi: thanks	15:59
clarkb	I was just about to do so myself saved me a few clicks	15:59
clarkb	re that thread it seems to be saying what I was trying to get at yesterday which is nice to see	15:59
corvus	okay restart didn't fix it. off to the repl.	16:01
opendevreview	Merged opendev/zone-opendev.org master: Cleanup nl01, nl02, nl03, and nl04 DNS records https://review.opendev.org/c/opendev/zone-opendev.org/+/945399	16:02
opendevreview	Merged openstack/project-config master: Cleanup configs for nl01, nl02, nl03, and nl04 https://review.opendev.org/c/openstack/project-config/+/945398	16:08
clarkb	once those have deployed I'll proceed with server deletion and emergency file cleanup. Should be able to get that done well before the next round of tuesday meetings	16:09
clarkb	deployment succeeded for both changes. I'm proceeding with server deletions now	16:23
opendevreview	James E. Blair proposed zuul/zuul-jobs master: Add upload-image-s3 role https://review.opendev.org/c/zuul/zuul-jobs/+/944813	16:28
clarkb	#status log Deleted nl01.opendev.org (7bf432b1-392f-4c34-adc3-f11f8181a187), nl02.opendev.org (553767f5-b6af-4684-b716-3ad2e16e18e2), nl03.opendev.org (a53d3af1-dfc0-4cb0-9cd4-d57e43355230), and nl04.opendev.org (c8206f41-eded-44be-ae3f-a18f4788fd39). They have been replaced by nl05-08.	16:30
clarkb	hrm status bot is here just being slow I guess	16:31
opendevstatus	clarkb: finished logging	16:31
fungi	clarkb: remember it does a synchronous write to the wiki	16:42
clarkb	oh right	16:43
fungi	so if the meediawiki api is dead slow responding (which it often is these days, especially for database writes), it can take an age	16:43
jamespage	frickler: I need to find someone at canonical to point you at	16:48
frickler	jamespage: seems haleyb was back today so best check with him I'd think	16:50
jamespage	frickler: ack - ftr I'm no longer at Canonical so on the outside as well now :)	16:52
jamespage	I've asked fnordahl to join this channel as he should be aware of this	16:53
clarkb	other than the bug itself I think the main feedback may be that it would be good if ubuntu could track buggy kernel patches to avoid repeating the same bugs release by release months apart	16:54
clarkb	bugs happen and the response in December was much appreciated. Ideally we'd avoid repeating the same issue in noble now	16:54
frickler	jamespage: oh, I wasn't aware of that, I'll try to avoid annoying you with Canonical things in the future, then :-)	16:56
jamespage	frickler: new news - only 2 weeks	16:56
clarkb	oh congrats!	16:56
jamespage	thanks	16:57
frickler	jamespage: nice, so it looks like you're doing containers now. you may want to update your oif page anyway ;)	17:00
jamespage	yep on the TODO list	17:00
clarkb	LE will stop sending expiration email reminders. We're fine as we have our own monitoring and update 30 days in advance but mentioning it here in case anyone was relying on those emails	17:34
fungi	also i don't think we ever received expiration reminders from them? or maybe we just renewed too soon to trigger any	17:37
clarkb	I think we renew too soon to trigger them	17:39
clarkb	fungi: I wouldn't say supporting old pythons is a lot of work. Its only extra work when devs choose to start making superficial changes that impact compatibility	17:45
clarkb	at least for a tool like pbr	17:45
clarkb	with minimal dependencies (setuptoosl only) and a narrow focus/scope	17:46
clarkb	I think the recent breaking chagne is a good example of this. Setuptools can accept both variations of the names using - or _ indefinitely using a small compatibility shim. That is easy to maintain and understand basically forever. but the instant you decide to no longer be backward compatible you have to consider the impacts and that is not easy and requires effort	17:47
fungi	yeah, i tried to point out that cpython is surprisingly backward compatible, and it's setuptools deciding to drop support for old things that otherwise would still work which is causing headaches	17:48
fungi	the effort in maintaining backward compatibility for pbr isn't nearly as much as for larger projects in openstack, but it's still more work than i'm sure some build backend maintainers want to sign up for	17:50
slittle	Does opendev have any automated tools for keeping a feature branch up to date relative to the main branch. i.e. an automated daily merge from 'main' to 'my_branch'. I expect not, as the merge always risks failing on a conflict and manual intervention would be required at that point.	17:51
clarkb	slittle: not for branches that both move independently. jeepyb does have the ability to update a local tracking branch to follow an upsteram but they can't diverge its a copy not a merge	17:52
clarkb	in general I suspect we'd largely recommend feature branches and similar types of work be as short lived as possible	17:52
clarkb	you can maintain stacks of proposed changes on top of branches with fairly minimal effort which means unless there is a really good reason to fork temporarily you're probably better off doing that	17:53
clarkb	fungi: one thing I find odd is that I think pbr is already doing the - to _ mapping for us. Are people then tripping because setuptools is also reading the file too?	17:53
clarkb	fungi: I wonder if we can make pbr/setuptools avoid that extra read and allow pbr to be a compatibility layer. That might work as a workaround for users of pbr	17:54
clarkb	fungi: look at cfg_to_args() and setup_cfg_to_setup_kwargs() to see what I'm talking about	17:54
fungi	clarkb: correct, setuptools has added setup.cfg file validation, based on (incorrect) assumptions that it's the only thing using that file	17:55
fungi	and yeah, that's what i meant in my post about transparently transforming metadata options	17:56
clarkb	slittle: if you can provdie more info about your higher level use case that would help us provide advice that works with the existing tooling	17:56
slittle	IS there an tools to aid in maintaining such a stack of proposed changes? And sharing that stack? I know the pain of trying to keep just a few updates current in gerrit. In high traffic areas it usually thouws a merge conflict pretty quick.	17:58
clarkb	there is git restack: https://opendev.org/opendev/git-restack https://pypi.org/project/git-restack/ I personally just use git rebase -i HEAD~N where N is the number of commits back that I need to edit. I also do what I like to call "squash back" where I edit on the tip with new commits that I know will be squashed back into existing commits that already have changes	18:00
clarkb	its the sort of thing that becomes a lot easier with a little practice	18:00
clarkb	newer/latest git has gotten a lot better about not conflicting on repeated work too which helps when you rearrange the order or stuff	18:01
fungi	i use gits- restack all the time. just used it today for this series of changes, for example: https://review.opendev.org/c/opendev/engagement/+/945225	18:01
fungi	er, git-restack	18:01
slittle	Basically I have a starlingx feature about to launch that will run for 6 months minimum and hit a dozen gits. Right now my best recommendation for them is to branch all gits and DO NOT try to keep up with the 'main' branch on a continuing basis. Instead I'm suggesting they do just a few manual merges at well chosen times. i.e. when both main and feature are otherwise healthy.	18:03
clarkb	ya so thats a pretty classic feature branch setup and in general I think we expect those to merge manually (because you may need to merge in either direction and it changes over time and conflicts tend to be common with feature work)	18:04
clarkb	the downside to working that way is merging can become a lot more difficult as you aren't doing it a piece at a time its everything all at once every time	18:04
clarkb	the upside is you can ignore all the other work happening while you work on your feature branch until you go to merge	18:05
fungi	right, usually whoever's maintaining that feature branch (e.g. release team members) will have the necessary permissions to merge from master into the feature branch at their discretion, whenever they feel it's needed, and then to merge the feature branch into master when they're ready to wrap it up	18:05
clarkb	most openstack projects develop new features directly against master all the time and don't use featur ebranches. There are rare exceptions and they tend to be for specific features (though swift has used tehm more than others iirc)	18:06
fungi	if instead you want to continually work in sync with master, rebasing a change series targeting master will be less work	18:06
clarkb	which is to say both approaches are valid and do work. You just need to pick which poison is better for you	18:06
slittle	What work is required to setup the feature owner with permissions to merge freely into there branch?	18:07
fungi	also if you're doing this across multiple git repositories, you may need depends-on footers in the commit messages of some changes where they rely on series in a sister repository	18:07
fungi	slittle: https://docs.opendev.org/opendev/infra-manual/latest/drivers.html#feature-branches	18:08
slittle	I guess the other aspect is that this is a multi-developer feature. I've only ever seem rebase used sucessfully for single developer features.	18:09
slittle	seem -> seen	18:09
clarkb	there are two approaches to ahndle mutli devs working no the same stack that I've seen work well. The first is to always git reviwe -d the stack before you edit it to ensure you have the latest copy and do some lightweight comms "I'm working on that now"	18:10
clarkb	the other is to decouple it a bit and rely on depends-on rather than the git tree to enforce roder	18:11
clarkb	re automating merges one thing to keep in mind is if you can git merge things trivially then that is trivial for anyone to do at any point and there is less value to doing it daily or on a schedule. If there are conflicts they need to eb resolved and that requires a human anyway	18:11
fungi	but generally the main reason to use a feature branch is if you want to make breaking changes that don't impact master until later, and are willing to incur the associated pain of dealing with that at merge points	18:13
fungi	usually projects either develop in master and then create stable branches at some cadence to provide a lower-churn option, or they develop on feature branches so that master will be lower-churn. doing both at the same time is a lot less common	18:14
clarkb	it also helps a lot to make code review and landing code an active part of the dev loop	18:14
clarkb	that minimizes the critical sections and reduces the depths of stacks/context you have to deal with	18:15
clarkb	consistent incremental progress essentially	18:15
opendevreview	Stephen Finucane proposed openstack/project-config master: gerritbot: Log changes to stable branches on #openstack-keystone https://review.opendev.org/c/openstack/project-config/+/945512	18:35
opendevreview	Merged openstack/project-config master: gerritbot: Log changes to stable branches on #openstack-keystone https://review.opendev.org/c/openstack/project-config/+/945512	18:59
frickler	I'm seeing concerning job timeouts on rax-dfw. for the last two weeks or so that was mostly kolla jobs, now two for keystone, in particular a simply docs job that really really shouldn't timeout https://zuul.opendev.org/t/openstack/build/b02c3d859f6e4084ac2447a0b353b8e2 https://zuul.opendev.org/t/openstack/build/d99690acde8e4745bf3c1d3aa832f974	20:39
frickler	these seem mostly to be happening when the cloud is running at capacity, so I'm thinking maybe to limit max-servers there for a while. like go to 100 from 140? https://grafana.opendev.org/d/a8667d6647/nodepool3a-rackspace?orgId=1&from=now-6h&to=now&timezone=utc&var-region=$__all	20:41
fungi	looks like it has the expected processor count and ram, at least	20:43
fungi	so not a scheduling mix-up	20:43
fungi	maybe this is a good incentive to pick jamesdenton's brain about shifting more of our quota from rackspace classic to flex?	20:44
fungi	since the network and mirror rebuilds, i haven't observed any issues with the test nodes we've been booting in either dfw3 or sjc3	20:46
opendevreview	Aurelio Jargas proposed zuul/zuul-jobs master: Add role: `ensure-python-command`, refactor similar roles https://review.opendev.org/c/zuul/zuul-jobs/+/941490	21:06
gouthamr	has anyone run into an issue where devstack bails out quite early in CI jobs with apache2 restarts failing? my specific issue seems to occur after setting up "keystone-tls-proxy", and bouncing teh apache2 service for that to take effect	21:41
gouthamr	The error i see in the journal is "apache2.service: Failed with result 'start-limit-hit'."	21:41
gouthamr	apache2.service: Start request repeated too quickly.	21:41
tonyb	gouthamr: I haven't seen it. So you have logs from the failed job?	21:43
gouthamr	tonyb: yes, https://zuul.opendev.org/t/openstack/build/e2fbf3148ba449c6ae5e0ec3f45c3318/log/controller/logs/devstacklog.txt#6218-6227	21:43
tonyb	https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e2f/openstack/e2fbf3148ba449c6ae5e0ec3f45c3318/controller/logs/apache/tls-proxy_error_log.txt doesn't have any errors and I doubt that warning is the cause	21:49
gouthamr	yeah :/ this is happening way every time on a single change, but not always on the same devstack job, which, gives me the feels that the particular change is cursed :D	21:50
gouthamr	https://review.opendev.org/c/openstack/manila-tempest-plugin/+/942862	21:50
tonyb	I'll keep looking, but it's slow going because I'm on my phone	21:50
gouthamr	ty for taking a look, tonyb	21:50
gouthamr	++	21:50
JayF	This sounds vaguely like an issue we had in ironic, I don't remember how we fixed it	21:58
* JayF can't find it in gerrit		22:00
tonyb	gouthamr: I think I need my laptop to do more digging. Does a no op change on the same SHA with the (merged) depends-on fail the same way?	22:03
Clark[m]	gouthamr tonyb https://serverfault.com/questions/845471/service-start-request-repeated-too-quickly-refusing-to-start-limit	23:02
Clark[m]	Probably just need to update the unit file to allow more restarts. That will be simpler than changing how devstack updates apache. My guess is those jobs ran on faster rax flex nodes and that allows them to restart too quickly	23:02
clarkb	I don't think you need to fully replace the /usr/lib/systemd/system unit you can just append to it via /etc/systemd/ or whatever the path is	23:24
clarkb	whoever decided that pyenv installing python 3.13 should install to /usr/local/bin/python3.13.2t is crazy	23:27
clarkb	ok now I shall go back to enjoying the nice weather. Tomorrow I'll try to land things that have bee nreviewed	23:33
corvus	\o/ mariadb query timeouts look good now: Query \| 1 \| Sending data \| SET STATEMENT max_statement_time=30.0 for ...	23:48
corvus	i restarted the schedulers and web servers to pick up the fix	23:48

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!