Tuesday, 2025-03-25

opendevreviewMerged zuul/zuul-jobs master: mirror-container-images: use skopeo to mirror multiarch images  https://review.opendev.org/c/zuul/zuul-jobs/+/94487800:00
clarkbI'm going to look for dinner now but would be good if we can keep an eye on ^ during opendev's mirror jobs that trigger in ~2 hours00:02
clarkbimage mirroring looks ok to me https://quay.io/repository/opendevmirror/registry?tab=tags there are manifests for unknown arches and platforms in addition to the linux on amd64 linux on arm64 etc manifests02:24
clarkbnot sure what is with those unknown ones. I feel like we've looked into this with nodepool images before and decided it wasn't a problem but I don't recall specifics02:25
clarkbcorvus would be good for you to double check tomorrow but my first glance seems fine02:25
clarkbalso periodic jobs are a great way to exercise the new nodepool launchers02:25
corvusclarkb: i agree that looks good.  we could probably recheck a zuul change and that might exercise the images02:27
corvusrechecked https://review.opendev.org/94430302:27
clarkbcool02:28
frickleris this job supposed to do anything useful? https://zuul.opendev.org/t/zuul/builds?job_name=zuul-nox-py311-multi-scheduler&project=zuul%2Fzuul&result=SUCCESS&skip=0 does timeout for me, without the success filter I only see failures+timeouts07:44
*** dmellado0755393737 is now known as dmellado07553937309:09
*** ykarel_ is now known as ykarel11:12
frickler#status log paused ubuntu-noble image builds and deleted the most recent one to mitigate https://bugs.launchpad.net/ubuntu/+source/linux/+bug/210413412:17
opendevstatusfrickler: finished logging12:17
fricklerjamespage: seems haleyb is out, can you take a look at this bug ^^ and make sure it gets proper attention? 12:22
Clark[m]frickler: I proposed a change yesterday to move that zuul job to the experimental pipeline. The jobs purpose is to run tests with multiple coordinating schedules which has value but getting it stable has been difficult. Maybe easier with larger test nodes I don't know13:04
Clark[m]Re the kernel bug this seems like deja vu I swear we had the same problem not too long ago13:06
Clark[m]Oh jammy broke in December and now noble is broken on the same bug13:07
ykarelClark[m], yes same issue was with Jammy in second last week of December13:07
Clark[m]https://wiki.ubuntu.com/KernelTeam says the kernel team is on matrix now13:15
Clark[m]Bugs that break firewalls on lts kernels are probably worth being up there?13:15
Clark[m]If no one beats me to it I can send a message once I'm actually fed and awake13:16
fricklerClark[m]: iiuc ykarel did so already13:32
fungiall's the better13:33
ykarelClark[m], frickler yes i already send a message there13:59
opendevreviewJeremy Stanley proposed opendev/bindep master: Comment reminding to replace extras with depgroups  https://review.opendev.org/c/opendev/bindep/+/94540214:14
fungilatest test results on ^ indicate centos 9 mirrors are back to working again14:45
opendevreviewJeremy Stanley proposed opendev/engagement master: Update project boilerplate  https://review.opendev.org/c/opendev/engagement/+/94515114:46
opendevreviewJeremy Stanley proposed opendev/engagement master: Import old who-approves.py script  https://review.opendev.org/c/opendev/engagement/+/94515214:46
opendevreviewJeremy Stanley proposed opendev/engagement master: Ratchet down and simplify linting rules  https://review.opendev.org/c/opendev/engagement/+/94521214:46
opendevreviewJeremy Stanley proposed opendev/engagement master: Rename who-approves.py to maintainers.py  https://review.opendev.org/c/opendev/engagement/+/94522414:46
opendevreviewJeremy Stanley proposed opendev/engagement master: Add a convenience entrypoint for maintainers.py  https://review.opendev.org/c/opendev/engagement/+/94522514:46
opendevreviewJeremy Stanley proposed opendev/engagement master: Rewrite maintainers.py functionality  https://review.opendev.org/c/opendev/engagement/+/94526214:46
clarkbinfra-root https://review.opendev.org/c/openstack/project-config/+/945398 and https://review.opendev.org/c/opendev/zone-opendev.org/+/945399 are the last two changes for cleaning up the old nodepool launchers if the new launchers look good to you14:56
corvusi don't think the mariadb statement timeouts are working in opendev.  i ran through everything manually and they seem to work.  so i'm going to restart the web servers again just to make sure i didn't get wires crossed and they somehow started using the mysql dialect dburi.  if that doesn't work, then i'll have to dig deeper.15:48
clarkback15:48
clarkbfwiw the serach builds by project performance did seem a lot better15:48
corvusyep, that much is working (which does make me suspect that the configuration is correct).  but still, gotta cross this off the list.15:49
corvusoh actually, that would hit with mysql dialect too15:49
corvusso, yeah.  restarting now.15:49
clarkbah15:50
corvusi'll restart the schedulers too, just because there's a small version bump.  that way they match.15:51
clarkblast call for objections on 945398 and 945399 otherwise I'll approve them and then work on cleaning up nl01, nl02, nl03, and nl04 on the cloud side15:51
fungianother fairly active thread has started up on the python community discourse in relation to yesterday's setuptools regression: https://discuss.python.org/t/how-can-build-backends-avoid-breaking-users-when-they-make-backwards-incompatible-changes/8584715:55
fungiclarkb: i've approved them both15:58
clarkbfungi: thanks15:59
clarkbI was just about to do so myself saved me a few clicks15:59
clarkbre that thread it seems to be saying what I was trying to get at yesterday which is nice to see15:59
corvusokay restart didn't fix it.  off to the repl.16:01
opendevreviewMerged opendev/zone-opendev.org master: Cleanup nl01, nl02, nl03, and nl04 DNS records  https://review.opendev.org/c/opendev/zone-opendev.org/+/94539916:02
opendevreviewMerged openstack/project-config master: Cleanup configs for nl01, nl02, nl03, and nl04  https://review.opendev.org/c/openstack/project-config/+/94539816:08
clarkbonce those have deployed I'll proceed with server deletion and emergency file cleanup. Should be able to get that done well before the next round of tuesday meetings16:09
clarkbdeployment succeeded for both changes. I'm proceeding with server deletions now16:23
opendevreviewJames E. Blair proposed zuul/zuul-jobs master: Add upload-image-s3 role  https://review.opendev.org/c/zuul/zuul-jobs/+/94481316:28
clarkb#status log Deleted nl01.opendev.org (7bf432b1-392f-4c34-adc3-f11f8181a187), nl02.opendev.org (553767f5-b6af-4684-b716-3ad2e16e18e2), nl03.opendev.org (a53d3af1-dfc0-4cb0-9cd4-d57e43355230), and nl04.opendev.org (c8206f41-eded-44be-ae3f-a18f4788fd39). They have been replaced by nl05-08.16:30
clarkbhrm status bot is here just being slow I guess16:31
opendevstatusclarkb: finished logging16:31
fungiclarkb: remember it does a synchronous write to the wiki16:42
clarkboh right16:43
fungiso if the meediawiki api is dead slow responding (which it often is these days, especially for database writes), it can take an age16:43
jamespagefrickler: I need to find someone at canonical to point you at16:48
fricklerjamespage: seems haleyb was back today so best check with him I'd think16:50
jamespagefrickler: ack - ftr I'm no longer at Canonical so on the outside as well now :)16:52
jamespageI've asked fnordahl to join this channel as he should be aware of this16:53
clarkbother than the bug itself I think the main feedback may be that it would be good if ubuntu could track buggy kernel patches to avoid repeating the same bugs release by release months apart16:54
clarkbbugs happen and the response in December was much appreciated. Ideally we'd avoid repeating the same issue in noble now16:54
fricklerjamespage: oh, I wasn't aware of that, I'll try to avoid annoying you with Canonical things in the future, then :-)16:56
jamespagefrickler: new news - only 2 weeks16:56
clarkboh congrats!16:56
jamespagethanks16:57
fricklerjamespage: nice, so it looks like you're doing containers now. you may want to update your oif page anyway ;)17:00
jamespageyep on the TODO list17:00
clarkbLE will stop sending expiration email reminders. We're fine as we have our own monitoring and update 30 days in advance but mentioning it here in case anyone was relying on those emails17:34
fungialso i don't think we ever received expiration reminders from them? or maybe we just renewed too soon to trigger any17:37
clarkbI think we renew too soon to trigger them17:39
clarkbfungi: I wouldn't say supporting old pythons is a lot of work. Its only extra work when devs choose to start making superficial changes that impact compatibility17:45
clarkbat least for a tool like pbr17:45
clarkbwith minimal dependencies (setuptoosl only) and a narrow focus/scope17:46
clarkbI think the recent breaking chagne is a good example of this. Setuptools can accept both variations of the names using - or _ indefinitely using a small compatibility shim. That is easy to maintain and understand basically forever. but the instant you decide to no longer be backward compatible you have to consider the impacts and that is not easy and requires effort17:47
fungiyeah, i tried to point out that cpython is surprisingly backward compatible, and it's setuptools deciding to drop support for old things that otherwise would still work which is causing headaches17:48
fungithe effort in maintaining backward compatibility for pbr isn't nearly as much as for larger projects in openstack, but it's still more work than i'm sure some build backend maintainers want to sign up for17:50
slittleDoes opendev have any automated tools for keeping a feature branch up to date relative to the main branch. i.e. an automated daily merge from 'main' to 'my_branch'.  I expect not, as the merge always risks failing on a conflict and manual intervention would be required at that point.    17:51
clarkbslittle: not for branches that both move independently. jeepyb does have the ability to update a local tracking branch to follow an upsteram but they can't diverge its a copy not a merge17:52
clarkbin general I suspect we'd largely recommend feature branches and similar types of work be as short lived as possible17:52
clarkbyou can maintain stacks of proposed changes on top of branches with fairly minimal effort which means unless there is a really good reason to fork temporarily you're probably better off doing that17:53
clarkbfungi: one thing I find odd is that I think pbr is already doing the - to _ mapping for us. Are people then tripping because setuptools is also reading the file too?17:53
clarkbfungi: I wonder if we can make pbr/setuptools avoid that extra read and allow pbr to be a compatibility layer. That might work as a workaround for users of pbr17:54
clarkbfungi: look at cfg_to_args() and setup_cfg_to_setup_kwargs() to see what I'm talking about17:54
fungiclarkb: correct, setuptools has added setup.cfg file validation, based on (incorrect) assumptions that it's the only thing using that file17:55
fungiand yeah, that's what i meant in my post about transparently transforming metadata options17:56
clarkbslittle: if you can provdie more info about your higher level use case that would help us provide advice that works with the existing tooling17:56
slittleIS there an tools to aid in maintaining such a stack of proposed changes?  And sharing that stack?  I know the pain of trying to keep just a few updates current in gerrit.  In high traffic areas it usually thouws a merge conflict pretty quick.17:58
clarkbthere is git restack: https://opendev.org/opendev/git-restack https://pypi.org/project/git-restack/ I personally just use git rebase -i HEAD~N where N is the number of commits back that I need to edit. I also do what I like to call "squash back" where I edit on the tip with new commits that I know will be squashed back into existing commits that already have changes18:00
clarkbits the sort of thing that becomes a lot easier with a little practice18:00
clarkbnewer/latest git has gotten a lot better about not conflicting on repeated work too which helps when you rearrange the order or stuff18:01
fungii use gits- restack all the time. just used it today for this series of changes, for example: https://review.opendev.org/c/opendev/engagement/+/94522518:01
fungier, git-restack18:01
slittleBasically I have a starlingx feature about to launch that will run for 6 months minimum and hit a dozen gits.  Right now my best recommendation for them is to branch all gits and DO NOT try to keep up with the 'main' branch on a continuing basis. Instead I'm suggesting they do just a few manual merges at well chosen times.  i.e.  when both main and feature are otherwise healthy.18:03
clarkbya so thats a pretty classic feature branch setup and in general I think we expect those to merge manually (because you may need to merge in either direction and it changes over time and conflicts tend to be common with feature work)18:04
clarkbthe downside to working that way is merging can become a lot more difficult as you aren't doing it a piece at a time its everything all at once every time18:04
clarkbthe upside is you can ignore all the other work happening while you work on your feature branch until you go to merge18:05
fungiright, usually whoever's maintaining that feature branch (e.g. release team members) will have the necessary permissions to merge from master into the feature branch at their discretion, whenever they feel it's needed, and then to merge the feature branch into master when they're ready to wrap it up18:05
clarkbmost openstack projects develop new features directly against master all the time and don't use featur ebranches. There are rare exceptions and they tend to be for specific features (though swift has used tehm more than others iirc)18:06
fungiif instead you want to continually work in sync with master, rebasing a change series targeting master will be less work18:06
clarkbwhich is to say both approaches are valid and do work. You just need to pick which poison is better for you18:06
slittleWhat work is required to setup the feature owner with permissions to merge freely into there branch?18:07
fungialso if you're doing this across multiple git repositories, you may need depends-on footers in the commit messages of some changes where they rely on series in a sister repository18:07
fungislittle: https://docs.opendev.org/opendev/infra-manual/latest/drivers.html#feature-branches18:08
slittleI guess the other aspect is that this is a multi-developer feature.  I've only ever seem rebase used sucessfully for single developer features.18:09
slittleseem -> seen18:09
clarkbthere are two approaches to ahndle mutli devs working no the same stack that I've seen work well. The first is to always git reviwe -d the stack before you edit it to ensure you have the latest copy and do some lightweight comms "I'm working on that now"18:10
clarkbthe other is to decouple it a bit and rely on depends-on rather than the git tree to enforce roder18:11
clarkbre automating merges one thing to keep in mind is if you can git merge things trivially then that is trivial for anyone to do at any point and there is less value to doing it daily or on a schedule. If there are conflicts they need to eb resolved and that requires a human anyway18:11
fungibut generally the main reason to use a feature branch is if you want to make breaking changes that don't impact master until later, and are willing to incur the associated pain of dealing with that at merge points18:13
fungiusually projects either develop in master and then create stable branches at some cadence to provide a lower-churn option, or they develop on feature branches so that master will be lower-churn. doing both at the same time is a lot less common18:14
clarkbit also helps a lot to make code review and landing code an active part of the dev loop18:14
clarkbthat minimizes the critical sections and reduces the depths of stacks/context you have to deal with18:15
clarkbconsistent incremental progress essentially18:15
opendevreviewStephen Finucane proposed openstack/project-config master: gerritbot: Log changes to stable branches on #openstack-keystone  https://review.opendev.org/c/openstack/project-config/+/94551218:35
opendevreviewMerged openstack/project-config master: gerritbot: Log changes to stable branches on #openstack-keystone  https://review.opendev.org/c/openstack/project-config/+/94551218:59
fricklerI'm seeing concerning job timeouts on rax-dfw. for the last two weeks or so that was mostly kolla jobs, now two for keystone, in particular a simply docs job that really really shouldn't timeout https://zuul.opendev.org/t/openstack/build/b02c3d859f6e4084ac2447a0b353b8e2 https://zuul.opendev.org/t/openstack/build/d99690acde8e4745bf3c1d3aa832f97420:39
fricklerthese seem mostly to be happening when the cloud is running at capacity, so I'm thinking maybe to limit max-servers there for a while. like go to 100 from 140? https://grafana.opendev.org/d/a8667d6647/nodepool3a-rackspace?orgId=1&from=now-6h&to=now&timezone=utc&var-region=$__all20:41
fungilooks like it has the expected processor count and ram, at least20:43
fungiso not a scheduling mix-up20:43
fungimaybe this is a good incentive to pick jamesdenton's brain about shifting more of our quota from rackspace classic to flex?20:44
fungisince the network and mirror rebuilds, i haven't observed any issues with the test nodes we've been booting in either dfw3 or sjc320:46
opendevreviewAurelio Jargas proposed zuul/zuul-jobs master: Add role: `ensure-python-command`, refactor similar roles  https://review.opendev.org/c/zuul/zuul-jobs/+/94149021:06
gouthamrhas anyone run into an issue where devstack bails out quite early in CI jobs with apache2 restarts failing? my specific issue seems to occur after setting up "keystone-tls-proxy", and bouncing teh apache2 service for that to take effect21:41
gouthamrThe error i see in the journal is "apache2.service: Failed with result 'start-limit-hit'."21:41
gouthamrapache2.service: Start request repeated too quickly.21:41
tonybgouthamr: I haven't seen it. So you have logs from the failed job?21:43
gouthamrtonyb: yes, https://zuul.opendev.org/t/openstack/build/e2fbf3148ba449c6ae5e0ec3f45c3318/log/controller/logs/devstacklog.txt#6218-6227 21:43
tonybhttps://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e2f/openstack/e2fbf3148ba449c6ae5e0ec3f45c3318/controller/logs/apache/tls-proxy_error_log.txt doesn't have any errors and I doubt that warning is the cause 21:49
gouthamryeah :/ this is happening way every time on a single change, but not always on the same devstack job, which, gives me the feels that the particular change is cursed :D 21:50
gouthamrhttps://review.opendev.org/c/openstack/manila-tempest-plugin/+/942862 21:50
tonybI'll keep looking, but it's slow going because I'm on my phone 21:50
gouthamrty for taking a look, tonyb 21:50
gouthamr++21:50
JayFThis sounds vaguely like an issue we had in ironic, I don't remember how we fixed it21:58
* JayF can't find it in gerrit22:00
tonybgouthamr: I think I need my laptop to do more digging.   Does a no op change on the same SHA with the (merged) depends-on fail the same way?22:03
Clark[m]gouthamr tonyb https://serverfault.com/questions/845471/service-start-request-repeated-too-quickly-refusing-to-start-limit23:02
Clark[m]Probably just need to update the unit file to allow more restarts. That will be simpler than changing how devstack updates apache. My guess is those jobs ran on faster rax flex nodes and that allows them to restart too quickly 23:02
clarkbI don't think you need to fully replace the /usr/lib/systemd/system unit you can just append to it via /etc/systemd/ or whatever the path is23:24
clarkbwhoever decided that pyenv installing python 3.13 should install to /usr/local/bin/python3.13.2t is crazy23:27
clarkbok now I shall go back to enjoying the nice weather. Tomorrow I'll try to land things that have bee nreviewed23:33
corvus\o/ mariadb query timeouts look good now:  Query   |    1 | Sending data | SET STATEMENT max_statement_time=30.0 for ...23:48
corvusi restarted the schedulers and web servers to pick up the fix23:48

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!