Thursday, 2023-03-30

opendevreviewIan Wienand proposed zuul/zuul-jobs master: container-build : add container_promote_method flag
opendevreviewIan Wienand proposed zuul/zuul-jobs master: remove-registry-tag: role to delete tags from registry
opendevreviewIan Wienand proposed zuul/zuul-jobs master: promote-container-image: use generic tag removal role
opendevreviewIan Wienand proposed zuul/zuul-jobs master: remove-registry-tag: update docker age match
ianw2023-03-29 02:21:38.331815 | TASK [build-wheel-cache : Prevent using existing wheel mirror]02:05
ianw2023-03-29 02:21:38.346571 | wheel-cache-ubuntu-jammy-python3 | changed: 1 line(s) removed02:05
ianw2023-03-29 02:21:38.388891 | 02:05
ianw2023-03-29 04:40:45.052347 | RUN END RESULT_TIMED_OUT: [untrusted :]02:05
ianwthat's annoying02:05
ianwit really must be the next step02:06
ianw was failing, but it's gone green.  and i'd say that relates to the mirror resyncing itself02:10
ianwlooks like it's probably right on the edge02:11
ianwthere's successful builds at 2 hrs 29 mins 20 secs02:11
ianwi think we can re-evaluate our branch list02:13
ianwthe short story is i can't see anything systematically wrong with the wheel builds right now.  i think need to monitor for next few days02:13
fungiwith openstack's constraints freezing for stable branches, i wonder if building new wheels for them is necessary02:19
ianwi guess maybe zed branching has added another step02:27
opendevreviewsean mooney proposed openstack/diskimage-builder master: [WIP] add apk element
opendevreviewsean mooney proposed openstack/diskimage-builder master: [WIP] bootloader-alpine-support
opendevreviewsean mooney proposed openstack/diskimage-builder master: [WIP] openssh-server-alpine-support
opendevreviewsean mooney proposed openstack/diskimage-builder master: [WIP] simple-init-alpine-support
opendevreviewsean mooney proposed openstack/diskimage-builder master: alpine support Change-Id: I7db02c1bdc8c6e466eee30a72ee8c70cf8ee0bf5
ianw[iwienand@fedora19 requirements (master)]$ cat /tmp/u-c.txt  | wc -l02:37
ianw[iwienand@fedora19 requirements (master)]$ cat /tmp/u-c.txt | sort | uniq  | wc -l02:37
ianwi'm really not sure why we don't de-dup the u-c across all branches02:37
ianwi have dejavu.  i think we've discussed it all before.  i think because each wheel builds with constraints of its own branch02:55
*** dasm|off is now known as Guest935304:02
fricklermnasiadka: mirror for ubuntu-ports should be in sync again07:18
*** jpena|off is now known as jpena07:18
*** amoralej is now known as amoralej|off07:26
opendevreviewSylvain Bauza proposed openstack/project-config master: Stop using Storyboard for Placement projects
*** amoralej|off is now known as amoralej07:44
fricklerthis is really weird, why does wheel building take 16 mins on bullseye but 2h on jammy?08:00
fricklermaybe we should add some more detailed logging to the build step in order to find out08:01
opendevreviewsean mooney proposed openstack/diskimage-builder master: add apk element
opendevreviewsean mooney proposed openstack/diskimage-builder master: bootloader-alpine-support
opendevreviewsean mooney proposed openstack/diskimage-builder master: alpine support
opendevreviewsean mooney proposed openstack/diskimage-builder master: simple-init-alpine-support
opendevreviewsean mooney proposed openstack/diskimage-builder master: [WIP] openssh-server-alpine-support
mnasiadkafrickler: fantastic10:20
*** amoralej is now known as amoralej|lunch11:13
fungifrickler: the time to build will depend on how many things have prebuilt wheels on pypi for that platform, so if we're building old versions of things that already have wheels for the python version on focal never got wheels on pypi for the version of python on jammy, that would explain it11:44
*** amoralej|lunch is now known as amoralej12:47
fricklerthere is a config error in the openstack tenant for any idea how to fix that?13:42
fricklersame for pytest-dev/pytest-testinfra13:43
fricklercorvus: maybe zuul should ignore merge-mode for projects that don't import any config?13:44
Clark[m]frickler: I suspect zuul is detecting those projects do something like squash or cherry pick and complaining we do normal merges locally which is out of sync with upstream. We may be able to set that on a per project basis13:44
Clark[m]Merge mode matters just for testing even if you don't do configs because you can build invalid speculative state13:45
fricklerwhat do you mean by invalid state? even if merging is enabled in github, the person applying the PR can still decide to squash or rebase, zuul can neither predict nor avoid that13:50
Clark[m]frickler: merge mode determines how zuul locally merges changes. This should match the merge mode for the project in the code review system to ensure the speculative state matches the actual state as much as possible 14:02
Clark[m]Is there not a GitHub merge mode default for projects?14:02
Clark[m]Yes looks like you can enforce a specific merge strategy in GitHub. I suspect zuul is detecting it is out of sync with projects that have done this14:04
fricklerthere is a default, but you can also allow multiple strategies at the same time, so it doesn't go 100%. the question now is how to find out what is the correct mode for those projects14:09
*** Guest9353 is now known as dasm14:48
corvusthe error is stating that the merge mode in zuul is not one of the possible merge modes selected by the project owners in github14:59
corvusso yeah, we can set that on the project in zuul to make the error go away15:00
fungiis it necessary to check that at all if we don't load any configuration from the project?15:03
fungicurious if it would make sense to have a short-circuit around the check in such cases15:03
opendevreviewBartosz Bezak proposed openstack/project-config master: Stop using Storyboard for Kayobe
fricklerfungi: clarkb mentioned earlier it will still affect speculative merging15:05
fricklerso zuul should at least try to do the most likely variant15:05
clarkbright we want the actual commits under test to look similar to what they will end up looking like upstream when we test with them.15:06
fungioh, got it. i missed that15:07
clarkbsounds like we may not be able to 100% solve that with github's allowance for variance but it knows the current method will enver be correct so complains15:07
corvusyeah, user intervention may be required due to the ambiguity15:08
fricklerbut then, whether a PR is merged, rebased or squashed, the resulting code should be the same. so unless a check actually looks at git commits, it shouldn't matter15:15
*** jpena is now known as jpena|off16:35
*** amoralej is now known as amoralej|off16:36
opendevreviewMerged opendev/system-config master: Redirect openstack-specs to git repo on opendev
opendevreviewThales Elero Cervi proposed openstack/project-config master: Add virtualization (virt) repo to StarlingX
fungithat ^ has made me wonder if we've updated the acl examples in infra-manual yet...18:52
funginever mind, our examples in there are simple enough they don't require updating. they just inherit from stuff we already updated19:00
Clark[m]The error is due to pushSignedTag. Did we break that in the checker script?19:02
fungishould be createSignedTag instead19:08
fungiwe updated them all in
fungialso we corrected it in the infra-manual examples19:10
fungiso probably was a copy from an old change19:10
fungifrom more than a year ago19:10
fungii left a comment on the change giving them a heads up19:12
opendevreviewThales Elero Cervi proposed openstack/project-config master: Add virtualization (virt) repo to StarlingX
ianwfungi: do you have some time to discuss the wheel jobs?19:31
ianwi started about 4 things yesterday and just confused myself19:31
fungiianw: sure20:07
opendevreviewThales Elero Cervi proposed openstack/project-config master: Add virtualization (virt) repo to StarlingX
ianwfungi: my first thought is, what are we actually expecting to work20:40
ianwspecifically at
ianwthere seems little point running stein -> ??? on jammy with python3.10, because a bunch of stuff doesn't build20:41
fungii think the intent was that we would install the constraints lists for each maintained stable branch on the platforms where we run jobs for those branches20:42
fungithough given how we freeze constraints for stable branches, i'm not sure even that is necessary unless we're worried about the wheel cache getting wiped and having to replace all the content from scratch20:43
fungiobviously installing constraints for stein on jammy is absurd20:43
fungii'm just increasingly uncertain we should bother to build any besides master20:44
fungiif we do intend to continue building wheels for maintained stable branches, we should probably do each branch's constraints file in a separate job anyway, so that we can select what platform(s) to do that on20:45
ianwyeah, things like centos-8 seem only to be used by swift20:46
fungiit's probably also worth bringing this topic up with prometheanfire and tonyb to get their perspectives20:47
tonybhow far back do I need to read?20:47
ianwabout 10 lines :)20:48
fungitonyb: the topic is periodically building and caching wheels for stable branches20:48
ianwi think we should be considering the case of "afs exploded" and we want to get back20:48
ianwi.e. probably should say "these platforms have previously built the wheels, which are now pushed so forget about it"20:49
tonybthat's something frickler has been looking at off and on.20:49
fungiif we want to cover that case reliably, we should probably have jobs for each branch+platform+arch we expect to continue running jobs on20:49
ianwright, i guess i got to making that list, then just left more confused :)20:50
tonybyeah I'd like to see that.20:51
fungibut also it seems slightly wasteful to keep rerunning these stable branch jobs which are going to at best just keep redundantly rebuilding the same frozen set of packages every day20:51
fungion the off-chance that the content they're building might disappear20:51
tonybweekly? or monthly would probably suffice for stable branches20:51
ianwlike for example, we've got swift running master jobs on centos-8-stream20:51
ianwbut building master global requirements on 8-stream with python3.6 doesn't seem like it can work20:52
fungianother aspect of this is, like with our decisions about what to mirror, these are a convenience to help speed up jobs. if there are very few jobs using wheels for master branch constraints on centos-8-stream nodes, then maybe we just don't bother prebuilding and caching those20:53
ianwand i wonder the opposite, perhaps we have stable branch jobs running on jammy.  again most of the stable branches g-r have stuff that isn't py3.10 happy20:53
tonybI really don't think it can unless we add more annotations to the constrains files and are prepared to cap setuptools etc for the purposes of building 20:53
ianwi think the problem with branch / platform jobs is the multiplication20:55
tonybthere is another aspect, iirc some system libraries in focal/jammy have moved forward to the point that you can't build the wheel and the only reason we have a working CI is because of that cache20:55
ianwwe have in the requirements repo rocky -> 2023.1 branches == 1020:56
fungii would be fine only building master branch constraints for the platforms listed in the openstack pti, or perhaps even simply a subset that most of the jobs run on, and have some notes on how to similarly regenerate wheels for stable branches if the cache suffers a catastrophic loss20:56
tonybI'd like to see us rebuild the cache for stable branches on some cadence20:57
fungistable branches run far fewer jobs, and if there are no prebuilt wheels for those jobs to use then yes they'll take longer or maybe highlight that a project has failed to list a binary dependency required for some of their deps which are only available as sdists20:57
tonybthat way if there is a catastrophic loss we have some confidence we can rebuild 20:57
ianw... and 16 platforms with x86+arm64 == 160 potential jobs20:57
tonybfungi at the moment if there wasn't a cache most of the stable branches wouldn't even run20:58
fungithey'd hit timeouts? or just that they're missing headers for some libs that extensions need to build against?20:58
tonybI can publish a patch to show that tomorrow 20:58
tonybit's that the system libraries haveoved forward and the python bindings haven't 20:59
tonybfor example if you have time build python-nss on jammy20:59
fungiwhy would stable branch jobs be running on jammy?21:00
tonybthe wheel fails and the only way to fix it is use code from hg21:00
fungioh, stable/2023.1 runs on jammy i guess21:00
tonybfungi I'd need to look21:00
jrosserlooks like both kolla and osa deploy Z on jammy21:01
tonybfungi you can just grab a jammy system/container and try pip install --user python-nss21:01
tonybjrosser: do they currently benefit from a wheel cache?21:02
jrosserosa certainly uses the built wheel cache21:03
tonybWRT to python-nss I'm working with $people to get it fixed and published but it's going21:03
jrosserand arm jobs are completely impossible without21:03
fungibut also the version of python-nss isn't changing in stable branch constraints21:04
tonybjrosser: interesting.  I'll have to poke at the logs.21:04
jrosserand by impossible i mean the wheel build time lands in the job instead and it's an inevitable timeout21:04
ianwhrm, i actually think the arm64 build isn't handling 2023.1 branch right21:05
tonybfungi: true. but 1.0.1 isn't buildable today.  so if we didn't have that in a wheel cache somewhere I suspect anything that installs it would fail21:05
ianwthat lists the branches and does tail -2.  but 2023.1 is going to be at the top21:05
fungiso the question is do we want automation rebuilding python-nss for stable/zed periodically (daily, weekly) or just a procedure for rebuilding it if something catastrophic happens to our cache? if the latter, is some days or weeks without a wheel cache for stable/zed disasterous? are there alternatives, like backing up the cache content?21:06
fungido we really need to be able to rebuild the cache from sdists, or is restoring from a file backup sufficient?21:07
ianw... yes it's only iterating yoga/zed
tonybpragmatically the latter would be fine.21:07
tonybin an ideal world I'd like the former21:08
ianwfungi: yeah, the gitops-y in me thinks that it's best if we have it codified somewhere, at least21:08
fungiif backups are a possibility, then we could really just run master branch wheel build jobs with the expectation that stable branch constraints don't change in ways that pull in new versions of things not available already as wheels21:09
ianwi guess arm64 has built everything in master, which eventually ended up in 2023.1 anyway, which is why this all hangs together, and what fungi is saying21:09
ianwas long as master is building every day, and we keep appending to the cache, then we don't have to worry about branches21:10
fungiand one daily backup is probably less costly resource-wise than maintaining 160 different wheel building jobs21:10
ianwi guess we would not maintain 160, but we'd have a matrix and have to at least decide which of the 160 to maintain21:11
fungiand that matrix would, itself, become a maintenance burden too21:11
tonybianw: does the current builder code start with an empty cache 21:11
tonybsorry I'm on my phone and can't get to it quickly 21:11
fungiyeah, the builder runs on a fresh vm and doesn't use our central cache21:12
fungiit's just a "normal" zuul job these days21:13
ianwsorry have to school run, back in 2021:15
ianw has some discussion to21:16
fungithe main optimization we added is that after it installs the constraints list, the job scans the pip install log to build a list of anything which was downloaded from pypi and cleans it out of the cache before we copy the remainder for publishing21:16
fungiso that we don't redundantly publish wheels which are already available on pypi21:16
fungiwe had a few incidents in the past where a wheel was yanked from pypi and our jobs didn't see that because we had a copy of it in our central cache21:17
tonybfungi: thanks.  tomorrow I'll try to make time to check the last logs 21:25
fungii think one of the goals of rebuilding for all branches was that we could as a next step start removing anything from the cache that we hadn't built in the latest run, but that was before we had 10 open branches for openstack, and a second processor architecture, and as many distros in the pti21:27
fungiso i no longer think that goal is especially reasonable21:28
fungii suppose one thing we *can* probably do safely is a cleanup pass of the current cache to 1. remove any wheels which are also available on pypi that we cached before the recent optimizations, and 2. remove any wheels which are for cpython versions or platforms we no longer run jobs on21:45
fungithat might reduce what needs backing up, if nothing else21:45
fungiand *maybe* we can check the versions in the cache against version numbers in constraints files, removing any that ended up cached because they once appeared as a constraint but then the constraint was changed while still in master21:51
fungiall of that seems probably scriptable21:51
fungiand could still get us close to the goal of having a cache that contains only packages we actively need for jobs21:52
fungiand those sorts of cleanup operations can happen asynchronously and are comparatively cheap compared to repeatedly rebuilding every last thing21:57
ianwas step 1 can go22:03
ianwroot@mirror01:/afs/ du -hs .22:04
ianwtonyb: i must be missing something about python-nss, it doesn't seem to be in upper-constraints?22:12
tonybhmm maybe it's only on older branches.22:15
tonybthat'd explain some of my confusion 22:15
tonybyup it looks like it isn't needed in > zed22:16
ianwwhat's weird is i don't see it in
ianwit is in the log @
ianw[iwienand@fedora19 python-nss===1.0.1]$ cat stderr 22:19
ianw  error: subprocess-exited-with-error22:19
ianw  22:19
ianwit suggests to me it's *never* build in these jobs :/22:19
ianw is the full output22:20
ianw(that just came from the complete logs kept at
ianw is the job22:21
clarkbfungi: ya I think having 10 branches intsead of 3 makes a huge difference here22:21
ianwit *has* built on focal ->
ianwso this comes down to the matrix thing again.  yoga upper-constraints.txt can't install on jammy essentially22:24
ianwfungi: is it perhaps fair to say that we don't care about any .whl in our mirror that has -none-any.whl?22:26
clarkbianw: I think those may still slightly speed up the install process but probably not to the extent that we'd need them22:27
clarkb(since pip will make a wheel of those packages first then install them when looking at an sdist)22:27
ianwis anything on pypi only an sdist though?22:28
fungisome things22:28
fungiyeah, i don't see significant harm is removing all pure python wheels, but would probably instead just check to see if what we have is already on pypi and remove it if so22:29
clarkbI ran into something the other day redoing my zuul env22:29
clarkbthey definitely exist22:29
ianwyeah i guess pypi does no building at all for you22:29
ianwit looks like might answer that for us22:29
fungibasically things which seem pretty safe to clean up are the platforms we no longer use, the wheels which are identical to ones on pypi, and wheels for versions of things which appear in no current constraints lists22:30
ianwif we fed it the package, then looked at the version from upper-constraints.txt, and checked if there is a .whl available22:30
ianwfungi: and then, essentially only build master and only on the platforms we expect to be able to run master?22:31
fungiand start backing up the cache22:31
fungiso that we can restore it rather than having to find a way to rebuild it22:32
fungiand figure out some periodic (semi-automated or automated) cleanup process to prune unneeded things based on the above rules22:32
ianwok, rough theory is to cat all the upper-requirments.txt together and sort | uniq it, then run it through something using pypi-simple to see if there's a wheel available upstream22:33
ianwif yes; then candidate for removal22:33
ianw(cat all together from all branches in requirements repo)22:33
clarkbwhat does removing wheels we already have get us? Smaller archive that is easier to backup?22:34
fungiwe need to compare to what we've got cached already though, right? how do we determine if there's a wheel upstream which satisfies the python version on focal but not jammy?22:34
fungiclarkb: yeah, reducing the footprint for backup purposes. i'm guessing (without actually having checked) that we likely have more cruft than actually used files22:35
ianwwell if it's a py3-none presumably it's fine everywhere?22:35
fungifor those specifically, yes22:35
fungiif we're trying to figure out manylinux1 and abi3 stuff it gets more complicated22:36
fungii was assuming we'd just make a list of all the files in our cacheĊŸ and then check pypi to see whether each one exists there or not22:37
ianwhrm, a manylinux1 would essentially work on all our platforms22:37
fungia manylinux1 for the right python version would22:38
fungialso manylinux1 for abi3 would work on most of our platforms, but those with old pip may not support checking abi322:38
ianwfungi: how do we query for a specific .whl though?  it seems like it's done by hash22:38
fungithose hashes are checksums, if we want to go about it that way. though i was thinking query pypi's (simple or json) api22:39
fungithe pypi simple api gives you a list of filenames22:39
ianwright, ours don't match22:39
fungiit's long enough to be a sha512sum, but doesn't actually match the sha512sum of what i download22:42
ianwok, i see22:43
fungiapparently BLAKE2b-25622:43
ianwcompare to, say
fungialso i means sha256sum (which it also wasn't)22:44
ianwso i think all of those .whls for our taskflow are upstream; that whole thing could go22:45
ianwone interesting thing might be how much this hides problems because we don't specify the data-requires-python22:47
fungifor some reason i thought we were excluding those from publication by scanning the pip log and identifying what was downloaded from pypi, but maybe that check has quietly broken22:47
ianwfungi: yeah, we *should* be22:47
ianwsome of them have date 2023-01-07 which is definitely after that check22:48
fungibut anyway, we should be able to either check the pypi simple api for a file list and compare filenames to decide what to remove, or generate blake2b-256 checksums of what we have and look for those on pypi though that sounds like more work when we ought to be able to assume same filename means same file22:48
ianwi think this is probably worth doing, as what's left over will be an interesting list of what we *actually* need to build for speed22:50
fungii do wonder why warehouse settled on blake2b instead of sha2 for its primary checksumming22:50
fungiclarkb: a related reason to clean as much cruft as possible out of the cache is not just because it will take up space for backups but also because the backup process would be reading this from afs which means it could be pretty slow for a large amount of data22:52
fungithe smaller we can keep that cache, the faster backups will complete, the less load we'll put on afs servers, et cetera22:52
Clark[m]Hem when they broke the pip cache it was sha256 that must be a new change?23:36
Clark[m]Silly phone auto complete... 'hrm'23:39
fungimaybe pip uses sha2-256 and warehouse uses blake2b-256? or yeah, maybe that was the breakage23:52

Generated by 2.17.3 by Marius Gedminas - find it at!