kozhukalov | Hi infra team. Can you please help me understand the following? I am using the same docker daemon configuration on all nodes for all jobs which includes "http://mirror.bhs1.ovh.opendev.org:8082" as registry mirror. But some jobs fail due to toomanyrequests. And it failed to pull the image that is used for all other jobs (openstackhelm/openstack-client:2024.2). | 19:56 |
---|---|---|
kozhukalov | What is the retention policy on this proxy? Looks like with mirror proxy configured all the images must be pulled through the proxy. And it fails on pulling images that are used like couple hours ago and must be available on the mirror. I am kind forced to recheck because we run like 15 jobs for different test cases and some of them sporadically fail due to this issue. | 19:56 |
kozhukalov | here is the example https://zuul.opendev.org/t/openstack/build/92aa8921e80840998461c58b3802b89d | 19:57 |
clarkb | kozhukalov: it is a caching proxy not a true mirror. Requests are cached for a maxmium of 24 hours but may be evicted earlier by the pruning system in a fifo style process | 19:59 |
clarkb | we disabled use of the proxy cache for docker hub by default in opendev because with docker hubs more aggressive rate limits funneling all requests through the proxy cache made us more likely to trigger the rate limits than if we had each node doing the requests | 19:59 |
clarkb | one thing to note though is that a year or two ago docker hub added ipv6 support and they treat an entire /64 as a single IP for the purposes of rate limiting | 20:00 |
clarkb | unfortunately our cloud providers do not give each VM a separate /64 | 20:00 |
clarkb | our current suggestion is largely that people stop using docker hub and/or authenticate with docker hub. If you go the authentication route then your rate limit is applied to that account though so isn't necessarily a solution if you run a bunch of jobs all at once. You also need to be careful that the credentials aren't exposed (say via a check job) and then get used to | 20:01 |
clarkb | publish images nefariously (so read only tokens if possible are ideal) | 20:01 |
clarkb | other ideas that have come up include: using a buildset registry to cache all of the images all jobs in the buildset need. Then your 15 jobs don't pull the images 15 times instead they pull them once into the buildset registry and then fetch from the buildset registry 15 times | 20:03 |
clarkb | zuul/zuul-jobs has a new role that can be used to mirror images from one registry to anohter that may be useful for this. It may also be useful for mirroring images from dokcer hub elsewhere | 20:03 |
kozhukalov | For the Openstack-helm this is a bad option to disable proxy because we deploy k8s with all its auxiliary things like CNI plugins and then deploy Openstack. So every node should download like up to 30 images during deployment. | 20:04 |
clarkb | right form a reality perspective what docker hub has done is going t olead to us doing many mor erequests overall | 20:05 |
clarkb | but also using a buildset registry as I described above can mitigate a large portion of that problem | 20:05 |
clarkb | unfortunately the docker imag eprotocol was never really built to make caching straightforward and the registry tools are generally not good at being caches (because they don't offer pruning) | 20:06 |
clarkb | the whole system was designed to talk to a central repository which worked great for yaers until they decided it was no longer economical to do so and now here we are | 20:06 |
clarkb | poettering even built an entirely different system because of this problem I'm trying to remember its name now | 20:06 |
kozhukalov | > If you go the authentication | 20:08 |
kozhukalov | If I decide to go this way, how can I store the credentials? As far as I understand zuul secrets are only allowed for gate pipeline and not for check (unreviewed) pipeline | 20:08 |
clarkb | right if they are in the check pipeline you have to assume they can be disclosed os there isn't much purpose is trying to keep them secret. However, this isn't an entirely accurate characterization of the behavior. They have to be stored in post review contexts or in trusted repos (there is no speculative execution of changes to trusted repos so everything is post review in | 20:10 |
clarkb | them essentially) | 20:10 |
clarkb | oh anoyher thing worth noting about the caching behavior is that it does respect the cache-control headers | 20:22 |
clarkb | one thing that may be happening is docker hub is setting a low cache-control value on manifest file (despite still being sha256sum addressed iirc) so we end up refetching them more often than necessary | 20:22 |
clarkb | and now that rate limits are based on manifest fetches and not object file pulls this creates problems for people trying to cache | 20:23 |
kozhukalov | Thanks a lot for this detailed explanation. I think the best option is to try to get as many images as possible from other registries | 20:33 |
fungi | yeah, some projects have started to mirror their dependencies from dockerhub to quay and then point their jobs at the quay copies | 20:40 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!