*** michael-beaver has quit IRC | 00:49 | |
*** swest has quit IRC | 01:08 | |
*** swest has joined #zuul | 01:22 | |
*** Goneri has quit IRC | 01:27 | |
*** cdearborn has quit IRC | 01:58 | |
*** saneax has quit IRC | 02:21 | |
*** bhavikdbavishi has joined #zuul | 03:17 | |
*** bhavikdbavishi1 has joined #zuul | 03:20 | |
*** bhavikdbavishi has quit IRC | 03:22 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 03:22 | |
*** bhavikdbavishi has quit IRC | 04:00 | |
*** bhavikdbavishi has joined #zuul | 04:13 | |
*** evrardjp has quit IRC | 04:35 | |
*** evrardjp has joined #zuul | 04:35 | |
*** bhavikdbavishi has quit IRC | 04:37 | |
*** zxiiro has quit IRC | 05:29 | |
*** toabctl has joined #zuul | 06:13 | |
*** sgw has quit IRC | 06:44 | |
openstackgerrit | Albin Vass proposed zuul/zuul master: WIP: enable installing ansible collections https://review.opendev.org/723071 | 09:06 |
---|---|---|
avass | ^ I'm thinking something like that | 09:06 |
openstackgerrit | Albin Vass proposed zuul/zuul master: WIP: enable installing ansible collections https://review.opendev.org/723071 | 09:17 |
*** tosky has joined #zuul | 09:36 | |
*** dpawlik has joined #zuul | 09:59 | |
*** avass has quit IRC | 10:16 | |
*** avass has joined #zuul | 10:31 | |
*** dpawlik has quit IRC | 10:36 | |
*** avass has quit IRC | 11:13 | |
*** bhavikdbavishi has joined #zuul | 11:19 | |
*** bhavikdbavishi1 has joined #zuul | 11:22 | |
*** bhavikdbavishi has quit IRC | 11:24 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 11:24 | |
*** avass has joined #zuul | 11:36 | |
*** avass has quit IRC | 12:27 | |
*** bhavikdbavishi has quit IRC | 13:13 | |
*** bhavikdbavishi has joined #zuul | 13:14 | |
*** bhavikdbavishi has quit IRC | 13:27 | |
fungi | zuul-maint: per discussion in #opendev, we may have a recent regression in git repository handling on mergers/executors causing zuul to be unable to find some shas | 13:34 |
fungi | an example traceback: http://paste.openstack.org/show/792699/ | 13:34 |
fungi | it could just be related to maintenance we performed yesterday, but we were seeing it from an executor which was rebooted prior to that, and mnaser reported independently running into a similar issue in his deployment too | 13:35 |
mnaser | fungi: i wonder if this has to do with the recent stuff merged with git gc | 14:02 |
mnaser | https://opendev.org/zuul/zuul/commit/79e50c04103098632de2ca47b18e08931925963b that is | 14:02 |
mnaser | so maybe its just gc-ing too aggressively and killing stuff we need | 14:03 |
mnaser | im just speculating though, i don't really understand all these | 14:03 |
mnaser | so it could very well be race-y with the gc stuff | 14:05 |
-openstackstatus- NOTICE: Zuul is currently failing some jobs with MERGER_FAILURE, this needs investigation by OpenDev team. Please refrain from rechecking until we give the all-clear. | 14:19 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: Revert "Ensure correct cleanup on repo update and reset" https://review.opendev.org/723098 | 14:25 |
mnaser | tristanC: are you also running into the same issue in your environments? | 14:26 |
tristanC | mnaser: nop, just noticed frickler identified that change on the path of the exception, and looking at the code it seems like it is likely able to cause such issue | 14:27 |
mnaser | tristanC: ah ok, i thought if you were runinng into it too, it would be an extra signal | 14:27 |
mnaser | but maybe not | 14:27 |
corvus | http://paste.openstack.org/show/792699/ http://paste.openstack.org/show/792708/ are relevant | 14:33 |
corvus | i'm unfamiliar with "cleaning empty ref dir" | 14:34 |
corvus | that's the other change isn't it? | 14:35 |
corvus | the one that merged on april 15 | 14:35 |
corvus | mnaser: given that you say you've been seeing this for a while, and i don't think we had been running either change, maybe that's the one that's more suspect? | 14:36 |
corvus | tristanC: ^ yeah that looks suspicios | 14:36 |
mnaser | yeah i do feel like it is more likely that might be an suspect | 14:36 |
mnaser | corvus: yeah and i wonder if the container images have the issue because the git version is different | 14:37 |
tristanC | when testing the merger cleanup change, i verified git continued to work when either .refs/heads or .refs/tags where missing, but when both are missing, then .refs is removed and that makes git unhappy | 14:37 |
corvus | tristanC: ah, so we might be able to roll forward with a patch for this | 14:38 |
mnaser | so what we need is to make sure we delete refs/heads and refs/tags but NOT refs/ ? | 14:39 |
corvus | maybe just patch it to, you know, not delete '.git/refs' and see if that fixes it | 14:39 |
tristanC | corvus: yes, that seems to work, let me propose something | 14:39 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: merger: ensure .git/refs is not removed as a leaked dirs https://review.opendev.org/723104 | 14:42 |
tristanC | the fix may be ^, i'll try to confirm with a unittest | 14:43 |
corvus | tristanC: that looks great, thanks! | 14:45 |
corvus | let's go aheand and merge that, and if we add a unittest as followup if we're able | 14:46 |
mnaser | ya, lgtm basic eye inspection | 14:46 |
corvus | mnaser: thanks! | 14:46 |
corvus | mordred: ^ | 14:46 |
mordred | ++ | 14:49 |
tristanC | corvus: though i'm not able to reproduce this easily... how can a merger repo doesn't have heads or branches? | 14:55 |
tristanC | heads or tags* | 14:55 |
corvus | tristanC: they may all be packed refs? | 14:56 |
corvus | and packed refs are kept in the .git/packed-refs file | 14:56 |
mordred | corvus: could that maybe be the state if you clone a repo immediately after a gc? | 14:58 |
tristanC | corvus: is there a git command or merger function i can use to reproduce that state? | 14:58 |
corvus | tristanC: so in a unit test, you may want to run "git pack-refs" after the repo is setup and before starting a merge | 14:58 |
mordred | ah - pack-refs | 14:58 |
corvus | tristanC: i think maybe "git.pack_refs()" would do it -- using the gitpython magic git commands syntax | 14:59 |
tristanC | no luck, AttributeError: 'Repo' object has no attribute 'pack_refs' | 15:00 |
*** jlk has quit IRC | 15:00 | |
tristanC | or repo.git maybe? | 15:01 |
corvus | tristanC: yeah try repo.git.pack_refs | 15:02 |
*** pots has quit IRC | 15:06 | |
*** pots has joined #zuul | 15:06 | |
tristanC | corvus: here is the test i'm trying to use to reproduce the original issue: http://paste.openstack.org/show/792710/ and this still works without https://review.opendev.org/723104 | 15:07 |
corvus | tristanC: i don't see anything obvious, but you can run with "KEEP_TEMPDIRS=1" and inspect the repos on disk | 15:15 |
mnaser | btw fwiw there is another suggestion for the socat workaround when pushing to ipv6 and its to add things to /etc/hosts | 15:19 |
mnaser | maybe we can throw something into the hosts file instead of running socat and all the stuff that entails, unless that's been explored | 15:19 |
mordred | mnaser: we don't always have appropriate access to add things to /etc/hosts | 15:22 |
mnaser | mordred: ah true, i forgot about that | 15:22 |
mordred | mnaser: you know what would be great? if docker would land the patch fixing the bug | 15:22 |
mordred | :) | 15:22 |
mnaser | mordred: dont we have some mirantis friends we can ping now? :P | 15:23 |
mordred | or if skopeo would be willing to fix it rather than insisting on keeping bug compat | 15:23 |
mordred | mnaser: that's a great point | 15:23 |
mordred | clarkb: ^^ maybe we should ping sergey and see if he knows how to push on that issue for us? | 15:24 |
tristanC | corvus: oh, it seems like https://review.opendev.org/723104 only affects git version < 2.13. The docker.io/zuul/zuul-executor has git 2.20 | 15:28 |
mordred | tristanC: ah - yes. this is a test that will really only properly trigger on xenial | 15:29 |
tristanC | is ze01 using the docker image? | 15:29 |
corvus | no, opendev executors are still pip | 15:29 |
tristanC | mnaser: iiuc you also had http://paste.openstack.org/show/792699/ , was this with docker.io/zuul/zuul-executor? | 15:30 |
mnaser | tristanC: i couldnt find in my logs a reference but i had similar behaviour, as i restarted things lately to try and see what the issue was, but i do use the z-e images | 15:31 |
mnaser | so i guess i never really ran into that, it must be something else | 15:31 |
tristanC | corvus: so, forcing an old git version, without https://review.opendev.org/723104 , i got the merger to remove `.git/refs` , but that didn't cause the ValueError of http://paste.openstack.org/show/792699/ | 15:35 |
corvus | tristanC: well, that's disappointing | 15:37 |
tristanC | here is the test and logs: http://paste.openstack.org/show/792712/ | 15:37 |
*** bhavikdbavishi has joined #zuul | 15:37 | |
corvus | tristanC: what happens if you call pack_refs on the work_repo? | 15:40 |
tristanC | corvus: no difference. So perhaps it is because the test doesn't simulate the origin.refs of a real change, e.g. https://opendev.org/zuul/zuul/src/branch/master/zuul/merger/merger.py#L359 | 15:43 |
tristanC | anyway, i think https://review.opendev.org/723104 is still relevant, because without it, the workspace clone are unusable, further git command fails with `fatal: not a git repository (or any of the parent directories): .git` | 15:44 |
corvus | tristanC: yeah, we should probably merge that, deploy it on opendev, then if it doesn't fix things, start reverting those 2 changes and figure out what we need to really debug them | 15:47 |
*** irclogbot_3 has quit IRC | 15:55 | |
mordred | ++ | 15:59 |
*** irclogbot_1 has joined #zuul | 16:00 | |
mnaser | corvus: im rechecking the patch to avoid deleting .git/refs to get it queued up to merge again | 16:02 |
fungi | the zuul tenant's gate pipeline doesn't require a clean check result to enqueue | 16:07 |
mnaser | yep so that should send it straight to gate, so that should be nice | 16:08 |
corvus | mnaser: looks good, thanks | 16:08 |
AJaeger | Is that a Zuul error? " Pipelines may not be defined in untrusted repos.. " See https://zuul.opendev.org/t/zuul/config-errors - and https://opendev.org/openstack/project-config/src/branch/master/zuul/main.yaml#L1619 ; we include without config... | 16:10 |
mordred | AJaeger: that seems weird | 16:12 |
mnaser | and a little concerning | 16:13 |
mnaser | could this be bad zuul merger behaviour again popping up? | 16:14 |
mnaser | https://opendev.org/zuul/zuul/src/branch/master/zuul/configloader.py#L1846-L1855 | 16:19 |
fungi | that gets raised in zuul.configloader.TenantParser.filterUntrustedProjectYAML() called from _loadTenantYAML() method of the same class which is in turn called by the fromYaml() method of that class | 16:19 |
mnaser | interesting i'm not seeing any logic that might actually filter out the fact pipelines are not being loaded | 16:20 |
fungi | which is called from the parseConfig() method | 16:20 |
fungi | but i think those errors are going to be coming from the scheduler not the merger, right? | 16:21 |
fungi | errors from mergers don't bubble up into the status api like that | 16:22 |
mnaser | correct, i think ti is from the scheduler when its loading the config and noticing that an untrusted repo contains pipelines | 16:22 |
mnaser | but it also seems like it doesn't take into factor the fact that it may not be included in the config | 16:22 |
fungi | yeah, looks like _loadDynamicProjectData() jumps straight to filterUntrustedProjectYAML() if the project is untrusted | 16:25 |
*** bhavikdbavishi1 has joined #zuul | 16:25 | |
*** bhavikdbavishi has quit IRC | 16:25 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 16:25 | |
*** irclogbot_1 has quit IRC | 16:26 | |
fungi | so unless the with configuration_exceptions() check in filterUntrustedProjectYAML() is somehow skipping projects included without config, i agree it looks like it just raises that exception for any untrusted project data with a pipelines attribute | 16:27 |
fungi | and yeah, configuration_exceptions() seems to just be for making the exception classes available | 16:29 |
*** irclogbot_1 has joined #zuul | 16:30 | |
*** evrardjp has quit IRC | 16:35 | |
*** evrardjp has joined #zuul | 16:35 | |
AJaeger | do we need to revert https://review.opendev.org/723082 ? | 16:36 |
fungi | AJaeger: not sure, it looks like it may have simply exposed a latent bug in zuul | 16:38 |
frickler | tristanC: regarding 723104 I think we should even go one step further. "git init" creates .git/refs/{heads,tags}, so we likely shouldn't be removing these, either | 16:42 |
tristanC | frickler: it seems like git works fine without heads or tags subdir | 16:47 |
tristanC | corvus: so i can't tell if https://review.opendev.org/723104 will fix http://paste.openstack.org/show/792699/ . If it does not, it would be good to know the context when the valueerror exception happen (is it random, is it affecting project without tags or stable branches, ...). | 16:50 |
tristanC | then i agree reverting https://review.opendev.org/723104 and https://review.opendev.org/701531 would be the next step to prevent further issue | 16:51 |
*** avass has joined #zuul | 16:53 | |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Support multi-arch image builds with docker buildx https://review.opendev.org/722339 | 16:54 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Use failed_when: false instead of ignore_errors: true https://review.opendev.org/723109 | 16:54 |
tristanC | frickler: i confirm running `rm -R .git/refs/*` with git version 2.7.4 from xenial doesn't cause git error on further command | 16:54 |
openstackgerrit | Merged zuul/zuul master: merger: ensure .git/refs is not removed as a leaked dirs https://review.opendev.org/723104 | 17:00 |
*** bhavikdbavishi has quit IRC | 17:11 | |
corvus | i restarted opendev executors, and i see this error some times: http://paste.openstack.org/show/792715/ | 17:43 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Revert "Tune automatic garbage collection of git repos" https://review.opendev.org/723110 | 18:29 |
corvus | i'm going to self-approve that on the basis that a couple hours ago when we had more maintainers around we all agreed that was the next step | 18:30 |
mnaser | ++ | 18:30 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul master: Revert "Tune automatic garbage collection of git repos" https://review.opendev.org/723111 | 18:33 |
*** avass has quit IRC | 18:34 | |
*** Goneri has joined #zuul | 18:47 | |
*** avass has joined #zuul | 19:11 | |
*** avass has quit IRC | 19:25 | |
openstackgerrit | Merged zuul/zuul master: Revert "Tune automatic garbage collection of git repos" https://review.opendev.org/723110 | 19:38 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Use failed_when: false and ignore_errors: true https://review.opendev.org/723109 | 22:29 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Support multi-arch image builds with docker buildx https://review.opendev.org/722339 | 23:02 |
*** avass has joined #zuul | 23:14 | |
*** tosky has quit IRC | 23:32 | |
avass | tristanC: re: https://review.opendev.org/#/c/722309/6 I think it would more sense to use a "block: ... always: ..." instead of "failed_when: false" and throwing a failure message | 23:37 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!