-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 834857: Fix bug in getting changed files https://review.opendev.org/c/zuul/zuul/+/834857 | 06:18 | |
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 834857: Fix bug in getting changed files https://review.opendev.org/c/zuul/zuul/+/834857 | 06:29 | |
@dong.zhang:matrix.org | Is opendev zuul currently working? check job is not staring in https://zuul.opendev.org/t/zuul/status | 06:41 |
---|---|---|
@q:fricklercloud.de | zuul-maint: ^^ see #opendev, help wanted | 07:32 |
@bookwar:fedora.im | Hi, folks. I am looking into the topic of zuul deployment on Openshift. I want to reuse the code from https://opendev.org/zuul/zuul-helm, but it is missing licensing information. Is there a default license for OpenDev projects? | 09:55 |
@apevec:matrix.org | Aleksandra Fedorova: that would be a question for the main author there, mnaser | 10:32 |
@apevec:matrix.org | afaik each project must declare own license | 10:33 |
@apevec:matrix.org | mnaser: is zuul-helm still used for Vexxhost Zuul ? | 10:34 |
@apevec:matrix.org | Aleksandra Fedorova: VH is doing Zuul hosting e.g. https://telecominfraproject.zuul.vexxhost.dev/builds | 10:36 |
@bookwar:fedora.im | I doubt that specific repo is used for anything in its current state, it has several outdated links to deprecated locations, but that's not a problem for me. I am planning to adjust it to my needs anyway. But for that I need the license to permit it. | 10:36 |
@apevec:matrix.org | https://vexxhost.com/solutions/managed-zuul/ if product placement is allowed here :) | 10:37 |
@apevec:matrix.org | Aleksandra Fedorova: I'd email mnaser and corvus - it's all theirs | 10:38 |
@bookwar:fedora.im | thanks for the tip, will do | 10:39 |
-@gerrit:opendev.org- Andy Ladjadj proposed: [zuul/zuul-jobs] 834043: [upload-logs-base] add public url attribute https://review.opendev.org/c/zuul/zuul-jobs/+/834043 | 11:35 | |
-@gerrit:opendev.org- Andy Ladjadj proposed: [zuul/zuul-jobs] 834043: [upload-logs-base] add public url attribute https://review.opendev.org/c/zuul/zuul-jobs/+/834043 | 11:35 | |
-@gerrit:opendev.org- Andy Ladjadj proposed: [zuul/zuul-jobs] 834043: [upload-logs-base] add public url attribute https://review.opendev.org/c/zuul/zuul-jobs/+/834043 | 11:36 | |
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 834857: Fix bug in getting changed files https://review.opendev.org/c/zuul/zuul/+/834857 | 13:11 | |
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 835452: Test zuul-client dequeue-all https://review.opendev.org/c/zuul/zuul/+/835452 | 13:42 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 835464: Add a blob store and store large secrets in it https://review.opendev.org/c/zuul/zuul/+/835464 | 14:24 | |
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 834857: Fix bug in getting changed files https://review.opendev.org/c/zuul/zuul/+/834857 | 14:26 | |
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 834857: Fix bug in getting changed files https://review.opendev.org/c/zuul/zuul/+/834857 | 14:46 | |
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 832363: Add queue.dependencies-by-topic https://review.opendev.org/c/zuul/zuul/+/832363 | 15:22 | |
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 835349: Revert "Remove Worker class" https://review.opendev.org/c/zuul/zuul/+/835349 | 15:30 | |
@jim:acmegating.com | mhu: to be clear -- when i suggested we can add the information back, i meant in a new way -- the old implementation is not a good starting point. | 15:37 |
@mhuin:matrix.org | even for the worker info? I understood that the nodes info needed rework | 15:38 |
@jim:acmegating.com | yeah, see the original commit message -- we're sending that info back via a different method now | 15:39 |
@mhuin:matrix.org | The revert doesn't rebase on master trivially anyway | 15:39 |
@mhuin:matrix.org | Ah ok, I missed that | 15:40 |
@mnaser:matrix.org | Could we have a quick eyes to fix the k8s jobs and a cleanup: | 15:49 |
- https://review.opendev.org/c/zuul/zuul-jobs/+/835162 | ||
- https://review.opendev.org/c/zuul/zuul-jobs/+/835156 | ||
@y2kenny:matrix.org | I have been seeing Zuul getting into weird states due to Gerrit connections issue. This is Zuul 5.2.0 with 6x scheduler, 10x merger, 10x executor. I don't know the internals of Zuul too well so I thought I should share my observation here: | 16:15 |
Observation 1: scheduler repeatedly updating gerrit change (seems to stuck in a loop?): | ||
https://paste.opendev.org/show/b9zioMLgrMCNl0yK2868/ | ||
It's just the same few changes repeating (it's more than 3 line... I just pasted 3 as an example | ||
Observation 2: jobs get stuck with the log streamer saying "BuildID not found" | ||
I am suspecting Gerrit connection error causing Zuul comment/score posting to fail and causing Zuul to stuck in a weird state. This is just a guess though. The recovery method is restarting scheduler and executors. | ||
@y2kenny:matrix.org | I see GerritConnection timeout can be adjusted for ssh but not sure if it's possible for http | 16:24 |
@clarkb:matrix.org | Kenny Ho: you do have a gerrit connection error then? | 16:27 |
@y2kenny:matrix.org | yes, definitely having unstable connection to Gerrit. | 16:28 |
@y2kenny:matrix.org | but only intermittently | 16:28 |
@y2kenny:matrix.org | Zuul seems to be having trouble recovering... I think? | 16:30 |
@y2kenny:matrix.org | something is happening but there are also a bunch of events not being schedule | 16:30 |
@clarkb:matrix.org | Well it should just make a new connection. The http stuff is pretty stateless and for ssh it retries until reconnected | 16:31 |
@clarkb:matrix.org | I think it might be helpful to pick a specific instance of what looks weird and track that down using the event ids in the logs | 16:31 |
@y2kenny:matrix.org | so I am looking at one scheduler instance | 16:32 |
@y2kenny:matrix.org | same event but the log is filled with "Updating <change>" | 16:33 |
@y2kenny:matrix.org | for a bunch of different change | 16:33 |
@y2kenny:matrix.org | at a rate of about 1 a sec | 16:34 |
@y2kenny:matrix.org | oh... I just see some pipeline adding change | 16:34 |
@clarkb:matrix.org | Ya it does that to ensure it has a current view of the repos for config accuracy | 16:34 |
@clarkb:matrix.org | Is the updating <change> happening over and over again for the same event? | 16:34 |
@y2kenny:matrix.org | over and over again for the same event yes | 16:34 |
@y2kenny:matrix.org | (I am assuming e: <hash> is event) | 16:35 |
@y2kenny:matrix.org | same event and same change | 16:35 |
@clarkb:matrix.org | yes e: <hash> identifies the triggering event | 16:35 |
@y2kenny:matrix.org | same event and same bunch of changes repeatedly at a rate of 1 per second | 16:37 |
@clarkb:matrix.org | is the processing of that event hitting an error causing another scheduler to process it then repeating in a loop? | 16:38 |
@y2kenny:matrix.org | um... let me check the other scheduler's log... | 16:38 |
@y2kenny:matrix.org | I just sampled two other scheduler and they don't seems to have repeating events. What I see is something else that I also see occasionally that I don't understand: (Exception loading ZKObject... kazoo.exceptions.NoNodeError...) Shouldn't be related though since it's on a pipeline that is not used. | 16:41 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 835478: Add a note about bwrap and setsid https://review.opendev.org/c/zuul/zuul/+/835478 | 16:41 | |
@y2kenny:matrix.org | This (repeating events) seems to only happen on the oldest scheduler (the scheduler I have been using before going to multi-scheduler) | 16:47 |
@y2kenny:matrix.org | not sure if that's relevant either | 16:47 |
@jim:acmegating.com | it's only happening on one scheduler because only one scheduler processes the gerrit event stream | 16:48 |
@y2kenny:matrix.org | ah ok | 16:48 |
@fungicide:matrix.org | Florian Haas: https://twitter.com/xahteiwi/status/1508470104970903561 got mentioned to me and since i don't have a twitter account i'll follow up here. i agree we don't seem to have any documentation explaining how git submodules are handled by zuul (the only reference i can find to the string "submodule" anywhere in the repo is in a couple of tests of the merger service), but there may be zuul users in here who use git submodules in their projects and can explain their workflow or related pitfalls sufficient for us to add something to the docs | 16:50 |
@fungicide:matrix.org | expanding my search to include "gitmodule" i see that the merger resets the .gitmodules file if an exception is raised while fetching refs which could indicate a faulty configuration introduced in .gitmodules | 16:55 |
@q:fricklercloud.de | fwiw what Kenny Ho describes sounds pretty similar to what I saw on zuul01.opendev.org. | 16:56 |
@clarkb:matrix.org | Kenny Ho: I think q ended up just restarting the scheduler to get it moving agian. Of course that doesn't help root cause and debug it. | 17:19 |
@clarkb:matrix.org | Are there no errors between the restarts of event processing? | 17:20 |
@clarkb:matrix.org | Seems like it is not removing the events because it must be failing somewhere along the way? | 17:20 |
@mnaser:matrix.org | > <@bookwar:fedora.im> I doubt that specific repo is used for anything in its current state, it has several outdated links to deprecated locations, but that's not a problem for me. I am planning to adjust it to my needs anyway. But for that I need the license to permit it. | 17:21 |
Sorry, there are a few outstanding patches which haven't landed because I haven't had time to fix the tests for it. However, it is functional with those patches (that are not passing but that's because the tests are borked) | ||
@mnaser:matrix.org | We use it in production and it works just fine :) | 17:21 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-helm] 835480: Add license file https://review.opendev.org/c/zuul/zuul-helm/+/835480 | 17:23 | |
@jim:acmegating.com | mnaser: ^ can you ack that? that was my understanding at least | 17:23 |
@mnaser:matrix.org | > <@jim:acmegating.com> mnaser: ^ can you ack that? that was my understanding at least | 17:24 |
+1'd, yeah, apache2 makes sense and that's the intention to follow the rest of the Zuul licensing. | ||
@jim:acmegating.com | apevec: i think it would be good for you to ack that as a redhat representative. | 17:24 |
@jim:acmegating.com | mnaser: (well, most of the rest; there is some gpl3 there) | 17:24 |
@y2kenny:matrix.org | Clark: There are error (Gerrit connection timeout, max retries, etc.) restarting kind of help but not really? I can't quite put my finger on it yet | 17:26 |
@y2kenny:matrix.org | like... there are event/jobs that got through | 17:26 |
@y2kenny:matrix.org | but feels like something major is blocking. | 17:27 |
@y2kenny:matrix.org | I restarted a few times but things didn't move until the last restart and everything seems to be unblocked | 17:28 |
@y2kenny:matrix.org | it's possible the problem is really just the network issue on the Gerrit side but then I am not sure how some of the jobs get through and succeed. | 17:28 |
@clarkb:matrix.org | I thought that we would eventually fail (we retry gerrit queries but after 3? retries we emit a failure) | 17:29 |
@y2kenny:matrix.org | I do get a Max retries exceeded | 17:29 |
@clarkb:matrix.org | I would expect it to move on at that point | 17:36 |
@clarkb:matrix.org | maybe we aren't handling that case properly which puts us in a loop because the event isn't removed? | 17:36 |
@y2kenny:matrix.org | Clark: Is the event stored in zk? Because they seems to survive scheduler restart (I could be mistaken though.) | 17:47 |
@clarkb:matrix.org | Kenny Ho: yes, it is received by a scheduler and then stored in zookeeper in a queue where it is meant to be processed | 17:51 |
@apevec:matrix.org | > <@jim:acmegating.com> apevec: i think it would be good for you to ack that as a redhat representative. | 17:55 |
you mean for your contributions while employed by Red Hat? | ||
@apevec:matrix.org | I can't ack myself, I can file a ticket for legal | 17:56 |
@jim:acmegating.com | apevec: yep, just to avoid any doubt, since red hat is the copyright holder. | 17:56 |
@bookwar:fedora.im | > <@mnaser:matrix.org> Sorry, there are a few outstanding patches which haven't landed because I haven't had time to fix the tests for it. However, it is functional with those patches (that are not passing but that's because the tests are borked) | 19:11 |
Good to know, thank you. I'll see if i can help and contribute fixes back to the main repo then. | ||
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 835100: Rely on the unparsed config cache in reconfigurations https://review.opendev.org/c/zuul/zuul/+/835100 | 19:31 | |
@jim:acmegating.com | Aleksandra Fedorova: https://review.opendev.org/835480 should address your concern about the license | 19:33 |
@clarkb:matrix.org | corvus: I think if you want to +A 835100 now that is probably fine since the edits were important but outside of the code and you had +2s before | 19:33 |
@vlotorev:matrix.org | Hi, on https://zuul.opendev.org/t/opendev/projects there projects from multiple connections: github, googlesource. | 19:53 |
On the other hand none of these projects from github/googlesource are mentioned in pipelines configuration https://opendev.org/opendev/project-config/src/branch/master/zuul.d/pipelines.yaml. | ||
What's the benefit of adding these projects if they not enqueued to pipelines? Only to test Depends-On? | ||
@jim:acmegating.com | vlotorev: that and required-projects | 19:54 |
@vlotorev:matrix.org | * Hi, on https://zuul.opendev.org/t/opendev/projects there are projects from multiple connections: github, googlesource. | 19:54 |
On the other hand none of these projects from github/googlesource are mentioned in pipelines configuration https://opendev.org/opendev/project-config/src/branch/master/zuul.d/pipelines.yaml. | ||
What's the benefit of adding these projects if they not enqueued to pipelines? Only to test Depends-On? | ||
@vlotorev:matrix.org | Thanks. | 19:55 |
@clarkb:matrix.org | Ya with github repos zuul will cache them for us and put them in place on jobs which is a common use case. We also do depends on with upstream gerrit for our deployment testingto ensure our bug fixes (and others if necessary) work for us | 20:19 |
@fungicide:matrix.org | vlotorev: we do also report to some projects on other connections in different tenants, but we try to isolate them so that we don't wind up with broken zuul configuration from projects outside our sphere of control impacting other tenants | 20:35 |
@fungicide:matrix.org | specifically, limiting which projects we will read configuration from | 20:41 |
@fungicide:matrix.org | but also in some cases limiting what specific kinds of configuration we'll allow to be used in them | 20:42 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 835518: Fix recursive Gerrit change query https://review.opendev.org/c/zuul/zuul/+/835518 | 21:11 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 835522: Add more submitted-together tests https://review.opendev.org/c/zuul/zuul/+/835522 | 21:57 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: | 22:10 | |
- [zuul/zuul] 835518: Fix recursive Gerrit change query https://review.opendev.org/c/zuul/zuul/+/835518 | ||
- [zuul/zuul] 835522: Add more submitted-together tests https://review.opendev.org/c/zuul/zuul/+/835522 | ||
@jim:acmegating.com | Clark: ^ can you re-review those? thx | 22:10 |
@clarkb:matrix.org | yes | 22:10 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 835478: Add a note about bwrap and setsid https://review.opendev.org/c/zuul/zuul/+/835478 | 22:44 | |
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 832870: Make promote work for any pipeline manager https://review.opendev.org/c/zuul/zuul/+/832870 | 23:24 | |
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 835100: Rely on the unparsed config cache in reconfigurations https://review.opendev.org/c/zuul/zuul/+/835100 | 23:31 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!