openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] ensure-tox use venv https://review.opendev.org/725737 | 00:19 |
---|---|---|
*** rlandy has quit IRC | 00:20 | |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] ensure-tox use venv https://review.opendev.org/725737 | 00:26 |
*** rfolco has quit IRC | 00:35 | |
*** jamesmcarthur has joined #zuul | 00:37 | |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: ensure-pip : fix Xenial exported virtualenv command https://review.opendev.org/725743 | 00:57 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: ensure-pip : fix Xenial exported virtualenv command https://review.opendev.org/725743 | 01:06 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: ensure-pip : fix Xenial exported virtualenv command https://review.opendev.org/725743 | 01:09 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: ensure-pip : fix Xenial exported virtualenv command https://review.opendev.org/725743 | 01:15 |
*** jamesmcarthur has quit IRC | 01:17 | |
*** swest has quit IRC | 01:53 | |
*** wxy has joined #zuul | 01:59 | |
openstackgerrit | Merged zuul/zuul-jobs master: ensure-pip : fix Xenial exported virtualenv command https://review.opendev.org/725743 | 02:06 |
*** swest has joined #zuul | 02:09 | |
*** jamesmcarthur has joined #zuul | 02:33 | |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: Remove opensuse-15-plain testing https://review.opendev.org/725750 | 02:42 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: ensure-tox: use venv to install https://review.opendev.org/725737 | 03:01 |
*** jamesmcarthur has quit IRC | 03:06 | |
*** jamesmcarthur has joined #zuul | 03:06 | |
*** bhavikdbavishi has joined #zuul | 03:06 | |
*** jamesmcarthur has quit IRC | 03:11 | |
*** bhavikdbavishi1 has joined #zuul | 03:16 | |
*** bhavikdbavishi has quit IRC | 03:17 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 03:17 | |
*** jamesmcarthur has joined #zuul | 03:41 | |
*** openstack has joined #zuul | 04:24 | |
*** ChanServ sets mode: +o openstack | 04:24 | |
*** evrardjp has quit IRC | 04:35 | |
*** evrardjp has joined #zuul | 04:36 | |
*** jamesmcarthur has quit IRC | 05:04 | |
*** jamesmcarthur has joined #zuul | 05:05 | |
*** jamesmcarthur has quit IRC | 05:11 | |
*** jamesmcarthur has joined #zuul | 05:11 | |
*** sanjayu__ has joined #zuul | 05:14 | |
*** sanjayu_ has quit IRC | 05:17 | |
*** sanjayu__ has quit IRC | 05:48 | |
*** ysandeep|away is now known as ysandeep | 05:51 | |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Remove some temporary files https://review.opendev.org/725650 | 05:54 |
*** saneax has joined #zuul | 05:54 | |
*** jamesmcarthur has quit IRC | 06:00 | |
*** jamesmcarthur has joined #zuul | 06:01 | |
*** dpawlik has joined #zuul | 06:03 | |
*** jamesmcarthur has quit IRC | 06:06 | |
*** ysandeep is now known as ysandeep|away | 06:12 | |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: WIP: Move queue from pipeline to project https://review.opendev.org/720182 | 06:14 |
*** ysandeep|away has quit IRC | 06:29 | |
*** jamesmcarthur has joined #zuul | 06:39 | |
*** bhavikdbavishi has quit IRC | 06:56 | |
AJaeger | tobiash: could you check https://review.opendev.org/#/c/723654/ , please? | 07:03 |
tobiash | AJaeger, avass: why do you prefer failed_when? | 07:05 |
*** saneax has quit IRC | 07:06 | |
tobiash | initially I used ignore_errors there intentionally to not mask errors in the log | 07:06 |
*** bhavikdbavishi has joined #zuul | 07:10 | |
*** saneax has joined #zuul | 07:11 | |
*** yolanda has joined #zuul | 07:14 | |
*** sanjayu_ has joined #zuul | 07:17 | |
*** jcapitao has joined #zuul | 07:18 | |
*** saneax has quit IRC | 07:19 | |
avass | tobiash: I think your change just got caught up when a changes a lot of places where ignore_errors when failed_when was actually wanted | 07:20 |
avass | i changed* | 07:20 |
tobiash | ah ok | 07:21 |
avass | since ignore_errors still show it as an error, but failed_when: false should be used when the task can't error, like checking if something is installed to know if it should be installed or not | 07:22 |
tobiash | avass: I also looked at the rest of your topic and for the other changes failed_when really makes more sense | 07:22 |
AJaeger | thanks, tobiash | 07:23 |
avass | how ready is the zuul-operator by the way? :) | 07:24 |
tobiash | I think tristanC is using it | 07:24 |
*** bhavikdbavishi has quit IRC | 07:25 | |
avass | nice | 07:25 |
avass | whenever I have more time I think I'll see if we can start using that then | 07:25 |
AJaeger | zuul-jobs maintainers, could you review https://review.opendev.org/725650 again, please? | 07:30 |
*** tosky has joined #zuul | 07:30 | |
*** rpittau|afk is now known as rpittau | 07:31 | |
openstackgerrit | Albin Vass proposed zuul/zuul-operator master: Doc fix: s/developper/developer https://review.opendev.org/725772 | 07:32 |
avass | zuul-maint: anyone wanna take a look at enabling callbacks? https://review.opendev.org/#/c/717260/ | 07:39 |
tobiash | avass: I'll have a look later today | 07:41 |
avass | tobiash: thanks! | 07:44 |
*** openstackstatus has quit IRC | 07:46 | |
*** openstack has joined #zuul | 07:50 | |
*** ChanServ sets mode: +o openstack | 07:50 | |
tobiash | avass: were you the one who had a wip version of windows log streaming (I remember someone mentioned it)? | 07:56 |
*** jpena|off is now known as jpena | 07:58 | |
*** dmellado has quit IRC | 07:58 | |
*** bhavikdbavishi has joined #zuul | 07:58 | |
*** mhu has joined #zuul | 08:00 | |
*** dmellado has joined #zuul | 08:04 | |
*** tumble has joined #zuul | 08:06 | |
avass | tobiash: yeah, but it's very coupled to our specific usecase at the moment | 08:08 |
tobiash | ok, so I guess it's not ready to push up a wip version so we can work together on finishing it? | 08:08 |
*** dpawlik has quit IRC | 08:09 | |
avass | tobiash: but the idea is almost the same as the linu one | 08:09 |
*** dpawlik has joined #zuul | 08:09 | |
avass | tobiash: not really, but I can see if I can make it ready enough and push it | 08:09 |
tobiash | cool | 08:09 |
avass | but it's using windows services instead | 08:11 |
tobiash | ah, so you're using a windows service for the streaming component and probably a forked win_shell plugin as in linux? | 08:12 |
avass | yeah :) | 08:13 |
avass | tobiash: actually, we didn't fork win_shell since we're having win_shell call psexec to get an interactive session on the remote, so we're piping the output from that to a file that's read but the log streamer | 08:22 |
avass | that's the part that is a bit too coupled to our usecase :) | 08:23 |
*** jamesmcarthur has quit IRC | 08:41 | |
*** sgw has quit IRC | 08:48 | |
openstackgerrit | Merged zuul/zuul-jobs master: Remove some temporary files https://review.opendev.org/725650 | 09:00 |
*** jamesmcarthur has joined #zuul | 09:16 | |
*** sugaar has quit IRC | 09:16 | |
*** sugaar has joined #zuul | 09:17 | |
*** bhavikdbavishi has quit IRC | 09:25 | |
*** jamesmcarthur has quit IRC | 09:26 | |
*** bhavikdbavishi has joined #zuul | 09:26 | |
openstackgerrit | Simon Westphahl proposed zuul/nodepool master: Expose image build requests in web interface https://review.opendev.org/725810 | 09:42 |
openstackgerrit | Simon Westphahl proposed zuul/nodepool master: Expose image build requests in web interface https://review.opendev.org/725810 | 09:43 |
*** rpittau is now known as rpittau|bbl | 09:44 | |
*** brendangalloway has joined #zuul | 09:49 | |
brendangalloway | Is there a method by which the zuul file filters could be used to setup arguments to a job, rather than triggering different jobs? | 09:53 |
*** jamesmcarthur has joined #zuul | 09:57 | |
*** jamesmcarthur has quit IRC | 10:07 | |
avass | is it possible to put allowed-labels on project level instead of tenant level? | 10:36 |
avass | I'm guessing it's not | 10:36 |
*** jcapitao is now known as jcapitao_lunch | 10:37 | |
*** jamesmcarthur has joined #zuul | 10:43 | |
*** jamesmcarthur has quit IRC | 10:43 | |
*** jamesmcarthur has joined #zuul | 10:43 | |
*** jamesmcarthur has quit IRC | 10:44 | |
*** jamesmcarthur has joined #zuul | 10:44 | |
*** jamesmcarthur has quit IRC | 10:51 | |
avass | brendangalloway: don't think so | 10:53 |
tobiash | brendangalloway: what's the use case? | 10:56 |
tobiash | avass: allowed-labels is currently only possible on tenant level | 10:57 |
*** jamesmcarthur has joined #zuul | 10:58 | |
*** jamesmcarthur has quit IRC | 10:59 | |
*** jamesmcarthur has joined #zuul | 10:59 | |
*** jamesmcarthur has quit IRC | 11:01 | |
brendangalloway | tobiash: We have a build script that packages a number of tightly coupled components together (all live in the same repo but with separate build scripts) and then creates a single build artefact that can be tested at gate. We're trying to reduce build times by only building the components that have changed, and using cached versions for the | 11:01 |
brendangalloway | others. I'd like a way to inform the build script as to which components should be used from cache, and to be able to pass that info to a post-test job that will cache whichever components were freshly built if the tests passed | 11:01 |
*** jamesmcarthur has joined #zuul | 11:02 | |
*** sanjayu__ has joined #zuul | 11:03 | |
tobiash | brendangalloway: you could do this within the job by diffing against origin/<target branch> | 11:03 |
brendangalloway | ok - I was hoping to be able to do it at zuul level since the existing file filter syntax is nice and clear | 11:05 |
tobiash | brendangalloway: you could do e.g. this to get the changed files of the change: git diff --name-only origin/master HEAD | 11:05 |
*** sanjayu_ has quit IRC | 11:05 | |
*** jamesmcarthur has quit IRC | 11:07 | |
brendangalloway | tobiash: we will tae that approach, thanks | 11:07 |
*** bhavikdbavishi has quit IRC | 11:25 | |
*** rfolco has joined #zuul | 11:30 | |
*** jcapitao_lunch is now known as jcapitao | 11:39 | |
*** jpena is now known as jpena|lunch | 11:39 | |
*** jamesmcarthur has joined #zuul | 11:40 | |
*** rlandy has joined #zuul | 11:58 | |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Scheduler's pause/resume functionality https://review.opendev.org/709735 | 12:05 |
mordred | brendangalloway: you might be able also to do what you're looking for with requires/provides and zuul artifacts. while it's not exactly the same, the docker image jobs do a similar thing to what you're talking about | 12:06 |
mordred | brendangalloway: to do that I thnik what you'd wind up doing is having each component have it's own build job with file filters that triggers the specific script. then you can have them use zuul_return to return the build component. then have a job which soft-depends on all of them (so it's ok if some of them don't run) which pulls the artifacts returned by the jobs (it can find out which ones to pull by | 12:09 |
mordred | reading zuul_return data) and then pull things from cache if nothing has built it | 12:09 |
*** hashar has joined #zuul | 12:11 | |
brendangalloway | mordred: I was not aware of the soft dependency option, that might also be an option | 12:11 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-operator master: Document the missing feature https://review.opendev.org/718755 | 12:11 |
mordred | brendangalloway: it's pretty useful - it says to depend on a job but only if it ran, so it's not an error if a file matchers causes it to not run | 12:12 |
brendangalloway | mordred: Yes, we've had issues with hard dependencies due to exactly that issue. Is it a new feature, I don't recall seeing it before | 12:14 |
mordred | brendangalloway: it's also possible that full separate build jobs would introduce too much overhead - so another thing you might consider is to just make the per-component jobs be tiny nodeless jobs with file matchers that just do a zuul_return with the name of their component - then have your main build job read the zuul return to build the list of needed components | 12:14 |
mordred | brendangalloway: I don't think it's that new - we use it extensively in opendev | 12:14 |
brendangalloway | mordred: that is also a good suggestion, thanks for the ideas | 12:16 |
*** rpittau|bbl is now known as rpittau | 12:18 | |
mordred | sure thing! | 12:19 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Retry buildx builds https://review.opendev.org/725843 | 12:23 |
mordred | tobiash, avass: if you have a sec, super simple patch ^^ | 12:23 |
avass | mordred: done :) | 12:24 |
tristanC | avass: installing the zuul-operator results in a working zuul, we are using it to validate base jobs, though some more work is needed to make it more production ready, see: https://review.opendev.org/718755 | 12:24 |
*** jamesmcarthur has quit IRC | 12:25 | |
avass | tristanC: alright, we won't start using it anytime soon anyway | 12:26 |
*** Goneri has joined #zuul | 12:38 | |
*** jpena|lunch is now known as jpena | 12:45 | |
*** sanjayu_ has joined #zuul | 12:47 | |
openstackgerrit | Merged zuul/zuul-jobs master: Retry buildx builds https://review.opendev.org/725843 | 12:47 |
*** bhavikdbavishi has joined #zuul | 12:49 | |
*** sanjayu__ has quit IRC | 12:50 | |
*** bhavikdbavishi1 has joined #zuul | 12:56 | |
*** bhavikdbavishi has quit IRC | 12:58 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 12:58 | |
*** jamesmcarthur has joined #zuul | 13:01 | |
openstackgerrit | Monty Taylor proposed zuul/zuul master: Tie status filter text to pathname https://review.opendev.org/725850 | 13:02 |
*** sgw has joined #zuul | 13:06 | |
*** cdearborn has joined #zuul | 13:11 | |
mordred | corvus: when you're up - we're getting 500s from zuul-registry trying to buildx the nodepool-builder image - and they seem to be consistent | 13:12 |
mordred | corvus: I grabbed the logs from the current buildset registry (luckily the k8s test takes a while so the buildset registry is easy to hop on to without a hold) | 13:12 |
mordred | and they do show a traceback | 13:12 |
*** rfolco is now known as rfolco|rover | 13:13 | |
*** jamesmcarthur has quit IRC | 13:21 | |
corvus | mordred: got a link? | 13:31 |
mordred | https://bb5882b725b63c90f428-d701f6a98461967df274f184d5a7d3cd.ssl.cf5.rackcdn.com/725611/1/check/nodepool-build-image/2dc0a25/ | 13:31 |
mordred | corvus: that's the failed job logs | 13:31 |
mordred | corvus: the registry logs I grabbed are on the mttest-docker node in ~mordred/registry.logs | 13:32 |
mordred | corvus: although the node is still currently there: root@2001:470:e126:1:f816:3eff:fe41:f1fa | 13:32 |
mordred | (the buildset registry node) | 13:33 |
corvus | mordred: can you paste the logs? | 13:33 |
mordred | yeah | 13:33 |
corvus | okay, we're talking about this change: https://review.opendev.org/725611 | 13:33 |
mordred | corvus: yeah - but it failed same way with the parent | 13:34 |
corvus | and the job is nodepool-build-image | 13:34 |
mordred | yes | 13:34 |
corvus | the previous build (before the current recheck) failed; was that the same error? | 13:35 |
mordred | corvus: it was the other thing we've seen: failed to solve: rpc error: code = Unknown desc = failed commit on ref "layer-sha256:3695e93b8b00a1d17e972a09de4338742c063b0b8242acb75d4c9250df1229de": unexpected size 0, expected 2041930 | 13:35 |
mordred | so it's possible there are 2 different issues - one that happens sometimes and one that seems to happen consistently | 13:36 |
corvus | it reported this traceback: https://zuul.opendev.org/t/zuul/build/259910ba7bf644299d532ca505fe51b9/log/docker/buildset_registry.txt#592-606 | 13:36 |
mordred | corvus: http://paste.openstack.org/show/793196 | 13:37 |
corvus | looks pretty similar | 13:38 |
mordred | corvus: there's the logs I captured from the buildset registry - although I now see you don't need those :) | 13:38 |
corvus | i do need them :) | 13:38 |
mordred | oh good! | 13:38 |
corvus | mordred: i see the same two errors logged in both the previous and current runs of the job | 13:38 |
mordred | corvus: line 388 in the paste is a different traceback | 13:38 |
mordred | oh - that sure did truncate | 13:39 |
corvus | mordred: that's here: https://zuul.opendev.org/t/zuul/build/259910ba7bf644299d532ca505fe51b9/log/docker/buildset_registry.txt#498 | 13:39 |
mordred | corvus: yeah - I think now that the job has reported - that log has what I was able to grab | 13:39 |
corvus | well, i was linking to the previous one | 13:40 |
mordred | ah. nod | 13:40 |
corvus | i'm still working on "identify the bug mordred wants me to investigate" :) | 13:40 |
corvus | mordred: so far, i'm seeing the same tracebacks in both logs | 13:40 |
corvus | mordred: but i agree, buildx reported 2 different errors | 13:41 |
corvus | mordred: there seem to be multiple puts going on; perhaps it's the same underlying issue but with a race | 13:41 |
corvus | mordred: i'll try to tease out the sequence from the logs | 13:41 |
mordred | corvus: yeah. ah - yeah | 13:41 |
mordred | corvus: you saw https://zuul.opendev.org/t/zuul/build/17b1cc2f42634141b991aa8d33fa32f9 failed to solve: rpc error: code = Unknown desc = failed commit on ref "layer-sha256:c677d2dff43d9861d4bf54db3fa3a4431c1621e7f4f940e78d10ad37966b0c81": unexpected status: 500 Internal Server Error | 13:42 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Add some debug lines to help with the loading_errors bug https://review.opendev.org/725732 | 13:44 |
openstackgerrit | Simon Westphahl proposed zuul/nodepool master: Expose image build requests in web UI and cli https://review.opendev.org/725810 | 13:56 |
openstackgerrit | Simon Westphahl proposed zuul/nodepool master: Expose image build requests in web UI and cli https://review.opendev.org/725810 | 13:57 |
openstackgerrit | Simon Westphahl proposed zuul/nodepool master: Expose image build requests in web UI and cli https://review.opendev.org/725810 | 13:58 |
*** jamesmcarthur has joined #zuul | 14:01 | |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: WIP configure cache-to and cache-from for buildx https://review.opendev.org/725862 | 14:02 |
corvus | mordred: wow, it really is uploading the same layer twice at the same time. the registry is not safe in that case and it errors. i can't say why buildkit reports 2 different errors here -- perhaps they represent each of the 2 different upload threads on the buildkit side. | 14:03 |
mordred | corvus: yeah - maybe so | 14:04 |
mordred | corvus: also - wow | 14:04 |
corvus | mordred: we should probably make the registry more robust in this case. but i also think it would be interesting to understand why buildkit uploads the same layer twice; maybe it's juts nuts, or maybe there's some subtle thing the registry isn't doing that it expects that triggers this. | 14:04 |
mordred | corvus: agree on both points | 14:05 |
mordred | corvus: maybe ... so - the way layers work on top of multi-arch base images is sort of interesting | 14:05 |
mordred | in that, for instance, python-base and python-builder are on top of debian which _is_ multi-arch but didn't do any multi-arch things themselves so are still multi-arch | 14:06 |
mordred | so - if there are steps in the nodepool dockerfile which are the same - then we have 2 builders potentially building the "same" layer | 14:06 |
corvus | mordred: ooooh good idea | 14:07 |
mordred | and the layers really will be the same for steps that don't do anything platform specific | 14:07 |
mordred | (since these are hashed only by their content - not content+context like git) | 14:08 |
corvus | i'm not aware of a way for us to figure out what these layers are | 14:08 |
corvus | (at least from the build logs we have) | 14:08 |
corvus | the shas are different from the 2 different zuul job builds, so at least there's something that changed between the 2 runs. but within a single run, perhaps the state is consistent enough to do what you suggest | 14:09 |
mordred | me either - but I think it will actually be most of the layers in the nodepool image that aren't the apt-get install step | 14:09 |
mordred | corvus: yeah | 14:09 |
*** bhavikdbavishi has quit IRC | 14:09 | |
corvus | mordred: if that's the case, i'm not sure there's anything we can do.... unless we can stagger the multiarch builds a little? | 14:10 |
corvus | like, start the native one, wait 10 seconds, then start the next? | 14:10 |
mordred | corvus: I'm not sure that's a thing we can control | 14:10 |
corvus | yeah, i was afraid of that | 14:10 |
corvus | (bad idea: conditional sleep in dockerfile) | 14:11 |
*** y2kenny has joined #zuul | 14:11 | |
mordred | corvus: yeah - although even that I'm not sure how to pass a different value to each since it's the same build from the docker pov | 14:12 |
avass | mordred: random value | 14:14 |
avass | :) | 14:14 |
mordred | avass: "yay" | 14:14 |
y2kenny | for nodepool, what does "building" state do? Is it possible to config a static node that skip "building"? (I am trying to setup a baremetal node that basically have nothing and allow zuul to install everything from the OS and up using pxe and ipmi.) | 14:15 |
y2kenny | Or do I need to have a custom driver for this type of nodes? | 14:16 |
y2kenny | (custom nodepool driver) | 14:16 |
avass | y2kenny, mordred: I guess you would need an ironic driver for that? | 14:20 |
corvus | y2kenny: the nodepool static driver expects to be able to supply a host to zuul to put in the ansible inventory; so the first question to answer is probably how you want to use these nodes from ansible | 14:21 |
*** jamesmcarthur has quit IRC | 14:22 | |
y2kenny | corvus: I am hoping to use the executor to drive much of the setup for the nodes with ipmi first before adding the node to the inventory | 14:22 |
*** jhesketh has quit IRC | 14:22 | |
y2kenny | similar to how k8s namespace type | 14:22 |
corvus | y2kenny: makes sense. the static driver doesn't have that functionality right now, so yeah, this would require a change to incorporate some of that behavior from the k8s driver | 14:24 |
y2kenny | I mean, similar to how I use k8s/openshift namespace type. (Nodepool provision namespace, I create pod with 'stuff' in zuul and then add to inventory for others to use.) | 14:24 |
y2kenny | corvus: so to use custom driver, what do I need to do? Do I need to build nodepool from source or is there somewhere in the nodepool image where I can "drop in" a custom driver? | 14:25 |
corvus | y2kenny: no, we don't support out-of-tree drivers; but we'd welcome a change like that to the upstream static driver | 14:26 |
y2kenny | corvus: ok... I am thinking of writing a cobbler driver or something like that so I was wondering how I should get started. | 14:29 |
mordred | yeah. although you might find it easier to make an ironic driver | 14:29 |
*** sanjayu__ has joined #zuul | 14:30 | |
mordred | I'd _highly_ recommend using ironic instead of cobbler for this | 14:30 |
y2kenny | mordred: um... I am just worry about my mental capacity to take in more new things :). How much of other pieces of OpenStack do I need to deploy in order to have ironic to work? | 14:32 |
y2kenny | last I looked into ironic (which, admittedly more than 3 years ago) I had a hard time getting my head around bootstraping ironic. | 14:33 |
*** sanjayu_ has quit IRC | 14:33 | |
y2kenny | I think there was some Openstack on Openstack thing but I wasn't too clear on how to get things done. | 14:33 |
mordred | y2kenny: none - ironic works standalone. there's a project called "bifrost" which is for installing a standalone ironic via ansible | 14:34 |
mordred | https://opendev.org/openstack/bifrost | 14:34 |
y2kenny | mordred: ok. I will look | 14:34 |
mordred | kk. I mostly mention it because ironic uses cloud-style images to boot base os from - as opposed to the cobbler kickstart oriented approach | 14:35 |
mordred | which is closer conceptually to how the vm stuff works and should be much quicker to power-on and get to minimal instal that you could hand to your jobs | 14:36 |
mordred | that said - there's no reason a cobbler driver wouldnt' be possible to write | 14:36 |
*** veecue1 has joined #zuul | 14:39 | |
fungi | or that other popular one which starts with a t, i forget the name | 14:40 |
y2kenny | mordred: the cloud-style images is probably more efficient but I need to see how much of cobbler's function ironic can do. Cobbler I understood because it's really just bunch of standard linux utils like bind, dhcpd, tftp, etc. Ironic is comparitively opaque to me conceptually. | 14:44 |
fungi | ironic uses those standard things as well, though its focus is on maintaining a hardware inventory, performing image management, and providing a rest api for querying and (re)provisioning systems | 14:48 |
y2kenny | fungi: ok I will dig into it | 14:50 |
*** jhesketh has joined #zuul | 14:51 | |
fungi | it started out as a backend for openstack nova, to make it possible for users to do on-demand allocation and deallocation of bare metal servers in a "cloud" environment like they would virtual resources | 14:52 |
fungi | but as mordred says you can just use it by itself too | 14:52 |
fungi | it's also the current hardware backend for the metal3 (metal cubed) subsystem for openshift/kubernetes too | 14:53 |
y2kenny | I saw the metal3 intro back in december and it looks pretty cool. I just need to figure out how things fit together | 14:54 |
corvus | mordred: the *docker registry itself* fails at this. | 14:55 |
mordred | corvus: wow | 14:58 |
mordred | corvus: oh - I just had a weird idea ... | 14:59 |
mordred | corvus: (for how to do your earlier suggestion of staggering) | 14:59 |
mordred | corvus: what if we run the build without the push once | 14:59 |
mordred | corvus: then do an ansible loop for each of the arches running the build with --push for just that arch | 15:00 |
mordred | corvus: then run the build with push with the list of arches at the end | 15:00 |
mordred | the final push should have already pushed all of the segments so I'd expect it to just push a manifest | 15:00 |
avass | makes sense to me | 15:01 |
corvus | mordred: ++ i'll try it out in my testing | 15:02 |
mordred | corvus: cool. I've got a patch for zuul-jobs ready if it works | 15:03 |
*** jcapitao is now known as jcapitao|afk | 15:06 | |
*** hashar has quit IRC | 15:08 | |
openstackgerrit | Merged zuul/zuul master: Add some debug lines to help with the loading_errors bug https://review.opendev.org/725732 | 15:12 |
*** bhavikdbavishi has joined #zuul | 15:30 | |
*** jcapitao|afk is now known as jcapitao | 15:39 | |
corvus | mordred: that idea seems to work, and i think we should do it because even if we fix the registry to handle this race, that should mean less network traffic | 15:41 |
mordred | ++ | 15:42 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Split the arch building and pushing separately https://review.opendev.org/725905 | 15:44 |
mordred | corvus: how's that look? | 15:44 |
mordred | clarkb: ^^ second reivew on that would be good too | 15:45 |
corvus | mordred: looks great | 15:46 |
avass | mordred: lgtm | 15:47 |
corvus | mordred: i didn't check that the final manifest looked like a correct multi-arch manifest | 15:47 |
corvus | mordred: i only checked that the commands ran without error | 15:47 |
corvus | but i think i can do that easily by inspecting the filesystem | 15:48 |
clarkb | done | 15:50 |
mordred | corvus: cool. incidentally - it is neat that layers can be shared between different arch images | 15:50 |
corvus | mordred: yep looks good | 15:50 |
*** cdearborn has quit IRC | 15:51 | |
corvus | mordred: incidentally, the top level manifest is this: {"application/vnd.docker.distribution.manifest.v2+json": "sha256:65ee2f0113bb121e974010143fc704270391bb5728b70bf160db10a86c1ff1fb", "application/vnd.docker.distribution.manifest.list.v2+json": "sha256:71d563ecfa8c24afdb8ae5a56ca54155a475106f1214fc4272cf1b8fa796d3c7"} | 15:51 |
corvus | the second one is the multi-arch manifest list, the first one appears to just be the amd64 (and indeed is also the first entry in the multiarch list) | 15:51 |
corvus | i find it curious that's also listed there and it's not merely the list | 15:52 |
corvus | but that's the case whether i do the single push or multi push | 15:52 |
mordred | huh. that is interesting | 15:52 |
*** zxiiro has joined #zuul | 15:55 | |
*** panda|pto has quit IRC | 16:03 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 16:06 |
*** panda has joined #zuul | 16:06 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 16:10 |
openstackgerrit | Merged zuul/zuul-jobs master: Split the arch building and pushing separately https://review.opendev.org/725905 | 16:14 |
*** jamesmcarthur has joined #zuul | 16:20 | |
*** rpittau is now known as rpittau|afk | 16:24 | |
openstackgerrit | James E. Blair proposed zuul/zuul-registry master: Handle blob upload races https://review.opendev.org/725925 | 16:28 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 16:29 |
corvus | mordred, avass, clarkb: ^ i think that should also make the registry more robust. hopefully we don't hit the race because of mordred's change, but if we do hit it, it should no longer error | 16:29 |
mordred | corvus: watching one of the nodepool builds, it got past the previous place | 16:34 |
*** evrardjp has quit IRC | 16:36 | |
*** evrardjp has joined #zuul | 16:36 | |
*** sanjayu__ has quit IRC | 16:38 | |
mordred | corvus: woot! job success | 16:40 |
*** stappa has joined #zuul | 16:40 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 16:40 |
avass | mordred: nice :) | 16:40 |
mordred | corvus, clarkb: https://review.opendev.org/#/c/722483/ and https://review.opendev.org/#/c/725611/1 will come back green and are ready for review | 16:40 |
mordred | avass: \o/ | 16:40 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 16:50 |
clarkb | corvus: couple notes on the registry change. My biggest concern is swift's eventual consistency leaves us open to the race you try to protect against | 16:52 |
corvus | clarkb: yeah, i thought about that, though i think that in most cases a PUT followed by a GET should actually return the thing. but even if it does fail, that's falling back to the status quo, so probably the patch doesn't make it worse? | 16:53 |
clarkb | corvus: ya I was poking around and if we go by ancient data https://github.com/gaul/are-we-consistent-yet#observed-consistency says ~3 reads after overwrite have been observed against swift implementations in the wild to get the correct data | 16:55 |
clarkb | however read after create is always ok | 16:55 |
clarkb | I think that means we should generally have a winner (the creating writer?) | 16:55 |
*** jamesmcarthur has quit IRC | 16:55 | |
corvus | clarkb: yeah, though i guess if it's a tight race after the create, the readback of the uuid could be wrong, and they could both think they're the winner | 16:57 |
corvus | it's not a foolproof algorithm, but it's probably better than nothing | 16:57 |
clarkb | and ya I agree its no worse than the current situation | 16:57 |
corvus | basically, we've reduced the race window from like 10 seconds to maybe <1 second? | 16:58 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 17:01 |
*** jcapitao has quit IRC | 17:04 | |
clarkb | corvus: also are the client supplied uuids expected to be random and not determinstic for the data? | 17:05 |
clarkb | (that was my other comment) | 17:05 |
mordred | corvus: hrm. I thought tailing the logs on the nodepool change had shown we had success - but I see failure :( | 17:05 |
mordred | corvus: https://zuul.opendev.org/t/zuul/build/17b1cc2f42634141b991aa8d33fa32f9 | 17:06 |
mordred | corvus: I've got to run for a little bit though | 17:06 |
mordred | I'll look further when I Get back | 17:06 |
*** jpena is now known as jpena|off | 17:08 | |
corvus | clarkb: replied (uuids are unique) | 17:09 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 17:16 |
*** veecue has joined #zuul | 17:19 | |
*** irclogbot_0 has quit IRC | 17:20 | |
*** jamesmcarthur has joined #zuul | 17:21 | |
*** veecue1 has quit IRC | 17:22 | |
*** irclogbot_0 has joined #zuul | 17:24 | |
*** sshnaidm is now known as sshnaidm|afk | 17:26 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 17:27 |
stappa | Hello. Sorry if this is a stupid question. There is a discuss in | 17:31 |
stappa | mailing lists about gzip in upload-logs role: http://lists.zuul-ci.org/pipermail/zuul-discuss/2018-May/000416.html There is a quote "I also agree, we'll likely want to improve documentation around web server | 17:31 |
stappa | configure too." I wonder if this is the thing and there is a | 17:31 |
stappa | documentation about relevant Apache configuration. For now we have a logs server with both compressed and uncompressed logs (compression was disabled at some point). Because of incorrect Apache mimetypes configuration we have broken links to the compressed "jobs-output.*" files. | 17:31 |
stappa | I am novice in Apache and looking to the right way to sort this out | 17:31 |
*** jamesmcarthur has quit IRC | 17:37 | |
*** jamesmcarthur has joined #zuul | 17:38 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 17:38 |
*** bhavikdbavishi has quit IRC | 17:45 | |
tristanC | tobiash: after running an enqueue-ref, it seems like our scheduler gearman is now stuck... how did you use the repl to unlock your scheduler? | 17:45 |
*** veecue has quit IRC | 17:46 | |
tristanC | running `zuul-scheduler repl` doesn't seem to do anything | 17:47 |
*** veecue has joined #zuul | 17:49 | |
fungi | stappa: it may not be an ideal example, but here's what we used to do on our logserver before we switched to serving build logs from swift: https://opendev.org/opendev/puppet-openstackci/src/branch/master/templates/logs.vhost.erb#L23-L64 | 17:55 |
*** hashar has joined #zuul | 17:58 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 18:05 |
veecue | Hi. I would like to contribute to a Gitea driver. (Reference: https://github.com/go-gitea/gitea/issues/8851). Are there any offerings for a good code skeleton to build from? I would start without the fancy caching etc that the GitHub driver has and work my way up from there | 18:21 |
clarkb | veecue: thr pagure driver was mentioned yesterday as a good framework to start from | 18:22 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 18:23 |
veecue | clarkb: Thanks! | 18:25 |
tobiash | tristanC: enable repl, connect via nc on port 3000 | 18:37 |
tobiash | zuul-scheduler repl | 18:37 |
tobiash | you get the scheduler object via server.scheduler | 18:37 |
tristanC | tobiash: thanks, unfortunately it seems like zuul was enable to start the repl service, we had to hard reset the service | 18:38 |
tristanC | unable* | 18:38 |
tobiash | looking forward to scale out scheduler | 18:40 |
tobiash | corvus: do we have anything left for the last 3.x release we planned on that etherpad? | 18:41 |
y2kenny | Someone just pushed a burst of 45 patches to my infrastructure and it seems to held up all the jobs (I have 5 executor). What's worse is that the branch for these patches are noops from the infrastructure perspective. So to make things better, should I, 1) simply increase the number of executor? 2) add additional merger? 3) Filter out branches | 18:43 |
y2kenny | for that project (I am not sure if this is actually possible.) | 18:43 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 18:44 |
fungi | veecue: tumble also mentioned an interest in working on a gitea source connection driver for zuul (or at least in using one if someone writes it), so may be useful to coordinate with them too | 18:45 |
clarkb | y2kenny: I expect your bottleneck is merging there because zuul needs to do merges to determine what jobs to run. Noops are handled directly in the scheduler iirc and never talk to an executor | 18:47 |
fungi | y2kenny: can you elaborate on how it "held up" your jobs? was it effectively utilizing your available job nodes? if not, do you have resource monitoring on your executors so that you can see if system load, disk space or memory utilization was topping out? or was the scheduler waiting on all the merge requests to complete? | 18:47 |
fungi | oh, and yeah, you said mostly noop, so i agree with clarkb | 18:48 |
fungi | you could add some standalone mergers | 18:48 |
tumble | fungi, that's true. And hi veecue! | 18:49 |
fungi | y2kenny: if noop jobs are what you really meant by "noops from the infrastructure perspective" | 18:49 |
veecue | fungi: Yep, he is how I even got to that IRC ;). Hi tumble! | 18:50 |
tumble | the propaganda worked | 18:50 |
fungi | heh | 18:50 |
y2kenny | fungi: it's held up in the sense that other queued items are stuck waiting. So I have one project called 'linux' and the pipeline queue counter was like 45 and then I have another one queued from the project zuul-config and it was stuck in the queue until that 45 patches cleared. | 18:50 |
y2kenny | fungi: yes, no op from the infrastructure perspective. My job is configured for branch 'A' and those 45 patches are from branch 'B' that the infrastructure doesn't handle | 18:51 |
tumble | veecue, feel free to start. I have nothing done yet and wouldn't start before Friday anyway. My biggest problem is that I'm new to zuul and struggle with the basics yet, so that would significantly slow me down anyway. | 18:51 |
corvus | tobiash: i was hoping to test and restart opendev with the zk auth patch, but it's been several weeks and i don't really know if we're in a place to work on that yet | 18:52 |
y2kenny | fungi: the infra is supposed to ignores them but I believe even those changes are tried by the merger from what I can tell. | 18:52 |
avass | y2kenny: yeah, all changes needs to be merged for the scheduler to load the job configuration | 18:53 |
tobiash | corvus: hrm, ok, I think I'll try that in our environment then | 18:54 |
fungi | y2kenny: how did you configure zuul to ignore those branches? | 18:54 |
veecue | tumble: I'm starting right now, but I'm also new to Zuul | 18:54 |
y2kenny | fungi: only by specifying specific branches for the jobs. Did I miss an alternate method? | 18:55 |
tobiash | hopefully I can contribute some useful insights to running with the zk auth patch :) | 18:55 |
y2kenny | fungi: I was looking into the tenant config but I didn't see anything that applies... but I may have missed something. | 18:55 |
clarkb | corvus: will the restart without jemalloc friday pull that in? | 18:55 |
clarkb | corvus: or is that still unmerged? | 18:55 |
tumble | veecue, alright, I'll be around here and interested in your progress if you get anything working :) | 18:56 |
tobiash | I thought is was merged already for quite some time and 'just' has to be configured | 18:56 |
corvus | clarkb: changes to actually switch zk to tls are lined up behind 'run everything in containers' | 18:56 |
fungi | y2kenny: no just curious if you had zuul configured to run noop jobs for changes targeting those branches, based on your earlier phrasing | 18:56 |
tobiash | makes sense to do that after containers | 18:57 |
y2kenny | fungi: oh no, they shouldn't even run noop job. I basically configured to have job.branches: 'A' (that's what I meant by ignoring the rest.) | 18:57 |
clarkb | corvus: gotcha and that last remaining piece for that is nodepool? | 18:57 |
corvus | veecue: hi! feel free to ping me for help; like tumble, i'll have more time to dive in myself after friday | 18:57 |
corvus | clarkb: yeah, then dust off https://review.opendev.org/720302 | 18:57 |
y2kenny | avass: that's still true even when the projects are listed under untrusted-projects and has an empty include right? | 18:58 |
mordred | corvus: nevermind! the nodepool patch is green - I think I maybe clicked re3check too quickly | 18:58 |
tobiash | corvus: did I understand that patch correctly that I can do a step by step transition by adding the tls port to zk first, reconfigure zuul and nodepool and then disabling the unencrypted port? | 18:58 |
mordred | corvus: https://review.opendev.org/#/c/722483 | 18:58 |
avass | y2kenny: ooh, I don't know about that | 18:59 |
avass | y2kenny: you mean zuul is tracking them but you don't load any items in those projects? | 18:59 |
AJaeger | corvus, tobiash, do you want to merge tobiash's change for git tuning in again? https://review.opendev.org/723800 | 18:59 |
y2kenny | avass: that's correct | 18:59 |
fungi | y2kenny: you ought to be able to look at the scheduler's logs to determine what actions it took for one of those changes | 18:59 |
y2kenny | fungi: oh... let me see... | 19:00 |
tobiash | AJaeger: we should change the pruneExpite from now to something slightly higher than now probably | 19:00 |
fungi | y2kenny: that could help narrow down what it was doing with them which took so much time | 19:00 |
corvus | AJaeger: has anything changed to make that less dangerous? | 19:00 |
AJaeger | tobiash: will you update the change? | 19:00 |
AJaeger | corvus: my understanding from the discussions was that the change was not the culprit | 19:01 |
*** y2kenny has quit IRC | 19:01 | |
AJaeger | corvus: but your call | 19:01 |
corvus | AJaeger: that's not what i got out of it | 19:01 |
tobiash | AJaeger: sure, I can update it if we come to a conclusion | 19:01 |
*** y2kenny has joined #zuul | 19:02 | |
y2kenny | fungi: when you said scheduler log, you meant scheduler and not executors right? | 19:02 |
AJaeger | corvus: not? Ok. let me check scrollback... | 19:02 |
corvus | AJaeger, tobiash: i started looking at it, but could not come up with a reproduction; it's not high on my list, so if someone else has time to look at it that would be great. | 19:02 |
fungi | AJaeger: last i saw, some snapshots were taken of some of the git repositories in question, but as of yet attempts to manually reproduce the problem were inconclusive, right? | 19:02 |
corvus | i put some git repos in afs and linked them here earlier | 19:03 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 19:03 |
AJaeger | seems I misunderstood the discussion | 19:03 |
avass | y2kenny: yes | 19:03 |
fungi | y2kenny: yes, scheduler. if zuul wasn't actually configured to run any jobs for those branches then in theory the executors shouldn't have been touched (but if the scheduler did ask executors to do something, it should log that in the scheduler logs) | 19:03 |
corvus | AJaeger, tobiash: i think the short version is: before we merge that, we want someone to say "i understand the problem and think either this was unrelated or it has been fixed". :) | 19:04 |
corvus | AJaeger, tobiash: i am not saying that at this moment :) | 19:04 |
avass | fungi, y2kenny: except if the executor merges the change | 19:04 |
AJaeger | corvus: I see | 19:05 |
fungi | avass: right, that's what i'm curious to find out. i suppose the change could in theory add a zuul configuration and then the scheduler needs the merger to feed that information back to find out what builds would be performed, but if the scheduler is configured not to consume configuration from that repository then it shouldn't need that information | 19:06 |
tobiash | corvus: was https://static.opendev.org/user/corvus/oo.tgz the correct link with the broken repo? | 19:07 |
y2kenny | fungi: so there's a lot of log and I am not sure which items are significant so I am going to start at the beginning of the burst and highlight the log as I go | 19:07 |
avass | fungi: yeah, if no items are included, that project should probably be treated as if it were not tracked. Except for being able to use required-project to test with that project I guess. | 19:07 |
corvus | tobiash: yep | 19:08 |
y2kenny | fungi: so at the start of the burst, I see a bunch of INFO zuul.GerritConnection... Updating <Change 0x7fdd05bbcca0 None 356552,1> | 19:09 |
y2kenny | basically one per change of those 45 patch | 19:09 |
fungi | y2kenny: if you search for xxxx,y where xxxx is the change number and y is the patch revision number then you should find the trigger event [e: zzzzzzzzzzz] where zzzzzzzzzzz is some long hex number | 19:09 |
fungi | that zzzzzzzzzzz will be included in all log entries related to the event | 19:09 |
y2kenny | Ah ok yes | 19:09 |
y2kenny | I was wondering what that e is | 19:10 |
fungi | which makes filtering a very busy log much easier | 19:10 |
avass | y2kenny, fungi: it's the buidset uuid isn't it? | 19:10 |
avass | or is that different | 19:10 |
tobiash | corvus: in that repo FETCH_HEAD seems to be valid | 19:11 |
y2kenny | fungi: um... tracking that zzzzz don't seems to be enough because seems like there are other events triggered by the same patchset | 19:12 |
AJaeger | corvus: reread the summary from http://eavesdrop.openstack.org/irclogs/%23opendev/%23opendev.2020-04-25.log.html#t2020-04-25T23:22:27 - somehow I had later conclusion in mind. I understand your concerns. | 19:12 |
y2kenny | fungi: I am seeing a lot of Adding change and Reporting enqueue | 19:12 |
avass | y2kenny: I would guess that means the scheduler is queueing up a merge job :) | 19:12 |
y2kenny | avass: looks that way. Each pipeline I have logged "adding change" | 19:13 |
tobiash | corvus: I could reproduce it :) | 19:14 |
avass | fungi: oh yes, you could still have a config project still targetting a project that is not including any items | 19:14 |
fungi | avass: the [e: zzzzz is] the triggering event id, the buildset_uuid is different | 19:17 |
mordred | corvus: I believe we're far enough along with the container rollout that working on the zk tls rollout is reasonable. we've landed and have containerized (or ansiblized) all the pieces of the pie - we still need to remove the remaining puppet builders, but that's just iterating through them and I think we can get that done before we actually land the zk thing - the ground sholdn't be moving under the zk patch | 19:17 |
y2kenny | fungi: is the event certain type of event or any kind of event? | 19:18 |
tobiash | corvus, AJaeger: those steps confirm that FETCH_HEAD is ignored as a ref when pruning: https://etherpad.opendev.org/p/lrJlt1vJ93Pt7bXmVAfo | 19:18 |
y2kenny | fungi: or may be any event that the pipeline is configured? | 19:19 |
y2kenny | fungi: oh... I think I might now what's going on. So I have 45 patch... but my pipelines are configured to trigger on events like comment-added. And there are other infrastructure that was commenting on the same change | 19:20 |
y2kenny | fungi: so each event has to be handled by the merger right? | 19:21 |
fungi | y2kenny: yes, the event id is a generic id applied to any event which is logged, so you can see all activities specific to a particular instance of an event | 19:21 |
fungi | every event gets a new event id, regardless of its event type | 19:22 |
veecue | tumble: Both of us were so optimistic about Gitea. It doesn't even support PR reviews :D | 19:22 |
fungi | so two different patch uploaded or comment added events will get distinct ids | 19:22 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 19:22 |
y2kenny | fungi: and each event would trigger an add to the pipeline and the pipeline is only emptied by merger? | 19:22 |
y2kenny | fungi: by add to the pipeline I meant add to the ChangeQueue | 19:24 |
*** jamesmcarthur has quit IRC | 19:25 | |
tumble | veecue, uhm, not sure what you mean by PR review, but I see a green "Review" button in PRs :o | 19:26 |
mordred | veecue, tumble | 19:26 |
mordred | veecue, tumble: also - the gitea humans are friendly, and we've landed several patches upstream as we've hit various things - so if the api is deficient, I'm sure it could be fixed :) | 19:27 |
corvus | tobiash: yeah, i wasn't able to make that happen without an explicit gc, but maybe it's enough to assume that it can happen, and for our testing just do explic gc's | 19:27 |
tobiash | corvus: yes, I think we can assume that, however it looks to be harder to do a gc that doesn't break fetch_head :/ | 19:28 |
fungi | y2kenny: the scheduler maintains an events queue, which all events it receives go into. as it pulls each event off the queue it checks to see whether it has any triggers configured matching that event, and if so it enqueues it into the relevant pipelines and asks for a merger to fetch the configuration for that change so that it knows what builds are needed (if any) | 19:30 |
*** jamesmcarthur has joined #zuul | 19:30 | |
y2kenny | fungi: ok. And the pipelines are shared by all projects and first come first serve? | 19:31 |
fungi | y2kenny: what i'm not clear on is whether that last step i mention happens when the project is configured to have nothing included | 19:31 |
y2kenny | fungi: ok let me look for merger in the log | 19:31 |
*** yolanda has quit IRC | 19:33 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 19:34 |
y2kenny | fungi: so after a mass of adding change to queue, I see some log from merger. I see some merger:merge unique: 1dc1e6a2583b44849d5508dd2da11586> complete, merged: True, updated: False | 19:35 |
y2kenny | I also see some "<change> did not merge because it did not have any jobs configured" | 19:36 |
y2kenny | and some "Resetting builds for change because the item ahead failed to merge" | 19:37 |
y2kenny | and also Dequeuing change because it can no longer merge | 19:37 |
fungi | y2kenny: so yes, it looks like it is at least did ask mergers for configuration from each change | 19:44 |
fungi | do you have any standalone mergers, or just the ones embedded in the executors? | 19:44 |
y2kenny | just embedded. I guess I should add some mergers | 19:44 |
fungi | yeah, we strongly recommend extra mergers so that such activity doesn't monopolize the ones the executors rely on. though it sounds like there might also be some opportunity here for improving performance if we can work out situations where the scheduler can know it doesn't need to request a merge of certain changes | 19:45 |
jamesmcarthur | clarkb: fungi: corvus: mordred: If y'all get a chance, could you check out the email entitled "Zuul Update for OSF Press Conference" | 19:46 |
mordred | jamesmcarthur: ohai - what's the timeframe on that? | 19:47 |
y2kenny | fungi: about the changequeue/pipeline, are they shared by all projects that uses that pipeline? Does the merger take from that pipeline one after the other? Is it possible to have per project queue within each pipeline or does that mess things up? | 19:48 |
jamesmcarthur | mordred: :) hi! May 11 is the date of the press conference | 19:48 |
jamesmcarthur | So ideally sometime before EOW | 19:49 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 19:50 |
mordred | jamesmcarthur: cool - I keep thinking about it - and then I keep forgetting | 19:50 |
jamesmcarthur | mordred: that's pretty much just life at the moment, so I understand | 19:52 |
fungi | y2kenny: as zuul pulls events off its event queue, if it decides it needs to act on one it puts a merger:merge request into gearman and then the next available merger grabs that and processes it, returning the result. until that happens, zuul can't be sure what pipelines to enqueue into because a change could be introducing new criteria for a project-pipeline | 19:53 |
y2kenny | fungi: ok | 19:53 |
fungi | y2kenny: the pipelines themselves can be prioritized so that the resources consumed by builds running for them get filled in priority order, but that happens after zuul knows what pipelines will be involved | 19:54 |
tobiash | corvus, AJaeger: I found a mailing thread about this behavior on the git mailing list: https://public-inbox.org/git/20160708025948.GA3226@x/ | 19:57 |
tobiash | but no real solution there | 19:57 |
fungi | yeesh, almost 4 years ago | 19:59 |
corvus | tobiash: will setting the delay to >0 work for us? | 19:59 |
tobiash | corvus: nope, that removed that object as well | 20:00 |
corvus | oh drat; well, we could go back to making zuul refs, but that's a mess | 20:00 |
tobiash | corvus: I need to see if we can fetch differently so that there is a real ref | 20:00 |
corvus | tobiash: i guess we could have a sinfle refs/zuul/fetch ref and just keep reusing that | 20:01 |
tobiash | I think that would be possible | 20:01 |
corvus | basically just needs to stay around long enough to merge it into a real ref | 20:01 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 20:02 |
*** tumble has quit IRC | 20:04 | |
y2kenny | Another observation I am getting a build that report both success and failure at the same time... | 20:05 |
*** tumble has joined #zuul | 20:05 | |
y2kenny | from Gerrit, it Verified +1 but immediately after it, it removed the verified +1 (the pipeline is set to remove verified if fail) | 20:06 |
y2kenny | looking at the scheduler log, I see "Build complete, result FAILURE" | 20:07 |
y2kenny | ... I am so confused... | 20:08 |
fungi | y2kenny: are you triggering two different pipelines on the same event with similar reporting settings? | 20:08 |
avass | y2kenny: are multiple pipelines being triggered? | 20:08 |
y2kenny | fungi, avass: let me check... I didn't think I configured multiple but may be I did that unintentionally | 20:10 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 20:10 |
stappa | fungi: This is useful, thanks! | 20:12 |
y2kenny | fungi, avass: ok I guess this is why... "Reported change" did not merge because it did not have any jobs configured" this is reporting from a different pipeline | 20:14 |
y2kenny | I guess I have a pipeline that has too much of a wild card in it | 20:15 |
fungi | y2kenny: that would do it | 20:15 |
fungi | y2kenny: er, actually "did not merge because it did not have any jobs configured" doesn't seem to result in a verified vote on our system | 20:18 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 20:18 |
fungi | y2kenny: so that could be benign | 20:18 |
fungi | y2kenny: look for log entries which say "Reporting item" and look at what tenant/pipeline is associated with them (it'll be in the facility name at the beginning of the log entry, like zuul.Pipeline.<tenant>.<pipeline>) | 20:21 |
y2kenny | fungi: I had a project configured to match any project but has no job associated with the pipeline. Looks like that caused the problem | 20:22 |
y2kenny | fungi, avass: thanks for the tip. | 20:22 |
avass | y2kenny: np | 20:23 |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: WIP: Revert "Revert "Tune automatic garbage collection of git repos"" https://review.opendev.org/723800 | 20:23 |
y2kenny | and just another observation... if the zuul user that is responsible to submit a change on gerrit does not have permission to submit, it will keep retrying without fail. May be it will time out eventually but I didn't wait for that after it commented on a change like 10 times trying to merge :) | 20:24 |
y2kenny | zuul stopped trying after I manually submitted the change | 20:24 |
clarkb | y2kenny: what HTTP return code does gerrit return in that case? | 20:27 |
clarkb | y2kenny: I did just add handling for HTTP 409 which happens if you try to vote on a closed change | 20:27 |
clarkb | and ya it should retry 3 times I think | 20:27 |
clarkb | (but not retrying when we have no hope of success is a good thing) | 20:27 |
y2kenny | clarkb: the voting is done by the executor or the scheduler? | 20:27 |
clarkb | y2kenny: the scheduler | 20:27 |
y2kenny | (just so I know where to check the log) | 20:28 |
y2kenny | ok... let me dig into the log... | 20:28 |
clarkb | y2kenny: if you can share a traceback for those errors it would be great. I don't mind writing another fix similar to the http 409 case | 20:28 |
tobiash | y2kenny: it can be that the comment triggers another enqueue into the gate which will try to merge again and then comment again... | 20:30 |
tobiash | y2kenny: we call this behavior a gate loop | 20:30 |
y2kenny | tobiash: oh it has a name.... :D | 20:30 |
y2kenny | yea, I think that's what that was | 20:30 |
tobiash | well that's how me and my team calls it ;) | 20:30 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 20:32 |
y2kenny | clarkb: sorry... I forgot my setup is not using the http... I am using ssh | 20:33 |
y2kenny | we have some weird policy that makes me not able to use http interaction with gerrit | 20:34 |
clarkb | gotcha I don't know if we retry failed requests over ssh so probably is what tobiash describes | 20:34 |
y2kenny | for "reason" we disallow git clone over http | 20:34 |
y2kenny | and I believe, if I configure zuul to interact with gerrit via http, it will try to clone via http as well... and that causes failure | 20:35 |
tobiash | y2kenny: interesting, for 'reason' our gerrit has (officially) disabled ssh | 20:35 |
avass | tobiash: we've had that as well :) | 20:35 |
y2kenny | so I am forced to use ssh only (while missing the oh so nice inline comment feature... :'( ) | 20:36 |
y2kenny | tobiash: I would prefer to be in your shoes | 20:37 |
y2kenny | tobiash: although, what are you guys using to replace stream-events? | 20:37 |
tobiash | y2kenny: well, we had to get an exception for our gerrit back then because zuul was ssh only for gerrit for a long time | 20:37 |
tobiash | so actually our zuul uses ssh, but that's not available to most users | 20:38 |
fungi | it sounds like gerrit wants to use their checks api to replace the need to watch event streams for code review events | 20:39 |
clarkb | fungi: there was discussion on that and I think lastI saw they will keep the event stream | 20:39 |
fungi | ahh | 20:39 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 20:43 |
y2kenny | I thought they added some webhook thing to replace stream event but I am not sure how it work exactly | 20:43 |
y2kenny | the nice thing about stream event is that the connection is opened from the remote side | 20:43 |
fungi | y2kenny: yeah, the checks api is supposed to incorporate some sort of web trigger, but sounds like they've decided the ssh event stream also still has uses | 20:44 |
y2kenny | so any one can subscribe to the boardcast as they see fit | 20:44 |
clarkb | y2kenny: its still zuul -> gerrit connection. Basically the CI system registers thing sthey care about then polls for events on that | 20:44 |
clarkb | there are currently a few limitations with it that means the ssh event stream still does things it can't (like non change events) | 20:44 |
fungi | but is lightweight enough i guess that rapid polling isn't a problem | 20:45 |
y2kenny | clarkb: I see. If only I can get my gerrit team to upgrade gerrit.... | 20:45 |
fungi | and has an up side that events are no longer missed if the consumer stops watching the event stream for a few minutes | 20:45 |
avass | anyone wanna take a look at: https://review.opendev.org/#/c/724967/ before I have to rebase that again? :) | 20:48 |
*** hashar has quit IRC | 20:49 | |
*** brendangalloway has quit IRC | 20:50 | |
*** jamesmcarthur has quit IRC | 20:50 | |
*** jamesmcarthur has joined #zuul | 20:51 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 20:53 |
*** Goneri has quit IRC | 21:05 | |
*** stappa has quit IRC | 21:09 | |
*** jamesmcarthur has quit IRC | 21:10 | |
*** jamesmcarthur has joined #zuul | 21:16 | |
mordred | corvus: look - it's a new issue -- https://zuul.opendev.org/t/zuul/build/622f858a89174725bb121fd098c849de | 21:40 |
mordred | corvus: I think I know what it is | 21:41 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Tag the images pulled back into docker for changes https://review.opendev.org/726007 | 21:47 |
mordred | corvus: ^^ | 21:47 |
corvus | mordred: agree with solution but typo ^ | 21:50 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Tag the images pulled back into docker for changes https://review.opendev.org/726007 | 21:54 |
mordred | corvus: good point | 21:54 |
*** jamesmcarthur has quit IRC | 21:55 | |
mordred | clarkb, fungi, tristanC: anybody want to +A that ^^ so I can re-check the nodepool images again? | 21:57 |
tristanC | mordred: done | 22:01 |
mordred | tristanC: thank you! | 22:01 |
*** threestrands has joined #zuul | 22:20 | |
openstackgerrit | Merged zuul/zuul-jobs master: Tag the images pulled back into docker for changes https://review.opendev.org/726007 | 22:25 |
*** tosky has quit IRC | 22:56 | |
*** y2kenny has quit IRC | 23:02 | |
openstackgerrit | Merged zuul/nodepool master: Build multi-arch images for x86 and arm https://review.opendev.org/722483 | 23:07 |
*** tumble has quit IRC | 23:08 | |
openstackgerrit | Merged zuul/nodepool master: Build nodepool with python3.8 https://review.opendev.org/725611 | 23:09 |
mordred | corvus, ianw, tristanC: ^^ BOOM! | 23:09 |
ianw | very cool! | 23:16 |
*** Shrews has joined #zuul | 23:22 | |
tristanC | well done! | 23:24 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!