*** michael-beaver has quit IRC | 00:08 | |
*** mattw4 has quit IRC | 00:15 | |
*** mattw4 has joined #zuul | 00:15 | |
jkt | btw, I'm getting this error with updated zuul's zuul-jobs: http://paste.openstack.org/show/752558/ | 00:23 |
---|---|---|
jkt | this is because zuul-jobs.git now assumes that it's called zuul/zuul-jobs everywhere, on all Zuul instances | 00:23 |
*** mattw4 has quit IRC | 00:24 | |
jkt | on my system, it's called ci/zuul-jobs though | 00:24 |
SpamapS | I've had a similar problem in the past with base jobs. | 00:26 |
SpamapS | jkt: I think you may have to call it zuul/zuul-jobs. :-/ | 00:26 |
jkt | corvus: it seems that this warning (http://paste.openstack.org/show/752558/) on installations where zuul-jobs is not named zuul/zuul-jobs has been introduced in your commit 2f2d6ce3 | 00:27 |
jkt | SpamapS: if this is intentional, then I will be able to adapt, sure, but I think that this might have been just a mistake | 00:28 |
corvus | jkt: we fixed that about 10 hours ago in https://opendev.org/zuul/zuul-jobs/commit/f1264e0f6071a11c5105fbc7edc71efc7e236fc3 | 00:29 |
corvus | jkt: if it's still a problem, maybe a full reconfigure would get the update, or if not, maybe a scheduler restart | 00:30 |
jkt | corvus: sorry for noise | 00:30 |
corvus | jkt: sorry for the error | 00:30 |
*** sanjayu__ has quit IRC | 00:42 | |
*** spsurya has joined #zuul | 01:01 | |
*** jamesmcarthur has joined #zuul | 02:41 | |
*** rlandy|ruck|bbl has quit IRC | 02:45 | |
*** jamesmcarthur has quit IRC | 03:08 | |
tristanC | fungi: mhu: there is indeed something wrong with the test-job-tox-el7, it may be related to the host upgrade we did yesterday | 04:34 |
tristanC | fungi: the funny name is generated by: https://softwarefactory-project.io/logs/56/663056/7/third-party-check/test-job-tox-f27/292d825/ara-report/file/85639583-d11d-4b3f-8cbd-77c9098e3d69/#line-14 | 04:35 |
*** pcaruana has joined #zuul | 04:50 | |
tristanC | it's now fixed, there was an issue with the test playbook. Thanks for noticing! | 04:51 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Return javascript content artifact records to Zuul https://review.opendev.org/663056 | 06:17 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Return python artifact records to Zuul https://review.opendev.org/663053 | 06:19 |
*** raukadah is now known as chandankumar | 06:24 | |
*** sanjayu__ has joined #zuul | 06:25 | |
*** sanjayu_ has joined #zuul | 06:27 | |
*** sanjayu__ has quit IRC | 06:29 | |
openstackgerrit | Merged zuul/zuul-jobs master: Allow download-artifact to download multiple files https://review.opendev.org/662876 | 06:33 |
openstackgerrit | Merged zuul/zuul-jobs master: Return javascript content artifact records to Zuul https://review.opendev.org/663056 | 06:39 |
*** themroc has joined #zuul | 06:48 | |
openstackgerrit | Merged zuul/zuul-jobs master: Return python artifact records to Zuul https://review.opendev.org/663053 | 06:57 |
*** gtema has joined #zuul | 06:58 | |
*** weshay has quit IRC | 07:22 | |
*** mhu has quit IRC | 07:22 | |
*** mhu has joined #zuul | 07:22 | |
*** weshay has joined #zuul | 07:23 | |
*** ParsectiX has joined #zuul | 07:24 | |
*** themroc has quit IRC | 07:26 | |
*** jpena|off is now known as jpena | 07:38 | |
*** hashar has joined #zuul | 07:46 | |
*** sshnaidm|afk is now known as sshnaidm | 09:05 | |
ofosos | The documentation on zuul-ci.org says that native container workflows are not yet implemented yet, can I request containers with nodepool anyway and run my builds there? | 09:22 |
*** felixgcb has joined #zuul | 09:33 | |
SpamapS | ofosos: there's a kubernetes driver, but I didn't have much luck with it. | 10:03 |
SpamapS | I believe the openshift driver gets more testing. | 10:04 |
*** flepied has quit IRC | 10:22 | |
*** ParsectiX has quit IRC | 10:32 | |
*** ParsectiX has joined #zuul | 10:33 | |
*** jpena is now known as jpena|away | 10:36 | |
*** gtema_ has joined #zuul | 10:42 | |
*** gtema has quit IRC | 10:42 | |
*** ParsectiX has quit IRC | 10:48 | |
*** ParsectiX has joined #zuul | 10:59 | |
*** ParsectiX has quit IRC | 10:59 | |
*** ParsectiX has joined #zuul | 11:06 | |
*** flepied has joined #zuul | 11:18 | |
*** ParsectiX has quit IRC | 11:26 | |
*** rlandy has joined #zuul | 11:59 | |
*** rlandy is now known as rlandy|ruck | 12:00 | |
*** gtema_ has quit IRC | 12:05 | |
*** felixgcb has quit IRC | 12:05 | |
*** hashar has quit IRC | 12:13 | |
*** gtema_ has joined #zuul | 12:27 | |
*** pcaruana has quit IRC | 12:52 | |
*** gtema_ has quit IRC | 13:32 | |
*** jpena|away is now known as jpena | 13:53 | |
*** jamesmcarthur has joined #zuul | 14:05 | |
*** gtema_ has joined #zuul | 14:16 | |
*** swest has quit IRC | 14:17 | |
pabelanger | morning! Looking for reviews on https://review.opendev.org/660856/ (from tristanC) to fix timer trigger file matching, and https://review.opendev.org/663378/ is fix use case with cisco network appliance. | 14:29 |
pabelanger | thanks! | 14:29 |
*** hashar has joined #zuul | 14:33 | |
*** sanjayu_ has quit IRC | 14:39 | |
Shrews | SpamapS: if the kubernetes driver doesn't actually work, we should either poke the author for fixes, or remove it | 15:09 |
Shrews | preferably the former | 15:10 |
*** chandankumar is now known as raukadah | 15:10 | |
SpamapS | Shrews: It has been on my todo to try it again for a while. | 15:11 |
SpamapS | When I tried it, the nodepool driver worked fine, it was the Zuul side of it that didn't work right. | 15:11 |
SpamapS | Something with the tokens and auth | 15:12 |
Shrews | tristanC: ^^ | 15:12 |
*** hashar has quit IRC | 15:13 | |
mordred | SpamapS: wasn't that because you were using eks which uses a different auth mechanism than normal k8s and we didn't have an answer for how to deal with that? | 15:15 |
openstackgerrit | Merged zuul/zuul-jobs master: Explicitly store date facts for promote https://review.opendev.org/662817 | 15:29 |
*** AJaeger has quit IRC | 15:38 | |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: A reporter for Elasticsearch https://review.opendev.org/644927 | 15:44 |
fungi | yeah, we can automatically test the kubernetes driver since it's all free software we can just install... but eks on the other hand not so easy to do in our opendev ci | 15:45 |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: A reporter for Elasticsearch https://review.opendev.org/644927 | 15:46 |
*** tjgresha has joined #zuul | 16:25 | |
SpamapS | mordred:that was a hypothesis I had | 16:29 |
clarkb | with the js tooling fix https://review.opendev.org/#/c/662339/1 passes now and I think that is a worthwhile update to zuuls js testing as it removes a set of variables we were otherwise having to consider | 16:38 |
clarkb | I rechecked the child change as well which I think need a bit more careful review but should also be considered | 16:38 |
clarkb | fixes a js dep issue if it works | 16:38 |
*** mgoddard has quit IRC | 16:48 | |
*** mgoddard has joined #zuul | 16:50 | |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: WIP: Add caching of autohold requests https://review.opendev.org/663412 | 16:58 |
*** mrhillsman is now known as openlab | 17:02 | |
*** openlab is now known as codebauss | 17:05 | |
*** codebauss is now known as openlab | 17:13 | |
*** openlab is now known as codebauss | 17:14 | |
*** codebauss is now known as openlab | 17:15 | |
*** openlab is now known as codebauss | 17:16 | |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Store autohold requests in zookeeper https://review.opendev.org/661114 | 17:19 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: WIP: Add caching of autohold requests https://review.opendev.org/663412 | 17:19 |
*** yolanda__ has joined #zuul | 17:20 | |
*** mattw4 has joined #zuul | 17:22 | |
*** yolanda has quit IRC | 17:23 | |
*** jpena is now known as jpena|off | 17:23 | |
*** codebauss is now known as mrhillsman | 17:24 | |
*** jamesmcarthur has quit IRC | 17:33 | |
SpamapS | Has anyone looked in to "zuul pushes" of late? I'm having some awkward conversations about git hashes and it would be great if we didn't have to store Zuul build UUIDs in the binaries built in the gate. | 17:35 |
corvus | SpamapS: not that i'm aware of but i believe it would still be welcome. | 17:37 |
fungi | yeah, it's something i've been looking forward to having for opendev if someone gets time | 17:37 |
fungi | there is a class of features which would be nice to have but hinge on zuul controlling the actual repository state and not delegating that control to the hosting system | 17:38 |
*** spsurya has quit IRC | 17:40 | |
fungi | and at the moment i don't recall what they were, but i *do* remember they seemed like a good idea ;) | 17:41 |
*** flepied has quit IRC | 17:42 | |
corvus | tristanC, pabelanger, tobiash, mordred: i've left a long review on https://review.opendev.org/660856 -- i'd like us all to think about this one a bit. | 17:44 |
corvus | pabelanger: small -1 on 663378 | 17:48 |
tobiash | SpamapS: using github that's not possible (is using app auth) | 17:50 |
*** gtema_ has quit IRC | 17:50 | |
tobiash | corvus: wow | 17:53 |
corvus | tobiash: what's not possible? | 17:53 |
mordred | tobiash: github disallows app auth from pushing changes? | 17:53 |
tobiash | corvus: direct push to a protected branch using an app | 17:54 |
mordred | wow | 17:54 |
tobiash | you can restrict pushes to branches only to real users | 17:54 |
mordred | I suppose a workaround could be to create a normal user account with an ssh key that has push access, yeah? and then tell zuul to use that for pushes? | 17:54 |
tobiash | mordred: yes, but that sounds like an ugly hack and totally violates the point of using app auth | 17:55 |
mordred | yup | 17:55 |
mordred | I agree. it's really sad that github has decided to do that - I'll be disappointed if zuul-push doesn't work for github once we have it implemented :( | 17:56 |
corvus | well, it will work, it just may need that workaround | 17:56 |
mordred | yeah | 17:56 |
tobiash | github has a long standing issue for this and states that they're working on this (since a year or so) | 17:56 |
fungi | maybe they'll have a solution by the time we do | 17:57 |
tobiash | that workaround isn't really workable for large multi tenant deployments, so that at least needs to be optional and disabled by default | 17:58 |
fungi | oh, i entirely expect zuul push to be disabled by default since that would be a significant behavior change to just introduce in an upgrade | 17:59 |
*** panda has quit IRC | 17:59 | |
fungi | even for gerrit deployments it would require additional acls | 17:59 |
mordred | yeah | 18:00 |
tobiash | corvus: re trigger, I haven't thought it through fully but could there also be a possibility D where we can derive this decision from the change itself? | 18:00 |
*** lennyb has quit IRC | 18:00 | |
tobiash | like a gerrit change or github pr always need file matcher | 18:00 |
tobiash | but a branch change or ref change might not, because they probably never have file changes attached | 18:00 |
tobiash | (at least if it has'nt an oldrev) | 18:01 |
fungi | (or a tag, or...) | 18:01 |
*** panda has joined #zuul | 18:01 | |
corvus | tobiash: yes, we should consider that -- i think the main thing to think about is whether we would ever want a branch head to use a file matcher | 18:01 |
fungi | pretty sure ref-updated events have no files section | 18:01 |
fungi | but i may be remembering older gerrit | 18:02 |
corvus | fungi: actually, i think they do... | 18:02 |
fungi | oh? | 18:02 |
*** michael-beaver has joined #zuul | 18:02 | |
tobiash | ref-updated can have files | 18:02 |
fungi | neat! | 18:02 |
corvus | and in cases where they don't, we have drivers fetch them. we use that internally in zuul to update the config | 18:02 |
tobiash | they have old-rev and new-rev and thus at least files can be computed | 18:02 |
*** mgoddard has quit IRC | 18:03 | |
tobiash | but something like enqueue-ref could be enqueued in a way that files have no meaning | 18:03 |
*** mgoddard has joined #zuul | 18:03 | |
tobiash | and timer trigger also would just enqueue a branch head which should probably never have files | 18:03 |
fungi | yeah, i think the enqueue-ref rpc subcommand lacks a --old-rev option | 18:04 |
fungi | oh, nope, --oldrev is supported | 18:04 |
fungi | i'm striking out today | 18:04 |
fungi | it's optional at least (and probably assumes oldrev=0 if unspecified) | 18:05 |
*** sshnaidm is now known as sshnaidm|off | 18:06 | |
fungi | seems --newrev is optional too according to the help output | 18:06 |
corvus | newrev=0==deletion | 18:07 |
fungi | ahh, yeah, that makes sense | 18:07 |
fungi | tags seem to lack an oldrev: http://zuul.opendev.org/t/openstack/build/2b2596f543764a189238c97d895c4390 | 18:08 |
pabelanger | corvus: thanks for both, will dig in more shortly | 18:09 |
fungi | no oldrev for branch updates either http://zuul.opendev.org/t/openstack/build/8d90086e496242f29f7f51c5c98f110b | 18:09 |
corvus | fungi: https://zuul-ci.org/docs/zuul/user/jobs.html#tag-items | 18:09 |
corvus | the docs lay out exactly when each thing is defined | 18:09 |
fungi | oh neat, i hadn't noticed that in the documentation. thorough! | 18:11 |
openstackgerrit | Paul Belanger proposed zuul/nodepool master: Toggle host-key-checking for openstack provider.labels https://review.opendev.org/663378 | 18:16 |
pabelanger | corvus: mordred: ^updated, thanks again for review | 18:16 |
*** mattw4 has quit IRC | 18:26 | |
*** mattw4 has joined #zuul | 18:27 | |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: WIP: Add caching of autohold requests https://review.opendev.org/663412 | 18:39 |
*** hashar has joined #zuul | 18:42 | |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Add autohold-info CLI command https://review.opendev.org/662487 | 18:48 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Record held node IDs with autohold request https://review.opendev.org/662498 | 18:48 |
Shrews | pabelanger: that host-key-checking change doesn't just optionally override the pool value, it totally eliminates checking of the pool value | 18:52 |
Shrews | pabelanger: is that intentional? the "override" comment in the commit message makes me think it isn't | 18:53 |
*** mattw4 has quit IRC | 18:53 | |
Shrews | corvus: maybe you want to remove your -3 on that until we confirm? | 18:53 |
*** mattw4 has joined #zuul | 18:53 | |
corvus | Shrews: done | 18:53 |
Shrews | sorry i didn't get a chance to look until now | 18:54 |
corvus | Shrews: does line 216-217 handle the fallback? | 18:55 |
corvus | in driver/openstack/config.py | 18:56 |
*** jamesmcarthur has joined #zuul | 18:56 | |
Shrews | corvus: that sets the pool version of that value, which is never checked | 18:57 |
Shrews | i think if he removes the change he made in provider.py, it would work | 18:57 |
*** mattw4 has quit IRC | 18:57 | |
Shrews | err, lemme check something | 18:57 |
Shrews | i think the code squashing in gerrit is throwing me off | 18:59 |
Shrews | corvus: pabelanger: ok, sorry for the noise. lgtm. i'll add the +3 back | 19:02 |
corvus | Shrews: thanks for the extra check :) | 19:02 |
corvus | Shrews: i also think the test still exercises the case you were worried about (label1_nodes should be that case i think) | 19:02 |
Shrews | i saw the +3 and tried to do a hurried review | 19:03 |
corvus | (i just went and double checked that) | 19:03 |
Shrews | cool | 19:03 |
openstackgerrit | Clark Boylan proposed zuul/zuul master: Update axios version and yarn.lock https://review.opendev.org/662316 | 19:06 |
Shrews | corvus: if we ever get a chance to redo nodepool, i want to eliminate the whole "outer label" and "inner label" config concept. It's caused me so much confusion in various scenarios. But maybe that's just me :) | 19:06 |
clarkb | I think it creates confusion but adds some useful flexibility | 19:07 |
corvus | Shrews: i hear you -- functionally it's really useful, but it's confusing. maybe we can find another way to do it, or maybe we just need new words. | 19:07 |
clarkb | it allows you to say this ubuntu-xenial-arm64 and that ubuntu-xenial-x86 image provide the ubuntu-xenial generic type to jobs | 19:07 |
corvus | (and if it's just new words, maybe we can make that change without a big redo) | 19:08 |
corvus | it's actually probably worse in the code than it is in the configfile | 19:08 |
Shrews | yes. that. | 19:09 |
corvus | (is pl a pool-label or a provider-label or...? turns out it's a provider-label in a pool. whatever that is :). | 19:10 |
* corvus -> lunch | 19:10 | |
fungi | so much p in that pool | 19:11 |
Shrews | boo fungi | 19:11 |
* Shrews hopes he's not in the lounge all week | 19:11 | |
fungi | i only do the eight o'clock and ten o'clock show | 19:11 |
Shrews | lol | 19:12 |
fungi | don't forget to tip your wait staff! | 19:12 |
*** jamesmcarthur has quit IRC | 19:12 | |
SpamapS | tobiash:well... we could always give Zuul a real user in the GitHub system for pushing purposes. | 19:12 |
SpamapS | Which would actually be pretty nice, because you could configure branch protection so only Zuul can push. | 19:13 |
tobiash | SpamapS: yes, that's a workaround but would not really fit into our operations model | 19:14 |
*** jamesmcarthur has joined #zuul | 19:14 | |
fungi | yeah, gerrit already requires an account for zuul to listen to its event stream and post label votes/comments, so it's not as big a difference there | 19:17 |
fungi | but also has a much more robust and granular rbac model | 19:17 |
clarkb | fungi: gerrit also requires it to hit the submit button | 19:17 |
clarkb | so no change in needing an account to hit submit vs force push | 19:17 |
fungi | sure, though that would ni theory go away when enabling the zuul push feature | 19:18 |
clarkb | well it needs an account to push right? | 19:18 |
fungi | yep, that's what i was saying | 19:18 |
clarkb | oh right submit goes away but not the need for an account | 19:18 |
fungi | i see, you mean requiring an account to push is also no different from requiring an account to call the submit api method | 19:19 |
fungi | which i agree with | 19:19 |
clarkb | yup | 19:19 |
*** armstrongs has joined #zuul | 19:23 | |
*** zbr has quit IRC | 19:25 | |
armstrongs | Hey has the aws nodepool driver been tested for Windows hosts. I have it working for fedora and centos ec2 instances but I have updated the connection type and port to winrm and it is falling over at key checking. Am I missing any other steps for Windows vms? | 19:25 |
tobiash | armstrongs: I think so far only SpamapS and I tried the aws driver and I think we both didn't test windows on aws | 19:29 |
tobiash | so it might or might not work probably | 19:29 |
tobiash | please note that the aws driver is still in an early state | 19:30 |
armstrongs | Cool I had to update it to support private ips already, I can add the logic for windows vms if you like put in a PR if that's ok. | 19:33 |
pabelanger | tristanC: could you point me in the direction on how to add pagination to the builds page on the dashboard? I've had a few users request that and struggling to figure out how to add it | 19:35 |
openstackgerrit | Merged zuul/nodepool master: Toggle host-key-checking for openstack provider.labels https://review.opendev.org/663378 | 19:36 |
tobiash | armstrongs: I'd be happy to see this :) | 19:39 |
tobiash | armstrongs: also I have two wip changes to the aws driver: https://review.opendev.org/632712 and https://review.opendev.org/632715 | 19:39 |
tobiash | if that's useful to you | 19:39 |
*** hashar has quit IRC | 19:43 | |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Add caching of autohold requests https://review.opendev.org/663412 | 19:43 |
*** hashar has joined #zuul | 19:44 | |
*** armstrongs has quit IRC | 19:47 | |
*** jamesmcarthur has quit IRC | 19:55 | |
*** jamesmcarthur has joined #zuul | 19:56 | |
*** rlandy|ruck is now known as rlandy|ruck|brb | 20:02 | |
*** panda has quit IRC | 20:04 | |
*** panda has joined #zuul | 20:05 | |
*** mattw4 has joined #zuul | 20:07 | |
mattw4 | Does anyone know why I would see a "DISK_FULL" error reported to Gerrit? My Zuul containers and test nodes seem to have adequate disk space, but I keep seeing this error: "setup-devstack finger://b191afe1a77a/68a2e1f7034744f8afb37c3f44d20745 : DISK_FULL in 0s" | 20:08 |
mattw4 | in the above ^ context, the "setup-devstack" job inherits from devstack-minimal. | 20:09 |
clarkb | mattw4: are you sure the executors and mergers have disk available? | 20:09 |
mattw4 | clarkb: I ran a "df -h" inside all the containers and it showed 60+GB available, but is that the correct way to check? | 20:10 |
corvus | mattw4: what clarkb said and also see https://zuul-ci.org/docs/zuul/admin/components.html#attr-executor.disk_limit_per_job | 20:11 |
corvus | and https://zuul-ci.org/docs/zuul/admin/components.html#attr-executor.min_avail_hdd | 20:11 |
clarkb | assuming docker I believe containers get unlimited disk by default | 20:11 |
clarkb | so df in the container is probably correct | 20:11 |
corvus | probably the disk_limit_per_job then | 20:11 |
mattw4 | corvus: how can I inspect how much scratch space consumed per job? I ran this devstack job a few times yesterday so maybe I'm accumulating logs somewhere? | 20:12 |
mattw4 | clarkb: yeah, it's docker containers | 20:13 |
tobiash | corvus: should we rename DISK_FULL to something like DISK_QUOTA or similar? | 20:13 |
corvus | tobiash: maybe so | 20:13 |
tobiash | I also got complaints why the heck I don't care about full disks as operator... | 20:13 |
corvus | mattw4: nothing should accumulate -- each job starts with a new scratch space | 20:13 |
corvus | tobiash: YOUR_JOB_USES_TOO_MUCH_DISK_SPACE? :) | 20:14 |
tobiash | that's great ;) | 20:14 |
fungi | agreed, some opendev users have been confused by DISK_FULL results and have to get explained that their jobs are archiving more data than zuul is configured to allow | 20:14 |
fungi | MAKE_SMALLER_LOGS | 20:14 |
fungi | ;) | 20:14 |
mattw4 | so how does it work that the job dails before it begins? | 20:14 |
corvus | mattw4: i'd check the executor logs for clues | 20:15 |
mattw4 | my jobs aren't running at all, just failing with the DISK_FULL error. Shouldn't it fail AFTER it accumulates a bunch of logs? | 20:15 |
mattw4 | will do corvus | 20:15 |
fungi | mattw4: when i've seen it, yes. maybe the job is accumulating too much in its workspace at the start somehow? | 20:16 |
corvus | mattw4: yes, that's the typical scenario, since due to use of hard links, we generally avoid counting the git repos against the quota. so when a job starts, it should only be using a few MB at most. | 20:16 |
fungi | i can't say i've seen jobs instal-fail with DISK_FULL | 20:16 |
mattw4 | it's devstack so it clones a LOT of repos first :/ | 20:16 |
fungi | in the executor workspace though? | 20:16 |
fungi | maybe the delegation is wrong? | 20:16 |
corvus | if it's using zuul-provided repos via require-projects, those generally shouldn't count (much) | 20:16 |
corvus | if it does its own cloning, then, that is almost certainly the problem :) | 20:16 |
*** rlandy|ruck|brb is now known as rlandy|ruck | 20:17 | |
mattw4 | I don't think I've properly configured Zuul because I'm not using required-projects | 20:17 |
mattw4 | I'm using roles instead | 20:17 |
mattw4 | in by job definitions | 20:17 |
mattw4 | my* | 20:17 |
corvus | mattw4: oooh | 20:18 |
corvus | mattw4: you're using devstack-minimal, right? | 20:18 |
mattw4 | corvus: yeah, trying to write a job that inherits from devstack-minimal | 20:19 |
mattw4 | so I added a bunch of untrusted projects (my dependency walk from yesterday) to the tenant config | 20:19 |
corvus | ok. devstack-minimal inherits from devstack-base which *does* have ERROR_ON_CLONE set to true, so as long as you aren't overriding that, devstack should refuse to clone repos. | 20:19 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: WIP: Auto-delete expired autohold requests https://review.opendev.org/663762 | 20:20 |
mattw4 | and I added {devstack,requirements,cinder} to my required projects in my job definition | 20:20 |
*** tjgresha has quit IRC | 20:21 | |
mattw4 | so I've configured this incorrectly, haven't I? | 20:21 |
corvus | mattw4: nah, that all sounds sensible. let's see what the logs say. | 20:21 |
corvus | mattw4: the other thing that might be helpful is to watch the streaming log as it's running | 20:21 |
fungi | there's a command-line flag to the executor to instruct it not to clean up workspaces right? maybe set that and then see what's taking up so much room? --help says it's called --keep-jobdir | 20:22 |
mattw4 | corvus: I found the error you mentioned: 2019-06-06 20:06:19,868 INFO zuul.ExecutorDiskAccountant: /tmp/tmpt5q_a830/68a2e1f7034744f8afb37c3f44d20745 is using 592MB (limit=250) | 20:22 |
corvus | mattw4: can you grab all the logs for that build and paste them? (grep for 68a2e1f7034744f8afb37c3f44d20745) | 20:23 |
mattw4 | corvus: you want to see all of the logs, right? for each container? | 20:24 |
*** jamesmcarthur has quit IRC | 20:24 | |
corvus | mattw4: sorry, just grep for "68a2e1f7034744f8afb37c3f44d20745" in the executor logs | 20:25 |
mattw4 | corvus: gotcha, just a sec... | 20:25 |
*** jamesmcarthur has joined #zuul | 20:27 | |
*** jamesmcarthur_ has joined #zuul | 20:29 | |
mattw4 | corvus: here's the executor log relating to that build: http://paste.openstack.org/show/752614/ | 20:30 |
*** jamesmcarthur has quit IRC | 20:31 | |
*** mattw4 has quit IRC | 20:31 | |
*** mattw4 has joined #zuul | 20:31 | |
fungi | mattw4: you probably want to go to http://paste.openstack.org/ and paste it there, then let us know the url | 20:32 |
corvus | fungi: i think mattw4 did that | 20:32 |
corvus | http://paste.openstack.org/show/752614/ | 20:32 |
fungi | oh, yep! | 20:32 |
mattw4 | :) | 20:32 |
fungi | i missed it in the part/join | 20:32 |
fungi | though he got kicked for flooding | 20:33 |
fungi | sorry for not reading closely! | 20:33 |
corvus | mattw4: the first time it hits the limit is during the clones, which suggests that there may be a problem with the hard-link system that's used to keep the git repos from taking up too much space | 20:34 |
fungi | yeah, telltale diskaccountant event in the middle of cloning nova | 20:34 |
corvus | exactly | 20:34 |
tobiash | mattw4: maybe the git cache and job dirs are on different filesystems | 20:34 |
corvus | tobiash: i think we default the job dirs to /tmp | 20:35 |
fungi | good point, if they share the same fs then git will use hardlinks | 20:35 |
tobiash | in many distros /tmp is a tmpfs | 20:35 |
mattw4 | tobiash: don't all of the containers just share the host fs? | 20:35 |
mattw4 | gotcha | 20:35 |
mattw4 | I will check | 20:35 |
tobiash | mattw4: containers? so you're running in docker? | 20:35 |
tobiash | in this case you need to ensure that both are within the same bind mount | 20:36 |
corvus | tobiash: yes, via docker-compose | 20:36 |
mattw4 | tobiash: yeah, I started with the quickstart/docker-compose | 20:36 |
tobiash | sorry, I didn't read all | 20:36 |
corvus | tobiash: we default git_dir to /var/lib/zuul/executor-git, but we default job_dir to /tmp | 20:36 |
tobiash | what are the volumes we mount in? | 20:37 |
corvus | considering how frequently /tmp is a different filesystem these days, perhaps we should change the default | 20:37 |
tobiash | also being on the same filesystem is not sufficient in docker, it also must be in the same bind mount | 20:37 |
corvus | tobiash: neither of those are volumes currently. | 20:37 |
mattw4 | tobiash: I thought the docker-compose.yaml would tell me, but I guess I'm not sure how to determine where they're bind-mounted | 20:38 |
tobiash | mattw4: then the output of mount inside the executor container would be helpful | 20:38 |
fungi | well, even moreso, /tmp and /var are *quite* likely to be different filesystems | 20:38 |
tobiash | mattw4: every volume is bind mounted | 20:38 |
fungi | in our (opendev's) deployment, our executors put the git repos and the jobdirs in /var | 20:39 |
mattw4 | tobiash: executor container mount: http://paste.openstack.org/show/752615/ | 20:39 |
tobiash | mattw4: that's it, you see the /var/lib/zuul (git cache is there) mountpoint which makes /tmp (jobs are by default there) a different mountpoint | 20:41 |
corvus | why does /var/lib/zuul appear there? that's not in the docker-compose file... | 20:41 |
tobiash | -> no hardlinks | 20:41 |
corvus | mattw4: did you customize your docker-compose file to add that? | 20:41 |
*** hashar_ has joined #zuul | 20:42 | |
mattw4 | corvus: I added that directory to hold my SSH keys | 20:42 |
*** hashar has quit IRC | 20:42 | |
mattw4 | executor container mounts from docker inspect: http://paste.openstack.org/show/752616/ | 20:42 |
corvus | mattw4: there's an existing volume in the docker-compose file you can use for ssh keys | 20:42 |
corvus | (called 'sshkey' and mounted at /var/ssh) | 20:44 |
mattw4 | corvus: where do I put keys in the host for it to mount there? | 20:44 |
*** armstrongs has joined #zuul | 20:45 | |
corvus | mattw4: it's not bind-mounted from the host, so you'll need to copy it into the container | 20:45 |
corvus | (or, if you want, you could alter the compose file to bind mount a dir in) | 20:45 |
corvus | mattw4: anyway, if you do that, you can drop the /var/lib/zuul mount which will put /var/lib/zuul/executor-git and /tmp on the same filesystem again and hard links will work | 20:46 |
corvus | mattw4: or you could alter the "job_dir" setting. whatever way gets you to the same filesystem. see the docs here: https://zuul-ci.org/docs/zuul/admin/components.html#attr-executor.git_dir | 20:47 |
mattw4 | corvus: thanks! I will give that a try. To summarize: the DISK_FULL error was a result of creating an additional mount that them put /var/lib/zuul/executor-git and /tmp on different filesystems? | 20:48 |
corvus | yep | 20:48 |
corvus | mattw4: (specifically it's the git_dir and job_dir directories -- those are the default values for those) | 20:49 |
*** hashar_ has quit IRC | 20:50 | |
mattw4 | corvus: just looked at git history: there's a bind mount defined for scheduler for /var/lib/zuul | 20:50 |
corvus | mattw4: well, upstream there's a *volume* defined for /var/lib/zuul on the scheduler. it's not a bind-mount; it's there to persist private key storage. | 20:51 |
corvus | those data are to re-creatable. everything on an executor is, so we didn't persist that in the docker-compose file. | 20:52 |
mattw4 | gotcha. The git log says I didn't alter the docker-compose so did I break something else or is it that the volume doesn't exist in my VM? | 20:52 |
*** armstrongs has quit IRC | 20:54 | |
*** jamesmcarthur_ has quit IRC | 21:03 | |
*** jamesmcarthur has joined #zuul | 21:05 | |
*** jamesmcarthur_ has joined #zuul | 21:16 | |
*** jamesmcarthur has quit IRC | 21:18 | |
*** jamesmcarthur_ has quit IRC | 21:36 | |
*** jamesmcarthur has joined #zuul | 21:37 | |
*** jamesmcarthur has quit IRC | 21:46 | |
*** sanjayu_ has joined #zuul | 23:03 | |
*** jamesmcarthur has joined #zuul | 23:13 | |
*** jamesmcarthur has quit IRC | 23:16 | |
*** jamesmcarthur has joined #zuul | 23:17 | |
*** jamesmcarthur has quit IRC | 23:42 | |
*** jamesmcarthur has joined #zuul | 23:45 | |
*** jamesmcarthur has quit IRC | 23:57 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!