openstackgerrit | Merged openstack-infra/zuul master: Fix context directories in image builds https://review.openstack.org/634266 | 00:00 |
---|---|---|
*** shanemcd has quit IRC | 00:02 | |
*** shanemcd has joined #zuul | 00:03 | |
*** rlandy is now known as rlandy|afk | 00:09 | |
*** sdake has quit IRC | 00:32 | |
*** sdake has joined #zuul | 00:34 | |
*** sdake has quit IRC | 00:36 | |
dkehn | clarkb, are there any examples for setting up fingergw in zuul/doc/source/admin/example zuul.conf and docker-compose.yaml? | 00:39 |
tristanC | jhesketh: for zuul-runner instances lifecycle, i'm thinking we could use nodepool driver.handler.launch() with a special Node object that wouldn't require zookeeper... | 00:41 |
tristanC | jhesketh: i know it sounds crazy, but with a user-provided nodepool.yaml, then zuul-runner should be able to use any type of labels (e.g. containers, instances, static-nodes, ...) | 00:42 |
*** rfolco has joined #zuul | 00:43 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: web: add /connections route https://review.openstack.org/631703 | 00:44 |
*** sdake has joined #zuul | 00:55 | |
jhesketh | tristanC: what are you imaging would be in the user-provided nodepool.yaml? and what (if anything) would handle the launching? | 01:03 |
tristanC | jhesketh: well i think we should retain the cli interface to provide node information, but the user could also provide a nodepool.yaml with labels associated with one or many providers | 01:04 |
tristanC | jhesketh: perhaps using cloud-image for openstack if we don't want to bother with nodepool-builder | 01:05 |
tristanC | jhesketh: then we could find a way to instantiate nodepool driver standalone and drive node creation/deletion from the zuul-runner cli | 01:06 |
openstackgerrit | Merged openstack-infra/zuul master: model: remove unused job's BranchMatcher procedures https://review.openstack.org/633643 | 01:07 |
jhesketh | right, I was kinda imaging a very similar thing, but I hadn't considered reusing nodepool to do it (basically I was thinking you'd just provide cloud creds and it'd launch nodes... I get that sounds like nodepool, but it wouldn't be keeping a pool) | 01:08 |
*** sdake has quit IRC | 01:08 | |
jhesketh | I was also wondering if we should publish infra's built images and provide a way to grab those and push them to your own cloud.. but that'd be a high overhead | 01:08 |
tristanC | jhesketh: that would work for cloud instance, but what about other type of instance like k8s pods? | 01:09 |
jhesketh | I had not considered that case tbh | 01:10 |
jhesketh | basically a file that defines for each label what to do. eg: ssh to this static node here; launch a node on this cloud; here are the k8s creds | 01:11 |
*** dkehn has quit IRC | 01:11 | |
jhesketh | which is basically what you're describing above, so I like that plan | 01:11 |
tristanC | jhesketh: ok, i could look into that next week, as it seems like the main missing part for a complete zuul-runner experience | 01:13 |
tristanC | though i'm not convinced re-using nodepool driver code is the right way, it may be worth investigating, but the downside is that zuul-runner user would have to provide a nodepool.yaml configuration | 01:14 |
jhesketh | right, I'm not sure how much of it would be reusable.. We should probably find another term for our configuration like zuul-runner.yaml if we are only needing a subset of nodepool.yaml | 01:15 |
jhesketh | I'd also like to experiment with making the runner-launcher plugable, so we could have a local libvirt driver to run tests locally | 01:16 |
tristanC | on the other hand, perhaps it would be a nice nodepool improvement if the drivers were usable standalone... | 01:16 |
tristanC | jhesketh: well, if there was a libvirt driver in nodepool, then zuul-runner user would just map the label to it | 01:17 |
jhesketh | Yep! | 01:20 |
tristanC | actually, it would make testing/developping nodepool driver easier if they could be used standalone... | 01:31 |
*** sdake has joined #zuul | 01:34 | |
*** dkehn has joined #zuul | 01:38 | |
*** sdake has quit IRC | 01:49 | |
*** sdake has joined #zuul | 02:23 | |
*** sdake has quit IRC | 02:33 | |
*** sdake has joined #zuul | 02:35 | |
*** sdake has quit IRC | 02:37 | |
*** sdake has joined #zuul | 02:40 | |
*** sdake has quit IRC | 02:42 | |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: [wip] add openstackci-mirrors element for centos/ubuntu testing https://review.openstack.org/634366 | 02:50 |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: [wip] add openstackci-mirrors element for centos/ubuntu testing https://review.openstack.org/634366 | 02:55 |
*** sdake has joined #zuul | 02:58 | |
*** sdake has quit IRC | 03:17 | |
*** sdake has joined #zuul | 03:19 | |
*** sdake has quit IRC | 03:19 | |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: [wip] add openstackci-mirrors element for centos/ubuntu testing https://review.openstack.org/634366 | 03:19 |
*** sdake has joined #zuul | 03:21 | |
*** sdake has quit IRC | 03:22 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: web: add /connections route https://review.openstack.org/631703 | 03:24 |
*** bhavikdbavishi has joined #zuul | 03:36 | |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: [wip] add openstackci-mirrors element for centos/ubuntu testing https://review.openstack.org/634366 | 03:41 |
*** sdake has joined #zuul | 03:44 | |
*** rlandy|afk is now known as rlandy | 03:45 | |
*** rlandy has quit IRC | 03:49 | |
*** sdake has quit IRC | 04:39 | |
*** spsurya has joined #zuul | 04:45 | |
*** invincible has quit IRC | 04:58 | |
*** chandan_kumar has joined #zuul | 04:59 | |
*** chandan_kumar is now known as chkumar|ruck | 04:59 | |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool master: [wip] add openstackci-mirrors element for centos/ubuntu testing https://review.openstack.org/634366 | 05:06 |
*** dkehn has quit IRC | 05:07 | |
*** bhavikdbavishi1 has joined #zuul | 05:37 | |
*** bhavikdbavishi has quit IRC | 05:37 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 05:37 | |
*** pvinci has quit IRC | 06:05 | |
*** bhavikdbavishi1 has joined #zuul | 06:13 | |
*** bhavikdbavishi has quit IRC | 06:14 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 06:14 | |
*** quique|rover|off is now known as quiquell|rover | 06:15 | |
*** jesusaur has quit IRC | 06:47 | |
*** jesusaur has joined #zuul | 06:52 | |
*** pcaruana has joined #zuul | 07:19 | |
*** saneax has joined #zuul | 07:46 | |
quiquell|rover | tristanC: o/ | 07:48 |
quiquell|rover | tristanC: Can I test https://review.openstack.org/#/q/topic:freeze_job ? | 07:48 |
*** panda|off is now known as panda | 07:53 | |
tristanC | quiquell|rover: yes please | 07:56 |
quiquell|rover | tristanC: How do I test it ? | 07:57 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Allow different filenames for Dockerfiles https://review.openstack.org/632979 | 07:58 |
tristanC | quiquell|rover: apply the patch, and then you can run something like: zuul-runner -a http://localhost:9000/api/ --tenant local --pipeline check --project rdo-jobs --job tripleo-ci-centos-7-standalone execute | 08:00 |
tristanC | quiquell|rover: atm node lifecycle management isn't implemented, so you have to give an ip address using --nodes ssh:zuul-worker:instance-ip:/home/zuul-worker | 08:01 |
tristanC | quiquell|rover: the nodes list is coma separated, thus it should work for multinode jobs too | 08:01 |
*** gtema has joined #zuul | 08:01 | |
quiquell|rover | tristanC: ack | 08:05 |
*** gtema has quit IRC | 08:05 | |
*** gtema has joined #zuul | 08:06 | |
openstackgerrit | Quique Llorente proposed openstack-infra/zuul master: Escape jinja2 stuff from inventory https://review.openstack.org/633930 | 08:08 |
quiquell|rover | tobiash, tristanC: The jinja2 thing https://review.openstack.org/#/c/633930 | 08:09 |
quiquell|rover | corvus: ^ | 08:09 |
*** rfolco has quit IRC | 08:41 | |
*** rfolco has joined #zuul | 08:42 | |
*** jpena|off is now known as jpena | 08:47 | |
quiquell|rover | tristanC: I have to apply just this ? https://review.openstack.org/#/c/631703/ | 08:48 |
tristanC | quiquell|rover: this + the rest of the topic | 08:51 |
tristanC | quiquell|rover: at least 607078 and 607077 on server side | 08:52 |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: Proposed spec: tenant-scoped admin web API https://review.openstack.org/562321 | 08:52 |
quiquell|rover | tristanC: Looks like we zuul is using multi-stage Dockerfile to build | 09:12 |
quiquell|rover | tristanC: Do you know if docker-ce is needed or buildha | 09:12 |
clarkb | quiquell|rover: the zuul image builds use docker ce iirc | 09:23 |
quiquell|rover | clarkb: so distro docker is old for that like docker at f28 and c7 | 09:24 |
quiquell|rover | clarkb: thanks | 09:24 |
*** sdake has joined #zuul | 09:27 | |
*** pcaruana has quit IRC | 09:30 | |
*** pcaruana has joined #zuul | 09:42 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: Proposed spec: tenant-scoped admin web API https://review.openstack.org/562321 | 09:43 |
*** luizbag has joined #zuul | 09:54 | |
*** bhavikdbavishi has quit IRC | 10:08 | |
*** electrofelix has joined #zuul | 10:38 | |
*** sdake has quit IRC | 10:51 | |
*** sdake has joined #zuul | 10:51 | |
tobias-urdin | how can i troubleshoot RETRY_LIMIT? I don't have any log storage right now, the zuul logs doesn't show anything and i'm trying to use finger but not sure how, fingergw is running | 11:06 |
tobias-urdin | i assume i can only finger while the jobs is running, but it's instances are killed pretty much instantly | 11:09 |
clarkb | tobias-urdin: check the executor log if the job is running something it should show there | 11:12 |
clarkb | if there isnt anything in the executor log then scheduler may say why it isnt getting that far | 11:13 |
*** bhavikdbavishi has joined #zuul | 11:13 | |
tobias-urdin | clarkb: i can see "2019-02-01 12:01:45,834 INFO zuul.AnsibleJob: [build: d73a9426ca934e38b4744481f5e17fde] Beginning job tox-py36-linters..." | 11:14 |
tobias-urdin | after that there's only "updating repo" and "checking out..." lines | 11:14 |
clarkb | if you grep on that build id tou dont get other info? | 11:15 |
tobias-urdin | cat /var/log/zuul/*.log | grep d73a9426ca934e38b4744481f5e17fde | 11:15 |
tobias-urdin | 2019-02-01 12:01:49,769 INFO zuul.ExecutorClient: Build <gear.Job 0x7f2354394080 handle: b'H:::ffff:127.0.0.1:234' name: executor:execute unique: d73a9426ca934e38b4744481f5e17fde> complete, result RETRY_LIMIT, warnings [] | 11:16 |
tobias-urdin | is there anything stored on executor that could help me troubleshoot why? | 11:17 |
clarkb | I wouldve expected logs from the ansible process too. Retry limit happens after a job fails in the pre run phase up to $limit attempts | 11:17 |
clarkb | so would've expected ansible logs showing pre run fail | 11:18 |
*** bhavikdbavishi has quit IRC | 11:18 | |
tobias-urdin | hm i get zero indication on what is causing pre run to fail, no ansible output at all it seems | 11:20 |
clarkb | any logs that match a grep for ansible-playbook? iirc zuul logs those command lines too | 11:25 |
clarkb | if those are missing it may be failing before ansible runs | 11:26 |
tobias-urdin | grep ansible-playbook /var/log/zuul/*.log | 11:26 |
tobias-urdin | no match, does it log that by default or do i need to change some logging config? | 11:26 |
tobias-urdin | i'm testing a simple hello world now to see if it's something weird or my jobs/roles/playbooks causing it | 11:26 |
clarkb | I dont know off the top of my head if debug level logging is default, but that is probably a good next item to check | 11:27 |
electrofelix | Was there work done for zuulv3 for a generic mechanism to apply review comments to the change under test in gerrit/github? | 11:32 |
*** bhavikdbavishi has joined #zuul | 11:35 | |
clarkb | electrofelix: there is the top level commenting that has always exosted and is configurable. But inline support for gerrit comments was added in v3 | 11:36 |
tobiash | ftr: https://zuul-ci.org/docs/zuul/user/jobs.html#leaving-file-comments | 11:37 |
tobiash | but it's currently only supported on gerrit | 11:37 |
electrofelix | was about to ask where to look, many thanks | 11:37 |
clarkb | tobiash: maybe you know answer to tobias-urdin questions earlier? | 11:41 |
tobiash | reading | 11:42 |
tobiash | clarkb: you're awake at this time? | 11:42 |
clarkb | Im in brussels for fosde. | 11:42 |
electrofelix | tobiash: if it turns out that it's still only gerrit by the time we upgrade, that might be something I can look at | 11:42 |
tobiash | clarkb: have fun | 11:43 |
tobiash | tobias-urdin: the executor should have pretty detailed logs of ansible runs | 11:43 |
tobiash | tobias-urdin: do you have the executor logs at hand (debug level)? | 11:44 |
tobias-urdin | tried with this http://paste.openstack.org/show/744368/ still fails on RETRY_LIMIT with no logs, need to fix logging config, think i just run with default and no custom logging config right now i.e only a zuul.conf | 11:44 |
tobias-urdin | should maybe note that i was on zuul 3.3.1 and caught this http://paste.openstack.org/show/744361/ so upgraded to 3.5.0 | 11:45 |
tobiash | tobias-urdin: the reason for retry_limit can vary from unreachable node, auth problems to the node, failures in pre-playbooks | 11:45 |
openstackgerrit | Sorin Sbarnea proposed openstack-infra/zuul-jobs master: Fix failure to add user to docker group on centos https://review.openstack.org/633948 | 11:46 |
openstackgerrit | Sorin Sbarnea proposed openstack-infra/zuul-jobs master: Fix failure to add user to docker group on centos https://review.openstack.org/633948 | 11:46 |
tobiash | tobias-urdin: but when I look at your job definition you don't have any pre-playbook so I'd assume a problem with communication problems to the node or auth problems | 11:46 |
tobias-urdin | yeah i removed any pre/run playbooks and added that hello world to test | 11:47 |
tobias-urdin | gonna fix logging conf, reload then recheck that see if i can see anything | 11:47 |
tobiash | k | 11:48 |
tobias-urdin | just need to figure out how first | 11:48 |
tobiash | corvus, clarkb: could it be possible that we have a memleak in the executor? | 11:48 |
tobiash | I noticed that the rss of the executor process is higher every night without load | 11:49 |
tobiash | the weird thing is that during low load the baseline memory consumption looks constant but after high load during the day the low load consumption is higher by several hundred mb | 11:52 |
tobiash | than the day before | 11:52 |
*** EmilienM is now known as EvilienM | 11:58 | |
tobias-urdin | ok so starting executor in cli did enable debug output, it's kind of obvious why it failed now haha http://paste.openstack.org/show/744369/ | 12:00 |
tobias-urdin | sorry for the noice | 12:01 |
tobiash | that explains something :) | 12:01 |
tobiash | corvus, clarkb: that's the rss of the executor container so basically rss of executor+ansible processes: https://paste.pics/a5a599d18aeb3aaaa3ad1218750febc9 | 12:03 |
tobias-urdin | ugh, does ansible "shell" invoke shell using python or a pure shell over ssh? maybe i could install python instead of carrying custom images to that cloud | 12:04 |
tobiash | using python | 12:04 |
tobiash | you can use a raw task for installing python however | 12:04 |
clarkb | tobiash: ya openstack hasnoticed similar as well as swapping | 12:04 |
*** pcaruana has quit IRC | 12:05 | |
tobiash | clarkb: ok, thx so I'm not chasing a phantom | 12:05 |
tobias-urdin | tobiash: ack, thanks! one step closer to getting rid of the stupid jenkins stuff we have though :) | 12:05 |
tobiash | tobias-urdin: but zuul does gather facts before running the job so python is probably required for this | 12:05 |
tobiash | tobias-urdin: do you have python3 on your images? | 12:06 |
tobiash | in that case you might be able to set the ansible_python_interpreter variable on the nodeset to /usr/bin/python3 | 12:06 |
tobiash | we may want a way in nodepool to define such variables in the future | 12:07 |
tobiash | we may want a way in nodepool to define such variables in the future | 12:08 |
tobias-urdin | thanks :) seems like the image has py3.6 by default | 12:09 |
tobiash | clarkb: that's interesting, a paused executor running no jobs anymore: http://paste.openstack.org/show/744371/ | 12:10 |
tobiash | it consumes much cpu and 3.5gb memory | 12:10 |
tobiash | hrm, stack trace contains a job in pause state that is not in the system anymore | 12:17 |
*** pcaruana has joined #zuul | 12:19 | |
*** pcaruana|afk| has joined #zuul | 12:25 | |
*** pcaruana has quit IRC | 12:26 | |
*** pcaruana|afk| is now known as pcaruana | 12:27 | |
*** jpena is now known as jpena|lunch | 12:31 | |
*** panda is now known as panda|lunch | 12:40 | |
openstackgerrit | Sorin Sbarnea proposed openstack-infra/zuul-jobs master: Make install-docker compatible with centos https://review.openstack.org/633948 | 12:43 |
*** sdake has quit IRC | 12:55 | |
*** bhavikdbavishi has quit IRC | 13:08 | |
tobias-urdin | tobiash: not sure where i should add ansible_python_interpreter doesn't seem like it reads it | 13:13 |
tobiash | tobias-urdin: where did you try it? | 13:13 |
*** pcaruana has quit IRC | 13:13 | |
tobias-urdin | tried as "vars" in parent job, "host-vars" on nodeset | 13:14 |
tobiash | hrm, I could have sweared that the nodeset allows to add host or group vars, but it seems not: https://zuul-ci.org/docs/zuul/user/config.html#attr-nodeset | 13:14 |
tobiash | tobias-urdin: I think pabelanger managed to use this by setting this as a site var in the executor config | 13:16 |
tobias-urdin | maybe it's even before my playbook is executed | 13:16 |
tobias-urdin | http://paste.openstack.org/show/744390/ | 13:16 |
tobiash | tobias-urdin: https://zuul-ci.org/docs/zuul/admin/components.html#attr-executor.variables | 13:16 |
tobiash | I think this should work | 13:16 |
tobiash | everything else is probably lower in the hierarchy to overwrite the variable zuul sets here: https://git.zuul-ci.org/cgit/zuul/tree/zuul/executor/server.py#n1614 | 13:17 |
*** rlandy has joined #zuul | 13:17 | |
tobiash | tobias-urdin: yes, the first thing zuul runs is a setup playbook that gethers and caches facts | 13:18 |
tobiash | and that runs before any job defined playbooks | 13:18 |
tobias-urdin | since that is in zuul config it means i can't have per-project, nodeset, job etc specific py3 only | 13:19 |
tobias-urdin | right? | 13:19 |
tobiash | correct | 13:19 |
tobiash | so this should probably changed ;) | 13:19 |
tobias-urdin | or maybe extra-vars overrides it, haven't tried that | 13:20 |
tobias-urdin | but maybe it still doesn't override the hardcoded one | 13:20 |
tobias-urdin | but in normal ansible operations it prob would | 13:20 |
*** pcaruana has joined #zuul | 13:20 | |
tobiash | tobias-urdin: pabelanger already succeeded to override this and I think it was by the executor variables | 13:21 |
tobias-urdin | ack | 13:21 |
*** panda|lunch is now known as panda | 13:23 | |
*** quiquell|rover is now known as quiquell|lunch | 13:23 | |
*** jpena|lunch is now known as jpena | 13:37 | |
sean-k-mooney | clarkb: corvus if ye have time later could ye take a look at https://review.openstack.org/#/c/633796/ and https://review.openstack.org/#/c/632452/ | 13:37 |
sean-k-mooney | im hoping to start testing triggering builds form upstream gerrit over the weekend and if i depend on both those change my thrid party ci seams to work but it would be nice if i did not have to add the depens on lines. | 13:38 |
*** sdake has joined #zuul | 13:41 | |
AJaeger | tobias-urdin: regarding https://review.openstack.org/633968 , please discuss this here. This looks like one of hte changes that need some discussion - probably with corvus involved. | 13:47 |
tobias-urdin | wrong link right? you are thinking about the revoke-sudo i assume | 13:48 |
*** quiquell|lunch is now known as quiquell | 13:51 | |
openstackgerrit | Sorin Sbarnea proposed openstack-infra/zuul-jobs master: Make install-docker compatible with centos https://review.openstack.org/633948 | 14:05 |
*** sdake has quit IRC | 14:06 | |
*** quiquell is now known as quiquell|rover | 14:07 | |
*** quiquell|rover is now known as quique|roverish | 14:07 | |
*** quique|roverish is now known as quiquell|rover | 14:08 | |
*** sdake has joined #zuul | 14:17 | |
AJaeger | tobias-urdin: yeah, that's it, sorry, wrong pasto. Was thinking about https://review.openstack.org/#/c/627534/ and the revoke-sudo. | 14:30 |
*** dkehn has joined #zuul | 14:39 | |
quiquell|rover | tristanC: Like the {% raw %} but I have to change all test now :-) | 14:41 |
openstackgerrit | Quique Llorente proposed openstack-infra/zuul master: Escape jinja2 stuff from inventory https://review.openstack.org/633930 | 14:42 |
*** quiquell|rover is now known as quiquell|off | 14:47 | |
*** quiquell|off is now known as quique|rover|off | 14:48 | |
openstackgerrit | Sorin Sbarnea proposed openstack-infra/zuul-jobs master: Make install-docker compatible with centos https://review.openstack.org/633948 | 14:53 |
openstackgerrit | Sorin Sbarnea proposed openstack-infra/zuul-jobs master: Make install-docker compatible with centos https://review.openstack.org/633948 | 15:17 |
*** panda is now known as panda|braindead | 15:42 | |
dkehn | clarkb: are there any examples for setting up fingergw in zuul/doc/source/admin/example zuul.conf and docker-compose.yaml? | 15:50 |
pabelanger | Is there any way to see which semaphore zuul is currently holding? Trying to debug an issue if some semaphore jobs not running, and believe we may not have unlocked the semaphore currently | 15:55 |
tobiash | pabelanger: not at the moment | 15:55 |
tobiash | you may have to dig in the logs | 15:56 |
pabelanger | we had a nodepool-launcher failure this morning, ran out of HDDs, and starting to think we had an exception lauching the semaphore job during that window | 15:56 |
pabelanger | now, jobs are stuck | 15:56 |
corvus | AJaeger, tobias-urdin: it looks like the puppetforge job uploads to puppetforge before fetching the artifact to the executor. if we were to model on the pypi job, we would fetch them to the executor, then upload to puppetforge from the executor. as it is, we're sending the credentials to the remote host, which means we're trusting any project that uses the job (which could be any project in the system). | 15:56 |
pabelanger | tobiash: k, thanks | 15:56 |
tobiash | pabelanger: you could looks for the last job that locked it and see how that job finished | 15:57 |
tobiash | pabelanger: that could possibly tell you the code path it took so we can find the broken one | 15:57 |
corvus | AJaeger, tobias-urdin: so instead of changing the revoke-sudo, i'd suggest looking into what is required to upload to puppetforge from the executor. we don't like to install much software on the executor (but we could probably install ruby itself, and then have the jobs install gems in the user's directory) | 15:58 |
*** luizbag has quit IRC | 15:58 | |
*** pcaruana has quit IRC | 15:59 | |
electrofelix | hughsaunders: wonder if you've any idea what it would take for the nodepool agent plugin to support freestyle jobs? | 15:59 |
*** saneax has quit IRC | 16:00 | |
corvus | tristanC, jhesketh: let's not expand the zuul-runner work to include nodepool yet. let's just see if we can get something landed in zuul first. :) jhesketh, maybe you can review tristanC's updates to your changes first, and when they look good, i'll take a look? | 16:03 |
corvus | tristanC: it looks like https://review.openstack.org/630035 is almost ready to land, but either has a bug or needs a test update. | 16:04 |
AJaeger | thanks, corvus for looking into this | 16:07 |
pabelanger | tobiash: okay, we had to restart zuul scheduler. jobs running again. So somehow we leaked semaphores, now comes the effort of trying to find where | 16:28 |
*** sdake has quit IRC | 16:32 | |
pabelanger | tobiash: do you have syntax handy for reenqueue on github? trying to help nhicher with error we are getting, but I've never actually enqueued a github job before. | 16:38 |
*** sdake has joined #zuul | 16:39 | |
tobiash | pabelanger: same as gerrit but change as pr#,<sha of head> | 16:48 |
pabelanger | ack | 16:49 |
corvus | tobiash: oh, i didn't know you had started on the buildset registry idea months ago :) | 16:59 |
tobiash | corvus: yeah that was kind of the demo showcase for job pause ;) | 17:00 |
corvus | tobiash: should i continue with what i'm working on, then maybe later we can fold in some stuff from your role (eg, proxy)? | 17:02 |
tobiash | ye | 17:02 |
tobiash | yes | 17:02 |
tobiash | your registry seems to be more sophisticated with certs and password | 17:03 |
corvus | tobiash: i'm not done yet, but you can see more of the system by searching for "topic: docker-registry" in gerrit | 17:03 |
corvus | there's 5 changes so far. | 17:03 |
tobiash | ah ok | 17:03 |
corvus | (i haven't written the push/pull roles yet, that's the next big piece, but i stubbed out their location in 634347 so you can see the sequencing) | 17:04 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: WIP: Add a role to run a buildset registry https://review.openstack.org/634319 | 17:08 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: WIP: add role to use buildset registry https://review.openstack.org/634346 | 17:08 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: DNM: test buildset registry https://review.openstack.org/634323 | 17:08 |
tobiash | corvus: looking at that topic I think you already have more than I had so I think I can just abandon my version of it | 17:08 |
corvus | tobiash: okay, don't let me forget proxies :) | 17:11 |
corvus | can anyone spot what i'm missing here? http://logs.openstack.org/06/630406/14/check/system-config-run-review/5c71e91/ara-report/result/8db4b584-1ecf-44a7-ab2f-ea70c9edf529/ | 17:13 |
corvus | i expect that command to redirect stdout+stderr to a file, but the file is empty and ansible captures both | 17:14 |
corvus | is that a bashism? and is that not running bash? | 17:15 |
tobiash | corvus: is that a command or shell task? | 17:15 |
tobiash | a command task won't interpret redirections | 17:15 |
corvus | shell: http://logs.openstack.org/06/630406/14/check/system-config-run-review/5c71e91/ara-report/file/5b878655-15bb-4234-ade6-e8f9ab6b1f8e/#line-31 | 17:16 |
corvus | that works as expected locally | 17:17 |
corvus | but my /bin/sh is bash | 17:18 |
corvus | i'll just throw "executable: /bin/bash" at it :) | 17:19 |
tobiash | corvus: yepp, bashism: http://paste.openstack.org/show/744411/ | 17:20 |
corvus | tobiash: thanks for the answer and for telling me about shellcheck :) | 17:20 |
tobiash | shellcheck is awsome :) | 17:21 |
tobiash | and it finds most of the bashisms :) | 17:21 |
tobiash | it even has an online checker: https://www.shellcheck.net/ | 17:21 |
*** jpena is now known as jpena|off | 17:37 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: WIP: Add a role to run a buildset registry https://review.openstack.org/634319 | 17:43 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: WIP: add role to use buildset registry https://review.openstack.org/634346 | 17:43 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: DNM: test buildset registry https://review.openstack.org/634323 | 17:43 |
tobiash | corvus: I've just discovered a problem with job pause | 17:44 |
tobiash | in case of an executor restart with a paused job, we currently restart that job | 17:45 |
tobiash | however I think in this case we actually need to restart all child jobs of it too | 17:45 |
tobiash | otherwise very unexpected and hard to debug symptoms can happen to the child jobs | 17:46 |
*** panda|braindead is now known as panda | 17:46 | |
corvus | tobiash: yes, i think i agree | 17:47 |
*** dkehn has quit IRC | 17:49 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: WIP: Add a role to run a buildset registry https://review.openstack.org/634319 | 17:55 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: WIP: add role to use buildset registry https://review.openstack.org/634346 | 17:55 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: DNM: test buildset registry https://review.openstack.org/634323 | 17:55 |
openstackgerrit | Sorin Sbarnea proposed openstack-infra/zuul-jobs master: Make install-docker compatible with centos https://review.openstack.org/633948 | 17:56 |
*** gtema has quit IRC | 18:07 | |
*** sdake has quit IRC | 18:08 | |
*** sdake has joined #zuul | 18:09 | |
*** electrofelix has quit IRC | 18:14 | |
openstackgerrit | Sorin Sbarnea proposed openstack-infra/zuul-jobs master: Make install-docker compatible with centos https://review.openstack.org/633948 | 18:21 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: WIP: Add a role to run a buildset registry https://review.openstack.org/634319 | 18:22 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: WIP: add role to use buildset registry https://review.openstack.org/634346 | 18:22 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: DNM: test buildset registry https://review.openstack.org/634323 | 18:22 |
openstackgerrit | Sorin Sbarnea proposed openstack-infra/zuul-jobs master: Make install-docker compatible with centos https://review.openstack.org/633948 | 18:33 |
*** fdegir has quit IRC | 18:38 | |
*** sdake has quit IRC | 18:39 | |
*** fdegir has joined #zuul | 18:40 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: WIP: add role to use buildset registry https://review.openstack.org/634346 | 18:40 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: DNM: test buildset registry https://review.openstack.org/634323 | 18:40 |
*** sdake has joined #zuul | 18:40 | |
*** panda is now known as panda|off | 18:47 | |
*** pvinci has joined #zuul | 18:50 | |
pvinci | corvus: thank you for fixing my issue. | 18:51 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: WIP: add role to use buildset registry https://review.openstack.org/634346 | 19:13 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: DNM: test buildset registry https://review.openstack.org/634323 | 19:13 |
SpamapS | corvus: IIRC, there was talk of having a way to expose logs via zuul-web. Is there anywhere I can go to keep tabs on that idea? | 19:13 |
SpamapS | I'm setting up an auth frontend for my logs right now and I'd really love to have a comment with "Once this lands we can remove this piece of the infrastructure" type of bread crumb. | 19:13 |
corvus | SpamapS: not afaik; the latest artifact i know of is my email laying out the idea and the thread which follows: http://lists.zuul-ci.org/pipermail/zuul-discuss/2018-July/000501.html | 19:14 |
corvus | SpamapS: i believe that tristanC has gotten some pre-requisite infrastructure in place for that (the build page fetches one file from the log server now), so ad-hoc progress is being made | 19:15 |
corvus | SpamapS: but maybe if we turn that into a story with individual tasks we can get some more momentum on it | 19:15 |
SpamapS | corvus: Indeed, I think it would be really great, then we can have a batteries-included log parser too.. I have so many ideas, but so little time. :-/ | 19:17 |
corvus | yeah. it's about third down on my list right now. but i think other folks could contribute. | 19:18 |
corvus | and i'd rather wait until that's in place before moving opendev to swift logs (because without it, it's a UX regression, with it it's an improvement) | 19:19 |
SpamapS | indeed, my s3 based logs are pretty awful right now | 19:21 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: WIP: add role to use buildset registry https://review.openstack.org/634346 | 19:23 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: DNM: test buildset registry https://review.openstack.org/634323 | 19:23 |
corvus | lemme etherpad up a task list and if it looks good i'll put it in storyboard | 19:24 |
SpamapS | ty, hopefully I can contribute some | 19:24 |
*** sdake has quit IRC | 19:28 | |
*** sdake has joined #zuul | 19:29 | |
*** sdake has quit IRC | 19:31 | |
corvus | SpamapS: how's that look? https://etherpad.openstack.org/p/mbAfaIpbWN | 19:32 |
corvus | (actually, several hard things in the original email are done now -- not only fetching files, but artifact url return) | 19:32 |
corvus | SpamapS: i went ahead and dumped it in storyboard: https://storyboard.openstack.org/#!/story/2004923 | 19:38 |
corvus | tristanC, mordred: ^ | 19:39 |
*** dkehn has joined #zuul | 19:40 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: WIP: Add a role to run a buildset registry https://review.openstack.org/634319 | 19:43 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: WIP: add role to use buildset registry https://review.openstack.org/634346 | 19:43 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: DNM: test buildset registry https://review.openstack.org/634323 | 19:43 |
dkehn | is the a place one can look at how the docker zuul images where built for hub.docker.com? like fingergw user, because failing on zuul user | 19:50 |
corvus | mordred, tobiash: http://logs.openstack.org/23/634323/14/check/test-buildset-registry/15d6904/job-output.txt.gz#_2019-02-01_19_46_38_350178 succeeded! i think the first 2 roles in zuul-jobs are ready to land (run and use registry). we're also ready to run the registry in opendev. topic:docker-registry | 19:50 |
corvus | mordred, tobiash: next i'll work on the pull/push roles, but we won't be able to really test them until we land provides/requires and have the opendev registry running | 19:51 |
corvus | dkehn: yes it's just the Dockerfile in the zuul repo | 19:51 |
corvus | dkehn: we should add fingergw to the zuul-quick-start | 19:51 |
dkehn | corvus: yes try to do that now | 19:52 |
dkehn | corvus: the only Dockerfile I see is logs-Dockerfile and node-Dockerfile, is that correct | 19:55 |
corvus | dkehn: this one: http://git.zuul-ci.org/cgit/zuul/tree/Dockerfile | 19:55 |
tobiash | corvus: cool :) | 19:55 |
corvus | dkehn: those are only used by the quickstart, and build images locally | 19:55 |
corvus | dkehn: er, lemme clarify -- the logs and node dockerfiles are only used by the quickstart, to build local images for the quick-start's log server and worker node. the dockerfile for all of the zuul service images is the one i linked. | 19:56 |
dkehn | corvus: so what user is in the zuul/zuul-fingergw build, because its failing on: http://git.zuul-ci.org/cgit/zuul/tree/zuul/lib/streamer_utils.py#n105 using the zuul user | 19:59 |
corvus | dkehn: oh, i think all the container images may run as root right now | 20:00 |
corvus | dkehn: and yeah, looks like the default is zuul -- http://git.zuul-ci.org/cgit/zuul/tree/zuul/cmd/fingergw.py#n66 so we'll need to set the user in the config file to root | 20:01 |
corvus | sigh. i think we should not have a default there | 20:02 |
corvus | (because if there's no user set, we don't drop privs) | 20:03 |
corvus | but that's difficult to do if there's a default user | 20:03 |
corvus | dkehn: try just setting "user=" | 20:05 |
dkehn | corvus: ack | 20:06 |
pabelanger | is there a reason not to create zuul user inside dockerfile? | 20:06 |
pvinci | Is it possible to run jobs on repo's without 'files' (.zuul.yaml) | 20:08 |
dkehn | corvus: that seemed to work thanks, now dying on the socket file. | 20:08 |
pabelanger | pvinci: yup, you can keep their job configuration in your config-project | 20:09 |
dkehn | corvus: going to add the - /var/lib/zuul to the docker-compose,yaml | 20:10 |
pvinci | pabelanger: That's what I thought, but the scheduler wouldn't trigger until I added an empty .zuul.yaml to the repo. I'll keep looking. thanks! | 20:11 |
tobiash | corvus: commented on 634319 | 20:13 |
corvus | tobiash: oh thanks. yeah, i'll update the docs, and... um, remove "WIP" from the commit message :) | 20:13 |
tobiash | corvus: same doc issue applies to the 'use' role | 20:14 |
pabelanger | pvinci: the projects tab on the UI should give you an idea of which jobs are attached to the projects. I am unsure why adding a .zuul.yaml is required, we have untrusted projects in ansible-network without any zuul configuration in them | 20:14 |
mordred | corvus: requires/provides lgtm - do you want me to hold off on +3 for anyone else to look at it? | 20:18 |
SpamapS | corvus: that's great! | 20:23 |
pvinci | pabelanger: ok. That helps. Thanks! | 20:23 |
corvus | mordred: i reckon it's gtg; it's been out there a few days, plus had an email thread. | 20:23 |
tobiash | corvus: I've also commented on 634346 | 20:28 |
tobiash | corvus: regarding proxy support I think we'll have to mess with systemd config. But that'll be more complicated and is definitely something for later when someone needs it. | 20:29 |
*** sdake has joined #zuul | 20:40 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Add a role to run a buildset registry https://review.openstack.org/634319 | 20:59 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Add role to use buildset registry https://review.openstack.org/634346 | 20:59 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: DNM: test buildset registry https://review.openstack.org/634323 | 20:59 |
corvus | tobiash: thanks, that should address all the comments | 20:59 |
tobiash | lgtm | 21:02 |
*** sdake has quit IRC | 21:03 | |
mordred | corvus: the test job has sad | 21:08 |
mordred | http://logs.openstack.org/23/634323/15/check/test-buildset-registry/2e8efb6/job-output.txt.gz#_2019-02-01_21_03_16_501911 | 21:09 |
corvus | hrm. so either there's a syntax error in the config, or we need to wait longer to log in? | 21:10 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: DNM: test buildset registry https://review.openstack.org/634323 | 21:14 |
mordred | corvus: I'll be curious to see if that fixes it | 21:17 |
corvus | i'm watching the stream, and i feel like it's not looping as much as i'd expect. | 21:19 |
corvus | the contents of that file look correct | 21:21 |
corvus | oh! | 21:23 |
corvus | i think i need to set restart true on the registry container | 21:23 |
*** sdake has joined #zuul | 21:23 | |
corvus | i'm going to leave that retry loop in there, just to avoid race conditions though | 21:25 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Add a role to run a buildset registry https://review.openstack.org/634319 | 21:26 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Add role to use buildset registry https://review.openstack.org/634346 | 21:26 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: DNM: test buildset registry https://review.openstack.org/634323 | 21:26 |
mordred | corvus: seems reasonable | 21:27 |
mordred | corvus: left a nit on 634346 - only matters if you wind up respinning again | 21:28 |
corvus | mordred, tobiash: looks like the test is good now (and the retries weren't needed, but i still think it's good to leave there) | 21:36 |
*** dkehn has quit IRC | 21:42 | |
mordred | corvus: woot! | 21:48 |
*** rlandy has quit IRC | 21:50 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!