jlk | Does perforce have a code review type system? Proposed changes and reviews and such? | 00:00 |
---|---|---|
clarkb | https://www.perforce.com/products/helix-swarm | 00:01 |
jlk | it has CI hooks, that's good | 00:04 |
mordred | I think it'll be interesting to figure that out at some point | 00:15 |
openstackgerrit | Jamie Lennox proposed openstack-infra/zuul feature/zuulv3: Convert jwt encode to string for github in python3 https://review.openstack.org/479100 | 00:23 |
jamielennox | jlk: ^ | 00:23 |
jlk | so... when reading the docs changes, it seemed to elude to the gerrit code doing some statsd stuffs, but I'm looking in the tree and not seeing it. Am I not looking in the right place? Are the docs wrong? | 00:24 |
jlk | woo! | 00:25 |
jamielennox | obvious once you find it, took too long to find | 00:25 |
mordred | jamielennox: woot! | 00:36 |
mordred | jlk: it actually all seems to be in the scheduler, connection and manager | 00:37 |
jlk | that's what I found | 00:37 |
mordred | jlk: so - I think you already have statsd reporting for github :) | 00:37 |
jlk | heh | 00:38 |
mordred | jlk: that was easy, right? | 00:39 |
jamielennox | i figured we'd build out the github statsd when we found things we actually cared about in github stats | 00:39 |
jlk | mordred: hah, yeah. I was looking for parity with gerrit, but yeah. | 00:40 |
mordred | yah - it's all pretty much reported at the scheduler event and connection event level currently - but yeah, if you find that's not reporting things that are useful for github, that'll be good learning | 00:40 |
mordred | things like: 'zuul.event.{driver}.{event}' and 'zuul.event.{driver}.{connection}.{event}' cover a lot of ground :) | 00:40 |
*** xinliang has quit IRC | 01:07 | |
*** jkilpatr has quit IRC | 01:11 | |
*** xinliang has joined #zuul | 01:21 | |
*** xinliang has quit IRC | 01:21 | |
*** xinliang has joined #zuul | 01:21 | |
*** jkilpatr has joined #zuul | 01:24 | |
jamielennox | SpamapS: not sure yet what caused this but thought you might be interested: http://paste.openstack.org/show/614137/ | 01:45 |
*** jkilpatr has quit IRC | 02:12 | |
tobiash | getting repeated exception in nodepool http://paste.openstack.org/show/614141/ | 05:13 |
tobiash | hrm, possibly unclean source when building the docker image, will rebuild and check if this happens again :-/ | 05:26 |
tobiash | jeblair, Shrews: I added a second cloud provider to nodepool and am observing a strange behavior now | 05:41 |
tobiash | jeblair, Shrews: when at quota (cores exceeded, but instances not) it fills up one provider and declines/fails the node requests constantly without trying the second provider (which still has capacity) | 05:42 |
clarkb | tobiash: you have to size based on instamces with the current scheduler | 05:55 |
clarkb | if not at max instances then nodepool assumes it has room | 05:56 |
tobiash | clarkb: I have different sized instances so I cannot calculate a proper instance cound to fill up the cpu quota | 05:56 |
tobiash | s/cound/count | 05:57 |
clarkb | and this is with zuulv3 right? with 2.5 I think the nodepool acheduler would round robin the requests | 06:00 |
tobiash | clarkb: yes | 06:01 |
clarkb | so we may need to add in some sort od round robin to mitigate against this | 06:01 |
clarkb | pretty sure old.scheduler did this | 06:01 |
clarkb | actually, current one may attempt it too by only failing the request if all possible providers fail? | 06:02 |
tobiash | clarkb: I'm trying to craft a test case for this (didn't find one) | 06:21 |
jamielennox | did we change the playbook path? | 06:44 |
jamielennox | seeing Unable to find playbook /tmp/df47f0b6d0234431a0c60b94e103a9c3/ansible/pre_playbook_0/github.com/BonnyCI/project-config/base-pre | 06:45 |
jamielennox | where the file is at project-config/playbooks/base-pre.yaml | 06:46 |
tobiash | jamielennox: playbooks/ is not defaulted anymore | 07:07 |
tobiash | (except for main playbook if it's not defined) | 07:07 |
jamielennox | tobiash: yea, guessed as much, just hadn't seen it | 07:07 |
tobiash | jamielennox: https://review.openstack.org/#/c/477672/ | 07:08 |
*** hashar has joined #zuul | 07:15 | |
tobiash | clarkb: never mind, I fucked up my config... max_servers of the second providers was 0 :-/ | 07:34 |
* tobiash needs to grab some coffee | 07:34 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: Ensure build.start_time is defined https://review.openstack.org/466732 | 08:11 |
*** bhavik1 has joined #zuul | 08:34 | |
mordred | tobiash: that would do it! :) | 09:35 |
*** pbelamge has joined #zuul | 10:13 | |
pbelamge | Hello All | 10:14 |
pbelamge | anybody came across this error when run zuul-server in daemon mode? | 10:14 |
pbelamge | https://thepasteb.in/p/Anhrq5mogngcv | 10:15 |
pbelamge | installed zuul version is: | 10:16 |
pbelamge | $ zuul-server --version | 10:16 |
pbelamge | Zuul version: 2.5.2 | 10:16 |
pbelamge | can anybody through some light on this? | 10:18 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 10:22 |
toabctl | pbelamge, does https://github.com/openstack/ansible-role-zuul support zuulv3? | 10:31 |
toabctl | eh, I meant pabelanger ^^ | 10:32 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Use versions of jobs from zuul-jobs https://review.openstack.org/479253 | 10:37 |
mordred | jeblair: see syntax error on first comment of https://review.openstack.org/#/c/479253/ | 10:39 |
mordred | jeblair: three feedbacks | 10:39 |
mordred | jeblair: a) WOOT - that was a syntax error in openstack-zuul-jobs run speculatively on a change to zuul | 10:40 |
mordred | s/openstack-zuul-jobs/zuul-jobs/ | 10:40 |
*** bhavik1 has quit IRC | 10:41 | |
mordred | jeblair: b) it didn't trigger the syntax error on zuul-jobs - before that change there was no job defined for zuul-jobs and the syntax error was actually in defining the job. that's an edge case, but probably one that'll be confusing to debug (and it was caused by a yaml indentation goof, so probably a _likely_ error) | 10:43 |
toabctl | is there somewhere a zuul.conf example for the v3 version? | 10:45 |
mordred | jeblair: c) when it was reported, it was reported as being in "<unicode string>" - I know that's a passthrough error - and the message in the commit actually did give me the information I needed - but we may want to ponder if it's feasible to be able to replace "<unicode string>" with .zuul.yaml | 10:46 |
mordred | toabctl: we're using: http://git.openstack.org/cgit/openstack-infra/puppet-zuul/tree/templates/zuulv3.conf.erb | 10:47 |
mordred | toabctl: I do not know if https://github.com/openstack/ansible-role-zuul has been updated for v3 yet - and pabelanger is on vacation this week | 10:47 |
mordred | toabctl: the bonnyci ansible may be helpful: https://github.com/BonnyCI/hoist/blob/master/roles/zuul/templates/etc/zuul/zuul_v3.conf | 10:48 |
*** jkilpatr has joined #zuul | 10:48 | |
toabctl | mordred, ok. I'll have a look at these (trying todo a github connected setup currently) | 10:49 |
mordred | toabctl: neat - I need to do one of those soon myself :) | 10:51 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 10:51 |
mordred | jeblair: also - I believe you were talking about this yesterday - but the "run post playbooks if pre-playbooks fail" I think will be super helpful | 11:04 |
pbelamge | anybody know why we get this error when run the zuul-server in daemon mode? | 11:06 |
pbelamge | IOError: [Errno 9] Bad file descriptor | 11:06 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 11:09 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 11:14 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 11:14 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 11:20 |
toabctl | mordred, does py34 work? or is really 3.5 required? | 11:46 |
toabctl | or is it just not tested.. | 11:46 |
mordred | toabctl: it's definitely not tested - no clue if it works or not | 11:47 |
mordred | ah - yeah - 3.5 is going to be required for at least some portions to work (the websocket log streaming, for instance) | 11:48 |
mordred | jeblair: vars in job definitions do not seem to be making it into the inventory | 11:49 |
*** hashar is now known as hasharAway | 11:49 | |
mordred | jeblair: nevermind. I don't know how to read logs | 11:52 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 11:56 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 11:59 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 12:06 |
openstackgerrit | Thomas Bechtold proposed openstack-infra/zuul feature/zuulv3: Fix exception handling in scheduler https://review.openstack.org/479283 | 12:27 |
toabctl | mordred, is the tenant_config somewhere documented? | 12:28 |
*** hasharAway is now known as hashar | 12:29 | |
pabelanger | toabctl: mordred: ohai, just on train heading back down to London from Scotland. ansible-role-zuul is tracking against feature/zuulv3 branch, but hasn't been updated in the last few weeks. So, there may be some changes that need to be made. | 12:47 |
toabctl | pabelanger, ok. thanks for the info! | 12:48 |
pabelanger | toabctl: plan to work on updating all the ansible bits when back from PTO on july 10th. | 12:49 |
mordred | ZOMG I JUST SPENT FOREVER TRACKING DOWN AN ANSIBLE SYNTAX ERROR WHICH TURNED OUT TO BE A SINGLE QUOTE IN A COMMENT!!!!! | 12:59 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 13:00 |
openstackgerrit | Thomas Bechtold proposed openstack-infra/zuul feature/zuulv3: Fix non-default username for zuul-executor https://review.openstack.org/479299 | 13:14 |
mordred | jeblair: role path lookup is wrong - if the base job adds roles: openstack-zuul-roles and then the unittest job adds roles: zuul-jobs and both openstack-zuul-roles and zuul-jobs have a role named the same, they the base job roles override | 13:15 |
mordred | jeblair: /tmp/64686e6e77c54b67aa38bad1db8dfc92/ansible/role_0/trusted/git.op | 13:15 |
mordred | enstack.org/openstack-zuul-roles/roles:/tmp/64686e6e77c54b67aa38bad1db8dfc92/ans | 13:15 |
mordred | ible/role_1/untrusted/zuul-jobs/roles | 13:15 |
*** pbelamge has quit IRC | 13:16 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Use versions of jobs from zuul-jobs https://review.openstack.org/479253 | 13:19 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Reverse role list before writing it out https://review.openstack.org/479301 | 13:19 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 13:21 |
*** jkilpatr has quit IRC | 13:22 | |
mordred | jeblair: also - if there is a playbook parse error, we're not logging that to the user currently - but we probably should | 13:23 |
mordred | although looking at the code, that's not going to be easy :( | 13:27 |
*** jkilpatr has joined #zuul | 13:35 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Use versions of jobs from zuul-jobs https://review.openstack.org/479253 | 13:44 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Append ansible yaml parse errors to job log file https://review.openstack.org/479312 | 13:44 |
mordred | actually - not that bad | 13:44 |
tobiash | mordred: looking at ^^, doesn't self.proc.stdout already have the stdout of the first ansible run? | 13:48 |
mordred | tobiash: oh - I guess I could just seek(0) it couldn't I? | 13:49 |
mordred | tobiash: thank you - that's much smarter :) | 13:50 |
tobiash | mordred: in line 1320 self.proc is resetted, this would need to be moved below probably | 13:50 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Append ansible yaml parse errors to job log file https://review.openstack.org/479312 | 13:53 |
tobiash | mordred: if seek doesn't work you also could append the lines to a list during logging | 13:53 |
mordred | tobiash: something more like that perhaps? | 13:53 |
tobiash | looking | 13:53 |
mordred | tobiash: yah - I avoided that originally because I was mistakenly thinking I needed to worry about large amounts of output buffering - but we deal with that elsewhere | 13:53 |
mordred | tobiash: so yes, I think we could also append lines to a list | 13:53 |
tobiash | mordred: maybe return failed? | 13:55 |
tobiash | mordred: I think ps2 would trick the result to believe a successful job | 13:56 |
jlk | o/ | 13:58 |
mordred | tobiash: it actually gets processed by the thing that calls the runAnsible method - if you look around 889 | 13:59 |
mordred | tobiash: in runPlaybooks - when job_status is RESULT_NORMAL, the logic there determines success or failure based on return code - and since it's 4 it should show failed | 14:00 |
tobiash | mordred: ah, you're right | 14:00 |
tobiash | missed that one | 14:00 |
tobiash | mordred: +1 | 14:01 |
mordred | tobiash: but good eye! | 14:01 |
tobiash | thanks | 14:01 |
mordred | jeblair: woot! do you remember when I made that patch for unicode encoding that log line but didn't have an example of when it broken? | 14:02 |
mordred | 2017-06-30 13:54:50,524 DEBUG zuul.AnsibleJob: [build: d9bb1e7de80f44a5adf3e390bedb694a] Ansible output: b"UnicodeEncodeError: 'ascii' codec can't encode character u'\\u011f' in position 64: ordinal not in range(128)" | 14:03 |
*** jkilpatr has quit IRC | 14:07 | |
*** jkilpatr has joined #zuul | 14:07 | |
jlk | toabctl: if it helps, there's some sample stuff up at https://github.com/j2sol/z8s and at https://github.com/j2sol/project-config | 14:10 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 14:12 |
toabctl | jlk, for the github connection - is the api_token enough? I'm still not sure if I have a working connection... | 14:12 |
jlk | toabctl: yeah you can do it with an api token | 14:13 |
toabctl | jlk, then the zuul-scheduler should wirte something to the log if I add a comment to a github PR done in a project mentioned in the tenant.yaml, right? | 14:13 |
jlk | toabctl: we are running into a problem right now where the webhook ingestion is triggering a byte decode bug somewhere in webobject, so your hooks won't necessarily get processed right | 14:14 |
jlk | toabctl: no, on the github side you have to configure a webhook to send events to your scheduler | 14:14 |
toabctl | jlk, I have no webhook configured. I thought it works also without webhooks. no? | 14:14 |
jlk | zuul doesn't poll github, it waits for githib to send events. | 14:14 |
toabctl | oh. then that's the problem. I thought it polls. | 14:14 |
toabctl | jlk, so zuul listens on 8001 for hooks? | 14:15 |
jlk | yeah | 14:15 |
*** jkilpatr has quit IRC | 14:15 | |
toabctl | jlk, what about the webhook secret? can I share that somehow with zuul? | 14:17 |
jlk | I believe so, I'm just not using that with my small sample setup I use to test zuul code | 14:17 |
jlk | the secret is defined as "webhook_token" in the config | 14:18 |
jlk | oh and the websecret has to be of type json | 14:20 |
jlk | er webhook | 14:20 |
toabctl | jlk, hm. accessing the webhook url:8001 gives a 404 | 14:22 |
jlk | accessing it how? | 14:22 |
jlk | it only accepts certain things in certain paths | 14:23 |
toabctl | jlk, with a browser - http get | 14:23 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 14:23 |
jlk | We send to /connection/github/payload where "github" is the name of hte connection in zuul config | 14:23 |
jlk | (that's in the github side webhook url | 14:24 |
jlk | you should also be able to get status, curl http://localhost:8001/z8s/status works for me becuase "z8s" is the name of my tenant in etc/zuul/tenant.yaml | 14:27 |
toabctl | jlk, hm. the status curl call returns something | 14:29 |
*** jkilpatr has joined #zuul | 14:29 | |
jlk | a GET on the webhook URL won't work. It takes POST | 14:29 |
toabctl | jlk, the webhook log says that the last deliver was successful. | 14:30 |
jlk | although what I get back is a 405, not 404 | 14:30 |
jlk | toabctl: I think what it delivers is a ping? | 14:30 |
jlk | curl http://localhost:8001/connection/github/payload just returns back a 405 | 14:30 |
jlk | curl -H "Content-Type: application/json" -H "X-Github-Event: issue_comment" -X POST -d @file.json would be how to do it manually | 14:30 |
* jlk has to go afk for breakfast | 14:31 | |
toabctl | jlk, works now! thanks a lot!!! | 14:31 |
*** jkilpatr has quit IRC | 14:42 | |
*** hashar is now known as hasharAway | 14:42 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 14:44 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Handle unicode (or high bytes) coming from output https://review.openstack.org/474827 | 14:44 |
openstackgerrit | Thomas Bechtold proposed openstack-infra/zuul feature/zuulv3: doc: Mention tenant configuration parameter https://review.openstack.org/479338 | 14:51 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Add web-based console log streaming https://review.openstack.org/463353 | 14:52 |
mordred | Shrews: morning sir! | 14:53 |
Shrews | g'morn | 14:54 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Add web-based console log streaming https://review.openstack.org/463353 | 14:55 |
toabctl | a general question - does anybody care about zuulv3 documentation updates? | 14:56 |
mordred | toabctl: yes! there is a big docs stack - (jeblair just did the first chunk of makig it good) - ending here: https://review.openstack.org/#/c/479020/ | 15:05 |
toabctl | mordred, ahh. good to know! | 15:06 |
mordred | toabctl: (we need to land that) - so if you want to update the docs based on stuff you've found or experienced, it should stack on top of that - but we very much value such thigns! | 15:06 |
mordred | Shrews: this is a very half-baked patch: https://review.openstack.org/#/c/474827/ here - I don't know 100% of what the deal is - but I *think* that perhaps we've got a mismatch somewhere in the log streaming pipeline WRT utf-8/ascii | 15:07 |
mordred | Shrews: not so much that things don't work normally - but in some cases we seem to be getting the occasional high-byte thing and can't log it | 15:08 |
mordred | Shrews: I'm mentioning it because it might be worthwhile for us to make sure we know, since we're sending stuff over the wire, what we're expecting the encoding to be on both sides of that (like, it's likely I'm not decoding where I should or am assuming one encoding but it's the other but it works because shell output is mostly in the ascii/utf-8 overlap) | 15:09 |
Shrews | mordred: ack. will take a peek | 15:21 |
tobiash | hrm, my nodepool sometimes seems to try to create new nodes and waits for them before handing them back to zuul even if it lists several nodes as ready and unlocked | 15:50 |
*** jkilpatr has joined #zuul | 15:53 | |
tobiash | will have to debug next week | 15:57 |
jeblair | 10:43 < mordred> jeblair: b) it didn't trigger the syntax error on zuul-jobs - before that change there was no job defined for zuul-jobs and the syntax error was actually in defining the job. that's an edge case, but probably one that'll be confusing to debug (and it was caused by a yaml indentation goof, so probably a _likely_ error) | 16:01 |
jeblair | mordred: yes, that's similar (or maybe the same?) as this story about adding a project to a pipeline: https://storyboard.openstack.org/#!/story/2000898 | 16:01 |
jeblair | 10:45 < mordred> jeblair: c) when it was reported, it was reported as being in "<unicode string>" - I know that's a passthrough error - and the message in the commit actually did give me the information I needed - but we may want to ponder if it's feasible to be able to replace "<unicode string>" with .zuul.yaml | 16:01 |
jeblair | mordred: the intent is to have actual files there, and i thought we did. if we lost that we may have introduced a bug | 16:01 |
jeblair | toabctl: yes, the tenant configuration is here: http://docs-draft.openstack.org/94/477594/4/check/gate-zuul-docs-ubuntu-xenial/c9e1704//doc/build/html/admin/tenants.html (that's a pre-rendering based on that big stack of docs patches mordred mentioned) | 16:02 |
jeblair | mordred: what's going on in this log? http://logs.openstack.org/65/478265/17/check/7a44730/job-output.txt (that's the port in tox jobs change, and it's a failing run of zuul-tox-linters) | 16:09 |
*** jkilpatr has quit IRC | 16:11 | |
*** jkilpatr has joined #zuul | 16:11 | |
mordred | jeblair: ah! neat - I got further that time | 16:13 |
mordred | jeblair: I believe that's recursing on a thing I don't need but thought I did because of earlier bugs | 16:13 |
mordred | jeblair: (I was hitting what wound up being yaml syntax errors before - but was misdiagnosing them) | 16:14 |
jeblair | oh that's the ' thing | 16:14 |
mordred | jeblair: well, that and the role path sequencing issue | 16:14 |
mordred | yah | 16:14 |
mordred | ' in comments == sad yaml | 16:14 |
jeblair | i guess you have to write ansible-bash like Data -- no contractions. | 16:15 |
Shrews | it's sad i know the episode of the show that references | 16:21 |
jeblair | mordred: i'm not sure about the roles order thing. if we allow child roles to override base/parent roles, then someone could put a malicious 'upload-logs' role in their repo and our base job would run it (they would have to *land* the change in their repo since we won't run it speculatively, but they could do that) | 16:22 |
jeblair | mordred: so i think we either need to decide that base roles have precedence, or we're going to need to make a unique ansible config file for every playbook invocation, and track the roles list through inheritance (so each playbook invocation only runs with the roles defined up to that point in the inheritance hierarchy) | 16:24 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Convert jwt encode to string for github in python3 https://review.openstack.org/479100 | 16:24 |
mordred | jeblair: hrm. I was thinking that the malicious role wouldn't be a problem because the base job would not run with the child jobs roles in place- but that's the second thing you said, so yeah | 16:25 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Use only project name in github repo creation https://review.openstack.org/478734 | 16:26 |
mordred | jeblair: it feels like the ubiquitous namespace problem once again - a job defined in a repo that has a role in that repo doesn't want to be magically broken by a role defined in a base repo that it doesn't know about | 16:27 |
mordred | jeblair: (like if ursula had a local 'run-bindep' role, for instance - it would expect to be using its role, not the CI system's) | 16:27 |
mordred | jeblair: at least at the moment of invocation of the actual content that is defined in that repo ... altohugh I suppose in that case it's likely the case that they'd have their roles adjacnet to their playbooks, in which case the local roles _would_ take precedence | 16:28 |
mordred | however, if it's like the tobiash case where there is a repo with roles and then some playbooks to test those roles, I would imagine such a job would definitely want to ensure it got the local copies | 16:29 |
mordred | jeblair: I'm starting to think we need to do the second thing you said and write an ansible.cfg for at least each level of the inheritance stack - with the roles_path being additive each layer down the onion | 16:31 |
Shrews | mordred: that zuul_stream error is weird. looks like we decode the things properly on that code path | 16:34 |
mordred | Shrews: AWESOME | 16:36 |
* Shrews rereads the py3 unicode howto | 16:36 | |
jeblair | mordred: okay. i think we can do that -- i think we create a playbook context object for each playbook, so it'd probably be a matter of freezing the role list and sticking it in there each time we make one. then, obviously, writing an ansible.cfg file for each invocation. | 16:39 |
jeblair | mordred: it's worth considering which is the least surprising to users -- should a job which inherits from other jobs have a consistent set of roles for all stages, or should each stage only run with the roles that the author knew about at that stage. | 16:40 |
jeblair | mordred: we have a N=1 experiment which shows that you expected the latter and were surprised by the former. :) | 16:41 |
jlk | toabctl: you aren't seeing any decoding errors when you get webhooks sent to you? | 16:46 |
mordred | jeblair: yah. I think my argument for the latter is mainly one of encapsulation - as a person building a job using a system provided 'base' job, I would be surprised if I needed to know things about the actual ansible roles the base job used to implement its functionality | 16:52 |
mordred | jeblair: knowing "the base job does setup including ensuring you have git repos in X and ensures logs you put into logs/ are published" is fair - but knowing it uses the prepare-workspace and upload-logs roles seems weird | 16:53 |
jeblair | mordred: yeah, i'm starting to think the onion roles approach strikes the right balance between being able to be ignorant ("i don't care if it uses run-bindep") but also being able to build on it ("i know the parent role list includes run-bindep so it will be available for me to use in my child job") | 16:54 |
mordred | ++ | 16:54 |
tobiash | ++ | 16:55 |
jeblair | 17:03 < openstackgerrit> James E. Blair proposed openstack-infra/project-config master: Zuulv3: Add base-test job https://review.openstack.org/479098 | 17:03 |
jeblair | mordred: ^ | 17:03 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 17:07 |
mordred | jeblair: cool! | 17:07 |
tobiash | jeblair: how does that work? Isn't the base job in a trusted repo without speculative execution? | 17:08 |
jeblair | tobiash: yes -- so with base-test i'm testing out a change i want to make to base. since i have to actually land it to change it, this way i can break/fix base-test until i get it right, then update base. | 17:10 |
tobiash | jeblair: so in base-test you're using the same roles/playbooks there like the base job, but with speculative execution? | 17:11 |
tobiash | ah no, I get it | 17:11 |
tobiash | you land the changes to base-test, try it out,... | 17:12 |
tobiash | like a staging concept | 17:12 |
*** hasharAway has quit IRC | 17:15 | |
*** hashar has joined #zuul | 17:15 | |
tobiash | jeblair: does it make sense being able to mark a job as private (so it cannot be inherited from a different repo)? | 17:16 |
tobiash | jeblair: use case would be again jobs for export and in the same zuul.yaml jobs for testing the exported jobs (which are not for public use and therefore should not be inherited by anyone except jobs in the same repo) | 17:17 |
tobiash | jeblair: currently I'm solving this by documenting them as private | 17:18 |
jeblair | tobiash: yep, like staging. | 17:19 |
tobiash | jeblair: this just came to my mind reading your comment of the base-test job | 17:19 |
tobiash | 'not for public use' ;) | 17:19 |
jeblair | tobiash: oh, i think we've got an idea of a 'final' job, i think we just haven't gotten around to exposing that as a job attribute. | 17:19 |
tobiash | jeblair: I read that in the documentation but the semantics are described differently | 17:20 |
tobiash | jeblair: ... "A job may inherit from any other job in any project (however, if the other job is marked as final, some attributes may not be overidden)." | 17:20 |
jeblair | (it's used internally for some stuff already, but there's no reason we can't say "final: True" on a job definition to indicate that it may not be inherited from) | 17:21 |
jeblair | tobiash: yeah, the things you can override on a final job are just the selectors and failure/success message, etc. not anything that can actually alter the job behavior. | 17:22 |
jeblair | tobiash: so you can still say "run this final job on the stable branch of this repo" | 17:22 |
tobiash | jeblair: ok, that's fine for me | 17:24 |
tobiash | jeblair: will zuul throw an error if you try to override stuff from a final job or just silently ignore the overides? | 17:25 |
jeblair | tobiash: error, i believe | 17:25 |
tobiash | :) | 17:25 |
jeblair | "Unable to modify final job %s attribute %s=%s with variant%s" | 17:26 |
SpamapS | jamielennox: Indeeed, the admin protocol functions aren't covered by tests, and that is just a py3 fail. | 17:26 |
SpamapS | jeblair: http://paste.openstack.org/show/614137/ | 17:26 |
SpamapS | I'll work up some tests and a fix | 17:26 |
jeblair | SpamapS: thanks! that will be nice to have working. :) | 17:28 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Append ansible yaml parse errors to job log file https://review.openstack.org/479312 | 17:36 |
mordred | jeblair: ok - there's that ^^ | 17:36 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Document final jobs https://review.openstack.org/479379 | 17:38 |
tobiash | jeblair: noticed that final was not part of the job documentation | 17:39 |
jeblair | tobiash: yes, i don't think it's implemented in the job definition yet either :) (it just needs to be exposed in configloader.py) | 17:40 |
tobiash | jeblair: ah ok, thought there was some yaml magic in there | 17:40 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Add web-based console log streaming https://review.openstack.org/463353 | 17:42 |
Shrews | cleaned up the rpclistener code a bit ^^^. the nested loop thing was itching my brain | 17:42 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Expose final job attribute https://review.openstack.org/479382 | 17:46 |
tobiash | jeblair: like this? ^^ | 17:47 |
tobiash | is it really that easy with this simple_attributes list? | 17:47 |
jeblair | tobiash: should be :) | 17:48 |
jeblair | we'll probably want a test for that too; we really don't want to accidentally break that since it has security implications. | 17:48 |
tobiash | jeblair: of course | 17:49 |
mordred | jeblair, SpamapS: did we never land "always use bubblewrap"? | 18:00 |
SpamapS | yes that landed | 18:00 |
SpamapS | I think? | 18:00 |
mordred | piddle. my local checkout is old | 18:01 |
jeblair | yep it landed | 18:01 |
mordred | yes. I now I agree with you :) | 18:02 |
jeblair | i'm afk for a bit | 18:02 |
jlk | oh that's interesting. Didn't realize I could mouse click in gertty... | 18:09 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Write secrets into their own file, not into inventory https://review.openstack.org/479390 | 18:14 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 18:16 |
mordred | SpamapS, jlk, jeblair: ^^ that idea look sane? | 18:17 |
jlk | looking | 18:17 |
*** bhavik1 has joined #zuul | 18:17 | |
jlk | which one? | 18:18 |
mordred | oh - hah - the secrets one | 18:18 |
jlk | okay that's what I have open | 18:18 |
mordred | good - the jobs one is still heavy-iterate | 18:18 |
jlk | when vars live in their own file, and are grabbed by -e @foo.yaml or whatever, it changes the inheritance order, vs them being in the inventory. It probably doesn't matter, but something to be aware of. | 18:19 |
SpamapS | You know.. I got to thinkinga bout that lwn article.. and I wonder if you could make zuul work with the pure email/mailing list workflow just by using procmail. | 18:20 |
jlk | http://docs.ansible.com/ansible/playbooks_variables.html#variable-precedence-where-should-i-put-a-variable | 18:20 |
jlk | SpamapS: OUT | 18:20 |
jlk | mordred: specifically, one cannot override a variable using set_fact or whatnot if that variable is defined in an "extra vars" file. It's the ULTIMATE IMMUTABLE truth. | 18:21 |
* SpamapS hangs head | 18:26 | |
mordred | jlk: that's good to know - do you think that's good or bad in this case? | 18:27 |
jlk | neutral. | 18:27 |
mordred | SpamapS: I'm 100% certain we'll be able to implement zuul drivers for the pure email/mailing list workflow - I'm less certain that doing so purely in procmail would cause joy (although it's procmail, you can implement anything in it) | 18:28 |
jlk | We just have to document the fact so that if somebody wants to be able to change a secret partway through, they use a placeholder variable rather than the secret variable directly | 18:28 |
mordred | jlk: ++ | 18:28 |
jlk | I have at one time seen a CI like system implemented via cups | 18:29 |
jlk | (since it has a scheduler...) | 18:29 |
*** Shuo has joined #zuul | 18:29 | |
clarkb | jlk: was it to test printers? | 18:30 |
jlk | no | 18:30 |
jlk | software | 18:31 |
clarkb | I wonder if my old printer tester script is still around | 18:31 |
SpamapS | nice | 18:31 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 18:31 |
jlk | or maybe it was the build farm, can't remember | 18:31 |
tobiash | jeblair: hrm, as far as I understand from the code the final attribute is only taken into account during evaluation of the job variants | 18:31 |
SpamapS | and I didn't really mean procmail.. but.. I mean just by feeding emails into Zuul programattically (and spitting them back out) | 18:31 |
tobiash | jeblair: when trying that with inheritance there don't seem to be any checks against that | 18:31 |
mordred | jlk: there was a time in my distant past where I implemented a data-feed processing system primarily in procmail | 18:31 |
mordred | SpamapS: yes. I'm 100% certain we will implement that | 18:32 |
jlk | that kind of work lives on | 18:32 |
mordred | jlk: if the company hadn't gone bankrupt, I'm sure that script would still be running today | 18:32 |
jlk | there's things like mailgun | 18:32 |
jeblair | SpamapS, mordred: for a treatment on the subject.... :) http://lists.openstack.org/pipermail/openstack-infra/2017-May/005361.html | 18:32 |
mordred | yup | 18:33 |
mordred | making zuul receive and send email is the easy part - doing the mailing list depends-on syntax there ^^ is there the real fun comes in :) | 18:33 |
jeblair | and mapping source repos | 18:34 |
mordred | yup | 18:34 |
* mordred is looking forward to making that one of these days- stewart keeps asking for it | 18:34 | |
SpamapS | Actually I found out Monday a pair of Xen VMs I set up 7 years ago running tokyotyrant (a memcache protocol backed by tokyocabinet k/v store) just got replaced by MySQL 5.7. They hadn't been rebooted, in fact they were forgotten, for about 7 years. | 18:36 |
mordred | SpamapS: that's awesome | 18:36 |
jeblair | mordred: if i'm following correctly, your change incidentally makes secrets trusted-only. that's a significant change i believe? secrets are *most* useful in untrusted repos (since those are the repos where people don't have a backchannel way of getting secret info onto the executor). | 18:36 |
jeblair | mordred: in other words -- shade would no longer be able to have your cloud credentials and go out and perform real-world tests | 18:37 |
SpamapS | They served every session request for every logged in user for all the clients of my old employer during that time. :P | 18:37 |
mordred | jeblair: oh - yah - I'm dumb | 18:37 |
mordred | jeblair: thank you | 18:37 |
mordred | SpamapS: heh. with that being the timeframe - that server COULD have been running drizzle | 18:37 |
SpamapS | so true | 18:38 |
SpamapS | Drizzle's what got me out of that place ;-) | 18:38 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Write secrets into their own file, not into inventory https://review.openstack.org/479390 | 18:38 |
SpamapS | and to be forthcoming and honest.. their traffic has been in a slow, steady decline, since I left ;-) | 18:38 |
mordred | jeblair: that should be better :) | 18:38 |
jeblair | mordred: can i trouble you for a commit message fix? | 18:39 |
SpamapS | But still kinda proud of the fact that they basically were able to forget those things existed (*mumble mumble security*) | 18:39 |
jeblair | mordred: i would normally not ask, but this is one where we want the change history to be clear if we have to dig it up. :) | 18:39 |
SpamapS | I would gladly merge your patch Tuesday for a commit message edit today, sir. | 18:39 |
jeblair | i ain't mergin' notin' on tuesday | 18:39 |
*** bhavik1 has quit IRC | 18:40 | |
jeblair | 'cept maybe some beers | 18:40 |
jlk | I'll be merging some sunscreen on my skin | 18:41 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Write secrets into their own file, not into inventory https://review.openstack.org/479390 | 18:44 |
jeblair | made my own commit message update. :) | 18:44 |
jeblair | mordred: it's still missing the '-' from '-e'. that's your's to fix. :) | 18:47 |
mordred | jeblair: we're frequently getting UnicodeEncodeError: 'ascii' codec can't encode character u'\\u011f' in position 64: ordinal not in range(128) | 18:47 |
jeblair | mordred: does that mean we can reproduce it? | 18:48 |
mordred | maybe - I've got a finger log open right now - waiting to get the corresponding text log to see if we can triangulate | 18:50 |
jeblair | mordred: looking at 5df57c6ae1a5428299781c9fb3f78675 ? | 18:50 |
mordred | jeblair: 4f4ec3c72b374278be7506b0ac8fe322 | 18:51 |
jeblair | you did say often :) | 18:51 |
mordred | yah | 18:51 |
mordred | pretty much every job best I can tell | 18:51 |
jeblair | b"UnicodeEncodeError: 'ascii' codec can't encode character u'\\xe7' in position 3726: ordinal not in range(128)" shows up as well (different character) | 18:52 |
mordred | yah | 18:52 |
jeblair | mordred: oh wait | 18:53 |
jeblair | that's running in python2.7 | 18:53 |
jeblair | i thought our ansible was running in python3 | 18:53 |
mordred | oh. of course it is | 18:53 |
jeblair | (did i munge up the pip install? i had to do some pip things a few days ago) | 18:53 |
mordred | jeblair: yes | 18:54 |
jeblair | i will fix my mess | 18:54 |
mordred | kk. well - good to know that it's a python2 issue - which is why the 'fix' patch doesn't make sense :) | 18:54 |
jeblair | yes! | 18:54 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Write secrets into their own file, not into inventory https://review.openstack.org/479390 | 18:56 |
jeblair | okay, should be fixed. i think we should see the new behavior without a restart. | 18:56 |
mordred | jeblair: cool. also - maybe we can land https://review.openstack.org/#/c/479312/ and do a restart ? | 18:56 |
jeblair | Shrews: ^ the encoding thing may be due to a sysadmin screwup on my part | 18:56 |
mordred | SpamapS: you hav ea sec to look at 479390 ? | 18:57 |
jeblair | i'm going to switch on keep to debug the base-test thing i'm working on (since we don't have run-post-if-pre-fails implemented yet) | 19:00 |
mordred | jeblair: yes please | 19:02 |
mordred | jeblair: I've had that on this morning already, fwiw | 19:03 |
jeblair | mordred: it has a bug and doesn't work :( | 19:03 |
mordred | jeblair: wait - what? | 19:03 |
mordred | keep? I've been totally looking at kept dirs | 19:04 |
mordred | we may be using different pronouns | 19:04 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Append ansible yaml parse errors to job log file https://review.openstack.org/479312 | 19:04 |
jeblair | mordred: i'm talking about 'zuul-executor keep' which turns on the '--keep-jobdir' option which causes it not to delete the jobdirs at the end of the job, eg, /tmp/af2f6d96f8d2408d9a13913a644774d5 | 19:05 |
mordred | yah. I thought I used that earlier - maybe I just got good at getting into the dir before it vanished | 19:05 |
jeblair | mordred: that might be it :) | 19:05 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix typo in keep/unkeep commands https://review.openstack.org/479403 | 19:06 |
jeblair | mordred: maybe we can merge that too before we restart? :) | 19:06 |
mordred | jeblair: ++ | 19:06 |
mordred | jeblair: our jobs that run shell tasks on localhost (adding the ssh key) spin up and tear down a zuul console for it - but obviously localhost is - well - shared | 19:08 |
mordred | jeblair: the spin up isn't a problem- any zuul_console can serve the stream for any file | 19:08 |
*** Shuo has quit IRC | 19:08 | |
mordred | jeblair: but one job can tear down the console while another is using it I believe | 19:09 |
mordred | (noticed a bunch of console-{uuid}.log files in /tmp | 19:09 |
jeblair | mordred: do we need to tear down at all? | 19:10 |
mordred | we added the kill function so that we wouldnt' leave tons of stuff laying around on re-used static nodes | 19:11 |
jeblair | mordred: we're just talking about the process, right? | 19:11 |
jeblair | we'd only be leaving one process laying around (we could still delete the log file) | 19:11 |
mordred | yes - but also the log files in /tmp for each of the shell commands | 19:11 |
mordred | yah - we could just not kill - the startup seems to handle an existing console existing | 19:12 |
mordred | we'll need to come up with another log file delete strategy though - the current one won't work for a shared zuul_console | 19:12 |
jeblair | mordred: what's the role that does that? | 19:13 |
mordred | (it does a delete on /tmp/zuul_console*log) | 19:13 |
mordred | jeblair: I believe it's the last thing in the post playbook of the base job | 19:13 |
mordred | oh - hrm. no it's not | 19:14 |
mordred | jeblair: ok. we seem to have lost that | 19:15 |
mordred | jeblair: I'll put it on my list to sort out | 19:15 |
jeblair | mordred: also https://review.openstack.org/474216 is relevant | 19:16 |
mordred | jeblair: OOHHHHHHH - yup | 19:17 |
mordred | so I'm just wrong all the way around :) | 19:17 |
jeblair | so do we have something creating stray console log files? | 19:17 |
mordred | the shell module. so it's just a cleanup thing | 19:17 |
jeblair | cool | 19:17 |
mordred | *phew* | 19:17 |
jeblair | with that... lunchtime. :) | 19:18 |
mordred | jeblair: when your patch lands, I'm going to restart the executors | 19:18 |
mordred | s/s$// | 19:18 |
*** jlk` has joined #zuul | 19:20 | |
*** jlk has quit IRC | 19:22 | |
*** timrc has quit IRC | 19:24 | |
*** timrc has joined #zuul | 19:25 | |
*** jlk` is now known as jlk | 19:25 | |
*** jlk has quit IRC | 19:26 | |
*** jlk has joined #zuul | 19:26 | |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix typo in keep/unkeep commands https://review.openstack.org/479403 | 19:28 |
jeblair | mordred: cool; i think there's still something finicky with the pidfile, so i think 'service zuul-executor stop', followed by 'rm /var/run/zuul-executor/zuul-executor.pid' may be necessary. | 19:28 |
mordred | jeblair: kk | 19:31 |
mordred | jeblair: and we're bumping the installed version with pip3 install . yeah? | 19:32 |
jeblair | mordred: ya | 19:32 |
jeblair | that 3 was the source of my recent troubles. | 19:32 |
jeblair | (i mean puppet will do it, eventually, but we're not patient) | 19:33 |
mordred | jeblair: I just did the same thing you did | 19:33 |
jeblair | mordred: you just did 'pip install .'? yay! | 19:34 |
jeblair | mordred: i think 'pip uninstall zuul ansible'; 'pip3 uninstall zuul ansible'; 'pip3 install /opt/zuul' will repair. | 19:35 |
mordred | jeblair: sigh. it is not starting, nor is it giving any error | 19:38 |
mordred | jeblair: if I start it with -d - it happily sits there | 19:39 |
jeblair | mordred: lemme take a crack | 19:40 |
jeblair | mordred: systemd thought it was running. i did 'service zuul-executor stop' and 'service zuul-executor start' | 19:41 |
clarkb | you might need systemctl ? | 19:41 |
mordred | jeblair: it's all yours | 19:41 |
clarkb | not sure how service works | 19:41 |
* mordred is SO HAPPY that in 2017 we've managed to make starting daemons hard. so much progress. | 19:42 | |
jeblair | mordred: ^ is running | 19:42 |
mordred | jeblair: I agree | 19:42 |
mordred | jeblair: what did you do? | 19:42 |
jeblair | mordred: systemd thought it was running. i did 'service zuul-executor stop' and 'service zuul-executor start' | 19:42 |
mordred | ah. nod | 19:42 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 19:51 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Expose final job attribute https://review.openstack.org/479382 | 19:52 |
jeblair | okay, i have stuffed my face | 19:52 |
mordred | jeblair: I believe I have witness a race condition you might want to know about | 19:53 |
jeblair | i turned on keep | 19:53 |
mordred | jeblair: /tmp/b86f1bb44c32430a8c7752945fb9f7d9 is the job dir for it - I pushed a new commit to 478265 | 19:53 |
mordred | jeblair: and it seems to have been re-enqueued - except it seems to have re-enqueued v20 of the change, not v21 | 19:54 |
tobiash | mordred: during run of the old change? | 19:54 |
tobiash | mordred: I also observed that and this was on my list to figure out | 19:54 |
tobiash | mordred: but didn't yet have time to dig deeper | 19:55 |
jeblair | mordred, tobiash: noted | 19:55 |
mordred | tobiash: yes | 19:56 |
jeblair | mordred: i got this exception when trying to run the ssh keys role: http://paste.openstack.org/show/614218/ | 19:58 |
jeblair | mordred: is that because we're running modules under py3 on the executor (and even though we'd rather not, we haven't found a way to avoid it yet?) | 20:00 |
mordred | jeblair: hrm. that's a fascinating new error | 20:01 |
mordred | jeblair: were you running that one by hand? | 20:01 |
jeblair | mordred: nope | 20:01 |
jeblair | mordred: /tmp/eff4efb6e2b2483ab09b8bcaf843efd0/work/logs/job-output.txt | 20:02 |
jeblair | mordred: (i did reformat the error by hand) | 20:02 |
jeblair | s/hand/emacs/ | 20:02 |
mordred | ah - nod | 20:04 |
jeblair | mordred: i would have expected the 'ssh-add' that we do in log uploads to hit this too... | 20:05 |
jeblair | mordred: unless..... it *is* also broken, but was masked by running ansible under py27 until recently | 20:06 |
mordred | jeblair: yah. I betcha that's true | 20:06 |
mordred | cause we're not getting logs now | 20:06 |
jeblair | reading http://bugs.python.org/issue17404 | 20:06 |
tobiash | jeblair: seems like inheriting from a final job and overriding vars is not inhibited | 20:07 |
tobiash | jeblair: https://review.openstack.org/#/c/479382/ | 20:07 |
tobiash | http://logs.openstack.org/82/479382/2/check/gate-zuul-python35/d482d9b/testr_results.html.gz | 20:08 |
jeblair | tobiash: yay tests! :) | 20:08 |
tobiash | jeblair: but breaking tests :( | 20:08 |
mordred | jeblair: https://stackoverflow.com/questions/37462011/write-unbuffered-on-python-3 | 20:08 |
mordred | jeblair: that does not seem to be complete | 20:09 |
jeblair | mordred: i'm leaning toward treating it as a utf-8 encoded binary file... whatcha think? | 20:12 |
mordred | jeblair: I was leaning towards the same thing | 20:12 |
jeblair | mordred: i'll make a patch | 20:12 |
mordred | jeblair: I already made one ... | 20:13 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Write logfile as binary encoded utf-8 https://review.openstack.org/479413 | 20:13 |
mordred | jeblair: I believe that should do it, yeah? | 20:13 |
jeblair | mordred: almost | 20:14 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix exception handler in command module https://review.openstack.org/479414 | 20:16 |
jeblair | mordred: ^ that's the other thing from that error | 20:16 |
mordred | ah - good | 20:16 |
mordred | jeblair: should we also update the streamer code and tell it to open binary and send? | 20:16 |
jeblair | mordred: see my inline comment on 479413 | 20:16 |
mordred | it would save a decode/encode step | 20:16 |
mordred | jeblair: gah | 20:17 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Write logfile as binary encoded utf-8 https://review.openstack.org/479413 | 20:17 |
jeblair | mordred: i think we do string manipulation in the streamer? | 20:18 |
mordred | oh - we scan for '\n' | 20:18 |
mordred | wait -that's get_command | 20:19 |
mordred | no - we just read 4096 bytes at a time and send them over the wire | 20:19 |
mordred | I'll do that as a followup | 20:19 |
tobiash | jeblair: I think JobParser.fromYaml should honor the final attribute and raise some exception if there is a violation right? | 20:19 |
jeblair | mordred: in zuul_stream.py? | 20:20 |
mordred | jeblair: oh - no - in log_streamer | 20:20 |
jeblair | mordred: yep, that wfm | 20:22 |
jeblair | tobiash: probably some of the work we're doing in Job.applyVariant needs to be copied to Job.inheritFrom | 20:24 |
jeblair | tobiash: it looks like we only implemented final for variants, and need to add it to inheritance. | 20:24 |
jeblair | tobiash: (in model.py) | 20:24 |
tobiash | jeblair: I'll check if I can move that into a common function | 20:25 |
tobiash | jeblair: that should not be implemented twice if not necessary | 20:25 |
jeblair | tobiash: it's worth thinking about whether we need that though... | 20:25 |
tobiash | jeblair: but will have to do that next week | 20:26 |
tobiash | jeblair: because secrets are not moved to different projects anyway? | 20:26 |
jeblair | tobiash: i'm not sure we gain much by disallowing inheritance, since we can already make secrets not inheritable. all we really do is make it so that people have to copy a job to achive the same thing. (i think) | 20:27 |
tobiash | jeblair: right, I thought that would have had security implications, but when thinking deeper it's not that clear anymore | 20:28 |
jeblair | tobiash: let's think about it some more over the weekend. we may still want to do it just for least surprise (after all, 'final' has a meaning in programming... :) | 20:29 |
tobiash | jeblair: agreed | 20:30 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix exception handler in command module https://review.openstack.org/479414 | 20:31 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Read the log file as binary in zuul_console https://review.openstack.org/479416 | 20:32 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Avoid decode/encode in the finger log stream server https://review.openstack.org/479417 | 20:32 |
mordred | jeblair: ^^ I think we can consider both of those- but I think we should think it through before we dive | 20:33 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Write logfile as binary encoded utf-8 https://review.openstack.org/479413 | 20:33 |
mordred | jeblair: ok - I'm going to pull those and restart | 20:33 |
mordred | jeblair: restarted | 20:37 |
jeblair | mordred: both lgtm. noted a fun fact on the first one. | 20:37 |
mordred | jeblair: hah | 20:38 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Add TenantProjectConfig object https://review.openstack.org/479073 | 20:44 |
mordred | jeblair: ok - the tox-linters job ran successfully again -so I think we're at least back up and running normally | 20:49 |
jeblair | yay! | 20:49 |
jeblair | 2017-06-30 20:49:53,076 DEBUG zuul.AnsibleJob: [build: 4173cc29a9da42a2b69c033f781ab247] Ansible output: b' [WARNING]: Failure using method (v2_runner_on_ok) in callback plugin' | 20:52 |
jeblair | 2017-06-30 20:49:53,076 DEBUG zuul.AnsibleJob: [build: 4173cc29a9da42a2b69c033f781ab247] Ansible output: b'(<ansible.plugins.callback.zuul_stream.CallbackModule object at' | 20:52 |
jeblair | 2017-06-30 20:49:53,076 DEBUG zuul.AnsibleJob: [build: 4173cc29a9da42a2b69c033f781ab247] Ansible output: b'0x7f15bd67ee10>): not enough values to unpack (expected 2, got 1)' | 20:52 |
jeblair | i wish i had a line number :( | 20:53 |
*** hashar has quit IRC | 20:53 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 20:55 |
jeblair | mordred: any chance that error is line 204 of zuul_stream? | 20:55 |
mordred | lemme look | 20:55 |
jeblair | ts, ln = (x.strip() for x in line.split(' | ', 1)) | 20:55 |
mordred | jeblair: yup. I betcha it is | 20:55 |
jeblair | i'd love to see that log file | 20:56 |
mordred | me too | 20:56 |
mordred | jeblair: OH - you know what... | 20:57 |
mordred | jeblair: that's reading directly from stdout_lines on the task since it's localhost - which does not have the zuul_console streamer adding prefixes | 20:57 |
mordred | so that split is never going to be the right thing | 20:58 |
jeblair | mordred: makes sense | 20:58 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Don't try to split localhost log lines https://review.openstack.org/479425 | 20:59 |
mordred | jeblair: ^^ | 20:59 |
mordred | WOOT! I got back to having failures on zuul-tox-linters that make it to the log | 21:00 |
jeblair | logs are cool | 21:00 |
mordred | yah | 21:01 |
mordred | jeblair: I added this: http://logs.openstack.org/65/478265/22/check/41b90a8/ansible-hostvars.ubuntu-xenial.yaml | 21:01 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Permit config shadowing https://review.openstack.org/479084 | 21:03 |
jeblair | mordred: i like that, and moving other similar info out of the log might be nice | 21:04 |
mordred | jeblair: yah | 21:04 |
mordred | once I've got a clean run I'm going to start refactoring :) | 21:05 |
mordred | but yah - there'sa bunch of stuff in the log that's very noisy | 21:05 |
jeblair | mordred: for the 'run post even if pre fails' -- should we run all posts, or only the ones that are siblings with pres that we have attempted to run? | 21:07 |
mordred | jeblair: I thnk the second | 21:08 |
mordred | jeblair: sort of like addCleanup | 21:08 |
jeblair | yeah. it's harder. :) | 21:09 |
mordred | :) | 21:09 |
mordred | jeblair: we could also run all of them and just make sure we run ALL of them even if one of them fails | 21:09 |
mordred | so that if one of the inner post playbooks borks because it doesn't have its input - shrug | 21:09 |
jeblair | (also, if a job adds 3 pre-playbooks, and 1 post playbook -- do we run that one post as long as we got as far as trying to run the first pre?) | 21:09 |
mordred | yah - maybe just running the post playbooks | 21:09 |
mordred | all of them - for now | 21:10 |
mordred | and see how that goes | 21:10 |
jeblair | yeah, let's give it a shot :) | 21:10 |
jeblair | the other is *doable*, just want to make sure we need the complexity it's going to add | 21:10 |
mordred | agree | 21:10 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 21:13 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Don't try to split localhost log lines https://review.openstack.org/479425 | 21:13 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Always run post playbooks https://review.openstack.org/479427 | 21:15 |
jlk | wow that was too much lunch | 21:26 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix exception handling in scheduler https://review.openstack.org/479283 | 21:27 |
mordred | jlk: there's never too much lunch | 21:29 |
jlk | my gastrointestinal track would disagree with you | 21:29 |
SpamapS | give it time | 21:30 |
SpamapS | it will come around | 21:30 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Carve out for stat https://review.openstack.org/479096 | 21:31 |
mordred | jeblair: when your always-run-post patch lands I'd like to do another restart | 21:33 |
jeblair | mordred: yeah, i bumped that to the top of the list for obvious raisins | 21:34 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 21:42 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Fix reenqueue wrong item on new patchset https://review.openstack.org/479432 | 21:44 |
tobiash | jeblair, mordred: this fixes the reenqueue of the wrong ps ^^ | 21:45 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Always run post playbooks https://review.openstack.org/479427 | 21:45 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 21:45 |
mordred | tobiash: oh neat! | 21:45 |
tobiash | so now really edo for me (almost midnight)... | 21:47 |
tobiash | cya | 21:47 |
mordred | that is, in fact, going to bite me again rigth now :) | 21:47 |
mordred | tobiash: have a good weekend! | 21:47 |
tobiash | mordred: you too | 21:47 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Add test for running post playbooks after pre-playbooks fail https://review.openstack.org/479436 | 21:49 |
jeblair | SpamapS: ^ good call on the tests :) | 21:50 |
jeblair | mordred: you're going to want to land that ^ before the restart :) | 21:51 |
jeblair | (meanwhile, i'm reviewing tobiash's change) | 21:52 |
mordred | jeblair: yah | 21:53 |
mordred | SpamapS: feel like an easy +A? | 21:53 |
jeblair | mordred: tobiash's change lgtm. the test failure is a known race; i rechecked. | 21:57 |
jeblair | mordred: oh i think i forgot some git adds | 21:57 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 21:57 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Add test for running post playbooks after pre-playbooks fail https://review.openstack.org/479436 | 21:58 |
mordred | jeblair: I'm having intermittent failures with log uploading that I don't have an answer for yet | 22:03 |
mordred | jeblair: the log exists properly in teh right place - and the debug log shows the upload-logs playbook having been successful | 22:05 |
mordred | jeblair: so - there's a heisenbug somewhere we likely want to keep our eyes out for | 22:05 |
jeblair | mordred: anything get uploaded? | 22:06 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Read the log file as binary in zuul_console https://review.openstack.org/479416 | 22:06 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Avoid decode/encode in the finger log stream server https://review.openstack.org/479417 | 22:06 |
mordred | jeblair: yup! http://logs.openstack.org/65/478265/26/check/6a4179a/job-output.txt - just the report back to gerrit for that job was zuul-tox-linters finger://ze01.openstack.org/6a4179a7e8ef41439a61cceb2d565600 : POST_FAILURE in 2m 08s | 22:07 |
mordred | jeblair: (also - the job passed) | 22:07 |
SpamapS | mordred: sure | 22:08 |
mordred | SpamapS: ( https://review.openstack.org/#/c/479436/ ) | 22:09 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 22:10 |
mordred | ok. that ^^ is going to pass the test | 22:11 |
jeblair | mordred: it looks like this task failed: | 22:11 |
jeblair | 2017-06-30 22:00:13.831965 | TASK [Collect tox logs. mode=pull, dest=logs/tox, src={{ item.path }}/log/] | 22:11 |
jeblair | mordred: part of the failure message made it into the executor log, but none of it ended up in the job log | 22:12 |
mordred | oh. weird | 22:12 |
jeblair | mordred: grep "failed:" executor-debug.log | 22:13 |
jeblair | mordred: that's the bit that's in the executor log but not the job output | 22:13 |
jeblair | (and of course it's truncated) | 22:13 |
mordred | jeblair: aha | 22:14 |
jeblair | but that definitely seems like something that would be good to have in the job output | 22:14 |
jlk | oh you've got unnamed tasks, so it's trying to put the task arguments into the log name, but it won't expand the jinja | 22:14 |
jlk | yeah, don't use unnamed tasks | 22:14 |
mordred | ah - good catch | 22:15 |
mordred | well - we can do a better job of not completely failing there in the zuul_stream processing I hope | 22:15 |
jeblair | jlk: where's the unnamed task? | 22:15 |
jlk | ASK [Collect tox logs. mode=pull, dest=logs/tox, src={{ item.path }}/log/] | 22:16 |
mordred | yah - that's got a name yah? | 22:16 |
mordred | OH | 22:16 |
jlk | oh? are we trying to slurp the arguments up somewhere? | 22:16 |
mordred | ok. this is a with_items | 22:16 |
mordred | I think we're missing something in our callback | 22:16 |
jeblair | "Collect tox logs." is the name? | 22:16 |
jlk | oooh maybe it adds the item into the task name output? | 22:17 |
mordred | the actual message is coming in as an sub-item to the task and we're clearly not handling it right | 22:17 |
mordred | well - items fire a callback too | 22:17 |
mordred | I _thought_ they would fire the same one - but I'll now look in to that | 22:17 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix reenqueue wrong item on new patchset https://review.openstack.org/479432 | 22:18 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add test for running post playbooks after pre-playbooks fail https://review.openstack.org/479436 | 22:19 |
SpamapS | I seem to recall some good reasons not to reference {{ item }} in task names. | 22:20 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 22:20 |
mordred | we're missing v2_runner_item_on_ok v2_runner_item_on_failed and v2_runner_item_on_skipped | 22:22 |
mordred | I'll work up a patch | 22:22 |
mordred | SpamapS, jeblair, jlk: WOOT! https://review.openstack.org/#/c/478265/ passed (thanks for finding the issue on the tox dir thing | 22:23 |
mordred | http://logs.openstack.org/65/478265/28/check/2102666/ has the job output, the ansible variables and the tox dir, as expected | 22:24 |
* jlk prepares for a long read. | 22:24 | |
jlk | mordred: what's the point of gather_facts true and gather_subset of !all? Isn't that saying to not gather anything? | 22:29 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Don't log starting to log messages to build log https://review.openstack.org/479439 | 22:29 |
mordred | jlk: !all means don't gather any of the extra facts | 22:29 |
mordred | jlk: so it won't try facter or ec2 or any of that stuff | 22:30 |
jlk | do you need any gathered facts? | 22:30 |
mordred | jlk: why that's "!all" is a mystery to me | 22:30 |
jeblair | maybe we only need that for the playbook where we print a bunch of facts? | 22:30 |
mordred | jlk: that's a good point - I should probably go through and mark as "no" on the ones that don't need them | 22:31 |
mordred | yah - it'suseful there - I dont' think we need it in most of the rest of the ones we're doing now | 22:31 |
jeblair | mordred: maybe put a comment in the one(?) playbook that needs it saying why it's there and not to copy pasta it :) | 22:31 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Don't gather facts when we don't need them https://review.openstack.org/479441 | 22:35 |
mordred | jlk: ^^ | 22:35 |
jeblair | mordred: can you squash those pls? | 22:36 |
jlk | reading these, kind of wish the "script" task was used. Easier to have the source of hte script in it's own file than inserted in-line in the YAML | 22:36 |
mordred | jlk: oh - also, on the long one - the intent is to start with content of the scripts we're running in the current infra jobs, but then to refactor that into real ansible | 22:36 |
jlk | nod | 22:36 |
jlk | O | 22:36 |
jlk | I'm trying to ignore the shell :) | 22:36 |
mordred | yah. it's the best bet | 22:37 |
mordred | it's a straight copy from the jenkins scripts | 22:37 |
* SpamapS really wants highlighting or pretty printing or something in those logs | 22:37 | |
mordred | jeblair: sure! | 22:37 |
mordred | SpamapS: yup. me too - definitely on the list | 22:37 |
jeblair | also, removing some things from them. :) | 22:37 |
mordred | yah | 22:37 |
SpamapS | 2017-06-30 22:22:46.611111 | ubuntu-xenial | changed: Results: => {"changed": true, "gid": 1001, "group": "zuul", "mode": "0775", "owner": "zuul", "path": "/tmp/21026668ac2a4e7c98cceb9859ea4786/work/logs/tox", "size": 4096, "state": "directory", "uid": 1001} | 22:37 |
jlk | left a -1, you're calling a post-tasks twice and once pointing at a pre task | 22:37 |
SpamapS | quite the jumble of "is this useful?" | 22:37 |
mordred | jlk: sweet - thanks for poking your eyes out looking at that | 22:38 |
mordred | but yes - getting those logs being pleasing is important | 22:38 |
jlk | it's a slow friday :) | 22:38 |
jeblair | SpamapS: i don't really want to see that, however, that particular bit is what ansible outputs so i think we have to keep it there. but yeah, highlighting can make that better. | 22:39 |
jeblair | things we can remove include boilerplate facts, etc, that can go into other files | 22:39 |
SpamapS | jeblair: I wonder if we could fold those in the HTML-izing of the logs. | 22:42 |
SpamapS | like just make the json a little [+] | 22:42 |
* SpamapS will noodle on it | 22:43 | |
SpamapS | just hard to really see what happened. | 22:43 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Port in tox jobs from openstack-zuul-jobs https://review.openstack.org/478265 | 22:43 |
SpamapS | but the part that shows the output of the running things is quite nice compared to usual ansible :) | 22:44 |
SpamapS | so.. progress. | 22:44 |
mordred | yup! | 22:44 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP: pass result data back from ansible https://review.openstack.org/479442 | 22:45 |
jeblair | SpamapS, mordred: also, probably worth thinking about this after ARA is in the mix too | 22:46 |
mordred | yah | 22:46 |
mordred | and where/how os-loganalyze may fit into the picture | 22:46 |
mordred | tons of options/possibliities - but I'm just happy we're collecting the logs and publishing them!!! :) | 22:47 |
jeblair | SpamapS, mordred, jlk: ^ that's a start on work to pass data back from ansible to zuul; feedback on the approach while i work on some tests is welcome. | 22:47 |
jlk | Seems useful | 22:50 |
jeblair | (i think i'd like to get rid of success-url and failure-url, and just have that be automatic based on something returned from this. but we can have a transition period where we just use success-url to interpolate a value in result_data) | 22:50 |
jlk | I have to flee to commute home. | 22:50 |
jeblair | jlk: have a good commute and holiday! | 22:50 |
mordred | jeblair: I like it | 22:50 |
jlk | yah, cheers all! | 22:50 |
mordred | jlk: have a great weekend! | 22:51 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: DNM This has a yaml syntax error in a job playbook https://review.openstack.org/479446 | 22:52 |
mordred | jeblair: ok - the "print things on parse error" worked - but is totes missing newlines | 22:57 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add newlines to parse error output https://review.openstack.org/479448 | 22:59 |
jeblair | mordred: ++ | 23:01 |
mordred | jeblair: I feel like we've got a few patches queued up it would be good to restart with | 23:07 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add newlines to parse error output https://review.openstack.org/479448 | 23:18 |
jeblair | mordred: i'll restart now | 23:25 |
jeblair | mordred: done | 23:26 |
mordred | \o/ | 23:33 |
mordred | jeblair: I'm rechecking the syntax error | 23:33 |
jeblair | mordred: i've got a command which is exiting with rc=1. i'm pretty sure it should output a line to the log, but it isn't. | 23:36 |
jeblair | mordred: http://paste.openstack.org/show/614223/ | 23:36 |
jeblair | mordred: the first 3 lines are from a similar command which succeeds, and i see output there | 23:37 |
jeblair | mordred: but the last 2 are from a failure. | 23:37 |
jeblair | mordred: i'm about 95% sure i know what the error is, and that it should emit one line of output | 23:37 |
jeblair | mordred: oh, but i don't know if it's stdout or stderr, lemme check | 23:37 |
mordred | jeblair: also - is it using with_items? | 23:38 |
jeblair | mordred: no with_items -- it's here: http://git.openstack.org/cgit/openstack-infra/openstack-zuul-roles/tree/roles/add-build-sshkey/tasks/create-key-and-replace.yaml#n11 | 23:39 |
jeblair | mordred: the line i'm missing *is* going to stderr | 23:39 |
mordred | oh. are we maybe not combining stdout and stderr when we do the localhost override? | 23:40 |
jeblair | mordred: starting to look that way | 23:40 |
mordred | jeblair: hrm I would expect givne the code that we'd be sending stderr to stdout in both cases | 23:41 |
mordred | jeblair: we don't special case that in the module | 23:41 |
jeblair | mordred: ok. i'll try running manually and see if i can find out where its going | 23:42 |
mordred | jeblair: kk. I'm eod'ing - I'll poke in in the morning and see what you've found | 23:48 |
jeblair | mordred: okay have a nice weekend! | 23:48 |
mordred | jeblair: you too! | 23:49 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!