openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: static: enable using a single host with different user or port https://review.opendev.org/659209 | 00:01 |
---|---|---|
*** paladox has quit IRC | 00:02 | |
*** paladox has joined #zuul | 00:02 | |
*** paladox has quit IRC | 00:02 | |
*** paladox has joined #zuul | 00:02 | |
*** jamesmcarthur has joined #zuul | 00:07 | |
*** paladox has quit IRC | 00:08 | |
*** paladox has joined #zuul | 00:08 | |
*** paladox has quit IRC | 00:09 | |
*** paladox has joined #zuul | 00:10 | |
*** jamesmcarthur has quit IRC | 00:49 | |
*** jamesmcarthur has joined #zuul | 00:50 | |
*** olaph has joined #zuul | 00:54 | |
*** jamesmcarthur has quit IRC | 00:55 | |
*** rf0lc0 has joined #zuul | 00:57 | |
*** mattw4 has quit IRC | 01:05 | |
*** jamesmcarthur has joined #zuul | 01:20 | |
*** michael-beaver has joined #zuul | 01:38 | |
*** rlandy|ruck|bbl is now known as rlandy|ruck | 02:06 | |
*** rf0lc0 has quit IRC | 02:13 | |
*** jamesmcarthur has quit IRC | 02:15 | |
*** jamesmcarthur has joined #zuul | 02:15 | |
*** jamesmcarthur has quit IRC | 02:18 | |
*** jamesmcarthur has joined #zuul | 02:20 | |
*** rlandy|ruck has quit IRC | 02:32 | |
*** jamesmcarthur has quit IRC | 02:33 | |
*** jamesmcarthur has joined #zuul | 02:33 | |
*** bhavikdbavishi has joined #zuul | 02:48 | |
openstackgerrit | Merged zuul/zuul master: Update quickstart nodepool node to python3 https://review.opendev.org/658486 | 02:52 |
*** bhavikdbavishi has quit IRC | 02:53 | |
*** jamesmcarthur has quit IRC | 03:01 | |
*** jamesmcarthur has joined #zuul | 03:02 | |
*** bhavikdbavishi has joined #zuul | 03:03 | |
*** jamesmcarthur has quit IRC | 03:04 | |
*** jamesmcarthur has joined #zuul | 03:04 | |
openstackgerrit | Paul Belanger proposed zuul/zuul master: Add more test coverage on using python-path https://review.opendev.org/659812 | 03:31 |
*** jamesmcarthur has quit IRC | 03:43 | |
*** jamesmcarthur has joined #zuul | 03:43 | |
*** michael-beaver has quit IRC | 03:48 | |
*** jamesmcarthur has quit IRC | 03:58 | |
*** igordc has joined #zuul | 04:07 | |
*** bhavikdbavishi has quit IRC | 04:19 | |
*** bhavikdbavishi has joined #zuul | 04:20 | |
*** swest has joined #zuul | 04:25 | |
*** swest has quit IRC | 04:31 | |
*** sanjayu_ has joined #zuul | 04:43 | |
*** swest has joined #zuul | 04:45 | |
*** pcaruana|afk| has joined #zuul | 05:01 | |
*** pcaruana|afk| has quit IRC | 05:04 | |
*** pcaruana has joined #zuul | 05:05 | |
*** badboy has joined #zuul | 05:08 | |
*** spsurya has joined #zuul | 06:07 | |
*** zbr|flow is now known as zbr|ooo | 06:11 | |
*** gtema has joined #zuul | 06:34 | |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 07:00 |
*** ianychoi has joined #zuul | 07:02 | |
*** igordc has quit IRC | 07:15 | |
*** jangutter has joined #zuul | 07:31 | |
*** jpena|off is now known as jpena | 07:43 | |
ofosos | Is there any way I can tell the executor what SSH key to use? In theory Bitbucket has an API for uploading SSH keys and I would like to use that to upload the Zuul key to Bitbucket. | 07:47 |
*** themroc has joined #zuul | 08:08 | |
tristanC | ofosos: for gerrit, there is a sshkey option that can be set per connection in the zuul.conf | 08:09 |
ofosos | tristanC: i'll have a look | 08:15 |
ofosos | Thanks | 08:15 |
*** gtema has quit IRC | 08:20 | |
*** gtema has joined #zuul | 08:21 | |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: web: add tenant and project scoped, JWT-protected actions https://review.opendev.org/576907 | 08:38 |
*** hashar has joined #zuul | 08:41 | |
*** themroc has quit IRC | 09:18 | |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: Allow operator to generate auth tokens through the CLI https://review.opendev.org/636197 | 09:20 |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: Zuul CLI: allow access via REST https://review.opendev.org/636315 | 09:31 |
*** gtema has quit IRC | 10:28 | |
*** gtema_ has joined #zuul | 10:28 | |
*** gtema_ is now known as gtema | 10:28 | |
*** bhavikdbavishi has quit IRC | 10:43 | |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 10:49 |
badboy | any ideas what's causing this: | 10:52 |
badboy | AttributeError: type object 'EllipticCurvePublicKey' has no attribute 'from_encoded_point' | 10:52 |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 10:58 |
*** jpena is now known as jpena|lunch | 11:01 | |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 11:20 |
*** sshnaidm has quit IRC | 11:52 | |
*** sshnaidm has joined #zuul | 11:56 | |
*** bhavikdbavishi has joined #zuul | 12:03 | |
*** rlandy has joined #zuul | 12:05 | |
*** rlandy is now known as rlandy|ruck | 12:06 | |
*** rf0lc0 has joined #zuul | 12:07 | |
*** bhavikdbavishi has quit IRC | 12:08 | |
*** bhavikdbavishi has joined #zuul | 12:15 | |
*** spsurya has quit IRC | 12:18 | |
*** jpena|lunch is now known as jpena | 12:40 | |
*** pcaruana has quit IRC | 13:01 | |
*** chandankumar is now known as raukadah | 13:01 | |
*** rlandy|ruck is now known as rlandy|ruck|mtg | 13:02 | |
*** sanjayu_ has quit IRC | 13:05 | |
*** gtema has quit IRC | 13:26 | |
*** bhavikdbavishi has quit IRC | 13:27 | |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 13:27 |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: Zuul CLI: allow access via REST https://review.opendev.org/636315 | 13:34 |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: Add Authorization Rules configuration https://review.opendev.org/639855 | 13:34 |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: Web: plug the authorization engine https://review.opendev.org/640884 | 13:35 |
*** jamesmcarthur has joined #zuul | 13:43 | |
fungi | badboy: are you seeing that with a gerrit connection? i want to say we've seen broken ecc implementation in some gerrit versions | 13:55 |
fungi | badboy: oh, looks like that could be a mismatch with the installed version of pyca/cryptography | 13:57 |
fungi | you may be running too old of a version? | 13:57 |
fungi | what version of cryptography does pip say is installed? | 13:58 |
fungi | also, sticking the full traceback on http://paste.openstack.org/ would help provide some context | 13:58 |
fungi | https://github.com/pyca/cryptography/blob/master/CHANGELOG.rst#25---2019-01-22 suggests you need at least 2.5 (from january of this year) for that method | 14:02 |
*** sanjayu_ has joined #zuul | 14:02 | |
*** igordc has joined #zuul | 14:05 | |
*** rlandy|ruck|mtg is now known as rlandy|ruck | 14:10 | |
*** gtema has joined #zuul | 14:17 | |
*** hashar has quit IRC | 14:17 | |
pabelanger | dmsimard: have you see this ARA failure before? Looks to be encoding issue when generating html: https://logs.zuul.ansible.com/89/57789/8d9f8e0547417362c0241ab039e360035b778478/third-party-check-silent/ansible-test-network-integration-ios-python27/bc7e0b5/job-output.html#l8652 | 14:27 |
dmsimard | pabelanger: I have not, I thought all those encoding issues had been ironed out :D | 14:30 |
pabelanger | dmsimard: yah, this is the first time I've seen it happen, we've been using ARA for ansible-test for some time. Will dig more into it | 14:31 |
pabelanger | I know it does some odd things with directory names for testing | 14:31 |
dmsimard | pabelanger: oh, it might be https://github.com/ansible-community/ara/issues/48 then -- that's >1.0 though but it's possible 0.x is also impacted | 14:32 |
dmsimard | it was also for a filesystem path with non-ascii characters | 14:32 |
dmsimard | (who does that?) | 14:32 |
dmsimard | I ran the ansible integration test suite against 1.x but not 0.x -- I should be able to reproduce | 14:33 |
pabelanger | dmsimard: yup, that likely is it | 14:35 |
pabelanger | let me confirm we have that non-ascii chars disabled for ansible integration testing | 14:35 |
pabelanger | I also don't know why they do it | 14:35 |
dmsimard | #ansible-devel said it's because it bubbles up bugs like this one :p | 14:36 |
pabelanger | that is true | 14:36 |
*** igordc has quit IRC | 14:42 | |
smcginnis | Daily third party CI question... :) | 14:43 |
smcginnis | If I want to use the devstack job in my local zuul instance form https://opendev.org/openstack/devstack/src/branch/master/.zuul.yaml#L343 | 14:44 |
smcginnis | I've added it to my untrusted-projects, but trying to define a job locally that inherits from it results in "Job devstack not defined". | 14:44 |
smcginnis | Is there something else I need to do in order to be able to use that? | 14:45 |
pabelanger | smcginnis: in your tenant config for devstack, did you allow loading of jobs? | 14:45 |
smcginnis | pabelanger: Ah, just noticing... is that "include: - job"? | 14:46 |
pabelanger | yah | 14:46 |
smcginnis | OK, nope, I did not do that part. Trying now. | 14:46 |
smcginnis | pabelanger: That job has a long list of required-projects. Will I also need to add those in my zuul config as untrusted-projects? | 14:47 |
pabelanger | yes | 14:47 |
smcginnis | OK, thanks. That saves me some digging then. I'll get all of that added and try things out. Thanks! | 14:47 |
pabelanger | I started testing devstack on rdoproject zuul a while back, but don't think I got it working 100%. Let me look to see if I can find the code | 14:48 |
smcginnis | Oh, great. Or if there's some other way - I ultimately need to be get things running and run tempest against it for third party CI testing. | 14:50 |
pabelanger | smcginnis: looks like I just added the tenant config, which you already know. Seems I didn't import all the required-projects, and code seems to have been removed now from rdoproject | 14:52 |
smcginnis | pabelanger: OK, no worries. This is a great start, so I'll keep fiddling. Thanks for looking. | 14:52 |
pabelanger | smcginnis: the job _should_ load properly, that was some of the work that needed to be done. However, you also might run into issues with missing nodesets, which you'll need to also define locally | 14:53 |
smcginnis | Ah, I didn't think about that. | 14:53 |
clarkb | devstack has itsown nodesets | 14:53 |
pabelanger | yah, zuul does a good job at saying what doesn't work :) | 14:53 |
smcginnis | I'll see if it makes sense to get all that matched up, or just define a local job and hope it doesn't diverge too much over time. | 14:54 |
fungi | as long as the nodeset is defined zuul will be happy. you don't actually need nodes matching those provided by nodepool if you're not actually going to run the jobs declaring they use them | 14:57 |
corvus | smcginnis: it's going to be some upfront investment to get the list of all the things you need to add, but i think it's going to be worth it. | 14:58 |
corvus | yeah, nodesets are something that may make sense to override locally | 14:58 |
fungi | someone else was working on putting together exactly the same list for the base devstack job... who was it? | 14:58 |
fungi | maybe they've already done that legwork now | 14:58 |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 14:59 |
*** sanjayu_ has quit IRC | 15:09 | |
corvus | fungi: several people have done it, i have not seen it shared yet. | 15:09 |
smcginnis | OK, I'll keep working towards that then and hopefully get it all documented well. | 15:11 |
*** igordc has joined #zuul | 15:23 | |
Shrews | corvus: i have no idea what's going on with the plugin tests, but https://review.opendev.org/663762 has seen so many random failures with it. Earlier failures were timeout related. The new one post-fungi's timeout fix is now: http://logs.openstack.org/62/663762/10/check/tox-py35/ec8a51c/job-output.txt.gz#_2019-06-13_15_08_38_574336 | 15:24 |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 15:24 |
corvus | Shrews: that means an executor is still running a job | 15:25 |
Shrews | corvus: seems unrelated to my change. and the following change passed all tests | 15:27 |
corvus | Shrews: could it be related to http://logs.openstack.org/62/663762/10/check/tox-py35/ec8a51c/job-output.txt.gz#_2019-06-13_15_19_28_066472 ? | 15:28 |
corvus | Shrews: and also http://logs.openstack.org/62/663762/10/check/tox-py35/ec8a51c/job-output.txt.gz#_2019-06-13_15_08_58_190771 | 15:29 |
Shrews | subunit exception. neat | 15:29 |
Shrews | i wonder if there was a recent release of subunit | 15:30 |
corvus | Shrews: i think the test needs to be split. let's just cut it in half. | 15:30 |
corvus | Shrews: that will address both timeouts and subunit report lengths. | 15:31 |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 15:31 |
corvus | fungi: ^ | 15:31 |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 15:32 |
fungi | will take a look after the meetings i'm in. we could just revert for now | 15:33 |
Shrews | corvus: by "cut in half", you mean a new test_plugins() test with half of the entries in the plugin_tests array? | 15:34 |
corvus | Shrews: yep | 15:35 |
corvus | fungi: i think it'll be just as easy to split as it would be to revert | 15:36 |
Shrews | i'll toss a change up | 15:36 |
fungi | thanks Shrews! | 15:36 |
fungi | i hadn't yet looked closely enough at how the framework for that test was done to work out how much duplication/abstraction would be needed to split it | 15:37 |
corvus | clarkb, pabelanger, fungi, Shrews, tobiash: did we decide yesterday that we should relesae zuul now? how about nodepool? | 15:38 |
fungi | i think there was a suggestion to restart the opendev deployment on the current state. i expect we might as well lump nodepool in | 15:38 |
fungi | though i haven't looked to see what's landed in nodepool since the last tag | 15:38 |
corvus | enough for a releas i think https://zuul-ci.org/docs/nodepool/releasenotes.html | 15:39 |
pabelanger | corvus: +1 for zuul release | 15:39 |
corvus | okay, so the plan is: restart all of opendev today, release both later today or tomorrow? | 15:39 |
pabelanger | wfm | 15:40 |
pabelanger | also, we've been using nodepool 3.6.1.dev16 without any issues, so +1 for tagging that too | 15:40 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Split plugin tests into two https://review.opendev.org/665161 | 15:41 |
fungi | yeah, the conclusion with the zuul memory leak we're seeing in opendev is that it took a couple weeks to manifest after scheduler restart the last couple times it bit us, so it'll probably be a while before it crops up again and we shouldn't delay the release waiting for that | 15:41 |
Shrews | corvus: i'm not sure which version of nodepool we are running in production and if we're in sync with master | 15:42 |
Shrews | lemme check | 15:42 |
Shrews | looks like status logs says we last restarted launchers on 58a2a2b68c58f9626170795dba10b75e96cd551 to pick up memory leak fix | 15:43 |
Shrews | err f58a2a2b68c58f9626170795dba10b75e96cd551 | 15:44 |
Shrews | that's the 3.6.0 tag | 15:44 |
corvus | i think that's old enough to warrant a restart | 15:44 |
*** bhavikdbavishi has joined #zuul | 15:45 | |
Shrews | corvus: agreed. which means "no" on nodepool release just yet | 15:45 |
corvus | Shrews: we need to restart zuul too, so i was thinking we'd restart both today and release tomorrow; how's that sound? | 15:45 |
Shrews | corvus: fine by me. if there's a nodepool problem, it should be spotted rather quickly | 15:46 |
*** bhavikdbavishi1 has joined #zuul | 15:53 | |
*** jangutter has quit IRC | 15:54 | |
*** bhavikdbavishi has quit IRC | 15:54 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 15:54 | |
ofosos | I get 'something went wrong' from zuul.openstack.org | 15:56 |
corvus | ofosos: please see #openstack-infra for information related to that service | 15:57 |
corvus | pabelanger, tobiash: http://paste.openstack.org/show/752891/ | 15:57 |
clarkb | catcing up after getting kids out the door for school and ya thta plan sounds good to me | 15:57 |
pabelanger | corvus: oh, was that a retry? | 15:58 |
corvus | pabelanger: no idea | 15:58 |
clarkb | corvus: I feel like that should go under "achivement unlocked" re abuse against github by zuul | 15:59 |
clarkb | did that happen after the restart? If so the multiprocessing change may send too many requests at once to github? | 15:59 |
corvus | clarkb: yes | 15:59 |
pabelanger | I wonder if ad668d74-8df3-11e9-93ab-4ff1818b4f8e got 502 Server error, then we sleep(1) and tried again | 15:59 |
smcginnis | OK, I'm feeling dumb now. Where do I need to define things to get rid of 'Unable to freeze job graph: The nodeset "openstack-single-node-bionic" was not found.' | 16:00 |
corvus | pabelanger: did you log retries? | 16:00 |
clarkb | corvus: yes they shouldbe logged | 16:00 |
pabelanger | corvus: yah, you should see it as exception | 16:00 |
corvus | smcginnis: define a nodeset called "openstack-single-node-bionic" | 16:00 |
corvus | pabelanger: feel free to dig, i have too many windows open at the moment running the restart | 16:00 |
smcginnis | corvus: Where is that actually done. I thought I did, but I still get the error. | 16:00 |
pabelanger | https://review.opendev.org/664843/ might be why | 16:00 |
corvus | smcginnis: it can be in any repo in the tenant | 16:01 |
pabelanger | corvus: sure, looking now | 16:01 |
clarkb | pabelanger: I think it more likely the multiprocessing change is to blame | 16:01 |
clarkb | pabelanger: I would think a second delay between requests is plenty. But sending ~20 (or however many threads are in the pool) requests at once may make it unhappy | 16:01 |
tobiash | corvus: you hit the rate limit? | 16:01 |
pabelanger | clarkb: oh, maybe | 16:02 |
tobiash | Oh and the retry succeeded :) | 16:02 |
tobiash | corvus: re release, did the python interpreter work land? | 16:03 |
pabelanger | tobiash: where do you see the succeed? | 16:03 |
tobiash | pabelanger: maybe i misinterpreted the log | 16:03 |
*** sshnaidm is now known as sshnaidm|off | 16:04 | |
pabelanger | tobiash: Oh, I think you are right | 16:04 |
corvus | tobiash: i believe so: Merge "executor: use node python path" | 16:04 |
pabelanger | tobiash: let me look at code again | 16:05 |
corvus | tobiash: i'm under the impression that both sides of that will be present in the nodepool and zuul releases, but if i'm wrong, let me know :) | 16:05 |
tobiash | Ah yes, so then ++ for release after burn in | 16:05 |
pabelanger | tobiash: actually, I don't think we retried. Looking at 664843 we'd have to add github3.exceptions.ForbiddenError too. But right now we don't trap generic github exepctions. cc clarkb | 16:06 |
tobiash | corvus: correct, just wanted to make sure that the zuul part is not missing :) | 16:06 |
corvus | ++ | 16:06 |
clarkb | pabelanger: ya I think that behavior is correct there | 16:06 |
clarkb | pabelanger: retrying would only make the abuse perception worse | 16:06 |
pabelanger | yah | 16:06 |
tobiash | Github timeout and retry is also in so ++ for burn in and release | 16:07 |
pabelanger | clarkb: I think you might be right, I don't see a pervious failure on ad668d74-8df3-11e9-93ab-4ff1818b4f8e in zuul logs. So maybe your comment about multiprocess change is to blame. | 16:09 |
pabelanger | tobiash:^ | 16:09 |
clarkb | pabelanger: for retries and abuse detection I think we may want a backoff that is more sophisticated than a sleep(1) | 16:09 |
clarkb | like sleep with increasing backoff if we detect that case or something | 16:09 |
pabelanger | clarkb: yah, I'm kinda curious why ratelimit didn't help here | 16:10 |
tobiash | And I think there is still some potential to optimize away a few requests | 16:10 |
tobiash | We don't do rate limiting afaik | 16:11 |
tobiash | We only log it | 16:11 |
pabelanger | okay, I see a retry attempt: dc71183c-8df3-11e9-97c9-52bfbc81ffb5 | 16:11 |
pabelanger | looking at logs | 16:11 |
*** hashar has joined #zuul | 16:12 | |
*** olaph has quit IRC | 16:12 | |
pabelanger | tobiash: clarkb: :( http://paste.openstack.org/show/752892/ | 16:15 |
pabelanger | that is a retry attempt | 16:15 |
pabelanger | from a 502 Server Error | 16:15 |
pabelanger | so, sleep(1) doesn't seem to be enough time | 16:15 |
pabelanger | time to read docs on why that is | 16:16 |
corvus | tobiash: re log annotations -- did you happen to see a way to get the tracebacks formatted like the other lines? (so they show up in a grep?) | 16:19 |
pabelanger | jlk: maybe you also have suggestion about 502 Server Error we get back from github api. We created, https://review.opendev.org/664843/ but now look to be tripping the abuse detection mechanism. | 16:21 |
fungi | pabelanger: what sort of api query is causing that? | 16:24 |
corvus | pabelanger, tobiash, jlk: here's the expanded log entries for that event: http://paste.openstack.org/show/752893/ | 16:24 |
corvus | fungi: our old friend "getPullBySha" -- the info that everyone (github internal devs included) really wants included in the event | 16:25 |
fungi | got it | 16:25 |
pabelanger | yah | 16:25 |
pabelanger | a quick google says we _should_ get Retry-After header back | 16:25 |
fungi | so it's a read operation | 16:25 |
pabelanger | but need to confirm that | 16:25 |
jlk | is it timing out or is the 502 immediate? | 16:26 |
jlk | My team was CCd on an issue that look slike there is a recent spike in somewhat immediate 502s | 16:26 |
corvus | jlk: i think it took a little over a minute to get the 502 back, if i'm reading the logs right; i'll double check that | 16:27 |
pabelanger | right, we now see it more because we've also bumped up the default_read_timeout to 300: https://review.opendev.org/664667/ | 16:27 |
corvus | yeah, in http://paste.openstack.org/show/752893/ "Handling status event" is right before the call, and "Failed handling" is right after | 16:27 |
corvus | pabelanger: right, if someone was using github3.py with the defaults, they would have hit the 10 second read timout before getting the 502 | 16:28 |
*** jamesmcarthur_ has joined #zuul | 16:29 | |
pabelanger | talking to some ansibullbot folks, they say there is an undocumented ~20 POST per minute, before hitting abuse things. Maybe we are also hitting that now | 16:32 |
*** jamesmcarthur has quit IRC | 16:32 | |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 16:36 |
openstackgerrit | James E. Blair proposed zuul/nodepool master: WIP: new devstack-based functional job https://review.opendev.org/665023 | 16:37 |
smcginnis | Is there a config option somewhere that controls allowed disk space? I see I'm getting aborts now from ExecutorDiskAccountant because the limit is set to 250mb. | 16:43 |
clarkb | smcginnis: https://zuul-ci.org/docs/zuul/admin/components.html#attr-executor.disk_limit_per_job | 16:45 |
smcginnis | Perfect, thanks clarkb | 16:46 |
clarkb | that should be plenty for devstack jobs last I checked | 16:46 |
clarkb | pabelanger: POST would be to leave comments on PRs? | 16:47 |
smcginnis | It looks like it's happening while checking out the other repos. The one I have on screen right now was Checking out openstack/cinder, Checking out master, ExecutorDiskAccountant warning using 544MB (limit=250) | 16:47 |
clarkb | pabelanger: the searching should all be GETs right? | 16:47 |
pabelanger | clarkb: yah, that is right. So, I'm now looking into search api | 16:48 |
pabelanger | because i do see a bit of google hits around search and abuse | 16:48 |
clarkb | smcginnis: hrm I didn't think the git repos counted against that | 16:48 |
pabelanger | maybe we are missing something with rate-limit | 16:48 |
pabelanger | with new multiprocessing change | 16:48 |
corvus | smcginnis: that can happen if the executor mounts are misconfigured -- https://zuul-ci.org/docs/zuul/admin/components.html#attr-executor.git_dir and https://zuul-ci.org/docs/zuul/admin/components.html#attr-executor.job_dir need to be on the same filesystem | 16:49 |
smcginnis | clarkb: I was a little surprised to see them all checked out there. | 16:49 |
pabelanger | I have to relocate, but plan to keep looking. We do seem to be hitting abuse message on zuul.o.o often | 16:49 |
pabelanger | back shortly | 16:49 |
fungi | smcginnis: normally you would deploy it so the workspace and the git cache are on the same fs, and then git will just make hardlinks when cloning | 16:54 |
fungi | if they are not on the same fs, git will copy all the data | 16:54 |
smcginnis | fungi: I'm just running the containers from the doc/source/admin/examples docker-compose setup, so I would have thought it would all be on the same fs. | 16:55 |
corvus | smcginnis: can you share your docker-compose.yaml file and your zuul.conf? | 16:56 |
*** jamesmcarthur_ has quit IRC | 16:57 | |
*** jamesmcarthur has joined #zuul | 16:58 | |
corvus | smcginnis: also "docker exec -it mount examples_executor_1" may be helpful | 16:58 |
smcginnis | "Error: No such container: mount" | 16:58 |
smcginnis | Getting the rest... | 16:59 |
corvus | er, other way around then | 16:59 |
corvus | "docker exec -it examples_executor_1 mount" | 16:59 |
smcginnis | Heh, yep. Sorry, didn't really look at the command when I ran it. That makes more sense. | 16:59 |
fungi | mattw4 ran into this exactly a week ago too: http://eavesdrop.openstack.org/irclogs/%23zuul/%23zuul.2019-06-06.log.html#t2019-06-06T20:22:08 | 16:59 |
smcginnis | corvus: Adding to https://etherpad.openstack.org/p/yvyRWS72JG | 17:00 |
fungi | though in that case it seems to have been caused by a spurious /var/lib/zuul bindmount | 17:01 |
fungi | so i guess same symptom, different underlying misconfiguration | 17:01 |
corvus | smcginnis: try running "df" in the container | 17:03 |
smcginnis | In the executor? | 17:03 |
corvus | yeah | 17:03 |
smcginnis | corvus: Added to the bottom. | 17:03 |
smcginnis | No I just get NODE_FAILURE | 17:06 |
*** mattw4 has joined #zuul | 17:07 | |
corvus | this is really weird... | 17:08 |
*** pcaruana has joined #zuul | 17:09 | |
smcginnis | I noticed that I had one DISK_FULL failure, but now the last couple attempts were NODE_FAILUREs. | 17:09 |
corvus | i'm focused on the disk issue | 17:09 |
corvus | it's going to be really easy to sweep that under the rug; it needs to be fixed | 17:10 |
*** hashar has quit IRC | 17:10 | |
corvus | when i run docker-compose locally, i'm seeing mounts in containers which i don't expect | 17:10 |
tobiash | corvus: re log annotations, I think I saw a change in nodepool that does something like this | 17:10 |
smcginnis | Just want to warn that its root cause potentially has gone away since the configs I pasted appear to have gotten by the DISK_FULL error and is hitting NODE_FAILURE instead. | 17:10 |
*** gtema has quit IRC | 17:11 | |
corvus | smcginnis: but you disabled the disk limit? | 17:11 |
corvus | anyway, give me a minute, i'm trying to put together a demonstration of what i'm seeing that is weird | 17:12 |
smcginnis | corvus: I did now. I can remove disk_limit_per_job and restart to see if the DISK_FULL error comes back, but I just am not sure right now if that went away before or after that change. | 17:12 |
corvus | smcginnis, fungi, tobiash: this doesn't make sense to me: https://etherpad.openstack.org/p/yvyRWS72JG lines 225-237 | 17:15 |
corvus | that's in my executor container; there should be no /var/lib/zuul mount there | 17:15 |
tobiash | that's weird | 17:17 |
corvus | this seems to match the behavior that smcginnis is seeing too -- smcginnis, if you run "df /var/lib/zuul" does it also show you that the fs is mounted on /var/lib/zuul ? | 17:17 |
smcginnis | corvus: Is it from here: https://opendev.org/zuul/zuul/src/branch/master/Dockerfile#L43 | 17:18 |
tobiash | corvus: does the dockerfile specify /var/lib/zuul as volume? | 17:18 |
fungi | indeed, that's the presumed spurious /var/lib/zuul mount mattw4 had | 17:18 |
corvus | oh, it does... | 17:18 |
fungi | he said he thought he'd mounted it to give access to ssh keys | 17:18 |
fungi | but maybe not? | 17:18 |
smcginnis | corvus: /dev/vda1 40470732 6877524 33576824 18% /var/lib/zuul | 17:18 |
corvus | that would do it | 17:18 |
tobiash | because the scheduler container specifies it (line 61) and there is probably some automagic connection | 17:18 |
*** jpena is now known as jpena|off | 17:19 | |
corvus | okay, given that, i think i understand the patch that's needed. i'll push it up and we can see if we agree | 17:19 |
smcginnis | So, should the reason for my current NODE_FAILURE show up in the executor log, or should I be looking somewhere else to figure out why that's happening? | 17:22 |
tobiash | corvus: re log annotations, this is the nodepool change I meant: https://review.opendev.org/613196 | 17:22 |
tobiash | but it is using a custom formatter | 17:23 |
corvus | smcginnis: probably nodepool launcher, or if not, possibly scheduler | 17:23 |
smcginnis | k, thanks. I'll look | 17:23 |
corvus | smcginnis: (the executor doesn't go into action until the scheduler hands it a node which it gets from the nodepool launcher) | 17:24 |
*** panda has quit IRC | 17:24 | |
smcginnis | OK, that makes sense. So if the node has a failure alon ghte way, it never gets sent over. | 17:25 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Correct quick-start executor volume error https://review.opendev.org/665186 | 17:25 |
corvus | smcginnis: yep. which is why the most likely place to find the error is the launcher, but if that's inconclusive, the scheduler should know why it declared it a node failure | 17:26 |
corvus | smcginnis, fungi, tobiash, mattw4: see https://review.opendev.org/665186 | 17:26 |
fungi | yup, saw you push and just finished reviewing | 17:27 |
fungi | thanks!!! | 17:27 |
corvus | (i still think we should change the executor default, but i think it's safer to make this change quicker) | 17:27 |
*** panda has joined #zuul | 17:27 | |
tobiash | ++ for changing the default | 17:27 |
corvus | smcginnis, mattw4: thanks for helping us find that; that was rather subtle, and i'm sorry you had to run into it for us to see it | 17:28 |
smcginnis | Glad some of this has been useful! | 17:29 |
mattw4 | me too! You all have helped a tremendous help too!! | 17:30 |
mattw4 | I am a native English speaker so I have no excuse for the grammar ^ :) | 17:31 |
smcginnis | :D | 17:31 |
fungi | i make excuses for my grammar all the time | 17:32 |
corvus | it's okay, that sentence made me feel very helpful :) | 17:35 |
fungi | extra helpful even | 17:37 |
smcginnis | Just FYI, the NODE_FAILURE I was hitting was found in the executor logs and it was due to not being able to deploy the openstack-single-node-bionic nodeset defined by the devstack job. So makes sense, just needed to figure out where to look for the error. Thanks again. | 17:38 |
mattw4 | Tremendously! :) | 17:39 |
mattw4 | smcginnis, I kinda faked it by defining a nodeset with that name and supplying my own node label in the definition. | 17:40 |
mattw4 | Scheduler complains that some nodes are undefined, but I don't need those nodes for my jobs. I'm not sure if that is a problem, but it doesn't seem to impact the tests that I'm running ATM | 17:41 |
smcginnis | mattw4: Good call, I think that was my mistake of not setting the label right. | 17:41 |
fungi | i think that highlights a rough patch in the job sharing model, not sure if anyone's yet thought through how to deal with reusing jobs that specify node labels which may not be relevant in the consumer's context | 17:42 |
smcginnis | Seems like you need to be able to separately share the jobs and their resource requirements with the nodes and what resources they can provide. | 17:43 |
fungi | smcginnis: oh, so i guess we're missing something to make /var/lib/zuul/builds a valid job_dir? | 17:43 |
smcginnis | But that's a drastic oversimplification. | 17:43 |
smcginnis | fungi: Yeah, looks like it. | 17:44 |
fungi | does it need to be created first? | 17:44 |
smcginnis | That's what it would appear. | 17:44 |
*** jamesmcarthur has quit IRC | 17:58 | |
tobiash | does anybody know the book Powerful Python by Aaron Maxwell? | 18:02 |
tobiash | I just stumbled accross it and I'm wondering if anybody would recommend reading it | 18:03 |
clarkb | corvus: re the volume thing, do we think it would be better to have flexibility in the deployment and have docker-compose or similar do the specification rather than the image? | 18:04 |
clarkb | I guess the problem with that is then people have to know to add it to compose or whatever | 18:04 |
clarkb | so better off in the image | 18:04 |
smcginnis | clarkb: Umm, you just approved the patch that has an error. Maybe I should have left -1 instead of just commenting. Might want to remove approval on that one. | 18:07 |
clarkb | smcginnis: done | 18:07 |
clarkb | smcginnis: and ya that is what -1 is for :P | 18:07 |
smcginnis | :) | 18:08 |
clarkb | fungi's comment is probably on the money for why it isn't working | 18:08 |
smcginnis | It was a "I'm getting an error but could be convinced it's just me" 0. | 18:08 |
clarkb | Because that is a volume mount we can't mkdir it during the build so I think we have to add that to the init script thing | 18:09 |
*** ianychoi has quit IRC | 18:13 | |
corvus | or have the executor create it | 18:14 |
tobiash | I'd vote for the executor | 18:14 |
corvus | mattw4, smcginnis, fungi: defining a local nodeset that satisfies what upstream jobs like devstack needs is exactly what i would expect. and if you don't actually need to use it, you could define it with "nodes: []". | 18:15 |
corvus | i'll work on an update to the patch; i'll probably just go ahead and switch the default, since it's going to involve executor code changes | 18:20 |
*** bhavikdbavishi has quit IRC | 18:23 | |
*** michael-beaver has joined #zuul | 18:23 | |
openstackgerrit | James E. Blair proposed zuul/zuul master: Change default job_dir location https://review.opendev.org/665186 | 18:28 |
pabelanger | clarkb: so, looking at github3.py, we should be able to inspect the exception for response headers on 502, I'm trying to see if there is 'Retry-After', if so we can use that value for our sleep. | 18:28 |
clarkb | pabelanger: ++ | 18:29 |
pabelanger | clarkb: other wise, maybe we need a better backoff process as you mentioned before | 18:29 |
corvus | tobiash, smcginnis, fungi, mattw4: okay, there's a slightly more substantial change ^ since that will need new images, etc, i'd suggest smcginnis and mattw4 just manually "mkdir /var/lib/zuul/builds" on the executor (since it's a volume, that will persist) and set the job_dir value in zuul.conf as in the previous patch. then after that change merges, you should be able to undo that. | 18:30 |
corvus | pabelanger: sounds good -- also, be thinking about whether we should hold the release for this (i'm inclined to -- this is the sort of thing we hope to catch by burning in on opendev). | 18:31 |
mattw4 | sounds good corvus, I will do that | 18:32 |
pabelanger | corvus: +1, I think we'll need to fix this before releasing | 18:32 |
Shrews | i guess our zuul timeouts are still not long enough? http://logs.openstack.org/61/665161/1/gate/tox-py36/c0ebbc7/job-output.txt.gz#_2019-06-13_18_15_58_351905 | 18:32 |
corvus | Shrews: wow, that was a job timeout | 18:33 |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 18:39 |
*** jamesmcarthur has joined #zuul | 18:56 | |
*** hashar has joined #zuul | 19:02 | |
openstackgerrit | Paul Belanger proposed zuul/zuul master: Improve retry handling for github driver https://review.opendev.org/665220 | 19:21 |
clarkb | pabelanger: were you able to check if retry after is ever present? | 19:22 |
pabelanger | corvus: clarkb: tobiash: jlk: ^ is my first attempt to deal with 502 / 403 github errors. Based on things I am reading on the web, and some manual testing 'retry-after' was there | 19:22 |
pabelanger | clarkb: yah, let me get paste | 19:22 |
pabelanger | clarkb: but I am not sure if it is on 502 error | 19:22 |
clarkb | cool that explains the fallback | 19:23 |
clarkb | note that will cause a 5 minute backup if it never recovers from the 502 and there isn't shorter retry after values | 19:23 |
pabelanger | http://paste.openstack.org/show/752900/ | 19:23 |
clarkb | (I think we can probably test with this and see if that causes problems) | 19:23 |
*** jamesmcarthur has quit IRC | 19:24 | |
pabelanger | clarkb: yah, I didn't actually wait 60 seconds, so maybe we should add a little buffer? | 19:24 |
pabelanger | I just did testing using curl | 19:24 |
corvus | is that inside our parellilized workers our outside? | 19:24 |
clarkb | corvus: I believe it is inside | 19:25 |
clarkb | so once we get past that 5 minute zone we should catch up quick | 19:25 |
pabelanger | or maybe we don't retry 5 times? | 19:25 |
corvus | clarkb: but other queries will still be happeninng in parallel, so we're only waiting for the sequencing | 19:25 |
corvus | ? | 19:25 |
clarkb | corvus: correct | 19:26 |
corvus | cool, i think (based on what i know atm) that's the way to go. at least, until we discover more about github rate limiting :) | 19:26 |
pabelanger | yah, I didn't find https://developer.github.com/v3/#rate-limiting too helpful, with examples | 19:28 |
corvus | pabelanger: i like the patch, but i left a suggestion about improving the debug info for us | 19:29 |
pabelanger | ack, give me a few mins to look | 19:30 |
openstackgerrit | James E. Blair proposed zuul/nodepool master: WIP: new devstack-based functional job https://review.opendev.org/665023 | 19:31 |
*** rf0lc0 has quit IRC | 19:44 | |
*** rfolco has joined #zuul | 19:44 | |
openstackgerrit | Paul Belanger proposed zuul/zuul master: Improve retry handling for github driver https://review.opendev.org/665220 | 19:48 |
pabelanger | corvus: clarkb: ^updated | 19:48 |
hogepodge | clarkb: right now locistack is broken during a refactor to use stock images, chasing down the issue right now. should have numbers sometime next week. | 19:50 |
corvus | hogepodge: cool, i'm going to proceed with the devstack approach, and we can look at swapping it in later. | 19:52 |
corvus | should be fairly isolated | 19:52 |
hogepodge | that sounds best, will give me a chance to do a last bit of housekeeping and setting up a tempest job against it so I can create an opendev repository | 19:54 |
pabelanger | jlk: maybe you could confirm if 'Retry-After' would be present on a 502 Server Error response, I haven't been able to find much info on the web. If you have the ability | 19:57 |
fungi | as a first step we could start logging more details from the 502 responses | 19:59 |
corvus | pabelanger: not quite what i had in mind, may i push up revision? | 20:00 |
pabelanger | corvus: please do so | 20:01 |
corvus | pabelanger: also, are you sure you want to retry forbidden errors? | 20:02 |
pabelanger | corvus: that was mostly based on the pastebin from today, so we could open it to more | 20:04 |
pabelanger | from my readings on the web, 403 did return 'retry-after' header | 20:04 |
pabelanger | but it was difficult to see what else did | 20:04 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Improve retry handling for github driver https://review.opendev.org/665220 | 20:05 |
corvus | pabelanger: this update should supply the information we need to answer that question ^ | 20:06 |
pabelanger | ah, much better | 20:06 |
corvus | oh 1 thing | 20:06 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Improve retry handling for github driver https://review.opendev.org/665220 | 20:07 |
corvus | missed type conversion | 20:07 |
pabelanger | corvus: thanks, I see what you were asking now. +1 | 20:11 |
corvus | does anyone know how this $REGION_NAME variable gets set? https://opendev.org/zuul/nodepool/src/branch/master/devstack/plugin.sh#L303 | 20:28 |
corvus | oh, that must come from devstack | 20:29 |
openstackgerrit | Merged zuul/zuul master: Split plugin tests into two https://review.opendev.org/665161 | 20:31 |
pabelanger | tobiash: clarkb: if you don't mind adding https://review.opendev.org/665220/ to your review pipeline, I think we should try to restart zuul.o.o with that to help avoid the 'abuse' errors we are now getting | 20:36 |
*** pcaruana has quit IRC | 20:36 | |
tobiash | Lgtm | 20:41 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Change default job_dir location https://review.opendev.org/665186 | 20:43 |
corvus | just a minor pep8 fix on that ^, otherwise it passed all the tests, so should be gtg | 20:43 |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: Web: plug the authorization engine https://review.opendev.org/640884 | 20:44 |
corvus | clarkb, fungi, Shrews: running devstack without the benefit of local git clones took 25 minutes 11.9 seconds (which ara rounds up to 13? cc:dmsimard): http://logs.openstack.org/23/665023/8/check/nodepool-functional-openstack/194fed6/ara-report/ | 20:47 |
*** panda has quit IRC | 20:49 | |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: Zuul Web: add /api/user/authorizations endpoint https://review.opendev.org/641099 | 20:49 |
openstackgerrit | James E. Blair proposed zuul/nodepool master: WIP: new devstack-based functional job https://review.opendev.org/665023 | 20:50 |
dmsimard | the duration in ara is calculated based on the time the task started and when it ended and then it's rounded in the webapp | 20:50 |
fungi | heh, i suppose you can make the argument that 11.9 is roughly 13 when rounded to the nearest odd number? ;) | 20:51 |
fungi | well, rounded up to the next odd number anyway | 20:51 |
dmsimard | there is some latency | 20:51 |
dmsimard | because task ends -> tells ara task ended -> ara marks end timestamp | 20:51 |
*** panda has joined #zuul | 20:51 | |
fungi | got it. so this is time it took for ara to become aware it was done | 20:51 |
dmsimard | yes | 20:52 |
fungi | it just gets a notification, not a timestamp passed to it | 20:52 |
*** hashar has quit IRC | 20:55 | |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: Web: plug the authorization engine https://review.opendev.org/640884 | 20:57 |
dmsimard | fungi: right -- this is more or less also how the upstream profile_tasks callback plugin calculates the duration but there is less overhead since it's just printing to stdout | 20:57 |
dmsimard | https://github.com/ansible/ansible/blob/devel/lib/ansible/plugins/callback/profile_tasks.py | 20:57 |
corvus | dmsimard: ah ok, i assumed it was working from the same data that shows up here: http://logs.openstack.org/23/665023/8/check/nodepool-functional-openstack/194fed6/ara-report/result/6cad0ed8-1cee-47d3-b1a3-58426aef0e37/ | 21:20 |
corvus | (start/end/delta) | 21:20 |
dmsimard | corvus: the problem is that (unless mistaken), those fields are not always returned | 21:20 |
corvus | yeah, i guess those are "command module" specific fields? | 21:21 |
dmsimard | like, depending on which module was used | 21:21 |
dmsimard | yeah | 21:21 |
corvus | got it, til, thx :) | 21:22 |
openstackgerrit | James E. Blair proposed zuul/nodepool master: WIP: new devstack-based functional job https://review.opendev.org/665023 | 21:24 |
fungi | also, i suppose they could "lie" under some circumstances, so having an external timer helps keep them honest even if it does only provide loose bounds on the runtime | 21:25 |
pabelanger | darn, we had py36 job timeout | 21:38 |
pabelanger | looks like be limestone | 21:38 |
dmsimard | fungi: if Ansible would reliably return timestamps for every module/action/etc, we'd probably use it | 21:40 |
smcginnis | Maybe a little more relevant in -infra than #zuul, but any idea why the devstack job would have ansible_interfaces undefined errors? Didn't collect facts, but where? | 21:55 |
mattw4 | smcginnis, I know this one!! | 21:55 |
smcginnis | :) | 21:55 |
mattw4 | I created a new role in base:pre.yaml to collect all facts with the setup module | 21:55 |
smcginnis | Oh good, I'm not the only one hitting some weird thing. | 21:55 |
mattw4 | I couldn't figure out how to make Zuul collect all facts by default so I just added a small role with the setup module | 21:56 |
smcginnis | Hmm, I tried something similar adding "gather_facts: True" to my task in pre.yaml, but same error. | 21:56 |
smcginnis | mattw4: Do you have that up somewhere I could take a peek? | 21:57 |
mattw4 | I think it gathers facts by default, but the fact set is limited to the minimum (!all) | 21:58 |
mattw4 | smcginnis: it' | 21:58 |
mattw4 | smcginnis: it's in an internal repo ATM, but I can share the role, just a sec | 21:58 |
smcginnis | Thanks mattw4! | 21:58 |
mattw4 | smcginnis: I named it "gather-all-ansible-facts" and it's super-small: http://paste.openstack.org/show/752907/ | 22:00 |
mattw4 | smcginnis: that added the 'ansible_interfaces' list to the fact set | 22:01 |
smcginnis | Awesome, I'll give that a shot. Thanks! | 22:01 |
mattw4 | np :) | 22:01 |
mattw4 | I already posted this in #openstack-qa, but I think this may be the right audience: Does anyone know why devstack would fail to install an SSL certificate for Apache2, causing a failure when apache2.service is restarted after installing uwsgi? | 22:03 |
mattw4 | the job is a child of devstack-minimal with a few additional services enabled. | 22:04 |
corvus | mattw4: that's probably a better #openstack-qa question unless zuul itself is somehow involved (but it doesn't sound like it) | 22:05 |
mattw4 | corvus: ok. True, it's probably not Zuul. Thanks tho. | 22:06 |
openstackgerrit | James E. Blair proposed zuul/nodepool master: WIP: new devstack-based functional job https://review.opendev.org/665023 | 22:17 |
*** jamesmcarthur has joined #zuul | 22:18 | |
pabelanger | corvus: clarkb: https://review.opendev.org/665220/ should land in the next 90mins, do we want to look to restart zuul again today or hold off until another time? I'll be able to assist either way | 22:19 |
clarkb | I'm in the middle of copying lots of data around so that we can do ssl cert updates on infra services | 22:22 |
clarkb | so I'll defer to others | 22:22 |
pabelanger | ack | 22:25 |
pabelanger | also, I should have used #openstack-infra for that | 22:25 |
*** tobiash has quit IRC | 22:40 | |
ofosos | corvus, SpamapS: any love for https://review.opendev.org/#/c/662134/54 ? All the basic test s are good now... I'd like some guidance on how to proceed | 23:14 |
ofosos | Linter will be fixed tomorrow | 23:18 |
clarkb | ofosos: you may want to send an email to the zuul-discuss list soliciting reviews? Sounds like its to a point where it is generally working and now its double checking (and potential refinement)? | 23:19 |
ofosos | clarkb: +1 | 23:20 |
*** jamesmcarthur has quit IRC | 23:24 | |
corvus | ofosos: looks like there's a pep8 error at http://logs.openstack.org/34/662134/54/check/tox-pep8/cc56a61/job-output.txt.gz#_2019-06-13_20_51_57_086167 but that's the only failing test | 23:25 |
corvus | ofosos: might want to go aheand and push up a fix for that; i can give the whole stack a closer look tomorrow. i'm looking forward to it! :) | 23:26 |
ofosos | Jup, it's a single line. I was already at the pub when that popped up. My ide was happy with the code. I'll fix it tomorrow | 23:27 |
ofosos | Had to celebrate a birthday yesterday (already Friday in my tz). | 23:28 |
ofosos | corvus: very good :) | 23:29 |
openstackgerrit | Merged zuul/zuul master: Improve retry handling for github driver https://review.opendev.org/665220 | 23:31 |
ofosos | I'd still like to refactor some things. Sometimes it's unclear where you have to pass a project or a project name. That's the biggest problem I saw. | 23:31 |
ofosos | The testing process was really nice though. The fixture tests took 10 hours today too get right, but in the end I think it's for the better. | 23:32 |
ofosos | I also need to incorporate API paging, but that can be done on the client level. | 23:33 |
jlk | pabelanger: I don't know if a reply-after is going to be present. It looks like our system can throw a 502 if a query has gone longer than 10 seconds, and there was a recent change that caused that to happen a lot more often. This change was reverted a few hours ago, so I'm curious if Zuul is still seeing a slew of 502s. | 23:34 |
mattw4 | Where can I set zuul_log_verbose: true to produce more verbose logs? | 23:51 |
pabelanger | jlk: great, thanks for the information, we've landed https://review.opendev.org/665220/ to help deal with it, and give additional info | 23:51 |
pabelanger | mattw4: you can set it in your playbook where you call the upload-logs role | 23:52 |
mattw4 | pabelanger: gotcha, Thanks! | 23:52 |
pabelanger | Hmm, for some reason, we have 3 PRs in our third-party-check pipeline, from the same PR: https://dashboard.zuul.ansible.com/t/ansible/status | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!