*** jkilpatr has quit IRC | 01:06 | |
*** deep-book-gk has joined #zuul | 01:20 | |
*** deep-book-gk has left #zuul | 01:21 | |
*** xinliang has quit IRC | 01:34 | |
*** openstackgerrit has joined #zuul | 05:04 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool feature/zuulv3: Implement a static driver for Nodepool https://review.openstack.org/468624 | 05:04 |
---|---|---|
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool feature/zuulv3: Implement an OpenContainer driver https://review.openstack.org/468753 | 05:04 |
*** GK1wmSU has joined #zuul | 05:43 | |
*** GK1wmSU has left #zuul | 05:45 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool feature/zuulv3: Implement a static driver for Nodepool https://review.openstack.org/468624 | 05:46 |
tobiash | tristanC: I have a comment and a question on https://review.openstack.org/#/c/488384 | 05:52 |
*** _GK1wmSU has joined #zuul | 05:56 | |
*** _GK1wmSU has left #zuul | 05:57 | |
tristanC | tobiash: thanks for the review | 06:07 |
tobiash | tristanC: is the loadConfig executed on every 10s tick? | 06:09 |
tobiash | tristanC: I think especially for openstack with many clouds they probably want to have a single config object | 06:10 |
tobiash | tristanC: maybe there is a way to make this more generic (probably in a follow up)? | 06:10 |
tristanC | tobiash: it seems like it yes, the Nodepool run loop does it in the while True | 06:10 |
tobiash | maybe have a list of drivers (could also simplify get_provider_config) and add a pre-config class method to the driver interface | 06:11 |
tristanC | I've been testing multinode jobs, and I was wondering if there is a way to limit the nodes requested by a job? Would it make sense to add a max-node setting per tenant, or to restrict a tenant to a specific nodepool pool? | 06:11 |
tobiash | tristanC: I'm not sure if this should be done on nodepool side, I think this also might make sense to solve this within zuul (e.g. tenant config) | 06:12 |
tristanC | tobiash: the Config interface could have a reset class method to be called before the reconfigure | 06:13 |
tobiash | tristanC: sounds good | 06:13 |
tobiash | tristanC: I've discussed a similar topic (limit lables to tenants) with jeblair and for this we agreed this should be handled in zuul so I think it might make sense to handle the max-nodes per node request also in zuul (even per tenant) | 06:15 |
tristanC | tobiash: that would works, but how about a max in-used node (not per request)? | 06:16 |
tobiash | tristanC: can you explain this more? | 06:17 |
tristanC | tobiash: e.g. a tenant would not consume more than x nodes in parallel | 06:17 |
tobiash | tristanC: you mention tenant -> zuul | 06:18 |
tristanC | tobiash: same per job, a tenant job would not be able to request more than y nodes | 06:18 |
tobiash | tristanC: that would need to add such accounting into zuul | 06:18 |
tristanC | yes, zuul need to keep track, or somehow query nodepool for tenant usage | 06:19 |
tobiash | tristanC: nodepool has no tenant knowledge and I learned that it shouldn't get this (as it is also used for non-zuul use cases outside) | 06:20 |
tobiash | tristanC: so zuul needs to keep track (which should not be that hard as it creates/deletes the node-requests | 06:20 |
tristanC | well my primary worry is that a new patchset could add a job with an unlimited nodeset | 06:22 |
tristanC | on another topic, I wonder if it would make sense to allow nodeset to be created across pools, for example to mix different driver labels | 06:23 |
tobiash | tristanC: that is probably a question for jeblair and Shrews | 06:24 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: WIP: Add jobs dashboard https://review.openstack.org/466561 | 06:43 |
*** amoralej|off is now known as amoralej | 06:58 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul-jobs master: Add configure-logserver role https://review.openstack.org/489113 | 07:26 |
*** xinliang has joined #zuul | 08:15 | |
*** rcarrill1 is now known as rcarrillocruz | 08:17 | |
*** hashar has joined #zuul | 08:34 | |
*** smyers has quit IRC | 09:02 | |
*** smyers has joined #zuul | 09:14 | |
tobiash | tristanC: was your zuul.d config support just for project configuration or also for zuul.conf? | 09:29 |
tristanC | tobiash: just for the in-repo configuration | 09:30 |
tobiash | tristanC: ah, ok | 09:31 |
*** bhavik1 has joined #zuul | 10:53 | |
*** jkilpatr has joined #zuul | 10:59 | |
*** bhavik1 has quit IRC | 11:00 | |
*** xinliang has quit IRC | 11:10 | |
*** xinliang has joined #zuul | 11:23 | |
*** xinliang has quit IRC | 11:23 | |
*** xinliang has joined #zuul | 11:23 | |
*** hashar is now known as hasharLunchAmper | 12:03 | |
*** hasharLunchAmper is now known as hashar | 12:16 | |
*** dkranz_ has joined #zuul | 12:21 | |
*** amoralej is now known as amoralej|lunch | 13:00 | |
*** xinliang has quit IRC | 13:05 | |
*** amoralej|lunch is now known as amoralej | 13:39 | |
*** openstackgerrit has quit IRC | 14:33 | |
*** openstackgerrit has joined #zuul | 14:33 | |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Implement autohold https://review.openstack.org/486692 | 14:33 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Implement autohold https://review.openstack.org/486692 | 14:43 |
clarkb | tristanC: I want to say max node settings came up at the last ptg. WIth one idea being capping it at your smallest cloud region (so that any job could be scheduled in any region) | 14:52 |
clarkb | but then further allow it to be restricted | 14:52 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul-jobs master: Simplify run tox task https://review.openstack.org/487551 | 15:19 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Remove duplicated states from zk.py https://review.openstack.org/489256 | 15:23 |
jeblair | tobiash, tristanC, clarkb: yes, we definitely need a max-nodes-per-job setting, ideally before our ptg cutover, just so that someone doesn't see how many nodes they can request at once. :) that should be straightforward. | 15:33 |
jeblair | tobiash, tristanC, clarkb: i don't think we've discussed a concurrent node limit per tenant, but that sounds like an additional good idea (though not as urgent or simple to implement) | 15:33 |
jeblair | tobiash: regarding nonexistant jobs -- can we perform a check in ProjectTemplateParser._parseJobList()? | 15:39 |
tobiash | jeblair: tried it there, but that doesn't work because there we don't have a complete job list | 15:50 |
tobiash | jeblair: but for eager validation I found a place in addProjectConfig and addProjectTemplate | 15:50 |
tobiash | jeblair: https://review.openstack.org/#/c/488758/1/zuul/model.py | 15:50 |
tobiash | jeblair: that works | 15:51 |
tobiash | jeblair: but I think both eager and lazy validation have their own pros and cons | 15:51 |
tobiash | +eager: consistent config enforced | 15:52 |
tobiash | +lazy: fewer runtime cost on every config evaluation | 15:53 |
tobiash | +lazy: no globally broken config (push through or removing a repo can lead to a globally broken config with eager validation) | 15:53 |
jeblair | tobiash: why don't we have a complete job list? we should parse all jobs before we parse all projects | 15:58 |
tobiash | jeblair: during my tests it had just the jobs of the same repo (or possibly the job it parsed so far from repos scanned before) | 16:03 |
jeblair | tobiash: that sounds like a bug -- the order is pipelines, nodesets, secrets, jobs, sempahores (huh that sounds backwards), templates, projects | 16:06 |
jeblair | specifically so that we can do this kind of validation | 16:07 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix GithubConnection logging name https://review.openstack.org/488537 | 16:09 |
tobiash | jeblair: ah, t | 16:14 |
tobiash | jeblair: possibly config,untrusted split | 16:15 |
jeblair | tobiash: hrm, we should perform a complet config load/validation even on changes to config repos (though we don't run with that config -- incidentally, that's another point in favor of config-time checking) | 16:17 |
tobiash | jeblair: addProjectConfig is called in a later stage where this works fine | 16:18 |
jeblair | tobiash: yes, but the confusing thing is that it's called right after the place where i'm suggesting we add the check | 16:19 |
jeblair | tobiash: can you push up the version of your change which failed there so i can take a look at it? | 16:19 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Docs: add a glossary https://review.openstack.org/489270 | 16:20 |
tobiash | jeblair: don't have it anymore, but you can at least take the tests from the change linked above | 16:21 |
jeblair | tobiash: is the WIP and non-WIP test the same? | 16:22 |
* tobiash is checking | 16:22 | |
tobiash | jeblair: the test from wip tests more stuff | 16:23 |
jeblair | ok i'll use that | 16:24 |
tobiash | jeblair: I think I was totally blind... :/ | 16:37 |
tobiash | jeblair: now it looks totally fine in ProjectTemplateParser._parseJobList() | 16:40 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: WIP Don't ignore inexistent jobs in config https://review.openstack.org/488758 | 16:42 |
tobiash | jeblair: hm, does work with the new test cases, but not with dynamic job addition :( | 16:43 |
tobiash | jeblair: ^ | 16:43 |
jeblair | tobiash: i think there are "errors" in some old tests | 16:44 |
jeblair | tobiash: they accidentally remove a job definition which isn't actually used in the test (since, of course, the tests are designed to fail reconfiguration) | 16:44 |
jeblair | tobiash: so they hit the non-existent job error earlier than the error they are designed to test | 16:44 |
tobiash | jeblair: test_dynamic_config fails now (which is not supposed to use non-existent jobs) | 16:45 |
* tobiash checking | 16:45 | |
jeblair | tobiash: do you want me to fix up the other tests, or do you want to? | 16:45 |
tobiash | jeblair: I'll do (still hoping the tests are broken) | 16:46 |
jeblair | tobiash: org/project defines "project-test1" which is used by org/project1 | 16:46 |
jeblair | tobiash: so any tests which replace org/project/.zuul.yaml need to keep the "project-test1" job definition, otherwise they break org/project1's config | 16:47 |
jeblair | tobiash: also left a suggestion on 488758 for how to actually raise the error | 16:48 |
tobiash | jeblair: hrm, thought I have fixed an error like this last week | 16:48 |
tobiash | jeblair: ah, ok, that was the other glitch | 16:48 |
* jlk dusts off his IRC client | 16:49 | |
jlk | o/ | 16:49 |
*** hashar has quit IRC | 16:50 | |
jeblair | jlk: welcome back! | 16:50 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: WIP Don't ignore inexistent jobs in config https://review.openstack.org/488758 | 16:57 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Docs: add a glossary https://review.openstack.org/489270 | 16:57 |
jlk | So y'all got v3 done while I was gone, right? It's all done now? We're on to v4? | 16:57 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: WIP Don't ignore inexistent jobs in config https://review.openstack.org/488758 | 16:58 |
jeblair | jlk: *much* has happened, though we missed you greatly! we're working down the punchlist of things we need to have in place before we cutover openstack (which we're hoping to do at the openstack PTG in denver) | 16:59 |
pabelanger | indeed! | 17:00 |
jeblair | jlk: mordred and pabelanger are focusing on creating job content (and generating items for our punchlist :) | 17:00 |
jlk | oh good, I'd love to see that punch list. Also, sounds like maybe some justification for me going to PTG | 17:00 |
jeblair | jlk: and docs are moving along | 17:00 |
jeblair | jlk: cool, most of it's in storyboard, a few things are in my local buffer because i still need to type them into storyboard; i'll try to refresh that today. | 17:01 |
jeblair | jlk: tobiash is continuing to find and fix bugs about 3 hours before we would have run into them for openstack, which is awesome :) | 17:02 |
jlk | neat! | 17:02 |
jeblair | jlk: our plan at the moment is to perform some trial cutovers saturday/sunday evenings before the ptg, then make the switch monday morning | 17:02 |
pabelanger | Ya, that reminds me. I need to check flights again for Saturday travel | 17:03 |
*** yolanda has quit IRC | 17:03 | |
jeblair | jlk: we'll have one or two cross-project sessions where we talk about v3, and then sort of have open office hours throughout the week to help people with job creation | 17:04 |
jeblair | jlk: and, of course, fix issues as they come up | 17:04 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: docs: reformat job section with zuul domain https://review.openstack.org/488955 | 17:04 |
jeblair | jlk: so yes, your presence would be helpful if you are able | 17:05 |
jeblair | jlk: also, we have openstack's zuulv3 talking to a test repository on github | 17:05 |
jlk | woot | 17:05 |
jeblair | which i would show you, but github is just giving me pink unicorns | 17:06 |
jeblair | ah, here it is: https://github.com/gtest-org/ansible | 17:06 |
pabelanger | Ya, getting 502 Server Error back from them ATM | 17:06 |
jeblair | right now may not be the time to do github work. nice to have other options. :) | 17:07 |
jlk | yeah github has been shakey this morning | 17:07 |
pabelanger | mordred: https://review.openstack.org/#/q/topic:zuulv3-tox-jobs is ready for comments. Removes last bits of shell script from tox role. I still need to add 'zero jobs' run check into a playbook | 17:09 |
tobiash | jeblair: :) | 17:09 |
tobiash | jeblair: patch is almost running, there was slightly more work to get the error message right | 17:10 |
Shrews | jlk: also, log streaming! | 17:10 |
jlk | wooo! | 17:11 |
Shrews | zuulv3.o.o has proper links for that now | 17:11 |
jeblair | Shrews: oh yes! lemme recheck a change | 17:11 |
jeblair | jlk: http://zuulv3.openstack.org/ | 17:12 |
jeblair | jlk: click on one of the jobs running there, and you'll get a streaming console log | 17:13 |
jeblair | eg http://zuulv3.openstack.org/static/stream.html?uuid=2b15d099cfd14d0bb59fb5e06cbc3af5&logfile=console.log | 17:13 |
jlk | that's pretty badass! | 17:13 |
jlk | heh, and then you realize how fun it is to have single ansible tasks that take many many minutes to complete | 17:14 |
jlk | all your output all at once! | 17:14 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Implement autohold https://review.openstack.org/486692 | 17:18 |
Shrews | ^^^ added link to SB | 17:18 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Don't ignore inexistent jobs in config https://review.openstack.org/488758 | 17:31 |
tobiash | jeblair: ^ now without wip and locally working pep8 and py35 \o/ | 17:33 |
tobiash | doesn't zuulv3.o.o vote anymore? | 17:42 |
tobiash | it ran jobs on ^ which were all green, but there is only the jenkins vote | 17:43 |
*** amoralej is now known as amoralej|off | 17:44 | |
Shrews | hrm yeah. something seems off | 17:48 |
pabelanger | https://zuulv3.openstack.org/ online now | 17:54 |
*** yolanda has joined #zuul | 17:55 | |
*** yolanda has quit IRC | 18:00 | |
jeblair | pabelanger, Shrews, tobiash: it looks like we're attempting to report gerrit changes to github, that fails, and so items are being removed from queues. | 18:04 |
jeblair | pabelanger: want to work on updating our pipeline definitions to correct that? | 18:04 |
jlk | erp! | 18:04 |
pabelanger | jeblair: Sure, I noticed that on friday | 18:04 |
pabelanger | jeblair: that means we need a dedicated pipeline for github-check? | 18:05 |
jeblair | pabelanger, jlk: actually -- that's not supposed to be a config setting, is it. that's supposed to be handled automatically by the driver? | 18:06 |
jlk | thinking | 18:07 |
tobiash | jeblair: for gerrit it was fixed here: https://review.openstack.org/#/c/461981/7/zuul/driver/gerrit/gerritreporter.py | 18:07 |
tobiash | jeblair: we possibly need that for github too | 18:07 |
tobiash | jeblair: that way we can share one pipeline for multi-gerrit-multi-github | 18:08 |
jeblair | tobiash: that sounds likely | 18:08 |
jlk | b09f421fd023412500f0eb50adbfa6ea80060268 | 18:08 |
jlk | was the commit from jamielennox | 18:09 |
jlk | let me see what github does | 18:09 |
jeblair | i'm guessing we only test in one direction, not the other :) | 18:09 |
pabelanger | jlk: jeblair: ya, I cannot think of a reason for a gerrit patch to report to github. But it would report to mysql regardless of the trigger. | 18:09 |
jlk | perhaps :( | 18:09 |
jlk | yeah it's missing for github | 18:10 |
jeblair | pabelanger: yeah, i think that's part of the reason this should be handled in the driver; the driver does know whether it can report an item or not. | 18:10 |
jlk | I can bang out a change here. | 18:10 |
jeblair | pabelanger: and mysql can report * | 18:10 |
jeblair | jlk: cool, thanks! | 18:10 |
jlk | is there an open issue on this? | 18:10 |
jeblair | jlk: not afaik -- i think we just diagnosed it | 18:11 |
pabelanger | jeblair: Ya, in driver seems to make sense | 18:11 |
*** harlowja has joined #zuul | 18:11 | |
jlk | the code is done, just looking for the right place to test this. b09f421fd023412500f0eb50adbfa6ea80060268 is oddly about multiple gerrits, not github+gerrit | 18:14 |
clarkb | jlk: you might also want to look over spamaps change to make the shared github secret mandatory for zuul | 18:15 |
jeblair | clarkb: pending or merged? | 18:16 |
jeblair | jlk: wow, i was *sure* we had some kind of a test with github and gerrit in the same pipeline, but i can not find one | 18:17 |
jlk | I'm almost positive we do | 18:17 |
jlk | one sec | 18:17 |
jlk | jeblair: tests/unit/test_multi_driver.py | 18:18 |
jeblair | jlk: thank you! | 18:18 |
jlk | I'll just add some more tests here regarding reporting. | 18:18 |
jeblair | jlk: in fact, that test does not assert anything about reporting | 18:19 |
jlk | yeah :/ | 18:20 |
jeblair | jlk: ah, but it also doesn't share a pipeline | 18:20 |
jeblair | so we should: a) add some report assertions to that test (it is, in fact, erroring on a different issue right now) | 18:21 |
jlk | easy enough to add a new pipeline. | 18:21 |
jeblair | and b) either combine those pipelines into one, or add make a new test with a shared pipeline | 18:21 |
jeblair | jlk: the error it's hitting is that we need to capitalize "Verify" in gerrit reporters now | 18:21 |
jeblair | we probably missed updating that test since it wasn't failing | 18:22 |
clarkb | jeblair: I'm not sure | 18:22 |
jlk | which test are we talking about? multi driver? | 18:22 |
jeblair | jlk: yes, i ran it in the foreground and saw the error | 18:22 |
jlk | gotcha | 18:22 |
jlk | yeah looking here, I think I can just combine them into a single pipeline. | 18:23 |
jeblair | clarkb, jlk: on the other topic of conversation, i think clarkb is referring to this: https://review.openstack.org/488240 | 18:23 |
clarkb | https://review.openstack.org/#/c/488240/ looks like merged | 18:23 |
jlk | I'll go that route, and add some asserts on reporting. | 18:23 |
jeblair | jlk: kk | 18:23 |
jlk | oh hrm | 18:24 |
jlk | that change might make it more difficult to synthesize webhook events running locally | 18:24 |
jlk | (with curl) | 18:24 |
jeblair | jlk: when we move the webhook listener into zuul-web and have it submit events to the scheduler over gearman, that will provide a convenient injection point for synthetic events. | 18:25 |
jlk | I have a tiny python app that sends the event, I can add some signature stuff to it as well. | 18:25 |
*** yolanda has joined #zuul | 18:34 | |
*** yolanda has quit IRC | 18:34 | |
*** yolanda has joined #zuul | 18:34 | |
tobiash | jeblair: will the mergers/executors in future serve git repos via http (again)? | 18:35 |
jeblair | tobiash: not publicly, but we have talked about doing so internally so that executors can fetch merged content from mergers without having to repeat the action. | 18:36 |
tobiash | jeblair: ok, so I shall not rip out zuul_url... | 18:37 |
jeblair | tobiash: didn't i do that already? :) | 18:37 |
tobiash | jeblair: I meant the zuul_url from zuul.conf | 18:37 |
tobiash | jeblair: it's still required in the zuul.conf | 18:38 |
jlk | isn't the zuul_url used in part by the webapp? | 18:38 |
jlk | and other things for status url? | 18:38 |
jeblair | tobiash: ah. hrm. we probably should remove it for now; it's meaningless now | 18:38 |
*** yolanda has quit IRC | 18:38 | |
jlk | maybe that usage got removed in the last two weeks :D | 18:38 |
jeblair | jlk: i think that's something different? | 18:39 |
jlk | hrm. | 18:40 |
jlk | must be. | 18:40 |
tobiash | jlk: did you mean status_url? | 18:43 |
jlk | jeblair: you were seeing "KeyError: 'verify'" right? | 18:43 |
jlk | (because changing the pipeline to be 'Verify:' just changes the traceback to KeyError: 'Verify' | 18:43 |
jlk | ) | 18:43 |
jeblair | jlk: i think so. i guess there could be something further wrong. | 18:44 |
jlk | tobiash: that sounds / looks right | 18:44 |
jlk | oh hah | 18:45 |
jlk | nope, hrm. | 18:46 |
jlk | lolol. | 18:48 |
jlk | s/Verify/Verified/ | 18:50 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Implement autohold https://review.openstack.org/486692 | 18:51 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Implement autohold https://review.openstack.org/486692 | 18:53 |
jlk | alright, tests pass with combined pipeline and fixed label. I'll expand the tests to assert reporting after I lunch and migrate to a coffee shop | 19:01 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Docs: add a :default: argument to zuul:attr https://review.openstack.org/489303 | 19:07 |
tobiash | did something change in the test framework? | 19:11 |
tobiash | just updated to latest branch tip and my local tests break | 19:11 |
tobiash | http://paste.openstack.org/show/617056/ | 19:12 |
tobiash | that happens for about 30% of the test cases | 19:13 |
tobiash | rerunning with --failed reduces the failed tests | 19:13 |
jeblair | tobiash: SpamapS was attempting a new way of stopping helper threads. we wait 5 seconds to join the diskaccountant thread on shutdown (including test shutdown). that error suggests it didn't shutdown within that time. | 19:14 |
jeblair | tobiash: can you try increasing the join timeout and see if it helps? | 19:14 |
* tobiash checking | 19:14 | |
jeblair | tobiash: executor.server.DiskAccountant.stop() | 19:15 |
jeblair | tobiash: (perhaps with a bunch of tests running, it's being starved a little and takes longer) | 19:15 |
tobiash | jeblair: rerunning with 15s... | 19:16 |
tobiash | jeblair: doesn't seem to be this value | 19:17 |
tobiash | jeblair: tried 50, but first fail was after about 15s | 19:18 |
pabelanger | Shrews: don't hate me, but I think we want a reason for autohold (like hold in nodepool). So we can ping operators to clean up autohold things | 19:18 |
pabelanger | having it a required field would be nice too. Forces people to add there name | 19:18 |
Shrews | pabelanger: that's fine by me. i was just implementing based on jeblair's story | 19:19 |
pabelanger | we could do it in follow up patch too | 19:19 |
jeblair | ya should be easy to add | 19:19 |
jeblair | and ++ to requiring it when we add it :) | 19:19 |
*** bhavik1 has joined #zuul | 19:20 | |
jeblair | tobiash: another possibility is that a bunch of tests are failing, but hitting that error during an unclean shutdown | 19:20 |
jeblair | tobiash: you can whitelist the thread in shutdown() in tests/base.py to check that | 19:21 |
tobiash | jeblair: ok, at least verified that the job root monitor patch is introducing this | 19:24 |
tobiash | jeblair: whitelisting fixes all tests except test_cache_hard_links which fails deterministically on my system | 19:32 |
*** bhavik1 has quit IRC | 19:34 | |
tobiash | jeblair: that could be the root cause as it throws an assertion but doesn't stop the thread and all later tests in that batch fail because the thread is still there | 19:37 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Add required reason for hold https://review.openstack.org/489366 | 19:40 |
Shrews | pabelanger: jeblair: ^^^ | 19:40 |
Shrews | also fixes the ---count in the example (one too many dashes) | 19:41 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Remove duplicated states from zk.py https://review.openstack.org/489256 | 19:43 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Ensure stop of disk accountant on assertion https://review.openstack.org/489368 | 19:47 |
tobiash | jeblair: missing stop when assertion hits caused this | 19:48 |
tobiash | jeblair: still have to check why the hard links check fails for me, possibly due to having /tmp on tmpfs | 19:49 |
tobiash | jeblair: will check that | 19:49 |
pabelanger | Shrews: awesome, thanks! | 19:52 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Docs: add a :default: argument to zuul:attr https://review.openstack.org/489303 | 20:04 |
Shrews | pabelanger: i think we can remove py2 jobs for nodepool features/zuulv3 branch now, yeah? | 20:07 |
Shrews | or is there something else we should do first? | 20:08 |
pabelanger | Shrews: Ya, I haven't see any issues with nl01 | 20:08 |
jeblair | Shrews: i'm catching up on the autohold change -- why did you end up changing the parameters in zuul/rpcclient.py from 'tenant' to 'tenant_name' ? | 20:08 |
Shrews | jeblair: at the request of tobiash | 20:08 |
pabelanger | Shrews: I'd +2 a removal :) | 20:09 |
jeblair | Shrews: was that in a review comment or irc? | 20:09 |
Shrews | jeblair: review comment | 20:09 |
jeblair | i can't find it :( | 20:10 |
Shrews | jeblair: ps5... also a bit in irc too | 20:10 |
Shrews | https://review.openstack.org/#/c/486692/5/zuul/zk.py | 20:11 |
Shrews | oh, maybe not in irc. his last comment is what i was thinking of as an irc discussion | 20:11 |
jlk | jeblair: can you give me some more details on the error you're seeing in prod with the failure to report? Without my change I have test case taht results in a traceback attempting to report a gerrit change to github, but it doesn't seem to stop that change from being reported to gerrit as well. | 20:12 |
jlk | ooooh! | 20:12 |
jlk | this could be because my setup is only reporting on success, rather than on start. | 20:12 |
jeblair | Shrews: okay, that's for arguments, but i don't think the same applies for on-the-wire json. changing it there puts it out of sync with the rest of the rpc functions. | 20:13 |
pabelanger | jlk: sure, I can get you a traceback | 20:13 |
*** persia has quit IRC | 20:13 | |
pabelanger | 1 sec | 20:13 |
pabelanger | jlk: an example of the error http://paste.openstack.org/show/617060/ | 20:14 |
jlk | um | 20:15 |
jlk | well that's slightly different | 20:16 |
*** persia has joined #zuul | 20:16 | |
pabelanger | let me check other tracebacks | 20:16 |
jlk | nah, I think I know what's going on | 20:16 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Fix autohold RPC protocol https://review.openstack.org/489375 | 20:17 |
Shrews | jeblair: how's that? ^^^^ | 20:17 |
pabelanger | jlk: ya, others seem to be the same. That code you linked from jamielennox looked like the right approach | 20:17 |
jeblair | Shrews: woot, thx. will discount that from my review :) | 20:17 |
jlk | yeah I'm trying to replicate the later part of the issue, where it drops the change out of the queue and doesn't do the next bit | 20:18 |
jlk | like, what was the symptom problem that sent y'all hunting? | 20:18 |
jlk | that's the test case I want to try to make sure we have | 20:18 |
jeblair | jlk: iirc, the reporter failing bumped the item out of the queue | 20:19 |
Shrews | jeblair: no worries. just increasing my zuul commit count :) | 20:19 |
pabelanger | jlk: Ah, I might be confusing the issue jeblair mentioned earlier with another | 20:19 |
jlk | jeblair: yeah, was it a start report that did that? Because when I don't have a start report things seem to "work". But it seems like a start event is what's causing the change to go missing | 20:19 |
jeblair | pabelanger: i think it's the same issue | 20:20 |
jeblair | jlk: that failure at least was a success report | 20:20 |
jlk | that's weird, I wonder if it's an ordering issue | 20:20 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Fix test_cache_hard_links when on tmpfs https://review.openstack.org/489377 | 20:21 |
jlk | in my test, it attempts to report success for both drivers, fails on one driver, succeeds on the other | 20:21 |
jlk | so the test case which looks to make sure each change has the appropriate number of reports still passes | 20:21 |
jeblair | jlk: it def could be ordering. | 20:21 |
jeblair | i'm not sure how we arrive at the reporter order; it might be dict key order which is random | 20:22 |
jeblair | jlk: zuul logs the order, so you'll note in pabelanger's traceback, github comes first | 20:23 |
jeblair | 2017-07-31 19:56:40,809 DEBUG zuul.IndependentPipelineManager: success [<zuul.driver.github.githubreporter.GithubReporter object at 0x7f4449594128>, <zuul.driver.sql.sqlreporter.SQLReporter object at 0x7f4449594588>, <zuul.driver.gerrit.gerritreporter.GerritReporter object at 0x7f4449594898>] | 20:23 |
jlk | let me re-run | 20:23 |
jlk | ah yeah in my test, gerrit reporter fires off first, then github | 20:24 |
jlk | What's the names of your connections in main.yaml ? Presumably not "github" and "gerrit" ? | 20:25 |
jeblair | we may want to put these in separate try/except handlers | 20:25 |
jeblair | jlk: they are that actually | 20:25 |
jlk | okay, so ordering is not guaranteed | 20:25 |
jlk | I can easily catch this if I turn on start reporting though | 20:25 |
jlk | because it'll toss out the change before it runs the job and it'll never report success | 20:26 |
jeblair | seems like a fair test | 20:26 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Implement autohold https://review.openstack.org/486692 | 20:27 |
*** persia has quit IRC | 20:27 | |
*** dkranz_ has quit IRC | 20:27 | |
jeblair | Shrews: all +3 | 20:27 |
*** jkilpatr has quit IRC | 20:28 | |
*** persia has joined #zuul | 20:28 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Remove zuul_url from merger config https://review.openstack.org/489378 | 20:29 |
Shrews | jeblair: w00t. another zuulv3 checkbox ticked | 20:29 |
Shrews | jeblair: i'll modify the nodepool client tomorrow for the new fields. Will be a good time to add the --detail flag that i've wanted to the 'list' command, too. | 20:32 |
jeblair | Shrews: sounds good! | 20:32 |
* tobiash hits eod | 20:35 | |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add required reason for hold https://review.openstack.org/489366 | 20:36 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix autohold RPC protocol https://review.openstack.org/489375 | 20:36 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Remove duplicated states from zk.py https://review.openstack.org/489256 | 20:36 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-sphinx master: Fix package setup https://review.openstack.org/489395 | 20:57 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-sphinx master: Fix package setup https://review.openstack.org/489395 | 21:00 |
jeblair | pabelanger, clarkb: can you +3 that ^? i'll follow that with a release, then hopefully get zuul-jobs depending on that. | 21:00 |
pabelanger | +2, give clarkb a moment to look | 21:01 |
clarkb | ya looking | 21:01 |
clarkb | done | 21:02 |
openstackgerrit | Merged openstack-infra/zuul-sphinx master: Fix package setup https://review.openstack.org/489395 | 21:06 |
openstackgerrit | Jesse Keating proposed openstack-infra/zuul feature/zuulv3: Limit github reporting to github sources https://review.openstack.org/489399 | 21:11 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Switch to using zuul-sphinx https://review.openstack.org/489409 | 21:24 |
jeblair | pabelanger: a +3 on that ^ should also net us a published zuul-jobs doc | 21:24 |
pabelanger | jeblair: done | 21:26 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Switch to using zuul-sphinx https://review.openstack.org/489409 | 21:30 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Switch to using zuul-sphinx https://review.openstack.org/489409 | 21:32 |
*** jkilpatr has joined #zuul | 21:34 | |
*** yolanda has joined #zuul | 21:34 | |
*** yolanda has quit IRC | 21:41 | |
pabelanger | jeblair: jlk +3 on 489399. I'll be sure to restart zuulv3.o.o once it lands on disk | 21:42 |
jeblair | pabelanger: thanks! | 21:42 |
jlk | woo | 21:43 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Limit github reporting to github sources https://review.openstack.org/489399 | 21:49 |
jlk | authoring code going into production feels like a nice way to return from vacation! | 21:51 |
Shrews | jlk: the way i see it, you owe us 2 weeks of work. best get movin'! | 21:56 |
jlk | LOL | 21:56 |
pabelanger | zuulv3.o.o restarted | 22:01 |
jeblair | zuul meeting time in #openstack-meeting-alt | 22:01 |
SpamapS | tobiash: in 489377 ..I am a bit confused by the commit message. ??? | 22:18 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Ensure stop of disk accountant on assertion https://review.openstack.org/489368 | 22:24 |
pabelanger | 2.5.2.dev1500 | 22:47 |
pabelanger | impressive | 22:47 |
pabelanger | master only has 1242 commits | 22:47 |
jeblair | pabelanger: wait, are you saying v2 to v3 has more commits than all of zuul v1 and v2 combined? | 22:49 |
pabelanger | jeblair: according to github, ya | 22:50 |
Shrews | neat | 22:52 |
jeblair | init to end of zuul v1 was 205 commits, end of v1 to (presumed) end of v2 will be about 1040. | 22:56 |
jeblair | so comparable to, but larger than the entire v2 effort, so far. however, *much* larger than the 141 commits that took us from v1 to v2. :) | 22:57 |
jlk | oh, so 141 to go from v1 to first of v2, but then another 900 some odd commits after that for continued v2 dev | 22:59 |
jeblair | jlk: yep | 22:59 |
jeblair | v2 had major changes all made continuously | 22:59 |
jlk | is https://storyboard.openstack.org/?#!/board/41 still the right board to look at for work? Is there a different board for the jobs development? | 23:02 |
jeblair | jlk: yes, and i don't think there's a board for jobs development. | 23:02 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-sphinx master: Remove cap on sphinx https://review.openstack.org/489431 | 23:13 |
jeblair | pabelanger: mind one more +3 on that ^? i think we'll want that before we push to add it to global-requirements. | 23:14 |
pabelanger | done | 23:15 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Switch to using zuul-sphinx https://review.openstack.org/489409 | 23:20 |
openstackgerrit | Merged openstack-infra/zuul-sphinx master: Remove cap on sphinx https://review.openstack.org/489431 | 23:21 |
pabelanger | cool, release pipeline now live | 23:26 |
pabelanger | http://zuulv3.openstack.org/ | 23:26 |
pabelanger | I'll start work in the morning for testing | 23:26 |
jeblair | pabelanger: \o/ | 23:27 |
pabelanger | jeblair: any objections on using openstack-dev/sandbox or openstack-dev/ci-sandbox for tagging? | 23:27 |
jeblair | pabelanger: sandbox sounds okay to me. would avoid ci-sandbox. but maybe ask others in #-infra | 23:29 |
pabelanger | agree | 23:29 |
pabelanger | don't think we have apache setup properly | 23:32 |
pabelanger | http://zuulv3.openstack.org/keys/gerrit/openstack-infra/zuul.pem 404s | 23:32 |
pabelanger | we'll have to look at rewrite rules in a bit | 23:32 |
jeblair | pabelanger: ++ | 23:32 |
jeblair | pabelanger: you can try wgetting that from zuul directly on the server to make sure it works | 23:33 |
pabelanger | jeblair: will have to dig more into it tomorror, getting 500 internal error back from: http://localhost:8001/openstack/keys/gerrit/openstack-infra/zuul.pub | 23:42 |
*** https_GK1wmSU has joined #zuul | 23:42 | |
*** https_GK1wmSU has left #zuul | 23:43 | |
jlk | I think I'm done for the day. Cheers! | 23:51 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!