tristanC | Shrews: btw, i think last PS of https://review.openstack.org/535537 is much better because it mitigates the SessionExpired issue which seems to be happening when the zk connection dropped after the "while self._zk.suspended or self._zk.lost:" waiting loop | 01:00 |
---|---|---|
mrhillsman | not sure what i am missing, i have jobs queued, instances available, executor connected to gearman, gearman stats says it is there, gearman workers says it is there, worker sends GRAB_JOB_UNIQ but goes right into GRAB_WAIT state, followed by NO_JOB packet then going into SLEEP state | 01:06 |
mrhillsman | do i need to increase a timer/timeout on the GRAB_JOB_UNIQ request | 01:07 |
tristanC | mrhillsman: what does display "nodepool request-list" ? | 01:09 |
mrhillsman | checking | 01:10 |
mrhillsman | so this is the second env with executor behind vpn | 01:10 |
mrhillsman | that returns empty table | 01:10 |
mrhillsman | rather, executor in environment that does not allow ingress but egress is ok | 01:11 |
mrhillsman | nodepool is within the environment as well using shared zookeeper with chroot | 01:13 |
*** yolanda has quit IRC | 01:44 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: builder: do not cleanup image for driver not managing image https://review.openstack.org/535552 | 02:06 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Implement a static driver for Nodepool https://review.openstack.org/535553 | 02:09 |
*** weshay has quit IRC | 02:14 | |
*** weshay has joined #zuul | 02:15 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Refactor run_handler to be generic https://review.openstack.org/535554 | 02:15 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Refactor NodeLauncher to be generic https://review.openstack.org/535555 | 02:20 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Implement an OpenContainer driver https://review.openstack.org/535556 | 02:39 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Implement a Kubernetes driver https://review.openstack.org/535557 | 02:39 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Implement an Amazon EC2 driver https://review.openstack.org/535558 | 02:39 |
*** bhavik1 has joined #zuul | 04:37 | |
*** bhavik1 has quit IRC | 05:07 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: zk: use kazoo retry facilities https://review.openstack.org/536209 | 05:38 |
*** threestrands has quit IRC | 05:40 | |
openstackgerrit | Merged openstack-infra/nodepool master: Drop python2 virtualenv for devstack https://review.openstack.org/536168 | 06:22 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: Do not call merger:cat when all config items are excluded https://review.openstack.org/536134 | 06:31 |
Wei_Liu | tristanC, How to config the workflow +/-1 in gerrit to trigger the gate? | 06:53 |
tristanC | Wei_Liu: you mean in zuul pipeline like this: https://git.openstack.org/cgit/openstack-infra/project-config/tree/zuul.d/pipelines.yaml#n82 ? | 06:58 |
Wei_Liu | tristanC: no, I mean the config of All-Projects in gerrit, like adding label "verified". How to add the label "workflow"? | 07:03 |
tristanC | Wei_Liu: iirc, this happen in the project meta/config project, labels are defined there | 07:07 |
tristanC | e.g. https://git.openstack.org/cgit/openstack-infra/project-config/tree/gerrit/acls/openstack/nova.config#n5 | 07:08 |
Wei_Liu | tristanC: ok, thanks | 07:11 |
tristanC | Wei_Liu: you're welcome! | 07:14 |
*** dkranz has quit IRC | 08:40 | |
*** jpena|off is now known as jpena | 08:46 | |
*** sshnaidm|off has quit IRC | 09:33 | |
*** hashar has joined #zuul | 09:53 | |
tobiash | corvus: ups, forgot about that and will be more careful next time at that topic | 10:07 |
tobiash | corvus: this probably also affects merger and executor | 10:07 |
tobiash | corvus: I will test and fix them as well | 10:07 |
*** yolanda has joined #zuul | 10:17 | |
*** electrofelix has joined #zuul | 10:23 | |
*** maxamillion has quit IRC | 10:25 | |
*** maxamillion has joined #zuul | 10:28 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: Clean held nodes automatically after configurable timeout https://review.openstack.org/536295 | 10:28 |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: Refactor status functions, add web endpoints, allow params This patch refactors status functions so that instead of having one function per output format, the output format is simply a parameter. https://review.openstack.org/536301 | 10:44 |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: Add a separate module for node management commands https://review.openstack.org/536303 | 10:47 |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: webapp: add optional admin endpoint https://review.openstack.org/536319 | 10:59 |
*** yolanda has quit IRC | 11:17 | |
*** hashar has quit IRC | 11:27 | |
*** openstackgerrit has quit IRC | 11:33 | |
*** jkilpatr has joined #zuul | 11:47 | |
*** jkilpatr has quit IRC | 11:55 | |
*** sshnaidm has joined #zuul | 11:58 | |
*** jkilpatr has joined #zuul | 12:09 | |
*** jpena is now known as jpena|lunch | 12:33 | |
tristanC | fwiw I wrote a blog post about a CI/CD workflow with Zuul, it's proposed here: https://github.com/redhat-openstack/website/pull/1157 | 12:53 |
*** dkranz has joined #zuul | 13:07 | |
*** openstackgerrit has joined #zuul | 13:21 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: Clean held nodes automatically after configurable timeout https://review.openstack.org/536295 | 13:21 |
openstackgerrit | Fabien Boucher proposed openstack-infra/zuul master: Tenant config dynamic loading of project from external script https://review.openstack.org/535878 | 13:21 |
*** jpena|lunch is now known as jpena | 13:26 | |
Shrews | tristanC: re: 535537, where did you and corvus land on that? I think I recall him asking for some more investigation on your side as to why that was necessary, or similar. I could be wrong though | 13:28 |
Shrews | tristanC: anyway we can test that in our unit tests? | 13:30 |
tristanC | i think the issues was because the while zk.suspended loop doesn't prevent connection issue happening after this protection | 13:31 |
*** rlandy has joined #zuul | 13:31 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: Refactor status functions, add web endpoints, allow params https://review.openstack.org/536301 | 13:32 |
tristanC | Shrews: we could probably mock kazoo transport or even the socket, though i'm not sure how reliable such test would be | 13:33 |
tristanC | i made the zookeeper client wait infinitely for the service to be available, which sounds like the best strategy | 13:37 |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: Clean held nodes automatically after configurable timeout https://review.openstack.org/536295 | 13:39 |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: Refactor status functions, add web endpoints, allow params https://review.openstack.org/536301 | 13:53 |
Shrews | tristanC: i left a comment there for your consideration | 13:59 |
*** weshay is now known as weshay|rover | 14:00 | |
*** maxamillion has quit IRC | 14:02 | |
*** maxamillion has joined #zuul | 14:02 | |
Shrews | tristanC: i took a very "openstack-ci-breaks-everything centric" stance there in the comment :) | 14:03 |
Shrews | openstack tends to find all of the edge cases before anyone else | 14:03 |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: Clean held nodes automatically after configurable timeout https://review.openstack.org/536295 | 14:08 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Revert "Register term_handler for all zuul apps" https://review.openstack.org/536376 | 14:20 |
tobiash | corvus: this change also broke signal handling in our dockerized deployments in ways I don't understand yet ^^ | 14:21 |
tobiash | thus it should be reverted for now | 14:21 |
*** dkranz has quit IRC | 14:22 | |
Shrews | tobiash: oh. hrm, didn't that also break running with the nodaemon arg? | 14:22 |
Shrews | thinking ctrl-C might no longer work with that change and nodaemon | 14:23 |
tobiash | Shrews: the intent of that was to make nodaemon work generic | 14:24 |
tobiash | however the signal handling seems to be broken up with that and I have no clue yet why | 14:24 |
tobiash | I run it with nodaemon inside docker | 14:25 |
tobiash | the revert fixes this for me again | 14:25 |
Shrews | tobiash: but if you did, say, 'zuul-fingergw -d' from the command line, did ctrl-C still work? | 14:25 |
tobiash | just tried to do that with the scheduler and it worked | 14:26 |
tobiash | the same in docker didn't work | 14:26 |
Shrews | interesting | 14:26 |
tobiash | I spent two hours debugging this now and then decided to do a revert instead of a fix :/ | 14:28 |
tobiash | maybe this has something to do when registering the handlers in the base class and calling methods in the child class | 14:29 |
Shrews | tobiash: ctrl-c does not seem to work with zuul-fingergw, so there are more problems :) | 14:33 |
Shrews | i think revert is the right thing | 14:34 |
tobiash | Shrews: yah, something is completely broken | 14:34 |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: Add a separate module for node management commands https://review.openstack.org/536303 | 14:34 |
SpamapS | I feel like there's some missing precedence information in the job config sections. Like, if I specify a 'files' matcher and an 'irrelevant-files' matcher, which one is evaluated first? | 14:37 |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: webapp: add optional admin endpoint https://review.openstack.org/536319 | 14:43 |
corvus | SpamapS: the order doesn't matter -- if they're both supplied, they both need to match | 14:49 |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul master: Use re2 for change_matcher https://review.openstack.org/536389 | 14:53 |
SpamapS | corvus: ACK. Branches are a bit different, which is what made me think that. | 14:54 |
SpamapS | corvus: also.. LCA'ing? | 14:54 |
SpamapS | so early :p | 14:54 |
SpamapS | corvus: anyway, ^^ is a first attempt at use of re2 with negative branch matching. I'm not sure I got it right. | 14:55 |
SpamapS | corvus: I also think we may want to add a specific section on regex flavors and links to the re2 docs if that's what we're going to use. | 14:55 |
SpamapS | Positive feedback from the fb-re2 maintainers though. | 14:55 |
SpamapS | working through a few review comments, I think we might actually get a py3 re2 soon. | 14:56 |
corvus | neat! | 14:56 |
SpamapS | once we got the right email address | 14:57 |
SpamapS | hoping to convince them to open up the issue tracker too.. no idea why it is turned off | 14:57 |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul master: Slack driver https://review.openstack.org/536391 | 15:00 |
SpamapS | corvus: ^ re-instated the -2 so we don't whoops | 15:03 |
corvus | SpamapS: thx | 15:03 |
*** sshnaidm is now known as sshnaidm|bbl | 15:30 | |
*** dkranz has joined #zuul | 15:38 | |
gundalow | Regarding the Zuul integration for Ansible, need to be careful that openstack-zuul (bot) doesn't end up adding the same comments as Ansibullbot | 15:39 |
gundalow | For example, Ansibullbot knows to ignore WIP PRs | 15:39 |
*** sshnaidm|bbl has quit IRC | 15:39 | |
*** sshnaidm|bbl has joined #zuul | 15:40 | |
*** zaro has quit IRC | 15:40 | |
*** mrhillsman has quit IRC | 15:40 | |
*** zhuli has quit IRC | 15:40 | |
*** zhuli has joined #zuul | 15:41 | |
*** zaro has joined #zuul | 15:41 | |
*** mrhillsman has joined #zuul | 15:41 | |
*** sshnaidm|bbl has quit IRC | 15:42 | |
*** abadger1999 has joined #zuul | 15:43 | |
pabelanger | how does github flag a PR WIP? Is there an event or just regex match for commit message? | 16:00 |
gundalow | pabelanger: Ansibullbot will add the WIP label based on some regex | 16:06 |
gundalow | current Ansibullbot will notify people for reviews, if rebases are needed, etc, etc | 16:06 |
gundalow | mattclay: FYI, discussing Zuul & notifications | 16:06 |
*** gregdek has joined #zuul | 16:07 | |
rbergeron | look, a wild gregdek appears | 16:07 |
pabelanger | okay, that doesn't exist in zuul today. So something that would need to be discussed. Once zuul sees a new patchset in gerrit or github, it will get added to pipelines for tests to run. | 16:08 |
gundalow | pabelanger: All look good an exiting. Just all the devil-in-the-detail fun and games now :) | 16:08 |
gundalow | Sooooooooooooooo looking forward to Zuul for Ansible Network stuff, that's going to make our lives so much easier :) | 16:08 |
pabelanger | Yay | 16:09 |
*** dkranz has quit IRC | 16:13 | |
*** myoung is now known as myoung|biab | 16:15 | |
*** sshnaidm|bbl has joined #zuul | 16:23 | |
*** sshnaidm|bbl is now known as sshnaidm | 16:23 | |
* mordred waves to gregdek | 16:30 | |
* gregdek waves back to mordred | 16:30 | |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul master: Use re2 for change_matcher https://review.openstack.org/536389 | 16:31 |
mordred | gundalow: sweet - and yah - I thik the very next step past "get the job running" is to dig in to figuring out the interaction between ansibullbot, tags and zuul - which I think will be fun and exciting | 16:31 |
corvus | mordred, SpamapS, tobiash: can you look at https://review.openstack.org/535501 and children? i'd like to get those in asap so we don't run into any problems when openstack starts branching | 16:31 |
mordred | corvus: will look in just a sec | 16:34 |
mnaser | can nodepool spin up instances on demand rather than maintaining $min-ready instances? | 16:35 |
mordred | corvus: in the mean time- re: your mailing list message, for now does that mean if I jst make a job: name: devstack branch: devel override-branch: master in the shade repo - since that would create a devstack variant that the shade job would match on ansible PRs? | 16:35 |
corvus | mnaser: absolutely. you can just set min-ready to 0 | 16:35 |
mordred | (while we poke more systemically) | 16:35 |
mnaser | ok, i see | 16:35 |
corvus | mnaser: it always expands in response to demand, min-ready just makes things faster | 16:36 |
*** bhavik1 has joined #zuul | 16:37 | |
corvus | mordred: nope, you can't make a devstack job outside of the devstack repo. i don't think there's a quick fix for this (though, the proposed solution isn't very difficult; we could probably land it in a day or so) | 16:37 |
pabelanger | corvus: Shrews: did you by chance see my question about min-ready and multiple launchers? | 16:38 |
pabelanger | http://eavesdrop.openstack.org/irclogs/%23zuul/%23zuul.2018-01-20.log.html#t2018-01-20T20:26:02 | 16:38 |
mordred | corvus: ok. cool | 16:39 |
* SpamapS tries to have two-jobs worth of min-ready defined. Works nicely for that first-thing-in-the-morning recheck storm. | 16:39 | |
pabelanger | I maybe wrong, but because each launcher sends min-ready requests, it is possible for 4 launchers (min-ready: 1) to bring 4 nodes online in ready state | 16:39 |
SpamapS | Of course, I have a private cloud with no charge-back billing so... #luxury | 16:39 |
corvus | pabelanger: ah, missed that. i suspect it's a race. they don't blindly request min-ready, but if they all think there's a min-ready deficit at the same time, they may all emit the request. | 16:41 |
pabelanger | corvus: okay, yes. That sounds like what might be happening | 16:41 |
mordred | mnaser, SpamapS: I wonder if it would be worthwhile to some users if we added a "pause ready servers" config flag that would tell nodepool to pause (or shelve or whatever) servers it has booted and that are ready but waiting to be used, and if it's set as soon as it decides to return a server to zuul it runs an unpause/unshelve on the server in question. I don't think it's a thing we'd use in openstack, but | 16:41 |
mordred | perhaps for folks who are in billing/charge-back scenarios it might make an impact? | 16:41 |
SpamapS | Yeah I could see that being useful on AWS. | 16:42 |
corvus | is that any faster than booting? | 16:42 |
Shrews | pabelanger: right, min-ready is just that... a minimum. we could totally get more | 16:42 |
SpamapS | More reliable. | 16:42 |
mordred | similarly, we've got a very-long-lived 'feature request' (originally requested by the HP Public Cloud ops team) for nodepool to use rebuild instead of delete/create when possible | 16:42 |
SpamapS | And if you're using volume-backed instance types, def faster than booting. | 16:42 |
*** bhavik1 has quit IRC | 16:42 | |
SpamapS | rebuild is super problematic. | 16:43 |
SpamapS | We have about a 5% fail rate on rebuild here. | 16:43 |
SpamapS | Though that may just be ELIBERTY | 16:43 |
mordred | we've been deferring rebuild for a while because from our end our concept of needing to do a rebuild hasn't been great | 16:43 |
mordred | SpamapS: good to know | 16:43 |
SpamapS | IMO rebuild is too taxing on hypervisors. | 16:43 |
mordred | I think in any case it's probably also a config setting | 16:43 |
SpamapS | Let the scheduler find a place for that new node. | 16:43 |
pabelanger | Shrews: yah, that is a little different from v2 nodepool. In the cases of images we don't use often, eg: debian-jessie, we can min-ready+ nodes, that idle for days. When we could be using that quota on more active labels | 16:43 |
mordred | for hp public - the reason they wanted use to use rebuild had to do with delete/create being more taxing on the network provisioning | 16:44 |
SpamapS | Def meh here. :) | 16:44 |
Shrews | pabelanger: max-ready-age can help alleviate some of that | 16:44 |
mordred | and doing a rebuild on an existing vm would have resulted in re-use of the provisioned network details | 16:44 |
SpamapS | since we just have provider networks. | 16:44 |
mordred | so I imagine the cost/benefit is different per cloud | 16:44 |
mordred | SpamapS: ++ provider networks :) | 16:44 |
Shrews | pabelanger: that's per-label, too | 16:45 |
SpamapS | yeah.. though our L3 solution is....... | 16:45 |
* SpamapS is actually getting ready to deploy a new cloud with routed networks which will make that not so .... and more :-D | 16:45 | |
*** jpena is now known as jpena|brb | 16:46 | |
pabelanger | SpamapS: okay, I do think we might want to revisit the min-ready logic to see if possible to remove > min-ready nodes. But I can also propose a patch to try max-ready-age in openstack-infra | 16:46 |
mordred | SpamapS, mnaser: in any case, I think there are a few additional knobs it might be useful to add to further allow folks to match their usage pattern to the quatlities of their cloud | 16:46 |
corvus | pabelanger: well, let's not remove > min-ready, let's address the race so we don't create > min-ready in the first place | 16:46 |
corvus | pabelanger: that way we don't run the risk of creating then deleting unused servers | 16:47 |
SpamapS | ++ addressing race | 16:47 |
SpamapS | That would likely also help with the "one cloud is smaller than the other" | 16:47 |
mordred | (mostly trying to think about folks who don't just have free accounts on public clouds like we do and might have different tradeoffs they care about WRT resource use) | 16:47 |
mordred | ++ addressing race | 16:47 |
Shrews | corvus: pabelanger: the min-ready logic is rather... complex, using the normal noderequest mechanism. not sure how to properly address that | 16:47 |
SpamapS | Really starting to wonder if a scheduling process is the answer there. | 16:47 |
SpamapS | Instead of "first provider that gets the req" , we have schedulers that pick a provider based on its current state. | 16:49 |
corvus | SpamapS: that's an option, but we'd need cooperative HA schedulers. i figure we may as well put them in the providers themselves. | 16:49 |
SpamapS | because it's kind of a bummer to only use my super-empty cloud when I could get a few more nodes out of the "kinda busy but still ok to use" cloud. | 16:49 |
corvus | SpamapS: at any rate, that's something we put in the spec as a future enhancement after we got some experience with the current simple algorithm. i think cooperative scheduling is something we can visit post 3.0 | 16:50 |
Shrews | did i hear SpamapS just volunteer for that spec??? ;) | 16:51 |
mordred | Shrews: that's what I heard | 16:52 |
SpamapS | So wish I could focus on Zuul'ing. ;) | 16:52 |
mnaser | ^ these are some good ideas | 16:54 |
mnaser | but yeah, keeping in mind some people have chargeback/public cloud usage is important | 16:54 |
SpamapS | Yeah for metered clouds, min-ready should be 0 or near-0. | 16:56 |
tobiash | We're setting min-ready depending on office hours | 16:57 |
corvus | tobiash: you should set max-servers based on office hours too :) | 16:58 |
tobiash | In combination with max ready age that scales down 0 pretty good at night :) | 16:59 |
tobiash | We're trying to have the full quota available at any time | 16:59 |
tobiash | If someone likes to work at night, that's fine. He just has a bit more latency. | 17:00 |
tobiash | But my biggest deployment (still v2) has max 1300 vcpus | 17:02 |
tobiash | That's not even near the openstack deployment ;) | 17:03 |
*** jpena|brb is now known as jpena | 17:11 | |
*** rlandy is now known as rlandy|brb | 17:20 | |
*** rlandy|brb is now known as rlandy | 17:33 | |
*** hashar has joined #zuul | 17:35 | |
*** elyezer has joined #zuul | 17:42 | |
*** hashar is now known as hasharAway | 17:42 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: webapp: add optional admin endpoint https://review.openstack.org/536319 | 18:03 |
*** jpena is now known as jpena|off | 18:18 | |
*** Cibo has joined #zuul | 18:28 | |
mordred | corvus: how hard would it be to add a log entry in the scheduler log that indicates which executor ran a given change/job? my process for that right now is to grep the scheduler log for change,revision.*job-name - then to find the build id then use ansible to grep the executor logs on all the executors to see which one had it | 18:34 |
mordred | corvus: or, is there a better way to do that? | 18:35 |
corvus | mordred: after the job completes (one way or another), it should be easy. doing it before or at start may be slightly more complex but possible | 18:37 |
corvus | 1 sec | 18:37 |
mordred | corvus: I think after complete is fine | 18:37 |
mordred | corvus: https://etherpad.openstack.org/p/zuulv3-retry-limit-merger-failures here's a collection of retry-limit failures I've gotten in the last day or two | 18:38 |
corvus | mordred: i think manager/__init__.py onBuildCompleted would be the place | 18:39 |
mordred | corvus: cool. I'll look in to that | 18:42 |
mordred | corvus: I'm currently curious to see if all of the failures happend on the same executor | 18:42 |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool master: WIP: Convert from legacy to native devstack job https://review.openstack.org/535899 | 18:46 |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool master: WIP: Convert from legacy to native devstack job https://review.openstack.org/535899 | 18:49 |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool master: WIP: Convert from legacy to native devstack job https://review.openstack.org/535899 | 18:55 |
*** myoung|biab is now known as myoung | 18:57 | |
*** jkilpatr has quit IRC | 19:02 | |
*** rlandy is now known as rlandy|biab | 19:03 | |
*** hasharAway has quit IRC | 19:04 | |
*** electrofelix has quit IRC | 19:10 | |
Shrews | clarkb: i just noticed a spurious failure for test_failed_provider in http://logs.openstack.org/60/535560/1/check/tox-cover/53f1ec0/testr_results.html.gz | 19:14 |
Shrews | clarkb: might be something we should look into | 19:15 |
*** hashar has joined #zuul | 19:15 | |
*** jkilpatr has joined #zuul | 19:16 | |
*** hashar has quit IRC | 19:25 | |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool master: Fix race in test_provider_removal https://review.openstack.org/536540 | 19:26 |
Shrews | tristanC: i found an issue with the test_provider_removal test that you saw fail in https://review.openstack.org/535531. 536540 ^^^ fixes that | 19:26 |
*** hashar has joined #zuul | 19:28 | |
*** hashar has quit IRC | 19:32 | |
*** rlandy|biab is now known as rlandy | 19:37 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: Clean held nodes automatically after configurable timeout https://review.openstack.org/536295 | 19:40 |
mordred | corvus: ok. I believe I have disproven the theory that the issues are due to a locally corrupt repo on one of the executors | 19:45 |
mordred | corvus: I grabbed uuids for some of the builds and then looked for executor hits on the executors and it seems to be fairly well spread across them | 19:46 |
*** sshnaidm is now known as sshnaidm|off | 19:48 | |
mordred | corvus: oh! wait- I may have done that very wrong | 19:50 |
mordred | corvus: ze09 and ze10 are the only ones with the ValueError in question in the logs | 19:51 |
mordred | corvus: http://paste.openstack.org/show/650392/ | 20:02 |
*** harlowja has joined #zuul | 20:04 | |
mordred | corvus: that's the ValueError instances from the past 2 days of logs - it seems to MOSTLY be issues with 303dad9aa842b3d745d0f8b1afae86bc32acf6c | 20:05 |
mordred | corvus: which is a ref on a branch that no longer exists in gerrit | 20:06 |
mordred | corvus: and does not exist on master | 20:06 |
mordred | it's this change: https://review.openstack.org/#/c/519949/ | 20:06 |
corvus | mordred: interesting... do you have a full tb handy? | 20:07 |
mordred | corvus: SO - I think we're back to your original theory - that there is something weird in some repo states | 20:08 |
mordred | corvus: yup - one sec | 20:08 |
corvus | mordred: i got a tb | 20:08 |
mordred | corvus: cool. http://paste.openstack.org/show/650406/ for anyone else interested | 20:09 |
corvus | mordred: the line above indicates that it's trying to recreate the ref refs/tags/0.7.2 | 20:11 |
mordred | corvus: I've verified, the 303dad9aa842b3d745d0f8b1afae86bc32acf6c sha does eixst on all of the git servers | 20:12 |
mordred | corvus: that's weird - cause that sha is not in the 0.7.2 tag | 20:12 |
mordred | it should not be in *any* tag or *any* currently existing branch - I would expect a git gc to make it go away | 20:15 |
corvus | mordred: i have to grab food; i'll rejoin when i get back | 20:18 |
*** sshnaidm|off has joined #zuul | 20:19 | |
*** jkilpatr has quit IRC | 20:25 | |
*** jkilpatr has joined #zuul | 20:53 | |
pabelanger | I'm guessing we'll also want to delete https://docs.openstack.org/infra/zuul/feature/zuulv3/ | 20:57 |
*** elyezer has quit IRC | 20:57 | |
pabelanger | and nodepool too | 20:57 |
*** elyezer has joined #zuul | 20:59 | |
*** elyezer has quit IRC | 21:10 | |
*** elyezer has joined #zuul | 21:12 | |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool master: WIP: Convert from legacy to native devstack job https://review.openstack.org/535899 | 21:20 |
SpamapS | pabelanger: can that link redirect to master? | 21:20 |
SpamapS | that would be ideal. | 21:21 |
SpamapS | shows up in a *lot* of google searches. | 21:21 |
pabelanger | I think so, I need to check | 21:21 |
pabelanger | I want to say we support per directory htaccess files | 21:22 |
Shrews | corvus: the directories referenced in https://docs.openstack.org/infra/zuul/user/jobs.html#working-directory ... where are those rooted? | 21:22 |
Shrews | or if anyone else knows | 21:25 |
openstackgerrit | Merged openstack-infra/nodepool master: license: remove dos line break https://review.openstack.org/535531 | 21:28 |
mordred | Shrews: all the ansible tasks start with $PWD in $HOME | 21:33 |
Shrews | pabelanger: those links are generated automatically by walking docs.o.o and rebuilding sitemap.xml. that's about all I learned about it when I looked last week. not sure what we want to do with that | 21:33 |
mordred | Shrews: so generally speaking, paths are relative to /home/zuul (or {{ ansible_user_dir }} | 21:33 |
Shrews | mordred: ok. was wondering if i place new files under work/logs if they'll automatically be archived for us, or if I need to do that manually | 21:35 |
* Shrews gives the try-it-and-see method a go | 21:36 | |
mordred | Shrews: gimme one sec, sorry - on the phone - logs are all sorts of fun | 21:37 |
mordred | Shrews: there is a mechanism in the devstack base job for collecting logs | 21:38 |
Shrews | mordred: yeah, saw the fetch-devstack-log-dir role. i suppose i could usurp that by writing to their log dir | 21:40 |
Shrews | that just seemed like not the proper thing to do though | 21:40 |
corvus | Shrews: andreaf is working on a more generic log mechanism, may be worth asking him | 21:58 |
corvus | mordred: i'm back and digging into that ref issue now | 21:59 |
corvus | mordred: i think i was wrong about which object was involved, it's not 0.7.2 | 22:00 |
corvus | mordred: we have the log line at exactly the wrong place. it's before it actually sets the ref, but it's *after* it looks up the object corresponding to the hexsha. | 22:00 |
corvus | mordred: which means it fits very well your theory that it's related to the deleted branch -- if the executor was told to create a ref for that branch, and it doesn't exist upstream, then this is the behavior i'd expect | 22:01 |
corvus | mordred: so it seems like the first thing to fix (after moving the log line :) is to make sure mergers prune branches | 22:02 |
corvus | mordred: i'll work on fixes to this. i want to try to add a test for it, and i don't think we have any tests that delete branches | 22:04 |
Shrews | zuul meeting now??? | 22:04 |
*** elyezer has quit IRC | 22:09 | |
*** elyezer has joined #zuul | 22:10 | |
*** myoung is now known as myoung|bbl | 22:12 | |
*** myoung|bbl is now known as myoung | 22:12 | |
*** openstackgerrit has quit IRC | 22:18 | |
mordred | Shrews: actually, I think for now just putting the files in {{ devstack_base_dir }}/logs is fine ... | 22:21 |
mordred | Shrews: BUT ... | 22:21 |
mordred | with this: https://review.openstack.org/536611 Move zuul_copy_output to be a job variable | 22:21 |
mordred | Shrews: you should just be able to add a zuul_copy_output var to your job definition containing just the paths to the log files you want to copy | 22:22 |
mordred | Shrews: so like zuul_copy_output:\n '/var/log/nodepool/builder.log': 'logs' | 22:23 |
corvus | oh yes that's what i was thinking of, sorry | 22:27 |
*** openstackgerrit has joined #zuul | 22:38 | |
openstackgerrit | David Moreau Simard proposed openstack-infra/zuul-jobs master: Revert "Revert "Add zuul.{pipeline,nodepool.provider,executor.hostname} to job header"" https://review.openstack.org/514489 | 22:38 |
*** Cibo has quit IRC | 22:40 | |
*** Cibo has joined #zuul | 22:41 | |
openstackgerrit | David Moreau Simard proposed openstack-infra/zuul-jobs master: Add Ansible version to job header https://review.openstack.org/532304 | 22:42 |
openstackgerrit | David Moreau Simard proposed openstack-infra/zuul-jobs master: Revert "Revert "Add zuul.{pipeline,nodepool.provider,executor.hostname} to job header"" https://review.openstack.org/514489 | 22:43 |
openstackgerrit | David Moreau Simard proposed openstack-infra/zuul-jobs master: Add Ansible version to job header https://review.openstack.org/532304 | 22:43 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul master: Fix a copy-pasta in a comment https://review.openstack.org/536620 | 23:05 |
mordred | SpamapS: ^^ that fixes a review you left on https://review.openstack.org/#/c/535502 | 23:06 |
mordred | corvus: the secrets/semaphores/nodesets stack ending in https://review.openstack.org/#/c/535503/ has enough +2s, but I did not +A in case you wanted to solicit more feedback | 23:08 |
corvus | mordred: we also had a +2 from fungi on the feature/zuulv3 version of the secrets patch, so we're probably gtg | 23:09 |
corvus | mordred: w00t, i believe i have reproduced the issue in a test case. | 23:12 |
mordred | corvus: woot! | 23:14 |
*** mrhillsman has quit IRC | 23:26 | |
*** kmalloc has quit IRC | 23:26 | |
*** mrhillsman has joined #zuul | 23:26 | |
*** mattclay has quit IRC | 23:26 | |
*** kmalloc has joined #zuul | 23:27 | |
*** mattclay has joined #zuul | 23:27 | |
*** lennyb has quit IRC | 23:27 | |
openstackgerrit | David Moreau Simard proposed openstack-infra/zuul master: Add support for dumping queues from a status.json file https://review.openstack.org/536622 | 23:28 |
*** lennyb has joined #zuul | 23:28 | |
*** Shrews has quit IRC | 23:28 | |
dmsimard | https://review.openstack.org/#/q/topic:zuul-changes adds a feature to dump queues from a status.json file (from our periodic backups) | 23:29 |
*** Shrews has joined #zuul | 23:35 | |
SpamapS | mordred: sweet | 23:48 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul master: Add change information to Build Completed log message https://review.openstack.org/536629 | 23:49 |
mordred | SpamapS, corvus: ^^ ok - I *think* that wins the prize for longest commit message compared to size and impact of commit ... although corvus has some really good candidates for that prize too | 23:50 |
SpamapS | :) | 23:51 |
*** openstack has quit IRC | 23:55 | |
*** openstack has joined #zuul | 23:59 | |
*** ChanServ sets mode: +o openstack | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!