*** rlandy has quit IRC | 00:00 | |
*** yolanda_ has joined #zuul | 00:06 | |
*** yolanda__ has quit IRC | 00:09 | |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Add min_avail_hdd governor for zuul-executor https://review.openstack.org/577948 | 00:09 |
---|---|---|
pabelanger | tristanC: clarkb: fungi: corvus: tristanC: first attempt at HDD governor for executor^ | 00:10 |
*** harlowja has quit IRC | 01:46 | |
*** mordred has quit IRC | 01:49 | |
*** mordred has joined #zuul | 02:07 | |
*** tovin07 has joined #zuul | 02:08 | |
*** tovin07 has quit IRC | 02:10 | |
*** yolanda__ has joined #zuul | 03:15 | |
*** yolanda_ has quit IRC | 03:18 | |
*** yolanda__ has quit IRC | 03:23 | |
*** yolanda__ has joined #zuul | 03:29 | |
*** yolanda_ has joined #zuul | 03:34 | |
*** yolanda__ has quit IRC | 03:36 | |
*** yolanda__ has joined #zuul | 03:44 | |
*** yolanda_ has quit IRC | 03:45 | |
*** yolanda_ has joined #zuul | 04:03 | |
*** yolanda__ has quit IRC | 04:06 | |
*** yolanda has joined #zuul | 04:08 | |
*** yolanda_ has quit IRC | 04:09 | |
*** yolanda_ has joined #zuul | 04:10 | |
*** yolanda has quit IRC | 04:13 | |
*** nchakrab has joined #zuul | 04:30 | |
*** nchakrab_ has joined #zuul | 04:48 | |
*** nchakrab has quit IRC | 04:48 | |
*** CrayZee has joined #zuul | 05:07 | |
*** nchakrab has joined #zuul | 05:35 | |
*** nchakra__ has joined #zuul | 05:37 | |
*** nchakrab_ has quit IRC | 05:38 | |
*** nchakrab has quit IRC | 05:40 | |
*** Rohaan has joined #zuul | 05:50 | |
*** snapiri- has joined #zuul | 05:53 | |
*** CrayZee has quit IRC | 05:56 | |
*** openstackgerrit has quit IRC | 06:04 | |
*** nchakrab has joined #zuul | 06:09 | |
*** nchakrab_ has joined #zuul | 06:12 | |
*** nchakra__ has quit IRC | 06:12 | |
*** nchakrab has quit IRC | 06:16 | |
tristanC | pabelanger: since we added more review.openstack.org projects to rdoproject zuul, it seems like the pipeline manager is queueing merge job, even when we don't include any configuration from those projects | 06:20 |
*** yolanda__ has joined #zuul | 06:21 | |
tristanC | there is currently a 20 PS stack for nova that is starving our resources | 06:22 |
*** yolanda_ has quit IRC | 06:24 | |
tristanC | shouldn't the manager prepareItem returns early if the project is configured with include: [] ? | 06:24 |
*** nchakrab_ has quit IRC | 06:29 | |
*** gtema has joined #zuul | 06:49 | |
*** nchakrab has joined #zuul | 06:56 | |
*** hashar has joined #zuul | 07:06 | |
*** pcaruana has joined #zuul | 07:20 | |
*** jpena|off is now known as jpena | 07:44 | |
*** nchakrab has quit IRC | 07:56 | |
*** nchakrab has joined #zuul | 07:58 | |
*** nchakrab has quit IRC | 08:02 | |
*** nchakrab has joined #zuul | 08:06 | |
tobiash | tristanC: I noticed that we occationally have zk connection losses (like http://paste.openstack.org/show/724289/) | 08:12 |
tobiash | tristanC: I think you also had trouble with zk. Did an increased session timeout help? | 08:12 |
tristanC | tobiash: iiuc, merger:cat job doesn't rely on zookeeper | 08:15 |
*** nchakrab has quit IRC | 08:21 | |
tobiash | tristanC: oh that was unrelated to your merger question ;) | 08:22 |
tobiash | tristanC: I was asking because you started that zookeeper discussion some time ago | 08:23 |
*** nchakrab has joined #zuul | 08:23 | |
tobiash | tristanC: that zookeeper connection loss triggers mass deletion of our nodes every now and then | 08:23 |
tristanC | tobiash: it used to do that for us too... | 08:25 |
*** nchakrab has quit IRC | 08:25 | |
tristanC | tobiash: that's why i'm still using https://review.openstack.org/#/q/status:open+project:openstack-infra/nodepool+branch:master+topic:zookeeper-retry | 08:25 |
*** nchakrab_ has joined #zuul | 08:25 | |
tobiash | tristanC: at least the session timeout config has been merged in zuul | 08:28 |
tobiash | tristanC: what value do you use for the session timeout? | 08:28 |
tobiash | tristanC: to your problem I'm not sure of we can delay that merger job because it might have some depends-on which depend on zuul.yaml changes | 08:29 |
tobiash | tristanC: so I see two possibilities there, take out nova from the config or add more merger capacity | 08:30 |
*** electrofelix has joined #zuul | 08:35 | |
tobiash | tristanC: oh nevermind, I failed reading diffs... you removed the session timeout in favor of the retry mechanism? | 08:35 |
tristanC | tobiash: i'm adding more merger, but it seems like the pipeline manager specificaly look for .zuul.yaml, then maybe there is another merge job running for depends-on | 08:41 |
tristanC | tobiash: about zk-retry, the idea is to completely remove the manual connection loop state check and rather rely on kazoo to retry the action until they succeed | 08:42 |
tobiash | tristanC: but that doesn't solve the lost session right? | 08:42 |
tristanC | tobiash: right, if the retry takes too long, then the session still gets lost (it's a server side thing iiuc) | 08:43 |
tobiash | tristanC: so how did you solve your expired sessions? Did you increase the session timeout? | 08:43 |
tristanC | tobiash: hum, if i read kazoo doc corrently, the timeout is "The longest to wait for a Zookeeper connection.", (e.g. https://kazoo.readthedocs.io/en/latest/api/client.html ) | 08:49 |
tobiash | tristanC: internally that is mapped to a _session_timeout variable | 08:50 |
tristanC | i think the only way to increase the timeout is to increase the server tickTime | 08:51 |
tristanC | tobiash: our issue was that the tcp connection between nodepool and zookeeper got stalled, the get_children called raised a "ConnectionLost", and by the time kazoo reconnect, we lost the session | 08:52 |
tristanC | and since we switched to using kazoo retry feature, then we no longer have connection loss or session loss issue | 08:53 |
tristanC | and i didn't see any duplicate node request or zknode leak, but we only run one launcher and one builder | 08:54 |
*** openstackgerrit has joined #zuul | 09:04 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: zk: use kazoo retry facilities https://review.openstack.org/535537 | 09:04 |
*** yolanda_ has joined #zuul | 09:50 | |
*** yolanda__ has quit IRC | 09:53 | |
*** Rohaan has quit IRC | 10:28 | |
*** sshnaidm|afk is now known as sshnaidm | 10:30 | |
*** jpena is now known as jpena|lunch | 11:03 | |
*** nchakrab_ has quit IRC | 11:36 | |
openstackgerrit | Fabien Boucher proposed openstack-infra/zuul master: Add /etc/localtime to bubblewrap default ro bind https://review.openstack.org/578072 | 11:44 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Support paging when listing github installations https://review.openstack.org/578077 | 11:53 |
tobiash | corvus: looks like we're the first running into paging when enumerating github installations... ^ | 11:54 |
tobiash | clarkb: ^ | 11:54 |
*** hashar has quit IRC | 12:13 | |
*** hashar has joined #zuul | 12:14 | |
*** hashar has joined #zuul | 12:14 | |
*** jpena|lunch is now known as jpena | 12:14 | |
*** rlandy has joined #zuul | 12:30 | |
*** nchakrab has joined #zuul | 13:02 | |
*** nchakrab_ has joined #zuul | 13:05 | |
*** nchakrab has quit IRC | 13:08 | |
*** Rohaan has joined #zuul | 13:18 | |
*** nchakrab has joined #zuul | 13:22 | |
openstackgerrit | Artem Goncharov proposed openstack-infra/nodepool master: retire shade in favor of openstacksdk https://review.openstack.org/572829 | 13:25 |
*** nchakrab_ has quit IRC | 13:25 | |
*** frickler has quit IRC | 13:26 | |
*** frickler has joined #zuul | 13:26 | |
*** nchakrab_ has joined #zuul | 13:29 | |
pabelanger | There are a few governor patches from tobiash and myself starting at: https://review.openstack.org/549275/ if people have some time today to review. I know the HDD sensor could be helpful to RDO to stop launching jobs when we are close to HDD limits | 13:30 |
*** nchakra__ has joined #zuul | 13:31 | |
*** nchakrab has quit IRC | 13:32 | |
mordred | gtema: heya- I keep meaning to comment on that ^^ and I keep getting distracted. I think we need to keep the nodepool-side flavor caching because we haven't started using the shade/openstacksdk caching layer yet | 13:34 |
*** nchakrab_ has quit IRC | 13:34 | |
mordred | pabelanger: all lgtm - I left +A off of the first two just in case corvus wants to look now that he's back from china, but he's +2'd the first one before, so I don't imagine issues | 13:34 |
pabelanger | wfm | 13:35 |
fbo | hi, is there a strong requierement about the authentication to use the depends-on feature ? I mean I want to use depends-on on a gerrit instance but don't have any account. | 13:35 |
gtema | mordred: but with a simple `cache: expiration_time:3600` in clouds.yaml it works prefectly (at least I checked manually listing images) | 13:37 |
fbo | I think that's the same if I need to depends-on on a pull request on github, zuul needs to have a github connection that require authentication. | 13:39 |
fbo | Could it be possible to use the git driver with a specific depends-on format ? | 13:40 |
*** snapiri- has quit IRC | 13:41 | |
mordred | gtema: yes, this is true. the issue is that today nodepool does the right thing out of the box, while someone would need to configure clouds.yaml. there is a long-languishing task in shade/sdk land to turn caching on by default- at least for some of the things - and it's that that we've been waiting for to remove the nodepool-side caching | 13:42 |
gtema | mordred: sad, but ok - I will turn caching back | 13:43 |
mordred | fbo: I don't think the git-driver has cross-source depends-on support yet- the biggest issue there is figuring out what url to give to it - for gerrit you need: git fetch https://git.openstack.org/openstack-infra/nodepool refs/changes/29/572829/3 | 13:43 |
mordred | gtema: yeah - I agree. hopefully we'll finish the caching work in sdk and can do it | 13:43 |
mordred | fbo: and then refs/changes/29/572829/3 is more specific than 572829 ... it's possible though that perhaps some of the things needed are available via unauthenticated REST | 13:44 |
gtema | mordred: ok | 13:45 |
mordred | fbo: I think corvus has had some thoughts around this use case before - probably good to chat with him when he gets up | 13:45 |
mordred | fbo: but for now at least the answer is to add gerrit sources with accounts ... we recently added an account for OpenStack Zuul on the Open Daylight gerrit for this reason | 13:49 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: log-inventory: add missing zuul_info_dir prep https://review.openstack.org/577674 | 13:59 |
*** elyezer has quit IRC | 14:00 | |
fbo | mordred: ok, thanks for the clarification. I think using something like depends-on: git.openstack.org/openstack/nova:changes/1 could be enough and the git driver should have all it needs to fetch the right git ref. | 14:03 |
*** nchakra__ has quit IRC | 14:10 | |
*** nchakrab has joined #zuul | 14:10 | |
*** elyezer has joined #zuul | 14:12 | |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Add min_avail_hdd governor for zuul-executor https://review.openstack.org/577948 | 14:21 |
pabelanger | mordred: sorry, pushed up a docs change^ | 14:21 |
*** elyezer has quit IRC | 14:21 | |
*** elyezer has joined #zuul | 14:23 | |
*** nchakrab has quit IRC | 14:41 | |
tobiash | corvus, clarkb: 575014 is in front of the multi-enqueue fix which would need a review too | 14:46 |
tobiash | but If you want I can unstack that | 14:46 |
clarkb | tobiash: I will take a look | 14:47 |
fungi | fbo: except that git.openstack.org/openstack/nova:changes/1 isn't something we can generalize to git remotes without some inside knowledge | 15:29 |
fungi | fbo: depends-on: git.openstack.org/openstack/nova:changes/43/76543/21 is at least | 15:31 |
fungi | gerrit doesn't provide a changes/76543 ref, it only provides a sharded and patchset-specific changes/43/76543/21 ref | 15:32 |
pabelanger | so, I am not sure if I have a bug in zuul or new feature, I am leaning towards new feature. Use case is, if you have a non-voting job and parent that fails, the child will be skipped. But zuul will report a -1 back to gerrit, because the skipped job didn't run. Which, in this case is fine, because we are trying to create some dynamic trigger jobs, without the node request to nodepool. The failing (non-voting) | 15:42 |
pabelanger | job has nodes: [], so only zuul-executor uses it. | 15:42 |
pabelanger | https://review.rdoproject.org/r/14462/ is the use case in question | 15:42 |
*** pcaruana has quit IRC | 15:42 | |
pabelanger | basically, propose a new job setting to allow for skipped jobs to not be considered a failure | 15:43 |
pabelanger | dmsimard: fyi^ | 15:43 |
dmsimard | pabelanger: is that in reference to the rdoinfo dynamic triggers ? | 15:45 |
pabelanger | dmsimard: yes | 15:45 |
dmsimard | ok so for people interested, we had discussed this a while back: http://eavesdrop.openstack.org/irclogs/%23zuul/%23zuul.2017-05-08.log.html#t2017-05-08T15:52:55 | 15:45 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Add min_avail_hdd governor for zuul-executor https://review.openstack.org/577948 | 15:45 |
dmsimard | (more than a year ago now, heh) | 15:45 |
pabelanger | tobiash: now with statsd testing for hdd sensor^ | 15:46 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Add min_avail_hdd governor for zuul-executor https://review.openstack.org/577948 | 15:47 |
*** Rohaan has quit IRC | 15:48 | |
corvus | fbo: the fundamental issue is more or less what fungi said -- we need to translate a url like "http://review.example.com/1234" to a gerrit git ref. that's gerrit specific enough that you need the zuul gerrit driver to do it. right now, that requires authentication. i have a WIP patch to support the http API for the reporter interface. if we also added HTTP support for the source interface, and then also | 15:53 |
corvus | added support for *unauthenticated* http, i think that would do what you want. | 15:53 |
corvus | pabelanger, dmsimard: the option sounds sensible to me -- however, i wonder if it would be okay to say that if a non-voting parent fails, treat all the children as non-voting. i think that's what you meant when you said it might be a bug. if folks think that's an intuitive behavior, we could probably do that. | 15:57 |
pabelanger | corvus: correct, that is what I was thinking could be a zuul bug. Assuming all were also okay with that statement | 15:58 |
corvus | at the moment, i can't think of when you wouldn't want that. but, i just woke up. | 15:59 |
dmsimard | corvus: the use case can be summarized as a parent job dynamically deciding which child jobs should be triggered (based on the results of the parent job) -- pabelanger's approach is something that gets us close to that without too much work I think | 15:59 |
corvus | dmsimard: i understand | 16:00 |
dmsimard | corvus: ack, making sure we're on the same wavelength | 16:00 |
pabelanger | okay, I'll see if I can figure out how to fix that today | 16:00 |
pabelanger | that would allow rdo to drop more dependency on jenkins :) | 16:01 |
corvus | if we explore the "non-voting parent causes skipped to be non-voting" approach, i think it does cover this use case, but we need to think about other use cases -- does it inhibit something else? | 16:01 |
*** elyezer has quit IRC | 16:02 | |
*** elyezer has joined #zuul | 16:02 | |
pabelanger | +1 | 16:04 |
*** harlowja has joined #zuul | 16:12 | |
*** elyezer has quit IRC | 16:15 | |
*** elyezer has joined #zuul | 16:19 | |
clarkb | corvus: pabelanger i think you should only do that if the children are all non voting too | 16:26 |
*** hashar is now known as hasharAway | 16:26 | |
*** elyezer has quit IRC | 16:30 | |
pabelanger | yah, in this case, the child are voting, the trick is once if the job is running does the results of pass / fail matter. If it is skipped, the result doesn't in that case. I understand it is a different model from today | 16:32 |
clarkb | the parent value implicitly matters if the child values do | 16:33 |
clarkb | that is the least surprising behavior to me | 16:33 |
pabelanger | the only reason the parent is non-voting in this case, is if the condition check to not run the child job is false, then we fail the job. Making it non-voting doesn't case zuul to leave -1 on patchset | 16:34 |
pabelanger | (condition check to run the child job is false) | 16:34 |
pabelanger | that is the only way I know today to have a child job skip | 16:35 |
clarkb | maybe that determination could be modeled in zuul directly and we can solve the actual usecase? | 16:35 |
clarkb | what determines if the child job should run? | 16:35 |
pabelanger | dmsimard: ^ | 16:35 |
mordred | clarkb: code/logic | 16:35 |
pabelanger | I want to say some external check | 16:35 |
pabelanger | curl URL | 16:36 |
mordred | it's a use case that's come up from a few places - people want to run code more complex than a regex to determine which jobs should run on a patch | 16:36 |
dmsimard | clarkb: "if grep foo; run job foo; elif grep bar; run job bar; fi" | 16:36 |
mordred | ansible does this inside of ansible-test - it examines the patch and applies some logic to determine which jobs should run | 16:36 |
*** elyezer has joined #zuul | 16:37 | |
clarkb | my immediate reaction to that is run all the jobs and have any that shouldn't run exit 0 and short circuit | 16:38 |
mordred | I don't think anyone has yet come up with a full suggestion on how zuul could provide direct support for that use case in a way that doesn't get pathological quickly | 16:38 |
clarkb | rather than have a complex tree of voting and non voting things | 16:38 |
mordred | clarkb: right - the issue there is in node allocation | 16:38 |
mordred | clarkb: if you have some node types that are expensive | 16:38 |
mordred | and you don't want to allocate them for a noop job if you don't have to | 16:38 |
pabelanger | clarkb: right, we're are also trying not to use nodepool resources, since there is a $$$ cost to that | 16:39 |
clarkb | thinking out loud more, maybe we need a new job type that is a branch selector explicitly | 16:39 |
dmsimard | clarkb: what mordred said, we don't have the luxury of spinning nodes up just to exit 0 and return | 16:39 |
clarkb | rather than hacking voting vs non voting | 16:39 |
clarkb | the reason I worry about using non voting in this way is it could very easily lead to +1 results on changes that should hae been -1'd | 16:40 |
clarkb | because as a user if a voting job cannot run that is a failing condition not a successful one | 16:41 |
pabelanger | agree, this was part of why I figured it was a new feature. Some sort of job attribute you express to say child result doesn't matter if parent failed | 16:42 |
clarkb | pabelanger: I would avoid failed vs success entirely in this case | 16:43 |
pabelanger | but, also not tied to this approach, just something I tested on Friday | 16:43 |
clarkb | something like set enqueue-children in zuul return | 16:43 |
pabelanger | clarkb: yah, really just need a way from parent not to run a child job. success / fail is only way today | 16:44 |
corvus | clarkb: er, back up a sec -- if the parent is non-voting, why would you care about the children? | 16:44 |
corvus | (sorry, i was away for a second, and i'm still mentally back at 16:26) | 16:44 |
clarkb | corvus: if the children are voting is the case I worry about (which seems to be pabelanger's case) that implicitly means a failure to run the parent is a failure to run the chidlren which should be a -1 | 16:45 |
corvus | clarkb: but why would you set the parent to non-voting in that case? | 16:45 |
clarkb | because voting jobs failed (by not being able to run) | 16:45 |
corvus | clarkb: oh wait -- you're saying there may be a difference between "parent says children should not run" and "parent barfed" ? | 16:45 |
clarkb | corvus: yes | 16:45 |
corvus | that is a good point :) | 16:45 |
corvus | and that case isn't handled by the other suggestion of adding a new flag either. it really is a different return value. right? | 16:46 |
mordred | yeah - it seems like there is a potentially new interface that could be designed here | 16:46 |
clarkb | yes, the parent would have to set an attribute that zuul evaluates for the child determination | 16:47 |
mordred | to allow a job to communicate to zuul something about what should happen with children | 16:47 |
mordred | clarkb: jinx | 16:47 |
pabelanger | yah, that would work great | 16:47 |
corvus | this lines up with the idea of continuing jobs too (keep the parent running while running children) mentioned in the container spec. i anticipated that as being implemented via zuul_return. so we could do that here too. | 16:48 |
mordred | ++ | 16:48 |
mordred | seems like between the RDO case and the ansible-test case, we might have 2 good and complex potential users to vet a design against | 16:48 |
*** CrayZee has joined #zuul | 16:49 | |
pabelanger | cool | 16:50 |
corvus | it's pretty easy to work with zuul_return data; the plumbing is already there | 16:51 |
*** CrayZee has quit IRC | 16:51 | |
pabelanger | any suggestion on attribute name zuul looks at for the child to run? | 16:54 |
openstackgerrit | Clark Boylan proposed openstack-infra/zuul-jobs master: Dynamically determine overlay network mtu https://review.openstack.org/578153 | 16:55 |
clarkb | dmsimard: ^ thta may interest you as you worked on that role a lot | 16:56 |
mordred | thinking out loud- if we added a variable to the zuul. ansible variables that is the list of child jobs zuul expects to run after the current job based on its understanding of the world - then we could have the capability for someone to return a list in zuul_return that would be either the same list or a smaller list | 16:57 |
*** elyezer has quit IRC | 16:57 | |
corvus | mordred: i was just thinking along similar lines :) | 16:57 |
dmsimard | clarkb: hm I thought we already did something like that | 16:57 |
mordred | then there could be a single rdo-selector or ansible-test-selector job with the full pile of dependencies - and it could do $logic and trim the list down | 16:57 |
clarkb | dmsimard: we make it configurable but don't configure it as far as I can tell | 16:57 |
mordred | pabelanger: does that seem like a potentially workable thing? | 16:57 |
pabelanger | mordred: yes, I believe so. And cliend jobs not listed show up as SKIPPED for results? | 16:58 |
corvus | mordred: we may also want to handle the simple case of "run no child jobs"... though, i guess you could do that by just saying: "zuul.child_jobs = []" ? | 16:58 |
mordred | corvus: yah | 16:59 |
clarkb | ya devstack calculates a similar value but only for neutron, not the overlay itself | 16:59 |
clarkb | er not the infra overlay | 16:59 |
mordred | corvus: andif you don't return child_jobs at all, zuul will not modify its job graph | 16:59 |
corvus | ++ | 16:59 |
dmsimard | clarkb: I guess it's a regression because we used to do it in devstack-gate https://github.com/openstack-infra/devstack-gate/blob/b3da8a393c68cc62924ee2f752f442d5c85ea8ed/devstack-vm-gate.sh#L69-L73 | 16:59 |
mordred | corvus, pabelanger: I thinkn we'd just not return them - kind of like we do when a job doesn't get run due to a path exclusion | 16:59 |
corvus | pabelanger: i think with mordred's approach, child jobs won't show up at all? | 16:59 |
corvus | right | 17:00 |
mordred | yah | 17:00 |
dmsimard | clarkb: I'm curious why this is coming up just now (packethost has been enabled for a while) though, I haven't kept up with the backlog | 17:00 |
clarkb | dmsimard: we don't do it in d-g either, that is only for neutron | 17:00 |
dmsimard | clarkb: oh, so not for the overlay bridge, got it | 17:00 |
clarkb | dmsimard: I think its not really popped up much because we only do this on multinode tests and very few of those vote | 17:00 |
dmsimard | clarkb: added a comment | 17:02 |
*** gtema has quit IRC | 17:02 | |
pabelanger | mordred: corvus: ack, that would work | 17:03 |
clarkb | dmsimard: I'm trying to find where we actually use that role outside of ozj tests (but I can depends on ozj and get those tests I think) | 17:04 |
clarkb | oh its in zuul-jobs itself | 17:05 |
dmsimard | clarkb: the ozj tests would not exercise devstack on top of it, though, but perhaps you could add functional tests to make sure the MTU is as exepcted | 17:05 |
dmsimard | clarkb: i.e, http://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/tests/multi-node-bridge.yaml | 17:05 |
clarkb | dmsimard: testing both is probably a good idea. I'll work on some depends on | 17:05 |
dmsimard | ++ | 17:05 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Add min_avail_hdd governor for zuul-executor https://review.openstack.org/577948 | 17:07 |
pabelanger | clarkb: mordred: tobiash: corvus: ^now with working statsd testing | 17:07 |
clarkb | dmsimard: the ozj tests are run against zuul jobs too, so just need a devstack depends on | 17:09 |
*** elyezer has joined #zuul | 17:09 | |
*** harlowja has quit IRC | 17:10 | |
clarkb | tobiash: https://review.openstack.org/#/c/578077/1 lgtm but I didn't approve it as there isn't a test and I'm not super familiar with the github api details | 17:23 |
tobiash | clarkb: the paging of the next call isn't tested yet as well. I hope I have time to work on testing these two. | 17:25 |
tobiash | But I'm completely in fire fighting mode this week so far :/ | 17:25 |
openstackgerrit | Clark Boylan proposed openstack-infra/zuul-jobs master: Dynamically determine overlay network mtu https://review.openstack.org/578153 | 17:43 |
*** jpena is now known as jpena|off | 17:49 | |
*** electrofelix has quit IRC | 17:52 | |
mordred | gundalow: conversation starting 1.25 hours ago about filtering of child/dependent jobs that RDO needs that I thnik likely applies to ansible-test too | 18:01 |
openstackgerrit | Clark Boylan proposed openstack-infra/zuul-jobs master: Dynamically determine overlay network mtu https://review.openstack.org/578153 | 18:11 |
openstackgerrit | Clark Boylan proposed openstack-infra/zuul-jobs master: Dynamically determine overlay network mtu https://review.openstack.org/578153 | 18:13 |
clarkb | this whole ip isn't in your path thing | 18:15 |
clarkb | seems like a bug that asnible/centos should fix | 18:15 |
mordred | clarkb: ++ | 18:24 |
gundalow | mordred: Thanks | 18:27 |
*** dmellado has quit IRC | 18:32 | |
*** gouthamr has quit IRC | 18:32 | |
*** bhavik1 has joined #zuul | 18:34 | |
*** bhavik1 has quit IRC | 18:38 | |
*** yolanda__ has joined #zuul | 19:04 | |
*** yolanda_ has quit IRC | 19:07 | |
*** yolanda_ has joined #zuul | 19:09 | |
pabelanger | mordred: corvus: clarkb: do we pre-populate the inventory with zuul.child_jobs for the parent? Or just look for zuul.child_jobs in zuul_return? If I parse before correctly, it was always add zuul.child_jobs to inventory file | 19:09 |
*** sshnaidm has quit IRC | 19:09 | |
corvus | pabelanger: yeah, i think that was the suggestion | 19:10 |
mordred | pabelanger: yah - the idea is to pre-populate the inventory with the child_jobs | 19:10 |
corvus | always populate the inventory | 19:11 |
mordred | because zuul will have already applied various rules to determine what should run - and we want to honor that too | 19:11 |
pabelanger | okay, great. I think I have that step done, just writing unit test. | 19:12 |
*** yolanda__ has quit IRC | 19:12 | |
*** yolanda__ has joined #zuul | 19:15 | |
*** yolanda_ has quit IRC | 19:18 | |
*** acozine1 has joined #zuul | 19:23 | |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Add zuul.child_jobs in ansible inventory file https://review.openstack.org/578181 | 19:38 |
pabelanger | corvus: mordred: ^is what I came up with for zuul.child_jobs. Now to work on zuul_return bits | 19:39 |
corvus | pabelanger, mordred: nice. do you think child_jobs should be just the first level of children, or all children? | 19:40 |
pabelanger | first level would be fine for rdo, but believe getDependentJobsRecursively returns all? I am unsure about ansible, would defer to mordred | 19:42 |
mordred | corvus: I think either could be fine - but I was thinking just first level originally | 19:46 |
pabelanger | okay, let me change | 19:48 |
*** hasharAway is now known as hashar | 19:50 | |
*** sshnaidm has joined #zuul | 19:50 | |
*** gouthamr has joined #zuul | 19:54 | |
mordred | pabelanger: and I was thinking that basically zuul would take the intersection of the lists - so it's not possible for a job to add child jobs - only to remove them (that's a follow up step - but just while I'm talking out loud) | 20:02 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Add zuul.child_jobs in ansible inventory file https://review.openstack.org/578181 | 20:04 |
pabelanger | mordred: yes, that's what I'm going to try and figure out now | 20:05 |
*** hashar has quit IRC | 20:10 | |
*** hashar has joined #zuul | 20:11 | |
corvus | i think the github paging change could benefit from something like a betamax test if we ever get those going, but with what we have now, i think the code is covered about as well as it can be so i'll approve | 20:19 |
corvus | tobiash, clarkb, mordred: ^ | 20:19 |
*** gouthamr has quit IRC | 20:19 | |
tobiash | corvus: thanks | 20:20 |
corvus | is /etc/localtime ever not present? | 20:20 |
corvus | (on linux) | 20:20 |
corvus | fungi, clarkb: ^ q re localtime | 20:22 |
clarkb | I want to say systemd has a default behavior if it isnt there or maybe it is glibc | 20:22 |
*** gouthamr has joined #zuul | 20:22 | |
fungi | yeah, not sure honestly | 20:23 |
fungi | i can't say i've ever noticed it missing on a system outside bootstrapping/installation | 20:24 |
corvus | okay, maybe we assume yes until someone complains :) | 20:24 |
corvus | er, assume it's always there | 20:24 |
*** gouthamr_ has joined #zuul | 20:29 | |
*** gouthamr has quit IRC | 20:29 | |
*** gouthamr_ is now known as gouthamr | 20:29 | |
*** acozine1 has quit IRC | 20:33 | |
openstackgerrit | Merged openstack-infra/zuul master: Support paging when listing github installations https://review.openstack.org/578077 | 20:39 |
openstackgerrit | Merged openstack-infra/zuul master: Add /etc/localtime to bubblewrap default ro bind https://review.openstack.org/578072 | 20:48 |
openstackgerrit | Merged openstack-infra/zuul master: Improve test case test_unprotected_branches https://review.openstack.org/576536 | 20:51 |
openstackgerrit | Merged openstack-infra/zuul master: Move exclude unprotected branches check into tenant https://review.openstack.org/576537 | 20:54 |
openstackgerrit | Monty Taylor proposed openstack-infra/nodepool master: Consume Task and TaskManager from openstacksdk https://review.openstack.org/414759 | 20:56 |
corvus | mnaser: in https://review.openstack.org/572588 why does enter cause the page refresh, but clicking the (X) in the box to clear it does not? | 20:56 |
corvus | ohhhh | 20:59 |
corvus | if you type in the box and don't hit enter, it filters on the next auto refresh. clicking the (X) must just be the opposite of that. it clears the field, but still awaits the next refresh for the change to be visible | 20:59 |
corvus | (i just happened to have clicked (X) right before a refresh and thought it was instantaneous) | 21:00 |
pabelanger | corvus: mordred: I'm trying to see where we check zuul.child_jobs has been filters to remove a job, but believe I could do it in QueueItem.findJobsToRun(), does that make sense? Otherwise, happy to take a pointer on where | 21:00 |
corvus | pabelanger: i think that's the right place | 21:00 |
pabelanger | Yay | 21:01 |
corvus | pabelanger, mordred: i think we may have more trouble implementing mordred's vision of jobs not appearing than pabelanger's vision of jobs showing up as something like 'SKIPPED' | 21:02 |
clarkb | https://review.openstack.org/#/c/578157/ shows my zuul-jobs change should be working (re mtu changes) | 21:02 |
corvus | pabelanger, mordred: because the job graph is pretty static once it's created; nothing removes branches from it right now | 21:03 |
mordred | corvus: ah ... nod | 21:03 |
mordred | well - given the construct (job manipulating graph) I think it's not terrible if we show those as skipped | 21:04 |
mordred | corvus: re: 572588 - I believe we can make a bunch of things better and more immediately responsive | 21:04 |
corvus | pabelanger, mordred: yeah, i think maybe the skipped approach is best. surgery on the graph would be possible, but it'll be complex (especially since it needs to survive reconfigurations which can also alter the job graph) | 21:06 |
corvus | i feel like skipped is pretty intuitive | 21:06 |
corvus | mordred: re 588, yeah, i figured (mnaser said things would get better when we angularify that) i just wondered why adding was different from removing, but i see now it isn't, so it makes much more sense :) | 21:07 |
corvus | mordred: i'm +2 on that and the main angular patch | 21:08 |
pabelanger | mordred: corvus: okay, I'll work up the skipped approach patch and see how that looks | 21:08 |
mordred | corvus: yay! | 21:08 |
corvus | we just need someone else to +3 those now :) how hard could that be? | 21:08 |
mordred | corvus: tobiash said he could upgrade his vote to a +2 if needed | 21:08 |
corvus | we may need to take him up on that :) | 21:08 |
mordred | corvus: but yeah - the number of folks who are excited about reviewing a large javascript patch turn out to be ... small | 21:08 |
mordred | clarkb: feel like +3ing two javascript patches? | 21:09 |
clarkb | mordred: I can take a look as soon as I get these emails sent | 21:09 |
clarkb | links? | 21:09 |
mordred | clarkb: https://review.openstack.org/#/c/551989/ and https://review.openstack.org/#/c/572588/ | 21:09 |
mordred | corvus: I've lost the status etherpad ... | 21:09 |
corvus | i think the intent is met though -- people who care and can poke at it have looked it over. perhaps no one (or only one) understands it fully, but it seems like an improvement, doesn't seem to break things, and people seem okay iterating on smaller changes after it lands | 21:10 |
corvus | mordred: /topic | 21:10 |
mordred | hrm. now how do I see the entire topic ... | 21:10 |
corvus | mordred: in your client, if you type '/topic' it doesn't show you the whole thing? | 21:11 |
mordred | oh - I don't believe i've ever tried that | 21:11 |
mordred | yes! | 21:11 |
mordred | neat. I learned something today | 21:11 |
corvus | w00t! | 21:11 |
corvus | though, also, you could buy more monitors until it fits. displaying the entire topic is a legit business expense. | 21:12 |
*** yolanda_ has joined #zuul | 21:12 | |
mordred | corvus: do I just gaff the extra monitors to the side of my laptop? that seems heavy | 21:13 |
corvus | mordred: you have to balance them on both sides | 21:13 |
corvus | gaff will hold | 21:14 |
*** yolanda__ has quit IRC | 21:14 | |
clarkb | lenovo made a Wsomething thinkpad that had a pull out extra display iirc | 21:15 |
clarkb | mordred: on that first one I am largely going to be reviewing for things not being obviously broken | 21:22 |
mordred | clarkb: yes. this is all i think it is reasonable to do | 21:22 |
*** cmurphy_vacation is now known as cmurphy | 21:22 | |
clarkb | mordred: we added tslint but didn't remove .eslintrc? | 21:24 |
clarkb | is that something we can cleanup later? or are we using both? | 21:25 |
*** gouthamr has quit IRC | 21:25 | |
*** gouthamr has joined #zuul | 21:26 | |
mordred | clarkb: I *think* we might be using both - but I cna't remember - I'll poke at cleaning that up - but we can definitely do as a followup | 21:27 |
*** hashar has quit IRC | 21:45 | |
clarkb | mordred: so if I read this right angular is basically just mvc and then we farm out to jquery js to actual business logic | 21:47 |
mordred | clarkb: yes. at least for now- the longer-term intent is to make the jquery in the status page go away | 21:48 |
mordred | but that's too much for one patch | 21:48 |
clarkb | also is there something that ties AppRoutingModule in https://review.openstack.org/#/c/551989/43/web/app-routing.module.ts to the NgModule above the export? | 21:48 |
clarkb | is that just magic ts inference (or angular) | 21:48 |
mordred | clarkb: https://review.openstack.org/#/c/551989/43/web/app.module.ts | 21:49 |
clarkb | mordred: that adds approutingmodule to the NgModule there | 21:49 |
clarkb | it just seems like the NgModule in the code I linked is dangling unused code? | 21:49 |
mordred | and that's called from main.ts | 21:50 |
clarkb | I'm sure it isn't because it is where the routes are defined sothere must be some implied connection | 21:50 |
mordred | that's a decorator | 21:50 |
clarkb | oh! | 21:50 |
mordred | decorating the declaration of AppRoutingModule | 21:50 |
mordred | so _actually_ if you start at main.ts - you should be able to step through all the things one way or another from imports and direct references | 21:51 |
mordred | this makes is nicer I think for us - where we'd like to be able to trace through things- but it removed a bunch of magic that previous fans of angularjs liked | 21:52 |
clarkb | ya I mean it mostly made sense, it was just the specifics in how are these things tied together I was missing, I figured they were tied together | 21:52 |
clarkb | and decorator would explain it | 21:52 |
clarkb | mordred: and you want me to +3? we are cool with tristanC's concerns (we can clean that up later I guess) | 21:53 |
*** myoung is now known as myoung|off | 21:56 | |
mordred | clarkb: yah - I think it'll be better to get the big thing in and get back to being able to make small patches | 21:56 |
mordred | clarkb: tristanC is running with some local patches applied already anyway - so we will not be breaking him | 21:56 |
clarkb | mordred: done | 22:02 |
mordred | clarkb: thank you! | 22:07 |
*** yolanda__ has joined #zuul | 22:07 | |
mordred | it feels like that patch has been in work longer than just since march | 22:08 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul master: Translate zuulStartStream into typescript https://review.openstack.org/558618 | 22:10 |
*** yolanda_ has quit IRC | 22:10 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul master: Shift log streaming code into StreamComponent https://review.openstack.org/558619 | 22:10 |
clarkb | corvus: tristanC I'm a little confused by https://review.openstack.org/#/c/576016/2 shouldn't the old code have returned [] if include is not in conf which would have set project_include to current_include under the old code (same as new code) | 22:11 |
clarkb | oh is the problem that frozenSet([]) evaluates to a truthy value? | 22:12 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul master: WIP Make websocket streaming more event-driven https://review.openstack.org/558646 | 22:14 |
corvus | clarkb: it's false. i think the problem is that it needs to be distinguished from 'not set' | 22:15 |
corvus | clarkb: old code was 'if user sets project include:[], ignore that and use project group include value' | 22:16 |
corvus | clarkb: new code is 'use project include: if set (to any value including the empty list), otherwise use group include' | 22:17 |
clarkb | gotcha | 22:17 |
clarkb | though it is set to current_include in either case? | 22:17 |
clarkb | that is what I'm tripping over if that evaluates to false it should be the same value in the end for both? | 22:18 |
clarkb | oh wait I get it now | 22:18 |
openstackgerrit | Merged openstack-infra/zuul master: configloader: skip merger:cat when no items are included https://review.openstack.org/576016 | 22:18 |
openstackgerrit | Merged openstack-infra/zuul master: Add a config_errors info to {tenant}/config-errors endpoint https://review.openstack.org/553873 | 22:18 |
openstackgerrit | Merged openstack-infra/zuul master: Refactor load sensors into drivers https://review.openstack.org/549275 | 22:18 |
clarkb | the difference is == [] vs is None | 22:18 |
clarkb | because None was being mapped to [] which got current include, but now None is current include and [] is still [] | 22:19 |
*** yolanda_ has joined #zuul | 22:21 | |
*** yolanda__ has quit IRC | 22:24 | |
*** threestrands has joined #zuul | 22:38 | |
tristanC | clarkb: mordred: i'll update the angular6 fixes that are needed for multi-tenant deployment. | 22:42 |
tristanC | beside the console log of routing event, it shouldn't affect zuul.openstack.org much | 22:42 |
*** rlandy has quit IRC | 22:49 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: web: fix status page flickering https://review.openstack.org/578226 | 22:50 |
tristanC | oh actually this is affecting single tenant too ^ You can reproduce by clicking status, then job, then status then jobs repeatedely. Then un focus and focus again, you'll see a multiple status call happening at once. | 22:52 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: web: angular6 fix suggestion https://review.openstack.org/573494 | 22:55 |
openstackgerrit | Merged openstack-infra/zuul master: Upgrade from angularjs (v1) to angular (v6) https://review.openstack.org/551989 | 22:59 |
openstackgerrit | Merged openstack-infra/zuul master: Hide queue headers for empty queues when filtering https://review.openstack.org/572588 | 22:59 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: WIP: Support skip_child_jobs via zuul_return https://review.openstack.org/578230 | 23:35 |
pabelanger | clarkb: corvus: mordred: ^first attempt to skip child jobs using zuul_return, there is still work to do to write a proper unit test, but I am not sure if I am on the right path. using zuul_return won't override the zuul.child_job ansible varaible, if I understand properly. So that is why I created skip_child_jobs. I'll continue more on it tomorrow but want to get some code up tonight | 23:37 |
corvus | pabelanger: the variables in zuul_return are independent of what's put in the inventory. so it's fine to use the same name. | 23:39 |
corvus | (i mean, if you zuul_return something, it gets passed to the child job, but *not* if it's something under the zuul hierarchy, so that's fine for this case) | 23:40 |
corvus | (iow, should be fine to use zuul.child_jobs) | 23:40 |
pabelanger | okay, thanks | 23:41 |
ianw | tristanC: oh, is see what's going on, inventory_file isn't defined when ansible doesn't think you have an inventory | 23:58 |
ianw | i wonder if it's more correct to ignore it, or copy in zuul's version. i'll throw up a patch and the peanut gallery can decide :) | 23:59 |
tristanC | ianw: thanks! | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!