openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Replace --keep-jobdirs with an IPC https://review.openstack.org/478682 | 00:05 |
---|---|---|
jeblair | okay, i think someone... was it pabelanger? had a change related to setting the homedir to jobdir/work | 00:07 |
jeblair | and i -1d it because it would be different due to bubblewrap only being on untrusted playbooks | 00:08 |
jeblair | but that's fixed now... | 00:08 |
*** Shuo has quit IRC | 00:08 | |
jeblair | but i can't find that change anymore, and i think it would fix this problem... | 00:08 |
jeblair | aha! | 00:11 |
jeblair | it was an earlier patchset of 473099 | 00:11 |
* jeblair types some things | 00:11 | |
jamielennox | IPC is better because i don't have to restart the service, but a 'keep if file exists' is way easier to coordinate from a ansible perspective | 00:21 |
jamielennox | (or something) | 00:21 |
jlk | I'm lost, I can't figure out where encoding is breaking. | 00:21 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Set HOME to work root https://review.openstack.org/478687 | 00:22 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Don't automatically mount user home in executor https://review.openstack.org/478688 | 00:22 |
jeblair | jamielennox: maybe it should also be a config file option? | 00:22 |
jeblair | mordred, SpamapS: 478687 and 478688 complete a thought from pabelanger which was before it's time, but now that we bwrap everywhere, i think those make sense. i believe they will make the known_hosts module work as expected by default. | 00:23 |
jlk | I have to walk away, I'm really dejected about this :( | 00:30 |
jamielennox | jeblair: does that part of config get re-read? the goal would hopefully be to not restart the process, just something that is checked right before the delete occurs | 00:30 |
jeblair | jamielennox: oh, i thought you wanted something to persist across restarts... if you're in the same boat and you just want to switch on the debug flag, can't ansible run 'zuul-executor keep' just as easily? | 00:31 |
jamielennox | jeblair: i was thinking both, persist across restarts, but that i can enable in the middle of a job to keep the current thing | 00:32 |
jeblair | jamielennox: oh, are you thinking *within the job ansible*? | 00:32 |
jamielennox | the IPC is fine as well, just thinking that something file based was easier to automate | 00:32 |
jeblair | jamielennox: or do you mean from bonnyci admin zuul modules? | 00:33 |
jamielennox | jeblair: it wasn't thought out enough to have a complete idea, but so far we are not doing anything via zuul IPC and the goal was to have everything triggered by ansible runs | 00:35 |
jamielennox | i just know i've been caught out before by a job that's currently running and i'd like it to not be deleted, but it's too late to add the flag | 00:35 |
jamielennox | config file works great, but if the value is only read once at startup then it's not better than the --arg | 00:36 |
jamielennox | IPC is good, but it doesn't persist across restarts and targets only one executor server (pros and cons) | 00:38 |
jamielennox | i'd still be happy with IPC, was just wondernig about other signals | 00:38 |
jeblair | jamielennox: gotcha. of course you can apply it to all executors with ansible. | 00:39 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Don't automatically mount user home in executor https://review.openstack.org/478688 | 00:39 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Set HOME to work root https://review.openstack.org/478687 | 00:39 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Augment --keep-jobdirs with an IPC https://review.openstack.org/478682 | 00:39 |
jeblair | jamielennox: i'm not super keen on adding another mechanism like out-of-band file checks. if the ipc doesn't work for some cases, it's probably better to spruce up config reloading. | 00:41 |
jeblair | jamielennox: but i think the ipc thing will make us both a lot happier in the mean time :) | 00:41 |
jamielennox | jeblair: config reloading is something i'd be interested in seeing anyway so can definitely wait until then | 00:42 |
jeblair | i have to eod now | 00:42 |
*** dkranz has joined #zuul | 02:52 | |
*** xinliang_ has quit IRC | 02:59 | |
*** xinliang has joined #zuul | 03:01 | |
openstackgerrit | Jamie Lennox proposed openstack-infra/zuul feature/zuulv3: Use only project name in github repo creation https://review.openstack.org/478734 | 05:08 |
*** isaacb has joined #zuul | 06:46 | |
*** jkilpatr has quit IRC | 07:33 | |
*** hashar has joined #zuul | 07:36 | |
*** openstackgerrit has quit IRC | 07:47 | |
*** smyers has quit IRC | 09:04 | |
*** smyers has joined #zuul | 09:14 | |
*** jkilpatr has joined #zuul | 10:56 | |
*** jkilpatr has quit IRC | 10:56 | |
*** jkilpatr has joined #zuul | 10:57 | |
dmsimard | I just noticed that doing a 'recheck' to re-trigger the check pipeline on a patch that already has a +verified vote doesn't remove the verified vote when the patch is enqueued. Should it ? | 11:19 |
dmsimard | Example here when I did a recheck: https://review.openstack.org/#/c/478624/ | 11:19 |
mordred | jlk: I've got one thing on my plate this morning first, then i'm going to poke at your encoding issue | 11:55 |
*** dmellado has quit IRC | 12:39 | |
*** dmellado has joined #zuul | 12:42 | |
*** jkilpatr has quit IRC | 12:49 | |
*** jkilpatr has joined #zuul | 12:49 | |
*** jkilpatr has quit IRC | 13:26 | |
*** jkilpatr has joined #zuul | 13:28 | |
*** jkilpatr_ has joined #zuul | 13:34 | |
*** jkilpatr_ has quit IRC | 13:36 | |
*** jkilpatr_ has joined #zuul | 13:36 | |
*** jkilpatr has quit IRC | 13:37 | |
*** dmsimard is now known as dmsimard|afk | 13:50 | |
*** jkilpatr_ has quit IRC | 14:26 | |
*** jkilpatr has joined #zuul | 14:27 | |
jlk | mordred: yay! that'd be fantastic. | 14:35 |
*** jkilpatr has quit IRC | 15:16 | |
Shrews | you know you're having fun when you have to write to a file to debug something | 15:25 |
Shrews | \o/ | 15:25 |
jlk | heh, yeah, been there. | 15:26 |
*** jkilpatr has joined #zuul | 15:29 | |
*** isaacb has quit IRC | 15:54 | |
jeblair | clarkb: what do you think of this change? https://review.openstack.org/478682 | 15:56 |
jeblair | clarkb: will let us say 'zuul-executor keep' to turn on the keep jobdirs feature | 15:56 |
jeblair | SpamapS: can you take a look at https://review.openstack.org/478688 ? | 15:57 |
clarkb | jeblair: seems straightforward, I have approved it | 16:00 |
*** openstackgerrit has joined #zuul | 16:13 | |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Augment --keep-jobdirs with an IPC https://review.openstack.org/478682 | 16:13 |
Shrews | jeblair: what am i missing here during test cleanup? http://paste.openstack.org/show/614094/ | 16:14 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Set HOME to work root https://review.openstack.org/478687 | 16:14 |
jeblair | Shrews: oh, probably the gearman client that gets set up to do the call to find out the server needs to be shutdown | 16:16 |
jeblair | Shrews: client.shutdown() should do it | 16:17 |
Shrews | jeblair: yup, that was it. thx | 16:18 |
jeblair | Shrews: there's a check on test teardown that we don't leak any threads -- it helpfully lists what threads are running, and in that paste i saw "MainThread" (obviously okay) and some gearman client threads. | 16:19 |
Shrews | jeblair: i didn't realize the client started a thread | 16:20 |
Shrews | i figured it was something with gearman though | 16:20 |
jeblair | Shrews: two threads, actually -- one to reconnect in the background, and one to receive data (it then updates previously existing Job objects with data it receives and invokes callback handlers) | 16:22 |
jeblair | they idle very well | 16:23 |
*** Shuo has joined #zuul | 16:32 | |
Shuo | Is there an openstack release management workflow team channel? | 16:35 |
jeblair | mordred: woo! log publishing worked, except that the finger url is what got reported | 16:35 |
mordred | jeblair: woot! that's just a config issue right? we don't have a success url configured? | 16:36 |
jeblair | Shuo: #openstack-release | 16:36 |
jeblair | mordred: i think we do, digging in now. | 16:36 |
mordred | ah - cool | 16:36 |
mordred | jeblair: I'm excited! | 16:36 |
Shuo | jeblair: thanks. | 16:36 |
Shuo | jeblair: I would guess topic, such as https://about.gitlab.com/2014/09/29/gitlab-flow/, related to developers' workflow definition is also descussed/adviced there | 16:38 |
mordred | Shuo: we mostly avoid long-lived topic or feature branches in openstack | 16:40 |
mordred | Shuo: zuul's feature/zuulv3 is an abberation and not our preferred mode of operation | 16:40 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: WIP: Add web-based console log streaming https://review.openstack.org/463353 | 16:42 |
mordred | Shuo: but yes - they are usually good people to discuss developer workflow with - as is #openstack-infra | 16:43 |
Shrews | mordred: jeblair: so in theory, that websocket test in ^^^ should pass now. Only problem is we need a way to get the finger port for a particular executor. Should that maybe go in the Worker class along with the hostname? | 16:43 |
jeblair | mordred, clarkb: i need to land https://review.openstack.org/478566 or will go insane :) | 16:43 |
Shuo | mordred: I don't understand release management much (my view point before was mainly from a developer's perspective), but right now I am trying to understand what the high-level workflow looks like in OpenStack community's practice (is there a gerrit workflow, like the github workflow or gitlab workflow diagram shown in the above link)? | 16:43 |
Shrews | i've hardcoded the test port for now, which sucks | 16:43 |
jeblair | Shuo, mordred: this conversation may be more appropriate for #openstack-infra | 16:44 |
mordred | Shuo: https://docs.openstack.org/infra/manual/developers.html is our guide for that - but yes, it's a good conversation for #openstack-infra | 16:44 |
Shuo | jeblair: will get discussed there, thanks. | 16:44 |
jeblair | Shrews, mordred: https://review.openstack.org/473103 is relevant | 16:44 |
*** Shuo has quit IRC | 16:45 | |
Shrews | jeblair: ooh. cool. also, i noticed that the test is getting None for worker_hostname. Is that expected? | 16:45 |
mordred | jeblair: landed | 16:46 |
mordred | Shrews: feel free to take that patch over - | 16:46 |
* SpamapS finally sitting down | 16:46 | |
jeblair | i pointed an error out on its parent which is causing the test failures | 16:47 |
jeblair | Shrews: hrm, i don't know? i didn't think we were overriding the hostname in the tests | 16:49 |
Shrews | jeblair: it turns out it works since passing None for hostname to asycio.open_connection() appears to equate to localhost, but just thought that was weird that it was None | 16:50 |
Shrews | mordred: that patch is apparently part of a bunch of related changes. I don't want to take those over, too. Can it be safely extracted? | 16:51 |
Shrews | child doesn't seem to depend on it, so maybe so | 16:52 |
mordred | Shrews: you need the one parent | 16:52 |
mordred | Shrews: but not both parents | 16:53 |
mordred | I believe | 16:53 |
mordred | so if you grab the worker_name patch - and the finger_port patch | 16:53 |
jlk | I'm really never going to stop laughing at "finger_port". I should grow up some day | 16:55 |
* Shrews is going to do gym things for a while | 16:56 | |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul feature/zuulv3: Re-enable test_merge_conflict_reports https://review.openstack.org/468670 | 17:02 |
SpamapS | jlk: the whole protocol is just a 30+ year joke at this point. | 17:03 |
jlk | that's cool, so am I | 17:03 |
jeblair | remote: https://review.openstack.org/479006 Zuulv3: Fix log urls | 17:04 |
jeblair | clarkb, mordred: ^ that fixes a formatting error that was causing us to report the finger url instead of the published url | 17:05 |
jeblair | (you can't use slices in the str.format() mini-language) | 17:05 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Create a new logger for gerrit IO https://review.openstack.org/478566 | 17:08 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Don't automatically mount user home in executor https://review.openstack.org/478688 | 17:11 |
SpamapS | jeblair: ^ | 17:12 |
SpamapS | restart should allow us to go forward with known_hosts now or is there another piece? | 17:13 |
jeblair | SpamapS: the parent patch was enough to fix known_hosts -- i restarted it and it looks like it works. i'm just fixing up the link formatting now. when that lands, we should be all set on log publishing and reporting | 17:14 |
SpamapS | AHH ok | 17:15 |
* SpamapS lost the thread a little on that | 17:15 | |
SpamapS | mordred: I'm doing a little project board cleanup. I noticed you submitted https://review.openstack.org/#/c/473811/ a bit ago, but it has stagnated. 1) Is it definitely needed to go live with Zuul, and 2) are you blocked? Do you want somebody else to grab it? etc. | 17:20 |
SpamapS | (It's attached to https://storyboard.openstack.org/#!/story/2000782) | 17:20 |
mordred | SpamapS: it is not needed to go live | 17:21 |
mordred | SpamapS: but also - if someone else wants to grab it that would not bother me at all | 17:22 |
SpamapS | cool, I'm going to start tagging stuff like that with zuulv3.0 and we can talk in a meeting about removing the zuulv3 tag from any of the things that are definitely "before we release 3.0" and that aren't critical. | 17:22 |
SpamapS | So it'll stay on the board and everything, just trying to whittle it down so we can narrow focus. | 17:23 |
mordred | SpamapS: ++ | 17:23 |
SpamapS | adam_g: do you think you'll have time to work on https://storyboard.openstack.org/#!/story/2000879 ? (Disk space monitoring thing) I notice you grabbed it on May 5. Stuff may have changed since then. :-P | 17:25 |
* SpamapS gives up on backporting bubblewrap to 16.04 | 17:30 | |
SpamapS | looks like ubuntu backports is dead | 17:30 |
mordred | SpamapS: :( | 17:31 |
SpamapS | Yeah, and I honestly don't know how much longer PPA's will be a thing. | 17:31 |
jeblair | SpamapS: thanks for trying; i'm sorry to hear that, though i don't think that will be much of a hardship for zuul users at this point. | 17:31 |
SpamapS | jeblair: agreed.. the PPA bridges us to Ubuntu 18.04. Fedora has us covered in the most recent releases. | 17:32 |
SpamapS | And Debian released with it. | 17:32 |
mordred | SpamapS: oh - it's in the latest debian? | 17:33 |
SpamapS | mordred: aye | 17:33 |
SpamapS | 0.1.7, which is sufficient for us | 17:33 |
SpamapS | we didn't really \o/ for Debian on 6/17 but they totally released | 17:34 |
SpamapS | which I believe is 4 in a row on the 2 year schedule | 17:35 |
SpamapS | Pretty awesome to see Debian take the cadence seriously. | 17:35 |
*** maxamillion has joined #zuul | 17:36 | |
jeblair | mordred: remote: https://review.openstack.org/479013 Zuulv3: Fix log urls more | 17:37 |
jeblair | mordred: pretty sure that's the last thing. | 17:37 |
SpamapS | We've had a few months of ongoing observation of zuulv3.. anybody know if https://storyboard.openstack.org/#!/story/2000899 still happens? (Jobs not aborting on executor failure) | 17:37 |
jeblair | SpamapS: i haven't been paing a lot of attention while stopping/starting the executor. let me do that right now while some jobs are running. | 17:38 |
mordred | jeblair: +3 | 17:38 |
jeblair | SpamapS: i kill -9'd the executor and do in fact see that the nodes are still locked and in use. so let's leave it. | 17:41 |
SpamapS | jeblair: ACK, adding that as a data point. | 17:46 |
mordred | jeblair: do the locks need to timeout before the fall away due to missed zk heartbeats from the locker? | 17:50 |
jeblair | mordred: no -- the scheduler holds the locks, so this is likely a case of the scheduler not doing the right thing (or maybe even not noticing?) when the executor dies. | 17:52 |
jeblair | mordred: i didn't want to dig further into it right now because the scheduler logs are swamped in gerrit io; hopefully that will get better in a few minutes. :) | 17:53 |
tobiash | hi, am I remembering right that zuul@openstack talks to gerrit for queries and a mirror for git operations? | 17:55 |
tobiash | zuulv3 seems to do much more git operations and I'm thinking about creating a mirror near zuul for hitting gerrit less | 17:56 |
jeblair | tobiash: zuul actually only talks to gerrit -- in v2, some of the git operations that our jobs do, using zuul-cloner, use the git mirrors to clone and update repos (before fetching changes from the mergers) | 17:56 |
jeblair | tobiash: in v3, the idea is that the mergers and executors talk to gerrit to keep their local repos up to date, and then push those out to jobs. so there are a lot of git operations, however, since the mergers usually have nearly up-to-date repos, they should not have much work to do. | 17:57 |
jeblair | tobiash: we do have an apache mirror in front of our gerrit server; i'm actually not sure if zuulv3 is hitting that or gerrit itself. but that might be an option if we need to protect gerrit more. | 17:59 |
mordred | yah - we keep a local git mirror on the same host as gerrit and do some apache rewrite to serve that ... | 17:59 |
mordred | tobiash: http://git.openstack.org/cgit/openstack-infra/puppet-gerrit/tree/templates/gerrit.vhost.erb#n78 | 18:00 |
mordred | tobiash: is the section where we tell it to serve git refs from the mirror instead of directly out of gerrit | 18:00 |
tobiash | jeblair, mordred: ah, looks like I misunderstood the canonical hostname setting of the gerrit driver | 18:01 |
mordred | oh - sorry - ti actually starts earlier at 61 | 18:01 |
tobiash | (unfortunately I have no influence on our gerrit) | 18:01 |
mordred | tobiash: yah - that, for us, is because our developers clone (or shoujld clone) repos from git.openstack.org ... | 18:01 |
mordred | so for things like go, where things need to go into filesystem paths that match the git url - it's important that we clone to git.openstack.org/foo/bar on the filesystem instead of review.openstack.org/foo/bar | 18:02 |
jeblair | http://docs-draft.openstack.org/94/477594/4/check/gate-zuul-docs-ubuntu-xenial/c9e1704//doc/build/html/admin/drivers/gerrit.html#connection-configuration | 18:02 |
jeblair | does that documentation look right? | 18:02 |
jeblair | i guess it could use more words :) | 18:02 |
tobiash | and our ops can be pretty fast in banning us from the gerrit ;) | 18:03 |
tobiash | jeblair: "This is used to identify repos from this connection by name and in preparing repos on the filesystem for use by jobs. " | 18:03 |
tobiash | the second part of the sentence made me believe that | 18:03 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Clarify canonical_hostname documentation https://review.openstack.org/479020 | 18:09 |
jeblair | tobiash, mordred: ^ | 18:10 |
* SpamapS should dive down the docs patch stack soon | 18:11 | |
tobiash | jeblair: found gerrit in github section | 18:12 |
tobiash | apart from that: Thanks, now it's totally clear :) | 18:13 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Clarify canonical_hostname documentation https://review.openstack.org/479020 | 18:13 |
jeblair | tobiash: fixed :) | 18:13 |
tobiash | jeblair: +1 :) | 18:13 |
*** dmsimard|afk is now known as dmsimard | 18:13 | |
jeblair | mordred: remote: https://review.openstack.org/479024 Zuulv3: Fix log urls even more | 18:18 |
jeblair | mordred: i think i probably could have written the "pass the build url back through a json file" patch faster than this "quick and dirty just copy it in the job def" approach. | 18:19 |
jeblair | win some lose some | 18:19 |
openstackgerrit | Jesse Keating proposed openstack-infra/zuul feature/zuulv3: Implement Depends-On for github https://review.openstack.org/474401 | 18:30 |
jlk | jeblair: mordred: SpamapS: jamielennox: ^^ that's a re-do of Depends-On, this time using the PR body as discussed. I was able to implement the reverse search as well. I haven't been able to fully test it with live github stuffs, because of the broken webhook handler at the moment. but, unit tests pass! | 18:31 |
SpamapS | jlk: kachow | 18:32 |
jeblair | mordred, SpamapS: check out the last report on https://review.openstack.org/478576 | 18:45 |
jeblair | we have published logs to the production log server, and linked there | 18:46 |
jeblair | i think we should go ahead and land that change, and after lunch, i'll look into why they're showing up inside of an extra 'logs/' directory. | 18:46 |
mordred | jeblair: wow. I never thought I'd be so excited to see a build log!!! | 18:47 |
clarkb | I'm going to guess rsync trailing / behavior? | 18:48 |
clarkb | I always get ^ wrong and have to read the manpage | 18:48 |
jeblair | clarkb: probably, though i thought i had addressed that. | 18:48 |
clarkb | that and ln target dest being the wrong way around always trip me up | 18:48 |
jlk | I have to man ln, ever time. | 18:50 |
mordred | jeblair: I believe it's because zuul.executor.log_root is workspace/logs | 18:50 |
mordred | oh - I think I just said clarkb's thing but with more words | 18:51 |
jeblair | mordred: yeah, i thought i checked that in an earlier test, but i must have missed something along the way | 18:51 |
jeblair | 18:51 < openstackgerrit> James E. Blair proposed openstack-infra/openstack-zuul-roles master: Fix synchronize paths in upload logs https://review.openstack.org/479037 | 18:52 |
jeblair | mordred, clarkb: ^ i think that will do it | 18:52 |
mordred | jeblair: wfm | 18:52 |
jeblair | yep that fixed it | 18:59 |
jeblair | mordred, clarkb, fungi, SpamapS: https://review.openstack.org/479038 our v3 has working log publishing! | 18:59 |
* mordred hands jeblair a chicken | 18:59 | |
* jeblair eats a chicken | 19:00 | |
fungi | woah! | 19:00 |
mordred | jeblair: and the output of the docs job shows that we don't die when the job sends weird binary too. so yay! | 19:00 |
fungi | i take it the build id there won't conflict with the zuul 2.x build id and so similarly-named jobs running in v3 won't suffer log collisions while we have this going in parallel? | 19:02 |
mordred | fungi: they shouldn't - unless uuid fails us | 19:04 |
fungi | righty-o | 19:04 |
clarkb | if that happens buy a lottery ticket | 19:05 |
mordred | it makes me giggle that this zuul v3 log output contains: | 19:06 |
mordred | export HUDSON_PUBLISH_DOCS=1 | 19:06 |
clarkb | wow hufson | 19:06 |
mordred | yah man | 19:09 |
mordred | we're old school around here | 19:09 |
jeblair | clarkb: i totally think we should spell hudson with a long-s | 19:09 |
jeblair | hudfon (because i can't unicode and eat a chicken at the same time) | 19:09 |
fungi | hudſon? | 19:09 |
jeblair | that's the one | 19:10 |
* fungi recommends hudßon | 19:10 | |
fungi | owing to english's germanic roots | 19:10 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Use worker_name for job cancellation and remove manager https://review.openstack.org/474288 | 19:13 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Support finger ports in finger URL https://review.openstack.org/473103 | 19:13 |
Shrews | mordred: I rebased those ^^^ and fixed a thing jeblair spotted in the 2nd. Not sure why the first one failed so many jobs before. | 19:15 |
mordred | Shrews: i'm going to blame the chicken I just gave jeblair to eat | 19:16 |
Shrews | mmmm, chicken | 19:16 |
Shrews | though I've been working on my pan seared/oven finished steak technique lately | 19:19 |
Shrews | oh, jeblair explained why | 19:26 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Use worker_name for job cancellation and remove manager https://review.openstack.org/474288 | 19:31 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Support finger ports in finger URL https://review.openstack.org/473103 | 19:43 |
Shrews | should pass now | 19:43 |
Shrews | gah. different fails on the 2nd | 19:53 |
jeblair | yeah just looking at that | 19:53 |
jeblair | that is really weird. i don't see what would cause that. | 19:56 |
Shrews | jeblair: i'm confused as to why the finger comparison string changed in the failed tests (at least in test_disabled_at) | 19:57 |
Shrews | yah | 19:57 |
Shrews | is something with the urls now a list, causing the change in output? | 19:57 |
Shrews | project-test1 ['finger://zl.example.com/4de09d998fe745be9302810853d2f6d6'] | 19:58 |
Shrews | looks like a list of urls | 19:58 |
* Shrews stabs in dark | 19:58 | |
jeblair | right, but i don't see anything in 473103 that would change it to a list | 19:58 |
jeblair | i see it | 20:00 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Use worker_name for job cancellation and remove manager https://review.openstack.org/474288 | 20:00 |
jeblair | Shrews: left comment | 20:01 |
Shrews | haha. thx jeblair | 20:01 |
jeblair | Shrews: also, worker_log_port doesn't actually have to be in the conditional, right? | 20:02 |
jeblair | we can just always set it to log_streaming_port in the data={} block above it i think | 20:02 |
jeblair | mordred: with the log stuff out of the way, are you blocked on anything from me? | 20:03 |
jeblair | mordred: if not, i'll start on my punch list | 20:03 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Support finger ports in finger URL https://review.openstack.org/473103 | 20:03 |
Shrews | jeblair: oh, hrm, lemme look | 20:03 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Support finger ports in finger URL https://review.openstack.org/473103 | 20:04 |
Shrews | errr, inconsistency with the hostname too | 20:05 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Support finger ports in finger URL https://review.openstack.org/473103 | 20:06 |
Shrews | boy, who wrote this code? ;) | 20:06 |
jeblair | Shrews: sorry, i may not have been clear about the worker port thing; left a comment | 20:15 |
jlk | mordred: did you get anywhere with the webob encoding issue? | 20:15 |
Shrews | jlk: oh, you were clear. i just read too quickly | 20:19 |
Shrews | fixing | 20:19 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Support finger ports in finger URL https://review.openstack.org/473103 | 20:20 |
Shrews | fixed the pep8 thing too | 20:20 |
jeblair | +3 (carrying over mordred's +2) | 20:21 |
Shrews | oh geez. all of that work, and i get back zl.example.com:79 for the finger address in tests | 20:28 |
Shrews | which is ever so helpful | 20:28 |
jeblair | Shrews: yeah, that's overriden in the base test class. we can stop doing that, we'll just need to update a handful of assertions to use the actual hostname | 20:30 |
Shrews | jeblair: not sure what to do about the log streaming port since I start that by hand. can i just set self.executor_server.log_streaming_port in the test? | 20:31 |
Shrews | hrm, no. that doesn't seem to work | 20:33 |
mordred | jlk: sorry- my morning thing took longer than I was expecting - I'm only just now getting tohelping with your thing | 20:35 |
jlk | alrighty | 20:36 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Support finger ports in finger URL https://review.openstack.org/473103 | 20:37 |
mordred | jeblair: no- I am not blocked on anything from you - thank you for making logs happen!!! | 20:40 |
jeblair | Shrews: what do you mean start by hand? | 20:41 |
jeblair | Shrews: oh, i see, yeah i'd just do that for now. | 20:42 |
jeblair | Shrews: (later, we can make a fixture for the log streamer that uses a random port, and then start that fixture before the executor server, and actually pass it in with the constructor as intended) | 20:43 |
Shrews | jeblair: except it doesn't work for some reason | 20:46 |
Shrews | maybe setting it in the test is too late | 20:47 |
jeblair | Shrews: did you set it at the beginning of the test case before you start the job? | 20:49 |
Shrews | i did not. trying | 20:49 |
Shrews | hazzah! | 20:50 |
* Shrews sees mordred's gift of chicken and raises him a fatted calf for jeblair | 20:50 | |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Add web-based console log streaming https://review.openstack.org/463353 | 20:51 |
mordred | zomg. that doesn't have WIP in front of it! | 20:51 |
Shrews | Finally removed the "WIP"! | 20:51 |
Shrews | \o/ | 20:51 |
Shrews | *dance* *dance* *dance* | 20:51 |
Shrews | we'll have to see if the changing the default hostname from zl.example.com breaks anything | 20:52 |
Shrews | changed all of the tests using it to reference self.executor_server.hostname | 20:53 |
* Shrews wishes good luck to whomever decides to review that | 20:54 | |
Shrews | mordred: btw, i'm not _entirely_ sure why you have that etc/console.html file there. an example, maybe? | 20:58 |
Shrews | oh neat. that passed all tests except for test_bubblewrap_leak | 21:03 |
Shrews | SpamapS: ^^^ ?? | 21:03 |
* Shrews will now EOD. ciao | 21:09 | |
mordred | Shrews: left some comments in the patch - one of them was, in fact, about etc/console.html | 21:10 |
SpamapS | Shrews: I'll look at the fail | 21:15 |
SpamapS | I wonder if there's a race with --die-with-parent | 21:20 |
SpamapS | you know.. I'm not sure we actually want --die-with-parent.. and I wonder if we should use runc to ensure all processes from the bubblewrap are dead instead. | 21:23 |
SpamapS | (runc, or something else that will wrap the bwrap and its children in a cgroup) | 21:23 |
jlk | haha, doesn't bubble wrap them in a cgroup? | 21:25 |
jlk | how many layers can we put around this? :D | 21:25 |
SpamapS | I know | 21:26 |
SpamapS | well here's the thing. --die-with-parent actually makes the kernel send the forked pid1 bwrap a SIGKILL when the outside-the-container bwrap dies. | 21:26 |
SpamapS | SIGKILL seems like the wrong thing | 21:27 |
SpamapS | but I dunno if a TERM would actually make it clean up its own processes. | 21:27 |
jeblair | SpamapS: by race, do you mean that the pid1 hasn't gotten around to killing its group children yet, even though bwrap has died? | 21:29 |
SpamapS | jeblair: right, I am worried that's what's happening. I'm tracing through how the bwraps are coordinated now. | 21:29 |
jeblair | SpamapS: maybe we should increase leak_time in that test, and then have a grace period for the kill to happen, and assume that the test is valid as long as it completed in less time than leak_time | 21:30 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Add TenantProjectConfig object https://review.openstack.org/479073 | 21:31 |
SpamapS | jeblair: I'm not finding where bubblewrap even kills the sleeper .. did we verify that this ever completed in under 7 seconds? | 21:35 |
SpamapS | I mean, reading all the cmdlines in /proc shouldn't take 7 seconds | 21:35 |
SpamapS | but... | 21:35 |
SpamapS | stranger things | 21:35 |
jeblair | SpamapS: i don't think we did | 21:41 |
SpamapS | hrm.. now how do I inspect a subunit file to find out | 21:57 |
SpamapS | ah, subunit2csv | 22:01 |
SpamapS | tests.unit.test_bubblewrap.TestBubblewrap.test_bubblewrap_leak,success,2017-06-29 20:32:04.448628+00:00,2017-06-29 20:32:04.483717+00:00 | 22:02 |
SpamapS | very fast as expected, so it's not that | 22:03 |
jeblair | SpamapS: that's from the run that failed? | 22:08 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Permit config shadowing https://review.openstack.org/479084 | 22:09 |
jeblair | mordred: fun fact: you can't inherit from a job that's defined after you in the config. which means i can only implement about half of what i wanted to do with shadowing (you can't override a remote job locally but still inherit from it). however, i think that's enough to support the base job in zuul-jobs, and give us the escape valve we need to make shared jobs safe. we can look into extending job inheritance to be forward-looking later. | 22:11 |
SpamapS | jeblair: no that's from a successful run | 22:15 |
SpamapS | jeblair: I checked 5 more. They're all about that fast. And the test runs fast locally. | 22:15 |
SpamapS | so yeah, whatever is killing the sleep probably just raced with the SIGKILL | 22:15 |
jeblair | SpamapS: ok cool. | 22:15 |
SpamapS | we could spin a couple of seconds making sure the inner bwrap is dead before we look for leaks. | 22:16 |
SpamapS | or even actually just loop until the inner bwrap is gone | 22:17 |
*** hashar has quit IRC | 22:23 | |
*** jamielennox has quit IRC | 22:57 | |
*** jamielennox has joined #zuul | 23:03 | |
jlk | so I have a question on cross-repo dependencies, and how to actually _use_ them at test time. If I understand it right, setting up a crd will cause both repos to get cloned into the executor (and presumably pushed to test node). | 23:12 |
jlk | I assume that there is a test written that will combine both sources into an "install" for the sake of testing, but how would that test normally work if there wasn't a crd established? How does one write a test that works both ways? | 23:13 |
SpamapS | jlk: that's a great point. :) | 23:15 |
mordred | jlk: if you make a test that operates on more than one repo - zuul will clone both repos regardless of whether a depends-on has been added | 23:15 |
SpamapS | they have to share a queue too right? | 23:16 |
SpamapS | otherwise they're not "the same job" | 23:16 |
SpamapS | (I assume you meant 'make a job') | 23:16 |
mordred | yah | 23:16 |
* SpamapS looks at the crd tests in test_scheduler | 23:17 | |
mordred | so, considering shade+ansible ... | 23:17 |
jlk | how do you make a test that operates on more than one repo? | 23:18 |
jlk | rather, how do you convince zuul to clone them both? | 23:18 |
mordred | the install step would do pip install src/github.com/ansible/ansible and pip install src/git.openstack.org/openstack-infra/shade | 23:18 |
mordred | jlk: you put things into required_projects on the job | 23:18 |
SpamapS | hrm non looks at the stuff under | 23:18 |
jlk | ah, required_projects, that's the ticket that I didn't know about | 23:19 |
mordred | check out tests/fixtures/layouts/repo-checkout-four-project.yaml: | 23:19 |
SpamapS | fyi, 'required_projects' does not appear in any tests | 23:19 |
* SpamapS tries not to stare at the red flag | 23:19 | |
mordred | SpamapS: sorry - it's 'required-projects' in the yaml | 23:19 |
SpamapS | phew | 23:19 |
* SpamapS should have thought of that | 23:19 | |
jlk | right, okay so nominally zuul will clone... master? | 23:20 |
mordred | yah - so we'll have job: shade-ansible-src\n required-projects: - ansible/ansible\n - openstack-infra/shade | 23:20 |
mordred | yes | 23:20 |
jlk | but in the case of a Depends-On, it'll present the change | 23:20 |
mordred | yup | 23:20 |
jlk | Can you define what branch it'll nominally take? | 23:20 |
jlk | (thinking of Ansible and it's use of "devel") | 23:21 |
mordred | override-branch is, I believe, the param | 23:21 |
jlk | coolio | 23:21 |
mordred | see tests/fixtures/layouts/repo-checkout-six-project.yaml for example | 23:21 |
SpamapS | should not be 'master' but the default branch that the repo uses | 23:21 |
mordred | I do not know - I remember some discussions around that at some point, but do not know where that came down | 23:22 |
SpamapS | Looking at ExecutorClient.execute ... I believe it maintains a set() for no good reason | 23:22 |
mordred | I know override-branch is useful for thigns like "I want to test master of shade against stable/newton of openstack services" | 23:22 |
SpamapS | ah n/m | 23:22 |
SpamapS | it uses it to add the actual project from the triggered item | 23:22 |
SpamapS | hm | 23:23 |
mordred | lucky for us ansible/ansible will give us some early testing of "we don't have a master branch" | 23:23 |
SpamapS | I think it might actually use it to add projects from any depended on change.. even if it's not in required-projects | 23:23 |
*** Shuo has joined #zuul | 23:23 | |
SpamapS | yep... | 23:23 |
SpamapS | Looks to me that as long as the project exists in the config you can depend on it and expect the repo on your test node | 23:24 |
SpamapS | not sure that's important | 23:24 |
SpamapS | but I'm not sure why we do that either. | 23:24 |
Shuo | jeblair and zuul team, can zuul work with repo/review frontend like github and gitlab? was there any discussion/thoughts around the wider applicability? | 23:26 |
jlk | Shuo: funny you should ask taht. | 23:26 |
jlk | Shuo: part of the v3 effort is to make the interfaces a driver construct, to make way for more drivers than gerrit. There was enough time that we were able to implement a driver for github. | 23:27 |
jlk | Shuo: supporting public github (via user API key or an "App") and github enterprise. | 23:27 |
Shuo | jlk: great to heard that! | 23:28 |
SpamapS | gitlab should be far simpler now that github has been shimmed in | 23:28 |
SpamapS | And I think adam_g is working out how to make Github Enterprise work. | 23:28 |
Shuo | jlk: but for gitlab, they seem to be building a Gitlab Runner, which seems to be serving this CI/CD farm role https://about.gitlab.com/features/gitlab-ci-cd/ how do you view that? | 23:29 |
jlk | SpamapS: IIRC that already works, we did that in the v2.5 patch set for the short time we were on GHE ourselves. | 23:29 |
SpamapS | jlk: I am sure there are gotchas in there. :) | 23:29 |
SpamapS | But yeah, I'd expect they're minor. | 23:29 |
jlk | Shuo: Gitlab does have a CI/CD thing. I haven't looked too deeply, but when I last looked it lacked many of the key features that led to Zuul's creation. | 23:29 |
jlk | so it's neat that it exists, but don't see it's existence as a reason to not implement a gitlab driver for zuul (provided there is some interest out there) | 23:30 |
jlk | gitlab could look at the zuul feature set and choose to implement them in their tool, the joys of open source. | 23:30 |
SpamapS | Github has Travis | 23:32 |
SpamapS | But we persist anyway ;) | 23:32 |
jlk | yeah, except Travis is it's own company, much like Shippable, CircleCI, Codeship, etc.. | 23:33 |
Shuo | jlk and SpamapS: I am asking trying to answer the questions that I may internally be facing :-) | 23:35 |
jlk | oh what.... | 23:35 |
jlk | Shuo: I might be wrong here, but it appears that Gitlab's CI service is triggered by /commit/, and is all post-commit actions | 23:35 |
jlk | rather than running tests for a change request and finding errors _before_ code is committed. | 23:35 |
SpamapS | gitlab's runner looks pretty nice, basic, mostly like Travis. | 23:37 |
SpamapS | https://docs.gitlab.com/runner/#using-gitlab-runner | 23:37 |
SpamapS | good starting point | 23:37 |
jeblair | mordred, SpamapS: i'm picking up SpamapS's ssh key change -- 465107 | 23:38 |
jeblair | first thing that occurs to me, is that i think we may not be running post playbooks when pre-playbooks fail. we should probably run the posts for all of the pre's which have succeeded. | 23:39 |
jeblair | ie, if we run base pre, but fail tox pre, we should not run tox post, but we should run base post. | 23:39 |
mordred | jeblair: agree | 23:40 |
jeblair | 2017-06-29 23:35:05.135180 | TASK [add-build-sshkey : Check to see if ssh key was already created for this build _variable_params={{ zuul_temp_ssh_key }}] | 23:40 |
jeblair | 2017-06-29 23:35:05.157073 | ubuntu-xenial -> localhost | ok: Results: => {"changed": false, "failed": false, "failed_when_result": false, "msg": "Executing local code is prohibited"} | 23:40 |
jeblair | mordred, SpamapS: ^ the stat call fails, but i wouldn't expect it to -- it's stating a file that should be in the workspace | 23:41 |
mordred | jeblair: stat is just a normal ansible command | 23:42 |
mordred | jeblair: which means it's going to hit the normal action plugin restriction | 23:42 |
mordred | which is "no code execution on localhost" ... we could put in a carve out for stat since we know it's safe | 23:43 |
jeblair | gotcha | 23:43 |
mordred | or - a carve out for stat with a file exclusion, of course | 23:43 |
Shuo | jlk: "...triggered by /commit/", but that can be worked around by having some kind of additional branch (say, branch of 'speculative-merge' or something like that) | 23:43 |
jlk | Shuo: maybe, but it runs into one of the problems that zuul solves | 23:44 |
jlk | Shuo: lets say you create that additional branch, and you test that commit. But time passes before you merge, and your target has moved on | 23:44 |
Shuo | jlk: agree. | 23:45 |
jlk | if you merge, you're now merging untested code. It's untested in the way it'll be merged. Zuul tests things _as_ they would merge, and since it's the one merging, it can ensure that what's tested is what's merged. | 23:45 |
Shuo | we have review tools like gitlab and perforce -- if zuul could be a multi-faceted speculative CI engine, that would be great. Not sure how hard or if possible the perforce integration might be | 23:49 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Carve out for stat https://review.openstack.org/479096 | 23:49 |
mordred | jeblair: ^^ totally untested - but somethign like that might work | 23:49 |
jlk | well, the goal is to be multi-faceted in that way, supporting gerrit, github, gitlab, etc. | 23:50 |
clarkb | perforce may be significantly more difficult though | 23:50 |
jeblair | mordred, SpamapS: can we go ahead and merge 465107? i'll need to invoke it from a trusted repo to get it to work (since it also runs ssh-keygen on the executor) | 23:50 |
clarkb | its not open source right? so how do you test it etc | 23:51 |
clarkb | also data models differ I'm sure | 23:51 |
jeblair | perforce has a git gateway | 23:51 |
jeblair | perhaps zuul could work with that. i dunno. | 23:51 |
mordred | jeblair: +2 from me | 23:52 |
*** Shuo has quit IRC | 23:53 | |
jeblair | oh wow the linters job failed it :) | 23:54 |
jeblair | first time i've seen that :) | 23:54 |
mordred | ooh | 23:54 |
mordred | jeblair: thats neat! | 23:55 |
jeblair | okay, updated and +3d | 23:56 |
mordred | cool | 23:56 |
jlk | a git gateway is neat, but more important is a way to get events, and report results | 23:59 |
jlk | (and deal with cross-repo deps discovery) | 23:59 |
clarkb | looks like you can import perforce repos into git too and operate on them that way | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!