jeblair | jamielennox: i proposed a followup to your ssh fix which merged: https://review.openstack.org/482735 | 00:11 |
---|---|---|
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add docs for web server component https://review.openstack.org/482630 | 00:13 |
jamielennox | jeblair: cool, yea that makes sense just starting it where everything else is started | 00:14 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add link to Zuul v3 docs to the README https://review.openstack.org/482349 | 00:14 |
jamielennox | i wasn't sure why ssh was started outside of the main job | 00:14 |
jeblair | jamielennox: yeah, i don't know of a reason. but if there is one, we should totally put it back with a note. :) | 00:15 |
jeblair | SpamapS: ^ | 00:15 |
jamielennox | adding that finish_job seems strange, i would have expected that to be run there somewhere already | 00:15 |
jeblair | jamielennox: oh, we should probably add a note to the run method saying "hey, put stuff in execute because of the handler + finishjob" | 00:16 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix ZuulWeb() invocation https://review.openstack.org/482321 | 00:19 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Use package module to install bindep packages https://review.openstack.org/482584 | 00:29 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Simplify bindep logic removing fallback support https://review.openstack.org/482650 | 00:29 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Move ssh-agent cleanup into existing exception handler https://review.openstack.org/482735 | 00:30 |
* SpamapS looks | 00:50 | |
SpamapS | jeblair: not sure I understand the ^ :-P | 00:51 |
*** dkranz has quit IRC | 00:51 | |
jamielennox | SpamapS: i think because you added the ssh-agent stuff, is there a reason the setup exists outside the thread | 00:54 |
SpamapS | jamielennox: so that you can get the pid of the ssh-agent in the main thread and kill it whether the thread returns or not? I'm kind of guessing. | 00:56 |
jamielennox | SpamapS: yea, i think it's just something that got put there rather than a specific reason for it | 00:57 |
jamielennox | there's a finally handler on the thread loop that should be able to do the cleanup | 00:57 |
openstackgerrit | Jamie Lennox proposed openstack-infra/zuul feature/zuulv3: Remove isJobRegistered https://review.openstack.org/482755 | 01:00 |
SpamapS | jamielennox: ehhh.. finally doesn't happen if the thread is blocking ;) | 01:09 |
jamielennox | in this case it's a failure cleanup thing so i don't know if we'd want to do anything if the thread is still active | 01:10 |
jamielennox | but anyway have a look at https://review.openstack.org/482735 and see if it's an issue | 01:10 |
SpamapS | Threads will still be active if they get stuck on a blocking syscall. It's part of the hazard of threads. | 01:17 |
SpamapS | But we don't want to just sit and wait for SIGKILL if we can avoid that. | 01:17 |
SpamapS | 01:36 | |
jamielennox | where is the base job for openstack coming from atm? | 01:51 |
jamielennox | did you ever figure out how to rsync logs out of the executor? this seems to be harder if there is going to be run inside bwrap with only the job ssh key | 01:52 |
jamielennox | oh, nvm its still in project-config | 01:55 |
mordred | jamielennox: yah - it's in project-config - althought the plan is to move it to zuul-jobs | 02:14 |
mordred | jamielennox: and yah - we've got our rsync-ing logs currently - working great! | 02:14 |
mordred | jamielennox: the "add_host" is the thing in there that makes it need to be defined in a trusted repo | 02:15 |
jamielennox | mordred: it seems like it just copies them into /srv/static/logs ? | 03:43 |
*** isaacb has joined #zuul | 06:25 | |
*** isaacb_ has joined #zuul | 06:32 | |
*** isaacb has quit IRC | 06:33 | |
*** openstackgerrit has quit IRC | 06:48 | |
*** isaacb_ has quit IRC | 07:31 | |
*** MaciekW has joined #zuul | 08:11 | |
MaciekW | Hello, how can I abort/stop already running zuul jobs? I am using zuul 2.5.2. I found a script at https://github.com/zaro0508/gearman-plugin-client/blob/master/gear_client.py but it doesn't stop the jobs | 08:14 |
tobiash | MaciekW: I know of the following possibilities: 1. abort in jenkins, or 2. Abandon/rebase the change triggers abort/abort+reenqueue | 08:31 |
MaciekW | tobiash: but unfortunately the problem refers to post jobs - we have many hanging post jobs at zuul queue - on zuul status page they are present but jenkins jobs have already finished | 08:39 |
MaciekW | tobiash: we don't know why this happens | 08:39 |
tobiash | MaciekW: I think in this case you would need to restart gearman or zuul (at least I did in such cases) | 08:40 |
*** openstackgerrit has joined #zuul | 08:53 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Case sensitive label matching https://review.openstack.org/482856 | 08:53 |
MaciekW | tobiash: If I will restart gearman on particular jenkins host all jobs will be restarted right? | 08:53 |
MaciekW | tobiash: I mean all jobs handled by this particular jenkins host | 08:54 |
tobiash | MaciekW: I don't think so | 08:54 |
tobiash | MaciekW: you should do this in a time slot where no otehr jobs are running (or retrigger them manually afterwards) | 08:55 |
tobiash | MaciekW: (when restarting gearman server and/or zuul) | 08:56 |
tobiash | MaciekW: when restarting jenkins only the behavior depends on how you did restart it | 08:56 |
MaciekW | tobiash: I meant to restart only gearman plugin on jenkins, maybe it will send some signal to gearman daemon/zuul | 08:58 |
tobiash | MaciekW: you also should do this only if no important jobs are running on this jenkins | 08:58 |
tobiash | MaciekW: I don't know how zuul handles this case but it can be that these jobs are marked as failed then | 08:59 |
tobiash | jeblair, mordred, clarkb: backported to master: https://review.openstack.org/#/c/482856/ | 09:01 |
MaciekW | tobiash: ok, I will try it, however it would be great to have a scirpt that can abort signle zuul/gearman job without affecting others | 09:02 |
tobiash | MaciekW: zuul 2 will be superseeded by zuulv3 later this year which works pretty different then (not sure if that already exists for v3) | 09:04 |
MaciekW | tobiash: ok, thanks for info | 09:05 |
mordred | jamielennox: +3'd https://review.openstack.org/#/c/482755 but with note | 09:14 |
*** isaacb has joined #zuul | 09:17 | |
openstackgerrit | Merged openstack-infra/nodepool feature/zuulv3: Add support for custom ssh port https://review.openstack.org/468752 | 09:22 |
openstackgerrit | Merged openstack-infra/nodepool feature/zuulv3: EOL ubuntu-precise for dsvm job https://review.openstack.org/482613 | 09:24 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Remove isJobRegistered https://review.openstack.org/482755 | 09:25 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add job dependencies to status.json https://review.openstack.org/482061 | 09:37 |
*** isaacb has quit IRC | 10:06 | |
*** isaacb has joined #zuul | 10:09 | |
*** isaacb has quit IRC | 10:10 | |
*** isaacb has joined #zuul | 10:12 | |
*** isaacb has quit IRC | 10:18 | |
openstackgerrit | Jamie Lennox proposed openstack-infra/zuul feature/zuulv3: Remove function_cache variables from executor client https://review.openstack.org/482890 | 10:22 |
mordred | jamielennox: thanks! | 10:40 |
pabelanger | morning | 10:43 |
*** xinliang has quit IRC | 10:48 | |
*** xinliang has joined #zuul | 11:01 | |
mordred | morning pabelanger ! | 11:03 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Avoid using apt-add-repository https://review.openstack.org/482683 | 11:03 |
*** jkilpatr has quit IRC | 11:06 | |
openstackgerrit | Paul Belanger proposed openstack-infra/nodepool feature/zuulv3: Add support to test fedora-26 https://review.openstack.org/482568 | 11:07 |
pabelanger | mordred: I seen you abandoned the configure-mirror in zuul-jobs, does that mean we are just using openstack-zuul-jobs now for it? | 11:07 |
mordred | pabelanger: yah - you already added it there, right? | 11:10 |
mordred | pabelanger: oh - crappit | 11:11 |
pabelanger | mordred: https://review.openstack.org/#/c/473845/ | 11:11 |
mordred | pabelanger: sorry - I was too tired :) | 11:11 |
pabelanger | still need to land it in openstack-zuul-jobs if we want it there | 11:11 |
mordred | no - my brain thought we'd already landed it | 11:11 |
mordred | and I thought that was the leftover patch :) | 11:11 |
pabelanger | Ah | 11:11 |
mordred | pabelanger: just restored | 11:11 |
pabelanger | ack | 11:12 |
mordred | thanks for the catch | 11:12 |
mordred | let's get that landed :) | 11:12 |
pabelanger | ++ | 11:12 |
pabelanger | mordred: https://review.openstack.org/#/c/482655/ should be an easy review for zuul-jobs | 11:20 |
pabelanger | add verify_host for synchronize commands, since we have known_hosts setup | 11:21 |
*** isaacb has joined #zuul | 11:25 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Remove default value for mirror_host https://review.openstack.org/482915 | 11:26 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Replace hand-written slug with ansible_managed https://review.openstack.org/482916 | 11:26 |
*** isaacb has quit IRC | 11:28 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Remove openstack-doc-build from zuul-jobs https://review.openstack.org/482921 | 11:29 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Enable verify_host for synchronize https://review.openstack.org/482655 | 11:32 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Remove openstack-doc-build from zuul-jobs https://review.openstack.org/482921 | 11:36 |
*** jkilpatr has joined #zuul | 11:42 | |
*** isaacb has joined #zuul | 11:51 | |
*** isaacb has quit IRC | 12:00 | |
*** isaacb has joined #zuul | 12:01 | |
tobiash | mordred: is there a common expectation here about the naming of the ansible template files (regarding the .j2 postfix) | 12:09 |
tobiash | mordred: I personally don't use the .j2 postfix in my ansible projects as this makes syntax highlighting in editors much better (and all files in <role>/templates would anyway have this suffix) | 12:10 |
openstackgerrit | Paul Belanger proposed openstack-infra/nodepool master: Support fedora-26 for nodepool dsvm job https://review.openstack.org/482942 | 12:14 |
pabelanger | tobiash: Ya, I tend to use .j2 for templates myself. Only to help identify them more | 12:17 |
pabelanger | I know some other roles I use tend to do that too | 12:18 |
pabelanger | but .j2 is optional | 12:18 |
tobiash | pabelanger: ok, I'm flexible in this regard (personally I don't use .j2 because it breaks syntax highlighting and has no real information benefit for me) | 12:20 |
pabelanger | tobiash: also note, .j2 is listed in most ansible example for documentation: https://docs.ansible.com/ansible/playbooks_lookups.html I say most, because there are times where they do omit it | 12:22 |
pabelanger | I also find it a little eaiser when looking at playbooks, anything that ends with .j2 tends to live in template folder when searching for things | 12:23 |
tobiash | yeah, that's true | 12:27 |
mordred | tobiash, pabelanger: I don't have an opinion one way or the other, actually - whatever everyone else thinks is a good idea | 12:28 |
tobiash | mordred, pabelanger: we just should do it consistent so best to follow pabelanger in this regard | 12:30 |
mordred | kk | 12:30 |
*** olaph has quit IRC | 12:40 | |
*** persia__ has joined #zuul | 12:40 | |
*** rfolco has quit IRC | 12:41 | |
*** persia has quit IRC | 12:41 | |
*** fbouliane has quit IRC | 12:41 | |
*** fbouliane has joined #zuul | 12:42 | |
*** rfolco has joined #zuul | 12:42 | |
*** MaciekW has quit IRC | 12:50 | |
tobiash | mordred, jeblair: what do you think about the ability to restrict some connections to certain tenants? | 12:51 |
tobiash | mordred, jeblair: I'd have a use case for separating more confidential repos from the rest with different connections, users, etc. | 12:52 |
tobiash | mordred, jeblair: ah, I think that would also work today already as jobs only can request repos from its own tenants right? | 12:54 |
mordred | tobiash: yes - that is correct | 12:55 |
mordred | tobiash: now - it's worth pointing out that there may still be some holes in tenant confidentiality ... | 12:56 |
*** xinliang has quit IRC | 12:56 | |
mordred | for instance, we do not have any way to restrict a user from streaming the build log for a job in another tenant currently | 12:56 |
*** xinliang has joined #zuul | 12:57 | |
mordred | tobiash: so depending on HOW confidential the repos are you may want to be careful - but also figuring out what use cases you have for the restrictions would be good | 12:57 |
tobiash | mordred: yeah as the console streaming endpoint is not tenant separated | 12:57 |
mordred | tobiash: although it's an obscurity thing - you can restrict status.json output by tenant (or, you can filter it by tenant and you can put web-server protections on the restricted tenant's status) | 12:58 |
tobiash | mordred: yeah I was told that it is that super confidential that they even want volume encryption in the inhouse hosted scm | 12:58 |
mordred | tobiash: so a user would have to guess the uuid of a build for the restricted jobs | 12:58 |
*** dkranz has joined #zuul | 12:59 | |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Cleanup components doc https://review.openstack.org/482636 | 12:59 |
tobiash | mordred: thinking about that it would be probably a good idea to just have a separate deployment for that | 12:59 |
mordred | tobiash: I would love for our isolation to be able to handle that - I believe the in-zuul restrictions (other than console-log) are pretty solid - and the other places that are per-tenant are actually in job config - things like log publishing | 13:00 |
mordred | tobiash: so doing an audit of places where info might leak would be worthwhile- perhaps we should tenant-isolate console-stream so that you could similarly lock it down in apache | 13:00 |
tobiash | mordred: so maybe it makes sense to supply a streaming endpoint per tenant? | 13:00 |
mordred | yah | 13:00 |
mordred | that would be consistent with the other endpoints too | 13:00 |
mordred | Shrews: ^^ making more work for you | 13:01 |
Shrews | joy | 13:02 |
mordred | Shrews: so - there's a thing with the endpoints in the old webap | 13:05 |
mordred | Shrews: that they support being prefixed with tenant name | 13:05 |
mordred | Shrews: I think the suggestoin would be to add the same thing to console-stream | 13:06 |
mordred | Shrews: so that you'd have /openstack/console-stream - and the 'openstack' part would be passed in as a parameter to the handler | 13:07 |
mordred | Shrews: (you can put variables in aiohttp routes for this kind of thing) | 13:07 |
mordred | Shrews: then, in the apache proxy impl we do, we'll point zuulv3.openstack.org/console-stream to 127.0.0.1/openstack/console-stream | 13:08 |
mordred | (that's actually what we're doing for the status page today) | 13:08 |
mordred | Shrews: the follow up question would be - when you do the gearman call to get the log information for the given buildid - can you verify that the build matches the tenant from the url | 13:08 |
Shrews | mordred: yup | 13:09 |
Shrews | 1 sec | 13:09 |
Shrews | mordred: https://github.com/openstack-infra/zuul/blob/feature/zuulv3/zuul/rpclistener.py#L185 | 13:10 |
Shrews | so we'd have to pass the tenant in along with the uuid | 13:10 |
Shrews | (assuming that sched.abide.tenants.values() contains the tenant name) | 13:11 |
mordred | Shrews: yes | 13:12 |
Shrews | mordred: but rather than backtrack and change all the things, i'd rather finish the static page serving first. Then go back and add tenant support across the board. | 13:13 |
mordred | Shrews: ++ | 13:13 |
mordred | Shrews: fwiw, app.add_route('GET', '/{tenant}/console-stream') and then request.match_info.get('tenant') in _handleWebsocket | 13:16 |
mordred | is the magic cantrip | 13:16 |
Shrews | mordred: you know, we don't have any sort of protection at the finger protocol level | 13:18 |
Shrews | mordred: if they know the uuid and executor, anyone can stream any log | 13:19 |
tobiash | Shrews: that doesn't need to be published to the outside world | 13:19 |
Shrews | tobiash: mordred wants the finger uri displayed | 13:19 |
*** isaacb has quit IRC | 13:19 | |
Shrews | which i think is valid. we have the finger protocol in place for people to use, if they wish | 13:20 |
tobiash | Shrews: we maybe should make it configurable which urls we want to display | 13:20 |
Shrews | security through obscurity is not really secure | 13:22 |
tobiash | Shrews: well in my deployment the executors are not directly accessible from outside so the finger urls wouldn't work anyway for me | 13:23 |
tobiash | but sure, hiding the finger url doesn't make sense if the executors are public | 13:24 |
*** isaacb has joined #zuul | 13:28 | |
Shrews | oh neat, my docs change caused tox-py35 to fail many tests | 13:28 |
Shrews | So, this whole tenant url thing has to be only about filtering, not security, because we just don't have the mechanisms in place to do that (and I think would be a much bigger change). If that's ok, then I can proceed with mordred's suggestions. If it is about security, we need a lot more thought put into it. | 13:32 |
SpamapS | Shrews: I think finger should probably be firewalled if we're saying "if you want security on logs, -> webserver security" | 13:34 |
SpamapS | It would make a lot of sense to make the log streaming URL have a tenant component so that webserver security fronting makes sense. | 13:34 |
SpamapS | so much sense :-P | 13:34 |
tobiash | Shrews: I'm not sure if I miss something but I think a combination of tenant filtering and handling the authorization/authentication path based outside would be a good combination | 13:35 |
* tobiash is too slow on typing ;) | 13:35 | |
fungi | right, multi-tenant finger just isn't going to be secure on its own | 13:36 |
fungi | as there's no authentication in the protocol | 13:36 |
fungi | at least not secure from a privacy perspective | 13:36 |
fungi | but also, it's not encrypted, so any privacy layers there would be illusory anyway | 13:37 |
Shrews | SpamapS: right, you need to do the webserver security stuff. but we also have to point out to users that steps need to be taken by the admin to effectively disable the finger protocol access | 13:37 |
mordred | SpamapS, Shrews, fungi: yes - it's just about filtering - security at this point is up to a deployer at the apache/firewall level | 13:38 |
mordred | which is one of the reasons to root the url endpointswith /{tenant} - to make it easy for a deployer to proxy or not proxy to a given sub-url range | 13:38 |
pabelanger | wss for websockets could be a thing in apache / nginx | 13:39 |
Shrews | mordred: cool, filtering is easy. | 13:39 |
fungi | also, probably best not to refer to that aspect as "security" but rather "privacy" | 13:39 |
mordred | for instance, we do this: https://github.com/openstack-infra/system-config/blob/master/manifests/site.pp#L1226 | 13:40 |
Shrews | fungi: yes, privacy is mo betta | 13:40 |
mordred | in infra - which makes zuulv3.openstack.org/status return 127.0.0.1/openstack/status - so there's no path for a user to override that | 13:40 |
fungi | public finger should be plenty "secure" as long as it does what it's designed for and only what it's designed for. it may not be "private" but that's a separate issue | 13:40 |
mordred | fungi: ++ | 13:40 |
pabelanger | fungi: ++ | 13:40 |
tobiash | ++ | 13:41 |
mordred | fungi, Shrews: when we get the finger multiplexor - maybe we can add a deployment option to restrict it by tenant - so similar to the apache config a deployer could run a finger gateway for a specific tenant on a given host | 13:41 |
mordred | so that we can empower deployers who wan to expose finger but also want private tenants to run a finger daemon on like "finger.openstack.org" that would only proxy finger requests for the openstack tenant | 13:42 |
mordred | but that's future thinking - for now blocking finger and doing webserver url limiting is the correct choice for folks who need privacy | 13:42 |
mordred | we may want to add a note about that to the docs too :) | 13:43 |
fungi | yes, a proxy could implement that | 13:43 |
Shrews | mordred: i was thinking we could have finger uuid:tenant:logfile@zuul-web.o.o since the logic is pretty much already in place in that thing | 13:43 |
Shrews | erm, port would be an issue. nm | 13:44 |
fungi | or the interface itself could do complex things like only allow queries for specific tenants on specific sockets/interfaces and then actual privacy could be added at the network layer | 13:44 |
fungi | if we wanted to have that possible via raw finger protocol | 13:44 |
fungi | but it would be complicated to implement | 13:44 |
mordred | fungi: right- that's why I was thinking just allow peope to run more than one daemon if we get to the point where we want to do htat | 13:47 |
mordred | I think getting the tenant stuff in the urls for now is the more pressing and more likely to be used thing | 13:48 |
fungi | full agreement | 13:48 |
mordred | I suppose we could also just make a tenant white/black list for the finger daemon and have it refuse to stream logs for builds that are associated with tenants on the blacklist - or only to serve them for tenants on the whitelist | 13:48 |
mordred | that shouldn't be terrible - the finger daemon already has to look up information about the build- verifying the build's tenant is 'ok' to stream should be not-terribly hard | 13:49 |
fungi | that would probably be reasonably trivial for someone to implement if they need it, yes | 13:49 |
mordred | yah | 13:49 |
mordred | fungi: well - thining about us - I know we want to provide finger - but also wold love to use tenant isolation perhaps to deal with security patches better - so it might be a simple way to allow us to configure our finger service to only serve logs for the openstack tenant | 13:50 |
fungi | mordred: i can see how it might fit into an overall design for testing embargoed vulnerability fixes, but that bit of the solution is probably one of the easier parts when looking at the whole of what we would need for private review and testing | 14:06 |
fungi | granted, some of the distributed/synchronized design for upcoming gerrit releases may solve the harder parts | 14:07 |
mordred | fungi: ++ | 14:07 |
fungi | (of running a separate private gerrit which has up to date copies of things in your public gerrit) | 14:07 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Correct run job parameter documentation https://review.openstack.org/482977 | 14:15 |
*** mnaser has joined #zuul | 14:35 | |
mnaser | would zuulv3 be okay for use for simple periodic jobs (no gerrit use or anything of that type) | 14:36 |
fungi | or rather, what significant bits are missing to support that use case? | 14:41 |
pabelanger | today we only have a single pipeline (check) on zuulv3.o.o. So atm, it is untested | 14:43 |
jeblair | mnaser: yes, it should be fine. i believe those unit tests are re-enabled at this point. though we may still have some things to change about it before we release. i need to give it a once over. | 14:46 |
*** isaacb has quit IRC | 14:54 | |
*** isaacb has joined #zuul | 15:16 | |
pabelanger | mordred: left comment on 482915 | 15:31 |
*** gundalow has left #zuul | 15:34 | |
pabelanger | mordred: jeblair: So, I am a little lost on what I should be working on for zuulv3 playbooks. I'm looking at the migration plan step 1 right now and trying to determine if tox jobs are done now or if we should increase coverage for other tox things out side of zuul tox.ini | 15:38 |
jeblair | pabelanger: i think mordred's got a better handle on that, so i'll defer to him when he gets back. but i do see some changes you can maybe review/merge: 482593 and 2 children, and 482584 and one child | 15:40 |
pabelanger | jeblair: ya, 482593 is the latest patchset I've uploaded, didn't want to self-approve on that. will dig into 482584 | 15:43 |
mordred | pabelanger: sorry - was looking away for a second | 15:44 |
mordred | pabelanger: there are still 3 of the tox roles thathave the jenkins script text copied in (run-cover, run-tarball and tox itself) | 15:45 |
mordred | pabelanger: I think we should iterate through making those not be giant shell blobs - which may take a little care and attention to figure out the right decomposition | 15:46 |
mordred | pabelanger: but we should be able to do it small bit at a time so we don't have to rewrite the whole thing | 15:46 |
pabelanger | mordred: sure | 15:48 |
pabelanger | mordred: also, it looks like validate-host is now how we collect host information for jobs, but today it nolonger is outputted to console stream, was that intentional? | 15:49 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix race in test_shadow https://review.openstack.org/483016 | 15:51 |
jeblair | easy +3 ^ | 15:51 |
*** isaacb has quit IRC | 16:07 | |
SpamapS | jeblair: trivial +3'd :) | 16:09 |
SpamapS | now if we can just keep it from exploding | 16:10 |
mordred | jeblair, pabelanger: http://logs.openstack.org/15/482915/1/check/23951a5/job-output.txt is recent and os_loganalyze did not html it - http://logs.openstack.org/14/481014/5/check/9ce8405/job-output.txt.gz is from last week and is from a random example I tried - and it did | 16:15 |
mordred | what's the difference / why is os_loganalyze not html-ifying http://logs.openstack.org/15/482915/1/check/23951a5/job-output.txt ? | 16:16 |
mordred | pabelanger: yes! it now goes to a nice file so it's readable | 16:17 |
pabelanger | mordred: ack, will take some adjustments | 16:17 |
pabelanger | mordred: jeblair: also, I am finding our current job-output.txt format very hard to read. Basically a large blob of text for me, what are your thoughts on making it more inline with: http://logs.openstack.org/89/481589/1/check/gate-windmill-deploy-fedora-25-nv/493e687/console.html.gz. Include line breaks between tasks, json pretty printing | 16:19 |
pabelanger | mordred: not sure about os_loganalyze, but can look into it. Does it know about job-output.txt? | 16:19 |
clarkb | pabelanger: mordred it has a whitelist of files that it can htmlify | 16:20 |
mordred | oh - thefile needs to be gzipped | 16:20 |
pabelanger | ya, that is what I am looking for right now | 16:20 |
mordred | RewriteRule ^/(.*\.txt\.gz)$ /htmlify/$1 [QSA,L,PT,NS] | 16:20 |
pabelanger | ah | 16:20 |
mordred | pabelanger: I agree that other thing is more readable for sure | 16:21 |
mordred | pabelanger: I was just about to do a couple of patches in zuul_stream - why don't I take a stab at that | 16:21 |
clarkb | you will alao need to whitelist it to get it htmlified | 16:21 |
pabelanger | mordred: what is the best way to test the callback_plugin? Do you just run it locally against a simple playbook? | 16:22 |
mordred | pabelanger: yah | 16:22 |
mordred | clarkb: nah - it seemed to to job-output.txt.gz from last week just fine - I tihnk the main key is missing .gz ending | 16:23 |
clarkb | whitelist may just be for log severity mqgic | 16:24 |
mordred | yah | 16:24 |
pabelanger | mordred: ack | 16:27 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Add job's project as implicit role project https://review.openstack.org/482726 | 17:07 |
Shrews | jlk: attempting to get a zuul-web container running in z8s. it seems to be running, but can't seem to get traffic through to it. changes are http://paste.openstack.org/show/615190/ | 17:12 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Pass result data back from ansible https://review.openstack.org/479442 | 17:13 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Support relative urls in success-url https://review.openstack.org/481342 | 17:13 |
jeblair | those were already approved; i'm self re-approving because that's just a merge conflict fix | 17:15 |
Shrews | jlk: oh, i think it is getting through. zuul-web just seems to be immediately disconnecting for some reason | 17:16 |
Shrews | hrm | 17:16 |
pabelanger | mordred: was there a specific reason we didn't use ANSIBLE_LOG_PATH to setup job-output.txt location, over ZUUL_JOB_OUTPUT_FILE logic? | 17:18 |
pabelanger | mordred: the reason I ask, it is looks like utils.display in ansible will setup a logger for use by default, if we use ANSIBLE_LOG_PATH | 17:19 |
Shrews | jlk: aha. have to use 0.0.0.0, not 127.0.0.1, in the zuul.conf | 17:25 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add spacing lines and formatting to result dicts https://review.openstack.org/483042 | 17:27 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add play recap back in to the end of plays https://review.openstack.org/483043 | 17:27 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Log execution phase and include information https://review.openstack.org/483044 | 17:27 |
jeblair | pabelanger: i believe the logger ansible sets up is unsuitable for us | 17:27 |
mordred | pabelanger: yes ^^ that | 17:27 |
mordred | I have talked with toshio about doing work upstream to fix the use of logger there | 17:28 |
jeblair | the formatting and information it provides is different than what we want, and we can't change it | 17:28 |
mordred | and he's open to fixes - so at some point in the future we can overhaul how that works which will be nice | 17:28 |
mordred | pabelanger, jeblair: ^^ that stack includes a thing pabelanger asked for , a thing jeblair asked for and a thing I wanted to do | 17:29 |
jeblair | (i mean, it's python logging, so maybe we could change it if we put on our crazypants and really dug into the api) | 17:29 |
*** amoralej has joined #zuul | 17:29 | |
mordred | yah. that seemed super unfun | 17:29 |
jeblair | mordred: you're pleasing everyone! | 17:29 |
dmsimard | Hello Zuulers, we were wondering if it'd make sense to pull zuul-cloner out of zuul ? We end up needing to install Zuul (and all of it's dependencies) on nodepool VMs that only require zuul-cloner. | 17:30 |
mordred | http://paste.openstack.org/show/615191/ is example output - oh - you know what - I need to make a change in that real quick | 17:30 |
*** dpawar has joined #zuul | 17:30 | |
mordred | dmsimard: zuul-cloner goes away in v3 - so that problem will resolve itself in the not-too-distant future | 17:30 |
dmsimard | mordred: oh yeah ? I didn't know. | 17:31 |
amoralej | mordred, is it replaced by something different? | 17:31 |
mordred | dmsimard: in v3 we push the repos in a properly prepared state onto the build nodes | 17:31 |
dmsimard | mordred: executor node will send the patchset to the nodepool VM or saomething ?% | 17:31 |
dmsimard | ahhhh | 17:31 |
jeblair | dmsimard: zuul magically puts all of your repos on disk for you with all the changes applied to the correct branches before starting the job | 17:31 |
mordred | dmsimard: so in v3 everything just works magically exactly as you'd want | 17:31 |
dmsimard | I can see the word "magically" being used often here :) | 17:31 |
mordred | :) | 17:31 |
jeblair | mordred: magic jinx | 17:31 |
amoralej | magically means we need some software installed on it? | 17:32 |
mordred | amoralej: ssh | 17:32 |
amoralej | nice | 17:32 |
mordred | amoralej: which you have to have for zuul to run jobs on it anyway :) | 17:32 |
dmsimard | amoralej: no -- in zuul v3 the zuul executor node (kinda like jenkins master) sends the data to the build node (nodepool vm) | 17:32 |
jeblair | zuul will push the repos onto the test nodes at the start of the job | 17:32 |
amoralej | yeah, that's not a problem for sure | 17:32 |
pabelanger | http://git.openstack.org/cgit/openstack-infra/openstack-zuul-roles/tree/roles/prepare-workspace/tasks/main.yaml FYI | 17:32 |
pabelanger | oh, that should be using verify_hosts too | 17:33 |
pabelanger | will patch :) | 17:33 |
amoralej | pabelanger, when using zuulv2, any trick to install it in nodepool nodes in venv to not interfere with other requirements? | 17:33 |
jeblair | dmsimard, amoralej: minimizing the requirements on the test nodes is a goal for us for v3. and so far, we've been able to do that. | 17:34 |
mordred | amoralej: in v2 I believe that's what we do - when we buld out images we make a zuul-cloner venv and then we call that from our jobs | 17:34 |
dmsimard | amoralej: upstream nodepool elements already do that | 17:34 |
jeblair | amoralej: yes, in openstack we have a dib element that installs it in a virtualenv. | 17:34 |
pabelanger | amoralej: nope, we do that today in our jobs. virtualenv zuul-env; zuul-env/bin/pip install zuul | 17:34 |
dmsimard | amoralej: it's our fault due to how we set up the image | 17:34 |
amoralej | ok, cool | 17:35 |
pabelanger | dmsimard: amoralej: centos-7 DIB will have /usr/zuul-env by default | 17:35 |
jeblair | mordred: you sure you want yaml vs pretty-printed json? i know this is for humans, though pretty-printed json is maybe closer to the minimal change to ansible output, but still readable by humans... | 17:36 |
amoralej | nice, we'll investigate to use it for our images | 17:36 |
amoralej | thanks | 17:36 |
jeblair | mordred: (i'm okay trying it -- i just want to make sure when we diverge from the native ansible experience (TM) we do so for good reason :) | 17:37 |
*** adam_g has quit IRC | 17:39 | |
*** adam_g has joined #zuul | 17:41 | |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul feature/zuulv3: Add new line between zuul_stream messages https://review.openstack.org/483048 | 17:41 |
pabelanger | mordred: ^ trival change, adds some new lines to our output between tasks / plays | 17:42 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add play recap back in to the end of plays https://review.openstack.org/483043 | 17:42 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add spacing lines and formatting to result dicts https://review.openstack.org/483042 | 17:42 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Log execution phase and include information https://review.openstack.org/483044 | 17:42 |
pabelanger | will results in http://logs.openstack.org/89/481589/1/check/gate-windmill-deploy-fedora-25-nv/493e687/console.html.gz for white spacing | 17:42 |
mordred | pabelanger: sorry man - that was the stack I was just writing | 17:42 |
mordred | oh - you did zul_stream | 17:43 |
pabelanger | mordred: :) | 17:43 |
mordred | nod - yah - I'd rather not do the \n there - because that'll wind up putting a \n between every line of stdout output we get from shell tasks | 17:43 |
pabelanger | ack | 17:43 |
pabelanger | mordred: lets go with your approach then | 17:44 |
mordred | jeblair: I could go either way re: yaml v json - easy to swap it in either direction | 17:45 |
jeblair | mordred: i'm +2 on the stack. let's see what it looks like, and we can compare it to the windmill output after it runs. | 17:47 |
pabelanger | same, also +2 looking forward to output | 17:47 |
mordred | \o/ | 17:49 |
pabelanger | Hmm, just testing locally, getting a warning | 17:49 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix race in test_shadow https://review.openstack.org/483016 | 17:49 |
mordred | (at some point that file needs a good refactoring - but I've been holding off on doing that for now because we have otherthings on our plate) | 17:50 |
jeblair | this is interesting: all of the zuulv3 jobs for that change hit retry_limit | 17:51 |
mordred | oh - interesting | 17:51 |
jeblair | http://logs.openstack.org/16/483016/1/check/8f1a044/job-output.txt | 17:51 |
jeblair | sample | 17:51 |
mordred | jeblair: I pulled hte +A real quick | 17:51 |
jeblair | mordred: sorry, i meant the merged one -- 17:49 < openstackgerrit> Merged openstack-infra/zuul feature/zuulv3: Fix race in test_shadow https://review.openstack.org/483016 | 17:52 |
jeblair | mordred: is it happening on others too? | 17:52 |
jeblair | (it wouldn't surprise me) | 17:52 |
mordred | jeblair: we went staight from pre to post in the ones I just pushed up | 17:52 |
jeblair | mordred: yeah, that's sort of what that log looks like too | 17:53 |
pabelanger | mordred: change 483042 to -1, with comment | 17:53 |
jeblair | 2017-07-12 17:21:54.162530 | ubuntu-xenial | Get:35 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 openjdk-8-jre-headless amd64 8u131-b11-0ubuntu1.16.04.2 [27.0 MB] | 17:53 |
jeblair | 2017-07-12 17:31:21.579914 | PLAY [/tmp/8f1a04439a58497c800260b9b6740581/ansible/post_playbook_0/git.openstack.org/openstack-infra/zuul-jobs/playbooks/tox/post : all] | 17:53 |
jeblair | note the timestamps there | 17:53 |
mordred | well - timestamp skew can be a thing | 17:53 |
jeblair | mordred: ? | 17:53 |
mordred | we report timestamps of the remote host for stdout outout and timestamp fromthe executor for play banners | 17:54 |
jeblair | mordred: ah ok | 17:54 |
jeblair | 2017-07-12 17:31:24.926275 | ubuntu-xenial | ok: Results: => {"changed": false, "examined": 0, "files": [], "matched": 0, "msg": "src/git.openstack.org/openstack-infra/zuul/.tox was skipped as it does not seem to be a valid directory or it cannot be accessed\n"} | 17:54 |
mordred | yah - that's in the "collect logs" postplaybook | 17:54 |
jeblair | mordred: ^ that's the next line. that's from the host, so it certainly is 10 minutes after the last line from the previous play | 17:54 |
mordred | nod | 17:54 |
mordred | so we're losing something somwhere | 17:55 |
jeblair | digging in executor log | 17:55 |
mordred | kk. I have to run to an appointment - back in a bit ... sad I don't get to dig with you | 17:55 |
pabelanger | http://paste.openstack.org/show/615196/ as the warnings I can getting with the stack from mordred | 17:55 |
pabelanger | maybe related | 17:55 |
pabelanger | left comment about unexpected keyword | 17:56 |
mordred | pabelanger: ou have to pass in a few thigns | 17:56 |
jeblair | timed out during bindep: http://paste.openstack.org/show/615197/ | 17:56 |
mordred | pabelanger: rm foo.out && ZUUL_JOB_OUTPUT_FILE=foo.out ansible-playbook test.yaml -e zuul_execution_phase=pre -e zuul_execution_phase_count=1 && cat foo.out | 17:56 |
mordred | pabelanger: is how I tested that | 17:56 |
pabelanger | mordred: k, let me try again | 17:57 |
pabelanger | mordred: same warnings :( | 17:58 |
mordred | pabelanger: it's on my todo list to make a "run this playbook like zuul would" tool | 17:58 |
mordred | jeblair: we have not yet landed the bindep rework patches, so nothing new realy should have changed there | 17:58 |
jeblair | mordred: i'm thinking it's just a really slow node? | 17:58 |
pabelanger | jeblair: mordred: we should land configure-mirror patch, so we are using AFS | 17:58 |
pabelanger | right now we are hitting upstream ubuntu | 17:59 |
jeblair | pabelanger: please do. :) | 17:59 |
pabelanger | k, I'll +3 https://review.openstack.org/#/c/473845/ unless somebody else wants to review | 17:59 |
pabelanger | sorry | 18:00 |
pabelanger | https://review.openstack.org/#/c/482593/ | 18:00 |
mordred | ah. yes. mirror | 18:00 |
jeblair | pabelanger: yes, that and its child | 18:00 |
pabelanger | k, approving now | 18:01 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Create configure-mirrors role https://review.openstack.org/482593 | 18:01 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Remove default value for mirror_host https://review.openstack.org/482915 | 18:01 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Replace hand-written slug with ansible_managed https://review.openstack.org/482916 | 18:01 |
*** dpawar has quit IRC | 18:15 | |
pabelanger | mordred: it looks like yaml.safe_dump is causing some problems with my local test | 18:28 |
pabelanger | mordred: I am using the debug task | 18:29 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Pass result data back from ansible https://review.openstack.org/479442 | 18:40 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add some indexes to allow efficient dashboard sql https://review.openstack.org/481614 | 19:14 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Remove function_cache variables from executor client https://review.openstack.org/482890 | 19:14 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Cleanup components doc https://review.openstack.org/482636 | 19:15 |
*** harlowja has quit IRC | 19:17 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Correct run job parameter documentation https://review.openstack.org/482977 | 19:34 |
*** amoralej is now known as amoralej|off | 19:35 | |
*** harlowja has joined #zuul | 20:02 | |
mordred | jeblair, pabelanger: ya'll learn anything about v3 while I was away? seems it was mostly looking at why the actual gate was slow | 20:02 |
pabelanger | mordred: just safe_dump comment from above on your zuul_stream changes. But ya, working on bring online 2 more zuul-launcher nodes to help with some scale issues | 20:03 |
jeblair | mordred: i finished https://review.openstack.org/482726 and the unit tests have shown a problem i don't have a good solution to | 20:04 |
mordred | jeblair: awesome | 20:04 |
mordred | pabelanger: neat - so - safe_dump + debug == sad? | 20:04 |
jeblair | mordred: basically -- if we have an implicit role for a job, how does that interact with inheritance and variance? | 20:05 |
mordred | pabelanger: aha! I have reproduced - neat | 20:05 |
mordred | jeblair: oh ... my | 20:05 |
pabelanger | mordred: ya, looks that way | 20:06 |
mordred | pabelanger: cool. looking now | 20:06 |
jeblair | mordred: i'm thinking that just shelving the idea may be a good option. the burden to explicitly add ones own repo to supply roles may not be worth the extra complexity after all. | 20:06 |
jeblair | mordred: i'll continue to mull it over in the background, but i'm going to move onto another foreground task right now. | 20:10 |
jeblair | probably the 'run playbooks with only roles defined up to that point' idea. | 20:12 |
*** jkilpatr has quit IRC | 20:25 | |
jeblair | mordred: a possible resolution may be to mirror the behavior of the implied run attribute (which does not get set in a project-pipeline variant). there's some... machinery... around that though. | 20:25 |
mordred | jeblair: it seems like a good idea to put it on the backburner and see if the subconscious comes up with anything | 20:29 |
mordred | pabelanger, jeblair: ok - SO - I think we're going to want to go json instead of yaml for the printing :) | 20:29 |
jeblair | mordred: ok. what's the problem? | 20:30 |
mordred | debug, which is one of the more important things to be able to print, has AnsibleBaseYAMLObject's in it | 20:30 |
jeblair | mordred: that can be marshalled to json? | 20:31 |
mordred | now - i'm sure there's a way to completely and accurately translate them such that yaml views them as a dict and not an object (I tried copying stuff - but various keys are AnsibleUnicode which are AnsibleBaseYAMLObject | 20:31 |
mordred | jeblair: it can - they ARE a dict | 20:31 |
mordred | jeblair: but yaml is too smart | 20:31 |
jeblair | mordred: gotcha | 20:31 |
mordred | so since it was a question originally anyway, and pabelanger's original version was json | 20:32 |
jeblair | mordred: yeah, it's probably a matter of setting custom representers or something, which seems like too much effort for this. | 20:32 |
mordred | why fight it | 20:32 |
mordred | jeblair: ++ exactly | 20:32 |
pabelanger | ++ | 20:32 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add play recap back in to the end of plays https://review.openstack.org/483043 | 20:36 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add spacing lines and formatting to result dicts https://review.openstack.org/483042 | 20:36 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Log execution phase and include information https://review.openstack.org/483044 | 20:36 |
jeblair | mordred, pabelanger, jlk: currently, we accumulate roles through the whole inheritance hierarchy, and then run all playbooks (pre, run, post) with all of those roles added. mordred requested that we do so only with the roles defined up to the point of playbook definition. iow, the pre-playbooks that a job inherits from the base job will only run with the roles that were defined on the base job. but the main playbook would run with a superset ... | 20:39 |
jeblair | ... of those since it was defined further down the inheritance hierarchy. | 20:39 |
jeblair | mordred, pabelanger, jlk: my question is this: | 20:39 |
jeblair | mordred, pabelanger, jlk: should the main playbook run with the roles defined at the point that the main playbook was defined, or should you be able to inherit from a job which defines a playbook, and have the child add roles without defining its own playbook. essentially, this would only be useful to override a previously defined role in the main job playbook. | 20:41 |
jeblair | er, i forgot the question mark, but you get the idea. | 20:41 |
*** jkilpatr has joined #zuul | 20:41 | |
jeblair | i lean toward saying that we should use the role list from the final job, because otherwise, what's the point of adding to the roles there? it'd be like we're ignoring configuration from the user. | 20:42 |
jeblair | it just makes the rules for the main playbook a little bit more different than the rules for pre and post playbooks. | 20:42 |
mordred | jeblair: yah | 20:43 |
jeblair | (i can mock up an example if any of that isn't clear) | 20:43 |
mordred | jeblair: so ... I buy your argument that we should execute the playbook for a job with the roles defined up through the job, not just up through where the playbook is defined | 20:44 |
pabelanger | jeblair: yes, I think a mock would help me | 20:44 |
jeblair | pabelanger: https://etherpad.openstack.org/p/b7TQRK74ka | 20:44 |
mordred | jeblair: I think my only reluctance would be that it's maybe complex in service of a thing that may not be super frequent? | 20:45 |
jlk | holy backscroll | 20:46 |
jeblair | mordred: as long as we're doing playbook context roles at all, the difference in implementation between option A (main playbook has final role list) and option B (main playbook has role list from when it was defined) is small i think. | 20:50 |
jeblair | pabelanger: mackup completed. does that make sense? | 20:50 |
pabelanger | jeblair: okay, now I see what you are saying about option b. my-unittests-roles, when would they actually be used then? | 20:50 |
jeblair | pabelanger: never | 20:50 |
jeblair | pabelanger: it sounds pretty ridiculous when put like that doesn't it? | 20:50 |
pabelanger | ya | 20:50 |
pabelanger | :) | 20:50 |
jeblair | pabelanger: the only reason i can see to do option B is so that we can write "playbooks always run with the roles path that the job had at the point in the inheritance hierarchy where the playbook was specified" in the docs. | 20:51 |
pabelanger | so, with that in mind, option A does make the most sense. But B would be nice if we wanted to block jobs from having roles for some reason | 20:52 |
jeblair | pabelanger: as opposed to option A, which would be "pre and post playbooks run with the roles path that the job had at the point where they were specified; the main playbook will run with the final roles path of the job." | 20:52 |
pabelanger | right | 20:53 |
mordred | jeblair: yah - but pre and post playbooks are already different, becaue they onion rather than override | 20:53 |
mordred | so I think that's fine | 20:53 |
jeblair | i think one of the main concerns we want to address is that we don't allow people to hijack jobs by substituting malicious roles. | 20:57 |
jeblair | because most of this is in service of reversing the role order so that you (a) can override roles, but (b) it's safe | 20:57 |
jeblair | so option A lets people override roles for the main playbook of a job. | 20:58 |
jeblair | if that job was defined in a trusted repo, then it could alter the behavior of the trusted job. | 20:58 |
jeblair | we have some protections against that -- if the job has secrets, it's automatically marked final, which means that adding new roles is disallowed. | 20:59 |
pabelanger | another thing I keep thinking about to is how something like requirements.yaml file for ansible-galaxy might complicate some of this. Assuming an existing deployment has already built up their requirements file for roles. Did we ever decide to have zuulv3 support installing roles from it? | 20:59 |
jeblair | pabelanger: we decided not to do anything with that for now. | 21:00 |
pabelanger | jeblair: ack. As for the current topic, I'll defer to both you and mordred on the best approach, but agree that option B doesn't seem to make sense from a configuration POV and potentially confusing to explain to end users why their role wasn't picked up | 21:03 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Log execution phase and include information https://review.openstack.org/483044 | 21:03 |
jeblair | mordred, pabelanger, jlk: so is the 'final' attribute sufficient protection here? should we expect people to know that a playbook defined in a trusted repo could have its behavior altered by a child job unless it's marked as final? this is a little hard to reason about because i don't have any examples at hand of a job we would want to have that would be affected by this (in a trusted repo without secrets) | 21:04 |
jeblair | that's starting to make me look at option B somewhat more favorably. | 21:04 |
jlk | The behavior would be altered because a role would be replaced further down the stack? | 21:05 |
Shrews | jlk: ohai! i haz question | 21:05 |
jeblair | jlk: yes, the proposal is to have child jobs add new roles to the front of the role path so they'd have precedence. | 21:05 |
jlk | hrm. Is there a pattern you're trying to replicate? | 21:06 |
jeblair | jlk: (that's mostly desired so that as we build up inheritance chains, roles from parent jobs don't end up messing up child jobs) | 21:06 |
jeblair | jlk: not that i'm aware of. :) | 21:07 |
Shrews | jlk: Dockerfile-zuul-fedora installs zuul from github, but we mount ZUUL_SRC in the container and the container somehow uses that instead. Where is the bit of magic that makes that happen? It eludes me... | 21:08 |
jlk | Shrews: pip install -e | 21:09 |
mordred | jlk: yah - it's mostly that when writing a job in a repo and making a role in that repo - it was then weird to have a role defined in a base job be the one that ansible found | 21:09 |
jlk | ( I recently learned of that magic myself. It creates an egg link back to the source dir, so you can modify the source dir and restart the app without having to re-install to get your changes ) | 21:09 |
Shrews | jlk: where are you modifying the link? | 21:10 |
jlk | pip install -e MAKEs the link | 21:10 |
Shrews | jlk: right, and that *should* point to the github checkout, yes? | 21:10 |
jlk | Shrews: it points to the github checkout, which is in /zuul/ | 21:11 |
jlk | the devel.yaml file will mount your local filesystem checkout of zuul on top of that inside the container so that it masks the content put there during the build phase | 21:12 |
pabelanger | mordred: does that exist today some place in our setup? Curious to look at it more | 21:12 |
jlk | mordred: so the base problem here is that there is role name collision. | 21:13 |
jeblair | pabelanger: well, it did at least temporarily when we were moving things around | 21:13 |
jlk | and "first found" was winning | 21:13 |
jeblair | jlk: yes | 21:13 |
jeblair | i'm going to tell a little white lie now: | 21:14 |
Shrews | jlk: oh, so the 'git clone' from the Dockerfile is happening at the / level. | 21:14 |
mordred | jlk: yes - and first found was the opposite of my expectations | 21:14 |
Shrews | that may have been the piece i was missing | 21:14 |
jeblair | the current behavior is that the jobs run all of their playbooks with all of the roles accumulated through the entire hierarchy, and the roles are in the order the appear in that hierarchy, meaning that the base roles came before the later job roles. | 21:15 |
mordred | jlk: since I was hacking on a playbook and a role in my local repo and had no idea there was a rle named something else used for a different playbook defined in a base job | 21:15 |
*** lennyb has quit IRC | 21:15 | |
jeblair | *(the white lie is that *actually* our roles come from an unordered set so it's random. but it mostly ends up that way, and if we decide to stick with the status quo, we would at least fix it so it behaved deterministically like that. so let's call that the current state) | 21:15 |
mordred | fair :) | 21:16 |
pabelanger | ah :) | 21:16 |
jeblair | so yeah, morded was correctly surprised that he wasn't able to use the role his job was expecting. i think pushing roles onto the stack so that child job roles are found first will be least surprising to users. | 21:17 |
jlk | that does seem least surprising | 21:17 |
mordred | ++ | 21:17 |
jeblair | however, in order to make that change, we also need to make it so that users can't hijack playbooks by supplying malicious roles. | 21:17 |
* SpamapS grumbles about namespaces | 21:17 | |
SpamapS | Feels like we're repeating PHP history here | 21:17 |
jlk | I'm unclear on how surprising it would be to have a playbook in my repo call a role (that I forgot to add) and it picks up a role from some other (random) repository | 21:18 |
jlk | feels, icky to me | 21:18 |
SpamapS | Yeah | 21:18 |
SpamapS | I kind of want explicit composition somehow | 21:19 |
SpamapS | import zuul_roles | 21:19 |
jlk | Do we have any immediate use cases for inheriting roles themselves? | 21:20 |
jlk | the role, not a job | 21:20 |
jeblair | jlk: well, it's a repository in the inheritance chain for your job; so in practice, either a role repository *related* to your job (eg, devstack-gate roles), or something the site admin thought should be available to all jobs. of course, that probably includes zuul-jobs, so there will be quite a few roles there. | 21:20 |
jlk | like, defining a new playbook but expecting to use roles from somewhere else, not in the way that Ansible already does it? | 21:20 |
mordred | jlk: ifyou want to make a job that runs tox on a specific environment | 21:22 |
mordred | jlk: you can inherit from the tox job | 21:22 |
mordred | jlk: and that'll get everything se tup for you in pre-playbooks | 21:23 |
jeblair | jlk: yes -- if i were defining a new simple job, i would expect to simply inherit from base and have all of the standard site roles available. similarly, a variant on the devstack job. or mordred's example. | 21:23 |
mordred | but then you might want to make a new run playbook - and expect to be able to use the tox role that the tox job used same as the tox job does since you inherited from the tox job | 21:23 |
pabelanger | base/pre job also works like that today. add-build-sshkey is located in openstack-zuul-roles | 21:23 |
jlk | but why invent a new way to get at these roles? | 21:23 |
mordred | jlk: what do you mean? | 21:24 |
jlk | Ansible already has a way to share roles | 21:24 |
mordred | jlk: wrong context | 21:24 |
mordred | jlk: we need zuul to be able to put speculative versions of roles in place on the executor | 21:24 |
jeblair | jlk: that, and this is the way we set roles_path in ansible.cfg. | 21:24 |
mordred | so that one can propose a change to a role that is part of a job definition and have the job execute with it correctly in place | 21:24 |
mordred | so - if you just want a galaxy role and you don't need speculative execution - you just add it to the job's roles list and you're good to go | 21:25 |
jlk | this is... complicated. | 21:27 |
*** lennyb has joined #zuul | 21:27 | |
pabelanger | Ya, it is a tricky question. If you were to run ansible-playbook mutliple time on your local system against multiple playbooks, would you dump all the role for all playbooks it a directory, or make updates to the role path between runs | 21:27 |
jeblair | for more background, skip to "Because the content of roles" in http://specs.openstack.org/openstack-infra/infra-specs/specs/zuulv3.html#ansible | 21:27 |
mordred | so - to be clear - I belive this is mostly going to come up for us and not for 'normal' users | 21:28 |
mordred | I expect 'normal' users to have playbooks and rles in their one repo and possibly some galaxy roles if they need those - and I _don't_ expect it to be frequent that a normal dev is going to be trying to do complex things with ansible AND the zuul job inheritance structure | 21:29 |
mordred | but for those of us working on base jobs and re-sharable jobs and running zuuls, this construct is going to be much more frequent because we're down in the muck of designing jobs for composition and reuse | 21:30 |
jeblair | yes, this should be invisible. the reason we're talking about it is because it momentarily wasn't. :) | 21:30 |
mordred | yup. we _want_ theuser to be explicit about the roles they want to use - we just want them to actually get those roles when they are :) | 21:31 |
pabelanger | we could also better namespace our roles, such as repo.rolename in our playbooks. eg: zuul-jobs.bindep to help people looking at jobs better understand where a role is coming from. | 21:32 |
*** dkranz has quit IRC | 21:32 | |
mordred | pabelanger: yah - I think we could certainly actually have our use in our base jobs adopt a stricter and cleaner approach lke that | 21:34 |
mordred | pabelanger: and that's likely not a terrible idea | 21:34 |
jeblair | yeah, that seems like a good idea. i'm not sure we should have zuul require or enforce that, since ansible doesn't. but we can certainly set up our roles that way. | 21:35 |
mordred | jeblair: ++ | 21:36 |
mordred | jeblair: I do not think we sould have zuul require or enforce that - but I think we should choose to do that ourselves | 21:36 |
pabelanger | agree | 21:37 |
jeblair | back to the question i posed -- i'm now leaning toward option B because i think it's safer. while it's true that the final attribute gets us a lot of safety, i'm starting to view things as basically roles go hand-in-hand with the playbook content. so it would be surprising for someone who authored a job in a trusted repo to have that job run with different roles than they specified (even if the job had no secrets) | 21:39 |
jeblair | i think i'm going to proceed with that for now. it should be easy to change if we don't like it. | 21:40 |
jlk | is option B defined somewhere? I haven't read all 400 lines of scrollback :D | 21:41 |
jeblair | jlk: yeah, i made a mockup here: https://etherpad.openstack.org/p/b7TQRK74ka | 21:41 |
pabelanger | ya, I am also a little confused on 'final' attribute myself | 21:41 |
mordred | I think option b is more in line with an expectation that things be explicit | 21:43 |
jeblair | pabelanger: 'final' means nothing which can affect what actually gets executed by a job can be changed by a variant of that job. (we might also apply it to inheritance, but we haven't yet). | 21:43 |
mordred | doing a feels like we're inventing a whole new method of executing ansible, which isn't our intent | 21:43 |
jeblair | pabelanger: for example the pypi-upload job will be a final job. anyone can run it and add it to their own project and specify under what circumstances it should run, but they can't change what it does. | 21:44 |
mordred | whereas b is basically "I have a playbook and some roles, please to run the playbook with the roles" | 21:44 |
mordred | long story short- I agree, I think B is the better choice | 21:44 |
pabelanger | jeblair: understood, by looking the etherpad, is my-unittests the final job in this context? | 21:45 |
jeblair | pabelanger: none of these jobs are final | 21:46 |
pabelanger | okay, I am still slightly confused, but that is okay. I'll hold out for the patchset and look again | 21:47 |
mordred | I think B basicaly says that doing the my-unittests job like that is essentially a silly thing | 21:47 |
jeblair | yeah, the only reason to write that job would be to further inherit from it. | 21:48 |
mordred | that unittests is a playbook defined in the unittests job and where it was defined it said "use unittests-roles" | 21:48 |
mordred | and if someone wants to make a my-unittests that uses their own roles, they can do that - they just need to also define a playbook that does so | 21:49 |
jeblair | mordred: yep | 21:49 |
mordred | and then they'll be in a position of "I defined a playbook and some roles and used them" and nothing is extra magic | 21:49 |
mordred | actually - lemme add that real quick ... | 21:49 |
pabelanger | okay, I think I am understand more now. my-unittests is missing the run section | 21:50 |
* jlk watches the pad get updated | 21:51 | |
pabelanger | mordred: line 18 would be valid right? | 21:51 |
mordred | pabelanger: sorry - which line? | 21:52 |
pabelanger | mordred: job on line 18, that to me is valid today | 21:52 |
* mordred thinks this is fun - and have become convinced that my-unittests is just a bad idea and not supporting it is correct | 21:53 | |
pabelanger | line 13, I haven't see before | 21:53 |
pabelanger | and not sure we are using it | 21:53 |
mordred | pabelanger: yes - job on line 18 is supported and what we want | 21:53 |
mordred | exactly | 21:53 |
pabelanger | ya, okay now I understand | 21:53 |
mordred | it's a theoretical possibility- and I think we've discovered that it's not a good theoretical possibility | 21:53 |
jeblair | job on line 13 is silly, but it's the crux of the question -- what do we do for that case. :) | 21:53 |
mordred | yup | 21:53 |
pabelanger | jeblair: ya, I am having a hard time thinking of why we'd want to do that | 21:54 |
pabelanger | over what mordred just did on line 18 | 21:54 |
jeblair | this will cause us to have per-playbook ansible config files, rather than just two (untrusted/trusted). | 21:55 |
jeblair | that's actually going to be the bulk of the change. the model/config stuff is easy. :) | 21:55 |
mordred | jeblair: ++ | 21:55 |
pabelanger | jeblair: ya, I actually think I like that better | 21:56 |
pabelanger | ++ | 21:56 |
jlk | yeah I like the chunk on 18 better. Less magic | 21:58 |
mordred | ++ | 21:58 |
jlk | You get the roles from the parent, and you can use your localized playbook to alter how those roles behave. | 21:59 |
pabelanger | Ya, I am kinda surprised that run isn't a required attribute for our jobs | 21:59 |
jlk | which you should be able to do speculatively, right? | 21:59 |
mordred | jlk: yup! | 21:59 |
jlk | hyperagreement! | 21:59 |
mordred | pabelanger: well - it can't be - because of base jobs - but I hear you | 21:59 |
jeblair | pabelanger: the 'my-unittests' job without the roles attribute but with, say, a vars attribute makes perfect sense. in fact, you've written such a job. but it still doesn't have 'run' because it doesn't need it. | 22:02 |
pabelanger | Ya, that is true | 22:03 |
jlk | FYI I'm out for two weeks on vaca starting Monday | 22:26 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Correct run job parameter documentation https://review.openstack.org/482977 | 22:28 |
pabelanger | jlk: enjoy! | 22:32 |
jlk | I plan to. 9 days of beach fun on Kauai | 22:33 |
mordred | jlk: sounds lovely! I was there in December and had a great time | 22:36 |
jlk | cool! My mother lived there during the 90s and I spent 4 summers there with her. Was good times. None of my family has been. | 22:37 |
jamielennox | Shrews, others: i had this patch for nodepool to remove the secure file default cause if the file doesn't exist it crashes: https://review.openstack.org/#/c/480310/ | 23:01 |
jamielennox | i was going to just remove the whole concept of a secure file because it's unused, but there's a few lines in the doc that imply it's coming back | 23:01 |
jamielennox | thoughts? | 23:01 |
pabelanger | Ya, I think we want it for zookeeper auth | 23:01 |
pabelanger | which we haven't added it | 23:02 |
jeblair | i'm thinking we should move all the config which isn't providers or images into the other file (and probably rename it) | 23:03 |
jeblair | that gives it the same sort of site-administration vs content split we have with zuul | 23:04 |
jamielennox | so i saw a proposal for a conf.d, wouldn't supporting that allow you to split secure/config without specifying exactly the two files | 23:04 |
SpamapS | jlk: zomg I love Kauai | 23:04 |
jlk | We'll be staying on Poipu beach. Kiahuna Plantation Resort. | 23:05 |
SpamapS | Tahiti Nui in Hanalei for karaoke.. it's mostly local people. :-D | 23:05 |
jeblair | jamielennox: maybe so? | 23:05 |
SpamapS | jlk: that's where I stayed, but in a rented house. | 23:06 |
jlk | oh, I don't do karaoke :( | 23:06 |
SpamapS | saright bra | 23:06 |
SpamapS | Also I can't remember the name of the cheesy * 10 luau I went to.. but it was ermahgerd goodbad. | 23:07 |
jlk | that sounds great! | 23:09 |
jamielennox | i am a fan though of splitting the provider/images type config away from the zookeeper and dir related stuff | 23:09 |
SpamapS | Smith Family Garden Luau | 23:10 |
SpamapS | that one | 23:10 |
jamielennox | as i can see wanting that info in a more publc project-config style repo than just the straight up config options | 23:11 |
jlk | SpamapS: you had me at "open bar" | 23:12 |
jamielennox | (can always make that work now, but as a recommended deployment style) | 23:12 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!