*** elyezer_ has quit IRC | 00:35 | |
*** elyezer_ has joined #zuul | 00:53 | |
*** openstackgerrit has quit IRC | 01:49 | |
*** Diabelko has quit IRC | 02:16 | |
*** Diabelko has joined #zuul | 02:16 | |
*** openstackgerrit has joined #zuul | 02:49 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: dashboard: fix multi-tenant navigation bar https://review.openstack.org/579418 | 02:49 |
---|---|---|
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: angular: call enableProdMode https://review.openstack.org/573494 | 03:30 |
tristanC | mordred: the angular routing fixes you merged are working well with our multi-tenant setup and '/zuul/' path prefix, though I still need 579418 and 573494 | 03:32 |
ianw | corvus: re prior message on zuul-announce deprecation mail -- I've sent v2 of that mail now (cc'd to discuss list) http://lists.zuul-ci.org/pipermail/zuul-discuss/2018-July/000460.html | 04:20 |
ianw | i think that covers the (what turned out to be rather complex, at least a lot of small, interconnected parts) situation now | 04:20 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: model: fix AttributeError exception in freezeJobGraph https://review.openstack.org/579428 | 04:23 |
tristanC | corvus: did zuul.openstack.org scheduler got restarted with the "Don't add implied branch matchers to project-pipeline variants" change? | 04:24 |
tristanC | corvus: it seems like it introduced a new exception for projects without a pipeline configuration, see 579428 | 04:28 |
tristanC | well it doesn't seems like a behavior change since projects without configuration aren't running jobs anyway... | 04:39 |
*** Rohaan has joined #zuul | 04:39 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: executor: change execution log to INFO https://review.openstack.org/578704 | 04:56 |
openstackgerrit | Ian Wienand proposed openstack-infra/zuul-jobs master: Use readthedocs webhook to trigger build https://review.openstack.org/579434 | 04:56 |
*** swest has joined #zuul | 04:59 | |
*** nchakrab has joined #zuul | 05:17 | |
*** nchakrab_ has joined #zuul | 05:18 | |
*** nchakrab has quit IRC | 05:22 | |
*** nchakrab_ has quit IRC | 05:28 | |
*** Rohaan has quit IRC | 05:36 | |
*** Rohaan has joined #zuul | 05:40 | |
*** nchakrab has joined #zuul | 06:00 | |
*** nchakrab has quit IRC | 06:01 | |
*** nchakrab has joined #zuul | 06:01 | |
*** dmellado has joined #zuul | 06:23 | |
*** threestrands has joined #zuul | 06:33 | |
*** threestrands has quit IRC | 06:33 | |
*** threestrands has joined #zuul | 06:33 | |
*** threestrands has quit IRC | 06:37 | |
*** gtema has joined #zuul | 07:08 | |
openstackgerrit | Ian Wienand proposed openstack-infra/zuul-sphinx master: Open role readme files in utf-8 mode https://review.openstack.org/579474 | 07:30 |
openstackgerrit | Ian Wienand proposed openstack-infra/zuul-jobs master: Use readthedocs webhook to trigger build https://review.openstack.org/579434 | 07:32 |
*** hashar has joined #zuul | 07:39 | |
*** jpena|off is now known as jpena | 07:44 | |
tobiash | tristanC: do you use html rewriting for running zuul on /zuul/ or is that not necessary anymore? | 07:57 |
tobiash | I just tried the current master in my test environment and it breaks my setup | 07:59 |
tristanC | tobiash: not sure what you mean by html rewriting, but our httpd configuration is from https://softwarefactory-project.io/cgit/software-factory/sf-config/tree/ansible/roles/sf-gateway/templates/gateway.common.j2#n147 to L183 | 08:05 |
tobiash | tristanC: our proxy in front of zuul rewrites html and adds /zuul/ to the links. This was a workaround in the early days and I guess that's broken now. | 08:06 |
tristanC | tobiash: oh no need to do that, the current code in master already handle sub-path | 08:07 |
tobiash | tristanC: ah thanks, do I need to condigure that somewhere? I didn't find anything in the docs/ | 08:08 |
tristanC | tobiash: i don't think so, just serve the html files from any path, and the client will figure out the api/ location | 08:09 |
tobiash | tristanC: ah thanks, will try | 08:10 |
openstackgerrit | Ian Wienand proposed openstack-infra/zuul-jobs master: Use readthedocs webhook to trigger build https://review.openstack.org/579434 | 08:23 |
tobiash | tristanC: does zuul_web_url contain /zuul/ in your case? | 08:40 |
tobiash | https://softwarefactory-project.io/cgit/software-factory/sf-config/tree/ansible/roles/sf-gateway/templates/gateway.common.j2#n181 | 08:40 |
tobiash | tristanC: I've added 579418 and 573494 and proxy /zuul/ to / and cannot get it working | 08:40 |
tobiash | it seems to default to the single tenant 'mode' in my case | 08:41 |
tobiash | I don't see any call to info | 08:45 |
tristanC | tobiash: zuul_web_url points directly at the zuul-web endpoint | 08:46 |
tobiash | tristanC: so you also proxy /zuu/ to / | 08:46 |
tristanC | tobiash: oh yes, we need to load "/zuul/t/$tenant_name/status.html" | 08:47 |
tobiash | I get 'https://cc-dev1-ci.bmwgroup.net/zuul/api/status: Not Found' when loading /zuul/ | 08:47 |
tristanC | tobiash: here is the updated httpd configuration so that GET on /zuul/ directly load the status page: https://softwarefactory-project.io/r/#/c/12828/3/ansible/roles/sf-gateway/templates/gateway.common.j2 | 08:47 |
tristanC | tobiash: what is the referer of that /zuul/api/status call? | 08:48 |
tobiash | referrer is https://cc-dev1-ci.bmwgroup.net/zuul/ | 08:48 |
tristanC | and can you try to load "/zuul/t/$tenant_name/status.html" ? | 08:49 |
tristanC | (and replace $tenant_name witha valid tenant...) | 08:49 |
tobiash | at that point I don't have a tenant as this is a multi tenant deployment | 08:49 |
tristanC | with 579418, you could also try to load /zuul/t/tenants.html | 08:50 |
tobiash | so https://cc-dev1-ci.bmwgroup.net/zuul/ should get me the tenant list | 08:50 |
tristanC | yes, tenant list isn't working without 579418 | 08:50 |
tristanC | and i couldn't figure out how to make the routing load the tenants component when accessing "/zuul/" directly | 08:50 |
tobiash | I already included 579418 in my deployment | 08:51 |
tobiash | ah so that's an unresolved problem | 08:51 |
tristanC | then redirect ^/zuul/$ to /zuul/t/tenants.html | 08:51 |
tobiash | https://cc-dev1-ci.bmwgroup.net/zuul/t/tenants.html shows me the same error message | 08:52 |
tobiash | and also the layout of that site tells me that this is not the tenant overview but a white labeled status | 08:52 |
tobiash | ok https://cc-dev1-ci.bmwgroup.net/zuul/tenants.html seems to work for the tenant overview | 08:54 |
tristanC | hum, weird, you sure you got https://review.openstack.org/#/c/579418/1/web/app-routing.module.ts ? | 08:55 |
tobiash | double checking | 08:55 |
tobiash | hm, weird, that's missing have to rebuild | 08:57 |
*** electrofelix has joined #zuul | 08:57 | |
tobiash | ups, built the wrong branch | 08:57 |
tobiash | ah ok, I think I understand now so the routing seems to be static and at / we cannot judge if we need the StatusComponent or the TenantsComponent | 09:04 |
tristanC | hum, can someone confirm if it's the correct behavior that a parent git repo change update doesn't dequeue the child git repo change? | 09:12 |
tobiash | tristanC: do they share the same queue? | 09:13 |
tristanC | tobiash: yes, but no depends-on in commit message | 09:13 |
tobiash | tristanC: gerrit or github? | 09:14 |
tobiash | and check or gate? | 09:14 |
tristanC | just did a test in openstack-dev/sandbox, and 579497,1 depends on 579496,1, and 579496,2 didn't dequeue 497 | 09:14 |
tristanC | tobiash: check with gerrit | 09:14 |
tobiash | in gate the update definitely needs to dequeue | 09:14 |
tobiash | in check that's normal behavior afaik | 09:15 |
tristanC | tobiash: arg, I just spent a few hours trying to understand why it didn't happen, for some reason i thought it should :-) | 09:16 |
tristanC | what's suprising is that if the child change has an explicit depends-on in the commit message, then the independent manager does the dequeue | 09:16 |
tristanC | and it seems like we rely on the "neededBy" gerrit schema field, but this doesn't get returned for non current patchset | 09:17 |
*** spsurya_ has joined #zuul | 09:40 | |
*** sshnaidm|off is now known as sshnaidm | 09:40 | |
*** nchakrab_ has joined #zuul | 10:08 | |
*** nchakrab has quit IRC | 10:09 | |
*** nchakrab has joined #zuul | 10:10 | |
*** nchakrab_ has quit IRC | 10:13 | |
tobiash | tristanC: ok, after rebuild it seems to work with zuul/t/tenants.html | 10:19 |
tobiash | tristanC: what's 573494 needed for? | 10:20 |
tristanC | tobiash: i think it's rather cosmetic, as far as i can tell it removes console.log, but the enableProdMode may have other benefit? | 10:21 |
tristanC | tobiash: it would be nice to merge 579418 and tag zuul-3.1.1 | 10:22 |
tobiash | tristanC: we still need an upgrade notice as /zuul/ to get the tenant list is not working anymore | 10:24 |
tobiash | or we need to fix that (no idea how) and then tag 3.1.1 | 10:24 |
tobiash | but at least right now a multi tenant setup on /zuul/ is *not* possible without a rewrite rule in a proxy to change /zuul/ to /zuul/t/tenants.html | 10:26 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Dynamically determine overlay network mtu https://review.openstack.org/578153 | 10:28 |
tobiash | so tbh I would even treat this as a blocker for 3.1.1 (given that 3.1.0 doesn't contain the angular refactor) | 10:28 |
tobiash | tristanC, mordred, corvus: what do you think? ^ | 10:28 |
*** Rohaan___ has joined #zuul | 10:34 | |
*** Rohaan has quit IRC | 10:34 | |
*** Rohaan___ is now known as Rohaan | 10:34 | |
*** jpena is now known as jpena|lunch | 11:30 | |
*** Rohaan has quit IRC | 11:52 | |
*** Rohaan has joined #zuul | 12:02 | |
*** rlandy has joined #zuul | 12:25 | |
*** jpena|lunch is now known as jpena | 12:28 | |
*** Rohaan has quit IRC | 12:45 | |
*** jpena is now known as jpena|off | 12:52 | |
*** nchakrab_ has joined #zuul | 13:02 | |
*** nchakrab has quit IRC | 13:06 | |
*** elyezer_ has quit IRC | 13:10 | |
*** dkranz has quit IRC | 13:14 | |
*** elyezer_ has joined #zuul | 13:29 | |
*** gtema has quit IRC | 13:35 | |
rcarrillocruz | mrhillsman: heya, around? looking at how you handle secrets on https://github.com/theopenlab/openlab-zuul-jobs . I take openlab-zuul-jobs is a config repo in your zuul tenant installation ? | 13:47 |
rcarrillocruz | my understanding is that secrets cannot be used in untrusted projects unless they are run on post-review queues | 13:48 |
rcarrillocruz | so here's the use case. I have a role that interacts with 3rd party clouds (be aws, or openstack, azure, etc) to create vpn tunnels. I'm thinking on putting the cloud provider creds as zuul secrets. Now, what I would like to avoid is to by mistake having someone a debug statement or something showing up the creds in the job logs. I'm thinking maybe it makes sense to create check/post with post-review just for | 13:53 |
rcarrillocruz | these kind of jobs, instead of using the general check/gate wich are pre-review by default | 13:53 |
rcarrillocruz | that at least allows us to run the jobs POST review , so we can see if someone is putting an unadverted debug statement or the likes | 13:53 |
mrhillsman | o/ | 13:57 |
mrhillsman | openlab-zuul-jobs is a trusted project/repo | 13:57 |
rcarrillocruz | what i thought, so that repo in your case is a mix of project-config/openstack-zuul-jobs right? | 13:58 |
mrhillsman | i am not sure we are using any project-config jobs but yes | 13:59 |
rcarrillocruz | oki | 13:59 |
rcarrillocruz | so you don't use any in-repo jobs that handle secrets, as that's a config repo that's the one containing jobs and secrets | 13:59 |
rcarrillocruz | gotcha | 14:00 |
rcarrillocruz | i'm wondering about pros-cons for my scenario: all jobs secrets put on project-config vs in-repo jobs containing secrets run on post-review queues | 14:00 |
mrhillsman | yes, we handle the secrets via roles in that repo | 14:00 |
mrhillsman | not 100% sure we are handling them the right way but we have not heard of any issues just yet with a few exceptions along the way | 14:01 |
pabelanger | rcarrillocruz: i am not sure I would enable post-review on check pipelines, I think it would be too easy to leak credentials. You are right, you can create a new parent job, keep it in project-config repo, then have jobs parent to it to setup vpn. Secrets needs to be next to playbooks | 14:02 |
rcarrillocruz | otoh... if i have a job in config project that is 'run role test.yaml' i'm always worried the test may contain something bad against localhost, that could harm executor | 14:03 |
rcarrillocruz | dunno | 14:03 |
mrhillsman | but you control the jobs there | 14:03 |
pabelanger | will be run in bwrap on executor, but yes. | 14:03 |
mrhillsman | so you should be reviewing the code | 14:03 |
mrhillsman | s/code/jobs | 14:03 |
*** dkranz has joined #zuul | 14:04 | |
mrhillsman | not sure if you know this, but unlike non-config projects, PRs to config projects are not run as part of jobs until they merge | 14:04 |
rcarrillocruz | gah, call... will get back to you in a few folks, thx | 14:08 |
*** nchakrab_ has quit IRC | 14:13 | |
Shrews | mordred: seems the changes to replace shade/occ in nodepool are ready? probably should pick a time/day to merge those then restart builders/launchers when we can be around to watch that. | 14:17 |
Shrews | that's not something i want to just merge then give infra-root a surprise when they need to restart for some reason :) | 14:18 |
Shrews | not that i expect issues :) | 14:18 |
openstackgerrit | Merged openstack-infra/nodepool master: Publish docs on release https://review.openstack.org/579300 | 14:19 |
openstackgerrit | Merged openstack-infra/nodepool master: Handle node no longer in pool error https://review.openstack.org/579309 | 14:19 |
*** jpena|off is now known as jpena | 14:31 | |
*** dkranz has quit IRC | 15:02 | |
clarkb | Shrews: the changes just change the import location to start right? we are still using the shade bits of the sdk repo? | 15:24 |
clarkb | (fwiw I think that is probably the right "level" of api consumption for nodepool considering that shade grew out of nodepool) | 15:25 |
Shrews | clarkb: yeah, that's right. same shade-style calls | 15:27 |
corvus | tobiash: test cases which just check that an exception handler did or didn't log something are difficult to write and fragile -- i'd prefer to focus on testing behavior. would you revisit your -1 on 579428 in that light? | 15:37 |
*** nchakrab has joined #zuul | 15:40 | |
*** dkranz has joined #zuul | 15:45 | |
*** nchakrab has quit IRC | 15:47 | |
tobiash | corvus: it looked to me like fixing an error (aka behavior change) | 15:48 |
tobiash | If it's just silencing logging I agree that there is no need for a test | 15:49 |
tobiash | corvus: Revisited my vote after a second look | 15:50 |
corvus | tobiash: yeah, it's at least not a testable behavior change (either way, nothing gets enqueued) | 15:51 |
corvus | it just skips a bunch of noop actions by going to the exception handler | 15:52 |
*** weshay is now known as weshay|ruck | 15:56 | |
*** gtema has joined #zuul | 16:03 | |
*** bhavik1 has joined #zuul | 16:22 | |
*** bhavik1 has quit IRC | 16:46 | |
*** jpena is now known as jpena|off | 16:58 | |
*** electrofelix has quit IRC | 17:08 | |
tobiash | pabelanger, corvus: looks like the hdd sensor has a race | 17:17 |
tobiash | http://logs.openstack.org/28/579428/1/gate/tox-py36/fb8808e/testr_results.html.gz | 17:17 |
tobiash | test_status_page | 17:17 |
corvus | i have to run to an appointment; will be back in a few hours | 17:25 |
*** elyezer_ is now known as elyezer | 17:39 | |
pabelanger | tobiash: corvus: I'll look here shortly | 18:09 |
*** gtema has quit IRC | 18:22 | |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool master: Fix for referencing cloud image by ID https://review.openstack.org/579664 | 18:48 |
Shrews | tobiash: you may want to take a peek at 579664 ^^^. Not sure what the whole dict(id=...) part was trying to do. | 18:49 |
tobiash | Shrews: looking | 18:54 |
tobiash | that dict(id=...) part looks kinda weird | 18:54 |
gundalow | As part of AnsibleFest we will be having a *TWO DAY* Contributor Summit. Please feel free to add items to the agenda https://etherpad.openstack.org/p/ansible-summit-october-2018 Signup and discount codes will be made available shortly (~week) via the ansible-devel email list. | 19:01 |
clarkb | gundalow: that bookends ansiblefest proper? | 19:02 |
gundalow | clarkb: haha | 19:04 |
gundalow | I like the bookends | 19:04 |
Shrews | tobiash: gah, release note for the fix... always forget those | 19:11 |
clarkb | gundalow: mostly making sure that I keep things like that in mind when booking travel | 19:12 |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool master: Fix for referencing cloud image by ID https://review.openstack.org/579664 | 19:13 |
gundalow | clarkb: Yup | 19:15 |
gundalow | Feel free to ping the agenda around anywhere else it might make sense | 19:15 |
openstackgerrit | Merged openstack-infra/zuul master: model: fix AttributeError exception in freezeJobGraph https://review.openstack.org/579428 | 19:16 |
rcarrillocruz | mrhillsman: i know, i mean ... | 19:23 |
rcarrillocruz | let me put an example | 19:24 |
rcarrillocruz | the tox job | 19:24 |
rcarrillocruz | is a config project | 19:24 |
rcarrillocruz | but that runs tox, which exercises the repo being tested | 19:24 |
rcarrillocruz | so imagine i have like a job that is 'run test.yaml' | 19:24 |
rcarrillocruz | if i push a change to the repo being tested | 19:24 |
rcarrillocruz | whjich does | 19:24 |
rcarrillocruz | debug: var=myzuulsecret | 19:25 |
rcarrillocruz | i guess that would be exposed in the job log, unless the job is previously reviewed in a queue with post-review true | 19:25 |
rcarrillocruz | ? | 19:25 |
clarkb | rcarrillocruz: zuul sould fail if you use a secret in a pre review situation | 19:29 |
rcarrillocruz | does it ? | 19:29 |
rcarrillocruz | THEN that's good | 19:29 |
rcarrillocruz | i'll try to repro tomorrow | 19:30 |
rcarrillocruz | by fail you mean zuul detects debug tasks or the likes | 19:31 |
rcarrillocruz | i.e. how does it know zuul when accessing a secret var is legit or bogus | 19:31 |
clarkb | rcarrillocruz: zuul does not let you use secrets in that situation and would be a bug | 19:32 |
clarkb | that situation being pre review testing with config changes that are not merged | 19:32 |
rcarrillocruz | yeah, but in this case it would not be a config change...the job is in config, like tox/ansible-playbook test.yaml on the tested repo | 19:33 |
clarkb | rcarrillocruz: you must explicitly allow the job to use the secret though | 19:33 |
clarkb | rcarrillocruz: if you write it to disk such that tox can read it then yes you have probably done something bad | 19:33 |
pabelanger | ran into this issue with RDO zuulv3 migration last week, for the purpose of finishing migration, we just added that specific to be trusted in config project repo. But ya, was a tox job | 20:36 |
corvus | rcarrillocruz, clarkb, pabelanger: there are a lot of ways to construct a job so that part of the job happens in a trusted environment with secrets and part happens in an untrusted environment (clarkb and pabelanger suggested some of those ways; there may be others). however, if you find it's not possible to construct the job you need like that, and you still want it to run pre-merge on untrusted code, then | 20:54 |
corvus | you might consider the option of a restricted check pipeline. a pipeline that has a requirement that a core reviewer leave a vote (or label) saying "it's okay to run sensitive check jobs on this change". | 20:54 |
pabelanger | +1 we'll likely do that in rdoproject in the next few days, once we unfreeze the config project. Just need to have a discussion first about it with everybody | 20:56 |
rcarrillocruz | it's not more a question as to how to structure a job. What it's not clear to me is by having a trusted job in a config repo that merely does 'run this entrypoint from checked out PR' can be protected from a malicious user to put a debug/print statement on the secret var | 20:56 |
rcarrillocruz | that's why i can only think of a pipeline with post-review true | 20:57 |
rcarrillocruz | corvus ^ ? | 20:57 |
rcarrillocruz | i may be overcautious, but trying to protect myself from getting a secret leaked out and a bot farm on an AWS account billed to our corp account :S | 20:58 |
corvus | rcarrillocruz: it's a little more nuanced than that. for instance, in openstack, we have jobs with access to secrets which run arbitrary code ("tox -e py35") but there's no way to access the secrets from the code that runs tox. that's because secrets attach to *playbooks* not jobs. | 20:58 |
corvus | rcarrillocruz: so, if the playbook which runs tests from the entrypoint is defined in a job with a secret, then that playbook (and therefore whatever code the 'run tests' entrypoint runs) has access to those secrets | 20:59 |
corvus | rcarrillocruz: it sounds like you might be writing a job like "use some aws credentials to test a proposed change to this ansible module". that sounds like the kind of job where the proposed change is the thing that needs access to the secrets. that probably can't be written safely to run pre-review. that kind of job probably needs a restricted check (post-review) pipeline. | 21:01 |
corvus | rcarrillocruz: however, a job like "i'm going to set up something in aws, then hand it to an ansible module with a proposed change" -- that could be written in two stages where only the first stage has access to the secrets. | 21:01 |
pabelanger | just remember to delete secrets from disk after vpn is created, so next jobs can't just echo that configuration | 21:03 |
rcarrillocruz | well, the idea would be to run ansible from an ephermeral node, so when is finished node is gone | 21:03 |
rcarrillocruz | corvus: yah, it's the primer. It's a role that 'create aws things with these creds'. It's not like i can prepare something in aws and act on it afterwards from proposed change | 21:04 |
corvus | i think that's probably the deciding question: does code in the proposed change need access to the secrets? if so, you probably need the job to be in a post-review pipeline. if not, you can probably (carefuly) construct it with multiple playbooks and run it pre-review. | 21:04 |
corvus | rcarrillocruz: yeah, then my guess is this one needs to be post-review. | 21:05 |
rcarrillocruz | so, check and gate 'restricted' | 21:05 |
rcarrillocruz | in the new multitenant world, i can have whatever pipelines right? | 21:05 |
rcarrillocruz | in my tenant i mean | 21:05 |
pabelanger | yes | 21:06 |
corvus | rcarrillocruz: yep. you need to make sure that folks know that they're reviewing the code for attempts to subvert the ci system :). so basically, don't +2/merge-label/whatever the change if it has "echo $PASSWORD" in the ci script. :) | 21:06 |
rcarrillocruz | heh | 21:06 |
rcarrillocruz | now | 21:06 |
rcarrillocruz | post-review in GH world | 21:06 |
rcarrillocruz | in my mind, post-review in gerrit means 'oh, you put +1 on the change, now we go run jobs' | 21:06 |
rcarrillocruz | how that translates in GH? | 21:07 |
corvus | rcarrillocruz: i'd do it with a label | 21:07 |
rcarrillocruz | approve is +1 | 21:07 |
rcarrillocruz | and mergeit is +A | 21:07 |
rcarrillocruz | we have mergeit label as +A on our tenant pipelines, 'approves' are like +2.. | 21:07 |
corvus | rcarrillocruz: either use whatever existing label means "i have reviewed this and think it's okay to merge", or add a new one that means "it's okay to run the real-cloud tests on this pr" | 21:08 |
rcarrillocruz | i wonder if i could make approves the o-k for post-review | 21:08 |
rcarrillocruz | so it has to be a label ? | 21:08 |
corvus | rcarrillocruz: 'approve' sounds reasonable, or if folks don't like it, add a new one | 21:08 |
rcarrillocruz | ok, thx folks | 21:08 |
rcarrillocruz | much appreciated | 21:08 |
corvus | rcarrillocruz: there are probably options other than label, but i think that's the best | 21:08 |
corvus | (you might be able to do "comment left by project contributor" or something) | 21:09 |
corvus | but i think labels were made for this | 21:09 |
corvus | Shrews, tobiash: i wrote the code updated by 579664 in I8411c627f9136339d1b0eb35632d6b2a27ab7a81 | 21:13 |
corvus | let's see if we can reconcile that before we merge 664 | 21:14 |
Shrews | corvus: oh? my 'git log' foo must have failed. i thought tobiash wrote it | 21:14 |
corvus | oh, hrm, it goes back further than that :) | 21:15 |
corvus | (that's definitely a relevant commit) | 21:16 |
Shrews | corvus: it seemed to be an untested path | 21:16 |
corvus | i wrote the previous incarnation too: I83f2902be4b9b73a949461b7f14da548066b9562 | 21:17 |
corvus | so the way i understand it (through the mists of time) is that if we don't pass in a dict, shade would treat it as a name. using the "dict(id=" lets us specify it's really an id. | 21:20 |
Shrews | corvus: except sending id=BLAH to get_image would never work there | 21:22 |
Shrews | the param is name_or_id | 21:22 |
Shrews | and there is a get_image_by_id for the other thing | 21:23 |
Shrews | so i don't think this ever really worked? | 21:23 |
Shrews | oh, yes it did | 21:24 |
Shrews | if the thing we pass has an 'id' attr, it dtrt | 21:25 |
Shrews | sneaky use of the shade api there | 21:25 |
corvus | at the time it was written, it would have passed "image=dict(id=UUID)" | 21:25 |
Shrews | yeah. we probably should have documented what was happening there | 21:26 |
Shrews | ok, so my fix isn't what we really want. will have to look more closely at why it could be failing | 21:27 |
Shrews | corvus: thx | 21:27 |
corvus | Shrews: np; thank you :) | 21:28 |
Shrews | the shade authors were either way too clever, or way to drunk | 21:29 |
corvus | i'm looking at the bug report, and i assume the first example is a type (it should say "image-id" instead of image-name there?) | 21:29 |
Shrews | corvus: yeah, that's what i assumed. my included test produced the same error message before the fix | 21:30 |
corvus | Shrews: apparently the api is "call the method with these args and a comment" :) | 21:30 |
Shrews | it was one of those things were we discovered after the fact that if we had the id, it would be a lot less expensive to query using it | 21:31 |
Shrews | so we were clever (but i also think we were drunk when we wrote the cleverness) | 21:31 |
corvus | the current call chain looks the same to me -- i think it should still be passing "image=dict(id=UUID)" | 21:33 |
Shrews | corvus: i think it's a different path (checking label readiness) | 21:34 |
Shrews | corvus: OpenstackProvider.getImage() call | 21:35 |
corvus | oh | 21:35 |
corvus | i was just about to ask where WARNING nodepool.driver.openstack.OpenStackProvider: Provider default is configured to use 0e2e2e05-99e5-422b-b9e3-7e4a28d9088e as the cloud-image for label my_label and that cloud-image could not be found in the cloud. comes from? | 21:35 |
corvus | so i guess that's failing before the createserver call | 21:35 |
Shrews | looks to me like it should still work, but i must be missing something | 21:36 |
corvus | ah found it | 21:36 |
corvus | sorry, i mean i found that error message | 21:36 |
corvus | provider labelReady | 21:36 |
Shrews | yup | 21:37 |
Shrews | my test failed because the fake provider doesn't handle the dict. i'm not sure why it fails in production (which uses shade which should handle it ok) | 21:37 |
corvus | sorry, i should have updated the fake provider to support that and added a test | 21:38 |
corvus | but it sounds like first we should double check this for real | 21:38 |
Shrews | corvus: yes, need a real openstack test to find this i think | 21:38 |
Shrews | will try to track that down tomorrow. | 21:39 |
corvus | Shrews: thank you for helping fix my stuff | 21:40 |
corvus | Shrews: i'll go ahead and make a patch to update the fake and add a test, but put it on hold until we verify it for realz | 21:40 |
corvus | er, well, i'll use the test you wrote. i'll just update the fak. | 21:41 |
Shrews | corvus: sure | 21:41 |
Shrews | maybe updating the fake provider will reveal the failure in a different path | 21:41 |
corvus | ooh. fingers crossed. :) | 21:41 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Fix race condition with executor hdd sensor https://review.openstack.org/579700 | 21:52 |
*** hashar has quit IRC | 22:01 | |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Fix race condition with executor hdd sensor https://review.openstack.org/579700 | 22:05 |
clarkb | pabelanger: on ^ is zuul expected to create thta dir? I expect not but returning false may prevent it from doing so if it is expected to create it | 22:07 |
pabelanger | clarkb: yah, zuul should create the state_dir if missing, but don't believe returning false will affect that | 22:09 |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Add test for referencing cloud image by ID https://review.openstack.org/579702 | 22:09 |
clarkb | pabelanger: returning false means that th executor shouldnt execute doesn't it? | 22:09 |
pabelanger | clarkb: the job shouldn't run ya | 22:10 |
pabelanger | but executor will continue to run | 22:10 |
clarkb | my grepping doesn't show it creating that dir though | 22:10 |
pabelanger | maybe we should only start sensors once zuul-executor is fully started | 22:10 |
pabelanger | 1 sec | 22:10 |
corvus | maybe it's just created in the test? | 22:11 |
corvus | though i would expect that to be created before the executor starts | 22:12 |
pabelanger | clarkb: I think https://git.zuul-ci.org/cgit/zuul/tree/zuul/executor/server.py#n1858 creates it | 22:12 |
pabelanger | because of makedirs | 22:12 |
pabelanger | plugin_dir is build up from state_dir | 22:13 |
pabelanger | corvus: I was thinking of doing that, but looks like zuul creates it. I could fix it that way too | 22:14 |
clarkb | pabelanger: the HDDSensor is created after that makedirs | 22:14 |
corvus | pabelanger: the governor doesn't start until after that line | 22:14 |
pabelanger | Hmm, maybe I am fixing it wrong | 22:15 |
pabelanger | http://logs.openstack.org/28/579428/1/gate/tox-py36/fb8808e/testr_results.html.gz was the failure | 22:15 |
pabelanger | tests.unit.test_web_urls.TestSuburl | 22:16 |
corvus | the exception in test_disk_accountant_kills_job references the tmpdir from test_template_removal_from_branch | 22:18 |
corvus | most of the test failures are due to the unclean kazoo shutdown in test_template_removal_from_branch | 22:20 |
pabelanger | okay, so consider it a test failure rather then race? | 22:21 |
corvus | yeah, i don't think there's a problem with the hdd sensor | 22:21 |
pabelanger | okay, thanks for looking | 22:22 |
Shrews | corvus: oh, i thought you were going to put a new PS on top of my review. Should I abandon mine? | 22:22 |
corvus | Shrews: let's just keep them both in WIP until we know what's going on? mine is really just yours but without the code change and reno (you're still the git author) | 22:23 |
Shrews | *nod* | 22:24 |
corvus | Shrews: mine passes the test, so we do need to poke at a real openstack | 22:24 |
clarkb | per ianw's email it would be helpful if we could get https://review.openstack.org/#/c/563789/ and https://review.openstack.org/#/c/563790/ moving (the second is subject ot hte deprecation notice but the first should be good to go in now I think) | 22:24 |
pabelanger | corvus: clarkb: are we at a point to consider restarting zuul in openstack-infra tomorrow and maybe tagging a new release? Now that JS has been landed | 22:25 |
corvus | pabelanger: i want to get more info from tobiash, tristanC, and mordred about the /zuul issue mentioned earlier. hopefully mordred will be out from under paperwork tomorrow | 22:25 |
clarkb | pabelanger: my only concern with it is I will become progressively more afks middle of hte week | 22:25 |
clarkb | but if others are around to work thorugh problems should be fine | 22:26 |
*** dtruong_ has quit IRC | 22:27 | |
pabelanger | wfm, hdd sensor is about the last thing RDO project needs to finish zuul migration, then we start deleting jenkins things | 22:27 |
*** dtruong has joined #zuul | 22:28 | |
pabelanger | skipping of zuul.child_jobs via zuul_return is another, but less important: https://review.openstack.org/578230/ | 22:28 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: DNM: Exercise log-inventory in base-test https://review.openstack.org/579708 | 22:40 |
*** rlandy has quit IRC | 23:35 | |
*** threestrands has joined #zuul | 23:43 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!