sfbender | Logan V created software-factory/sf-config master: Allow LECM to renew certificates https://softwarefactory-project.io/r/13872 | 00:14 |
---|---|---|
*** rcarrillocruz has quit IRC | 02:51 | |
sfbender | Merged software-factory/sf-config master: Allow LECM to renew certificates https://softwarefactory-project.io/r/13872 | 03:21 |
*** rcarrillocruz has joined #softwarefactory | 04:16 | |
*** nilashishc has joined #softwarefactory | 04:20 | |
*** rcarrillocruz has quit IRC | 04:21 | |
*** rcarrillocruz has joined #softwarefactory | 04:25 | |
*** nilashishc has quit IRC | 04:29 | |
*** nilashishc has joined #softwarefactory | 04:35 | |
*** jangutter has joined #softwarefactory | 07:29 | |
*** jangutter_ has joined #softwarefactory | 07:32 | |
*** jangutter_ has quit IRC | 07:32 | |
*** jangutter_ has joined #softwarefactory | 07:33 | |
*** nilashishc has quit IRC | 07:34 | |
*** jangutter has quit IRC | 07:36 | |
*** nilashishc has joined #softwarefactory | 07:36 | |
*** chandankumar has joined #softwarefactory | 07:40 | |
*** jpena|off is now known as jpena | 07:44 | |
*** zoli is now known as zoli|afk-fbcI | 08:15 | |
*** zoli|afk-fbcI is now known as zoli | 08:15 | |
spredzy | tristanC: o/ | 08:26 |
spredzy | If you have write premission, would mind approving https://github.com/ansible/zuul-config/pull/30 ? | 08:26 |
tristanC | spredzy: sure, done | 08:34 |
spredzy | thanks :) | 08:34 |
spredzy | tristanC: https://ansible.softwarefactory-project.io/zuul/status.html | 08:39 |
spredzy | I got an internal server error | 08:40 |
tristanC | hum, let me check | 08:44 |
tristanC | arg, Out of memory: Kill process 4440 (zuul-scheduler) score 818 or sacrifice child | 08:47 |
spredzy | :/ | 08:51 |
*** chandankumar has quit IRC | 08:51 | |
spredzy | I don't have the Internal Server Error anymore, just Tenant ansible isn't ready | 08:52 |
spredzy | I supose I need to wait a lil | 08:52 |
tristanC | spredzy: yes, scheduler is restarting | 08:52 |
spredzy | Nows its back | 08:52 |
spredzy | \o/ | 08:52 |
spredzy | tristanC: Question regarding https://github.com/ansible/awx/pull/2309 | 09:00 |
spredzy | I made it Depends-On: https://github.com/ansible/zuul-jobs/pull/16 | 09:00 |
spredzy | But apparently the changes hasn't caught up | 09:00 |
spredzy | Would you know why I am doing wrong ? | 09:00 |
tristanC | spredzy: according to status, it seems like the changes are stacked up correctly | 09:02 |
spredzy | But the job run are not the one from the change | 09:03 |
spredzy | ie. tox-api-lint vs. tox-awx-api-lint | 09:03 |
spredzy | The content of the #16 change doesn't seem to be applied | 09:03 |
tristanC | spredzy: indeed, not sure what's going on | 09:06 |
spredzy | tristanC: I can merge it, see if it still applies and then revert the change. | 09:09 |
spredzy | (If needed) | 09:09 |
spredzy | tristanC: Can it be posible that the git repo for zuul-jobs is somewhat out-of-sync ? | 09:10 |
spredzy | On friday I did a mistake and merged a commit manually on zuul-jobs rather than tag it 'mergeit' | 09:10 |
spredzy | Not sure if it can be related | 09:10 |
tristanC | spredzy: it shouldn't, zuul should be force pull | 09:11 |
spredzy | ack. So no idea about why it isn't applying the change | 09:11 |
spredzy | I ll merge zuul-jobs PR and see if it does what expected | 09:12 |
spredzy | tristanC: so yes, I really have the feeling that the latest ansible/zuul-jobs is not taken in consideration | 09:18 |
spredzy | tristanC: https://ansible.softwarefactory-project.io/zuul/status.html | 09:18 |
tristanC | spredzy: here are the logs: https://ansible.softwarefactory-project.io/paste/show/F4CARz8bc8hatZiIWaI1/ | 09:18 |
tristanC | spredzy: i think there is an issue with zuul-jobs branch, there should only be a master branch | 09:19 |
tristanC | spredzy: as you can see from that paste, the project-template is applied from 3 differents branches, master, awx_template and clean_static_branch | 09:20 |
tristanC | spredzy: you shouldn't create and merge such branch, it's confusing zuul | 09:20 |
spredzy | Let me remove them from Github, and let the team know to work on their fork and not directly push the branches in ansible/zuul-jobs | 09:21 |
tristanC | spredzy: we could also setup branch protection, as documented here: https://ansible.softwarefactory-project.io/docs/user/zuul_user.html#configure-branch-protection | 09:23 |
tristanC | then in zuul, we can make it exclude unprotected branch | 09:23 |
spredzy | Yep, I agree with this proposal | 09:25 |
spredzy | tristanC: ansible/zuul-jobs has only master now, but the configuration is still not caught up | 09:29 |
spredzy | Zuul is applying old config to pipeline | 09:30 |
*** zoli is now known as zoli|lunch | 09:30 | |
*** zoli|lunch is now known as zoli | 09:30 | |
tristanC | spredzy: hum, it seems like zuul still has the branches... let me reload it | 09:31 |
spredzy | 1ok | 09:33 |
spredzy | tristanC: looks better :) | 09:35 |
*** sshnaidm is now known as sshnaidm|lnch | 09:40 | |
gundalow | tristanC: spredzy do we need to enable `Require pull request reviews before merging`: | 09:42 |
gundalow | > When enabled, all commits must be made to a non-protected branch and submitted via a pull request with the required number of approving reviews and no changes requested before it can be merged into a branch that matches this rule. | 09:42 |
tristanC | gundalow: it's up to you, we just removed the requirements in zuul gate pipeline | 09:43 |
spredzy | tristanC: do you see any issue with those job definitions ? https://github.com/ansible/zuul-jobs/blob/master/zuul.d/jobs.yaml | 09:46 |
spredzy | 2018-10-08 09:37:11.749693 | TASK [tox : Require tox_envlist variable] | 09:46 |
spredzy | 2018-10-08 09:37:11.791933 | static | skipping: Conditional result was False | 09:46 |
spredzy | I see that in the log, not really understanding why | 09:46 |
tristanC | spredzy: check ara-report ? | 09:46 |
spredzy | Hmm. OSError: [Errno 2] No such file or directory: '/home/zuul/src/github.com/ansible/awx' | 09:48 |
tristanC | spredzy: the skip is fine, the task is actually a "fail" module | 09:49 |
spredzy | tristanC: Looking at the sequence here https://ansible.softwarefactory-project.io/logs/09/2309/e0c7a7becea574aefe32dd9964c9d033a0223751/check/tox-awx-api-lint/5593fd2/job-output.txt.gz#_2018-10-08_09_37_09_030982 | 09:51 |
spredzy | I don't see tox (not in ensure mode) being triggered | 09:51 |
spredzy | Because the fail module failed in chdir'ing | 09:52 |
tristanC | spredzy: https://ansible.softwarefactory-project.io/logs/09/2309/e0c7a7becea574aefe32dd9964c9d033a0223751/check/tox-awx-api-lint/5593fd2/ara-report/file/8fde4eae-ccef-487e-8c6a-2aabfef3832c/#line-1 | 09:52 |
spredzy | OSError: [Errno 2] No such file or directory: '/home/zuul/src/github.com/ansible/awx' | 09:52 |
tristanC | spredzy: yes, the directory doesn't exist because you remove it in https://github.com/ansible/zuul-jobs/blob/master/playbooks/clean-static-node.yaml#L10 | 09:52 |
tristanC | perhaps we should clean before the prepare-workspace here: https://github.com/ansible/zuul-config/blob/master/playbooks/base-minimal/pre.yaml#L13 | 09:54 |
spredzy | Oops sorry I haven't seen that (paid closer attention) | 09:54 |
*** sshnaidm|lnch is now known as sshnaidm | 09:59 | |
spredzy | tristanC: base-minimal only applies to runc-fedora, right? | 10:00 |
spredzy | https://github.com/ansible/zuul-config/blob/master/zuul.d/jobs.yaml#L43-L46 | 10:00 |
tristanC | spredzy: nodeset can be changed by child jobs | 10:04 |
tristanC | spredzy: the cleaning should check for nodepool static label | 10:05 |
*** jpena is now known as jpena|off | 10:07 | |
spredzy | I see | 10:10 |
spredzy | Just trying something real quick and if it doesn't work will go down the path you just suggested | 10:10 |
*** nilashishc has quit IRC | 10:14 | |
*** nilashishc has joined #softwarefactory | 10:15 | |
spredzy | tristanC: https://ansible.softwarefactory-project.io/zuul/status.html job has been stuck for 15min (linters + awx), anything currently happening ? | 10:19 |
tristanC | spredzy: i think it's because all executors are at capacity, i had to re-enqueue the job lost after the oom | 10:28 |
tristanC | spredzy: tomorrow we'll spin more executors | 10:28 |
tristanC | https://softwarefactory-project.io/grafana/d/000000001/zuul-status?panelId=66&fullscreen&orgId=1 | 10:28 |
spredzy | Oh ok nice. I always forgot about those dashboards. Really need to get use to check them out | 10:30 |
spredzy | tristanC: the change I submitted seems to have been enough for the use-case. All green | 11:09 |
spredzy | Thanks a lot for this morning assitance. Really appreciated. Merci :) | 11:09 |
tristanC | spredzy: you're welcome, anytime :) | 11:14 |
*** jangutter_ has quit IRC | 12:27 | |
*** jangutter has joined #softwarefactory | 12:27 | |
pabelanger | tristanC: re: OOM how much does zuul-scheduler server have? Are we running a swap there also? | 14:39 |
pabelanger | I _think_ that is shared with other services, maybe with recent quota bump we can move to dedicated server also | 14:39 |
spredzy | pabelanger: hey. I have a question wrt to zuul | 15:06 |
spredzy | For some reason it doesn't seem to pull the proper PR | 15:06 |
spredzy | https://github.com/ansible/awx/pull/2266 | 15:06 |
spredzy | when I do a `cat .git/HEAD` | 15:06 |
spredzy | I get ref/head/devel rather than the proper PR content | 15:07 |
spredzy | Would you have pointer on where should I start looking at ? | 15:07 |
spredzy | proper PR reference* | 15:07 |
pabelanger | spredzy: where are you doing cat .git/HEAD ? | 15:08 |
spredzy | chdir: "{{ ansible_user_dir }}/{{ zuul.project.src_dir }}" | 15:09 |
pabelanger | spredzy: sorry, do you have log from zuul showing that | 15:10 |
spredzy | pabelanger: So that is the PR from zuul-jobs https://github.com/ansible/zuul-jobs/pull/22/files | 15:11 |
spredzy | pabelanger: here are the logs https://ansible.softwarefactory-project.io/logs/66/2266/24e2a9ace08d5e5fa05f5f6d95df40cf6ab087d7/check/tox-awx-api/c9d076f/ | 15:12 |
spredzy | AWX - Depends-On: zuul-jobs PR that displays cat .git/HEAD | 15:13 |
spredzy | tristanC | spredzy: it shouldn't, zuul should be force pull | 15:15 |
spredzy | pabelanger: tristanC said that this morning, so I am not very sure about whats going on | 15:16 |
pabelanger | force pull for what, the github PR? | 15:16 |
spredzy | We were talking about zuul-jobs, but I was supposing this also applies for Github PR | 15:17 |
pabelanger | spredzy: okay, can you state the issue again. Got lost in the weeks trying to understand what the awx jobs are doing. What is the issue you are seeing in zuul | 15:18 |
*** nilashishc has quit IRC | 15:20 | |
spredzy | So in awx, we have a static-node for our nodeset. So, the issue is that when zuul catch the event of recheck, the content of the PR is not pulled | 15:20 |
spredzy | it references to refs/head/devel when it should reference to the proper references something like refs/pull/2266 | 15:20 |
pabelanger | spredzy: how is the static node content cleaned up between job runs? | 15:21 |
pabelanger | eg: how do we know if the job before in not affecting the current | 15:21 |
spredzy | Currently, the folder "{{ ansible_user_dir }}/{{ zuul.project.src_dir }}" is not removed | 15:21 |
spredzy | Because I though zuul will git pull force the PR | 15:21 |
pabelanger | no | 15:21 |
spredzy | but I might be missing it | 15:21 |
pabelanger | well | 15:21 |
pabelanger | when a zuul-executor starts a job, one of the first tasks it does is prepare-workspace which uses synchronize to push the git content to the remote node | 15:22 |
spredzy | https://github.com/openstack-infra/zuul-jobs/blob/master/roles/prepare-workspace/tasks/main.yaml | 15:22 |
spredzy | Yep | 15:23 |
pabelanger | in our care, we are using ephemeral nodes, so we know there is nothing on the far side. | 15:23 |
pabelanger | my question here is, how do we know there is nothing on the far side with a static node? | 15:23 |
pabelanger | is it possible prepare-workspace is having an issue | 15:23 |
pabelanger | and not getting the right git content on to the node | 15:23 |
spredzy | We need to think that "{{ ansible_user_dir }}/{{ zuul.project.src_dir }}" will always be present in the static node case | 15:24 |
spredzy | just not on the right refs | 15:24 |
pabelanger | well, prepare-workspace will do that | 15:25 |
pabelanger | however, it mostly has only been used with nodes from nodepool, so maybe we are hitting a bug | 15:25 |
pabelanger | but, you could also ensure that directory is first absent when a job start / ends | 15:26 |
pabelanger | The other question I had, why does it need to be a static node? Why not use a VM from nodepool which we know is clean from start | 15:26 |
spredzy | yep, that what tristanC suggested this morning but I though I had found a way around it but it doesn't seem like it | 15:26 |
spredzy | pabelanger: I don't have the knowledge yet to answer this question. But will be able to answer by end of week | 15:27 |
pabelanger | spredzy: I talked a little with matburt at ansiblefest last week and offered to help a little. I still think defaulting to a vm first from nodepool, then run your containers on that is a great first step. It sounds like some of the things you are running into are either edge cases to static nodes or containers, which is fine. But remember you are likely one of the first people do try this, which is | 15:30 |
pabelanger | going to slow down development on awx. | 15:30 |
spredzy | I realize that, I'll try to grab more knowledge this week on why things are done this way and see what preventing us from going to "fresh" env for each job | 15:49 |
*** sshnaidm is now known as sshnaidm|afk | 15:51 | |
spredzy | pabelanger: do you know how I can access hosts.static.nodepool.label from https://ansible.softwarefactory-project.io/logs/25/2125/e7e6ccb181898dea46d9ec5a67f61087fcef7f2b/check/tox-awx-api/3b205a7/zuul-info/inventory.yaml ? | 16:01 |
pabelanger | spredzy: access how? | 16:01 |
spredzy | from whithin an ansible task | 16:01 |
spredzy | ie do this when: hosts.static.nodepool.label == 'static-ansible' | 16:02 |
pabelanger | if you are using the inventory, you can just reference static | 16:02 |
pabelanger | - hosts: static | 16:02 |
pabelanger | tasks: | 16:02 |
pabelanger | etc | 16:02 |
spredzy | its from all here what I am trying to do https://github.com/ansible/zuul-config/pull/32 | 16:03 |
spredzy | let me know if you think this isn't the proper way of adressing the issue we talked about earlier | 16:04 |
spredzy | s/state/file | 16:04 |
pabelanger | left comment | 16:10 |
spredzy | thanks /me reads | 16:11 |
*** zoli is now known as zoli|gone | 16:34 | |
*** zoli|gone is now known as zoli | 16:34 | |
spredzy | pabelanger: Thanks, I'll work on that tomorrow. Will have to go soon. | 16:36 |
spredzy | Do you have the hand on our zuul system? | 16:36 |
pabelanger | spredzy: I don't have SSH access anymore, I did when I was on SF team | 16:37 |
spredzy | ack | 16:37 |
pabelanger | but, I've been pushing on changes needed for ansible-network tenant | 16:37 |
spredzy | Actually I didn't ask, but when did your move happen? | 16:47 |
*** sshnaidm|afk is now known as sshnaidm | 16:48 | |
pabelanger | 3 weeks ago | 16:48 |
spredzy | So we moved in Ansible ~ the same time | 16:50 |
spredzy | Sept. 17 for me | 16:50 |
*** sshnaidm is now known as sshnaidm|afk | 17:32 | |
spredzy | pabelanger: re: We should can create a new job, say awx-base, as untrusted, that is zuul-jobs. have it parent to base (which is also untrusted). Then, in the post-run logic, after log collections, we can add logic to clean up static nodes. In zuul, we've also discussed the idea of a clean up handler, but don't think it is landed. | 17:46 |
spredzy | Currently all jobs are defined here https://github.com/ansible/zuul-jobs/blob/master/zuul.d/jobs.yaml | 17:47 |
spredzy | they all inhrerit from tox (from openstack-infra/zuul-jobs) | 17:47 |
spredzy | If I created an awx-base as suggested, copy/paste base-minimal in ansible/zuul-jobs, how would I make tox inherit from it? | 17:48 |
spredzy | or simply redefintin - job: name: tox parent: awx-base would be enough? | 17:49 |
*** nilashishc has joined #softwarefactory | 18:04 | |
pabelanger | spredzy: yah, we'd need to create a new job called awx-tox here since today we cannot modify tox to inherit from another parent | 18:09 |
pabelanger | long term, we've talked about this use case, and should be able to do it once we have per project jobs | 18:09 |
pabelanger | right now tox job can only be defined once | 18:09 |
spredzy | So I should copy-paste tox from openstack-infra/zuul-jobs | 18:11 |
spredzy | name it: awx-tox | 18:12 |
spredzy | and have it inherit awx-base ? | 18:12 |
spredzy | pabelanger: is my understanding correct ? | 18:12 |
pabelanger | spredzy: let me think about it, because there is also unittests which tox parent too | 18:13 |
*** nilashishc has quit IRC | 18:13 | |
spredzy | Current PR looks like https://github.com/ansible/zuul-jobs/pull/22/files | 18:13 |
pabelanger | spredzy: this is why I said VMs will be much easier :) Now we need to redesign jobs | 18:13 |
spredzy | Yep I understand, and agree. But just dealing with what I have to deal with atm. | 18:14 |
spredzy | Be sure I'll push toward this way - this is extra-headache we could save ourselves from | 18:14 |
pabelanger | spredzy: okay, so for now, you are just using tox jobs right? | 18:15 |
pabelanger | let me work on etherpad for flow and will share for tomorrow | 18:15 |
spredzy | Yes, just tox jobs | 18:16 |
pabelanger | k | 18:16 |
spredzy | pabelanger: currently it fails with "Unable to freeze job graph: Job tox-awx-api-lint does not specify a run playbook" | 18:28 |
spredzy | if I try to use tox/parent or copy over without the dependencies | 18:28 |
spredzy | just so you know | 18:28 |
pabelanger | spredzy: can you link PR? | 18:28 |
spredzy | Pr is https://github.com/ansible/awx/pull/2266 | 18:29 |
spredzy | but it depends on https://github.com/ansible/zuul-jobs/pull/22 | 18:29 |
spredzy | This is the one where I am trying all the things | 18:29 |
pabelanger | spredzy: okay, will dig more in a bit. As a test to your clone issue, if you used the label ansible-fedora-28-4vcpu from nodepool, is the git state correct? | 18:31 |
pabelanger | you could also use ansible-fedora-28-1vcpu | 18:31 |
spredzy | lemme try it and let you know | 18:32 |
spredzy | pabelanger: do we have access to it ? | 18:35 |
spredzy | The nodeset "ansible-fedora-28-1vcpu" was not found. | 18:36 |
pabelanger | spredzy: yah, we should let your tenant load nodesets from ansible-network | 18:37 |
pabelanger | but you can just add: https://github.com/ansible-network/ansible-zuul-jobs/blob/master/zuul.d/nodesets.yaml#L25 | 18:37 |
pabelanger | for now | 18:37 |
pabelanger | spredzy: that will either run in limestone or vexxhost | 18:38 |
pabelanger | working to bring online a 3rd provider | 18:38 |
* spredzy googles limestone | 18:39 | |
spredzy | who's the 3rd name? | 18:39 |
pabelanger | https://www.limestonenetworks.com/ | 18:40 |
pabelanger | spredzy: some capactiy we are testing thanks to logan- | 18:41 |
spredzy | Cool | 18:42 |
spredzy | Which zone are we using ? | 18:43 |
spredzy | pabelanger: https://ansible.softwarefactory-project.io/zuul/status.html | 18:43 |
pabelanger | spredzy: both | 18:43 |
spredzy | They end up in error | 18:43 |
spredzy | here https://ansible.softwarefactory-project.io/zuul/builds.html | 18:44 |
pabelanger | spredzy: for limestone? | 18:44 |
pabelanger | will need to check launcher | 18:44 |
spredzy | ah no, wait | 18:44 |
spredzy | type | 18:44 |
spredzy | typo | 18:45 |
spredzy | pabelanger: seems ok with the Fedora node | 18:50 |
spredzy | ie. proper file content is in there. Let me check the git commit | 18:51 |
pabelanger | spredzy: k, that is good new | 18:51 |
pabelanger | then we know something is wrong with static | 18:51 |
pabelanger | and not zuul | 18:51 |
spredzy | so the issue is: with a static-node kind, how to make sure "{{ ansible_user_dir }}/{{ zuul.project.src_dir }}" is up to date and/or cleaned | 18:52 |
spredzy | pabelanger: yes def. proper git sha1 on the fedora nodes | 18:54 |
* spredzy goes dinner | 18:56 | |
spredzy | let me know if you come up with any idea Paul | 18:56 |
pabelanger | spredzy: yah, lets chat more tomorrow, maybe with matburt too | 19:10 |
matburt | I'm definitely happy to do that | 19:11 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!