*** jamesmcarthur has joined #zuul | 00:21 | |
*** jamesmcarthur has quit IRC | 00:23 | |
*** jamesmcarthur has joined #zuul | 00:23 | |
*** jamesmcarthur has quit IRC | 01:05 | |
*** igordc has quit IRC | 01:08 | |
*** mattw4 has quit IRC | 01:09 | |
ianw | i'm coming to the conclusion there really is a problem with the log streamer when ansible is running under python3 ... do we really test that in production? | 01:10 |
---|---|---|
ianw | i think even on our bionic nodes, we're not setting ansible_python_interpreter | 01:10 |
clarkb | we set it to 2 iirc | 01:11 |
ianw | it works locally testing zuul_stream ... but reliably fails https://review.opendev.org/#/c/682275 :/ | 01:25 |
*** Goneri has quit IRC | 01:31 | |
*** rfolco has quit IRC | 02:28 | |
*** roman_g has quit IRC | 02:33 | |
*** jamesmcarthur has joined #zuul | 02:38 | |
*** bhavikdbavishi has joined #zuul | 02:43 | |
SpamapS | ianw: I use python3 exclusively in my setup. What's the problem? | 02:43 |
ianw | SpamapS: so you set python-path on your dib nodes to "/usr/bin/python3" as well? | 02:44 |
SpamapS | I don't have "dib nodes" ... I'm on AWS with packer-built AMI's. | 02:45 |
*** bhavikdbavishi1 has joined #zuul | 02:45 | |
SpamapS | I've been doing this a long time, I set ansible_python_interpreter=/usr/bin/python3 in my site variables. | 02:46 |
SpamapS | I think I did this before nodepool had a facility for that. | 02:46 |
ianw | hrm, well it's definitely failing in the gate remote tests, and i *think* what's happening is the streamer plugin is somehow failing | 02:47 |
SpamapS | And my only working OS is Ubuntu 18.04 | 02:47 |
*** bhavikdbavishi has quit IRC | 02:47 | |
ianw | you get the last task output, then http://paste.openstack.org/show/777058/ | 02:47 |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 02:47 | |
ianw | Ansible output terminated | 02:48 |
SpamapS | Hm, let me see | 02:48 |
SpamapS | in the executor? | 02:48 |
ianw | you can see the failure in https://openstack.fortnebula.com:13808/v1/AUTH_e8fd161dc34c421a979a9e6421f823e9/zuul_opendev_logs_539/682275/2/check/zuul-tox-remote/5391677/testr_results.html.gz | 02:49 |
SpamapS | I see "Ansible output terminated" at the end of every job's ansible summary. | 02:49 |
SpamapS | but there's no traceback or anything | 02:49 |
ianw | sorry, i think the error is more "[Zuul] Log Stream did not terminate" | 02:50 |
*** persia has quit IRC | 02:50 | |
ianw | the the job aborts | 02:51 |
SpamapS | I do see that now and then | 02:51 |
ianw | i think that means the streaming callback plugin died, somehow ... but figuring out how is currently how i'm stumped :) | 02:52 |
SpamapS | ahh yeah, perhaps we need to wrap it in a try/except that writes the exception into a tempfile. | 02:53 |
ianw | hrrm, i could wrap all functions in a decorator for that ... | 02:55 |
*** persia has joined #zuul | 02:56 | |
*** jamesmcarthur has quit IRC | 02:57 | |
SpamapS | Yeah I guess since it's a plugin that's the way you'd have ot do it. | 03:03 |
ianw | http://paste.openstack.org/show/777059/ ... something seems to be going bananas | 03:17 |
ianw | constantly forking and "ansible_zuul_console_payload_69fl742k" is somehow involved ... this seems to suggest "zuul_console:" somehow :/ | 03:18 |
*** persia has quit IRC | 03:18 | |
*** persia has joined #zuul | 03:24 | |
*** jamesmcarthur has joined #zuul | 03:56 | |
*** jamesmcarthur has joined #zuul | 03:57 | |
ianw | i think that might be a red herring ... the console streamer is trying to open a file that never appears maybe | 04:00 |
*** jamesmcarthur has quit IRC | 04:51 | |
*** bolg has joined #zuul | 05:15 | |
*** pcaruana has joined #zuul | 05:16 | |
*** jamesmcarthur has joined #zuul | 05:21 | |
*** pcaruana has quit IRC | 05:29 | |
*** sshnaidm|afk is now known as sshnaidm|pto | 05:35 | |
*** AJaeger has quit IRC | 05:52 | |
*** sanjayu_ has joined #zuul | 05:54 | |
*** spsurya has joined #zuul | 05:55 | |
*** AJaeger has joined #zuul | 05:56 | |
*** sanjayu_ has quit IRC | 06:00 | |
*** saneax has joined #zuul | 06:01 | |
*** jamesmcarthur has quit IRC | 06:10 | |
*** themroc has joined #zuul | 06:27 | |
*** jamesmcarthur has joined #zuul | 06:32 | |
*** avass has joined #zuul | 06:36 | |
*** jamesmcarthur has quit IRC | 06:37 | |
*** jamesmcarthur has joined #zuul | 06:45 | |
openstackgerrit | Ian Wienand proposed zuul/zuul master: [dnm] testing python3 ansible https://review.opendev.org/682556 | 06:57 |
*** jamesmcarthur has quit IRC | 07:17 | |
*** tosky has joined #zuul | 07:27 | |
*** armstrongs has joined #zuul | 07:27 | |
*** jangutter has joined #zuul | 07:28 | |
*** saneax has quit IRC | 07:28 | |
*** armstrongs has quit IRC | 07:37 | |
*** jamesmcarthur has joined #zuul | 07:44 | |
*** bhavikdbavishi has quit IRC | 07:44 | |
*** jpena|off is now known as jpena | 07:46 | |
*** jamesmcarthur has quit IRC | 07:50 | |
*** jamesmcarthur has joined #zuul | 07:52 | |
ianw | https://openstack.fortnebula.com:13808/v1/AUTH_e8fd161dc34c421a979a9e6421f823e9/zuul_opendev_logs_cfd/682556/1/check/zuul-tox-remote/cfd83a2/testr_results.html.gz | 07:53 |
ianw | mordred: ^ if you could figure out any clues as to why this seemingly small change to python3 leads to ^ https://review.opendev.org/#/c/682556/1/tests/base.py which appears to me to be an issue with the streaming? | 07:54 |
*** hashar has joined #zuul | 07:58 | |
mordred | ianw: 2019-09-17 07:19:04,069 zuul.AnsibleJob.output seems to be the last chunk that ran - and I don't know if that traceback is expected | 08:00 |
mordred | ianw: maybe there's a behavior/traceback change under 3 that's not getting caught properly? that test is testing if something doesn't exist - so *maybe* something that we're doing is not handling an error properly when running under 3? | 08:01 |
mordred | but - otherwise, no, I don't have an immediate thought | 08:01 |
ianw | mordred: yeah, i just can't find it :( | 08:02 |
ianw | 2019-09-17 07:20:04.902649 | ubuntu-bionic | b"2019-09-17 07:19:34,105 zuul.AnsibleJob.output DEBUG [e: 32de590007fe490da9e7b9cc89391a90] [build: c1a1bde54c19483a97a26774f9add01f] Ansible output: b'[Zuul] Log Stream did not terminate'" | 08:02 |
ianw | I think that is probably a problem? | 08:02 |
mordred | maybe? | 08:02 |
ianw | dunno, throwing in the towel on this one for today, anyway | 08:05 |
ianw | the bigger problem is that i want to use python-path python3 for fedora 30 -> https://review.opendev.org/682569 | 08:06 |
ianw | i think that for ansible >=2.8 zuul should be able to just leave ansilbe up to it's own devices on this one (https://review.opendev.org/682275) but have to figure out what's going on with this failure first | 08:07 |
mordred | ianw: you've got all the fun ones :) | 08:11 |
*** jamesmcarthur has quit IRC | 08:15 | |
*** jamesmcarthur has joined #zuul | 08:16 | |
*** igordc has joined #zuul | 08:20 | |
*** bhavikdbavishi has joined #zuul | 08:25 | |
*** igordc has quit IRC | 08:28 | |
*** noorul has joined #zuul | 08:31 | |
noorul | hi | 08:31 |
noorul | Is there a way to define a project dependent on another project? | 08:32 |
SpamapS | noorul: https://zuul-ci.org/docs/zuul/user/config.html#attr-job.required-projects | 08:37 |
SpamapS | noorul: but that's job->project. | 08:37 |
SpamapS | so it can be inferred by project->job->required-project | 08:38 |
openstackgerrit | Merged zuul/zuul-jobs master: Add a netconsole role https://review.opendev.org/680901 | 08:38 |
noorul | SpamapS: Thanks! Which branch will it checkout? | 08:39 |
avass | noorul: https://zuul-ci.org/docs/zuul/user/config.html#attr-job.required-projects.override-checkout | 08:40 |
noorul | avass: Thank you! Is there an example? | 08:41 |
avass | noorul: So the same branch as the branch triggering the job unless override-checkout is specified | 08:41 |
noorul | avass: What happens if that branch does not exist in the required project? | 08:41 |
avass | noorul: Not sure, I guess the job would fail | 08:42 |
noorul | Oops! | 08:42 |
noorul | I am thinking how can I do the following | 08:43 |
noorul | I have two repositories repo1 and repo2 | 08:43 |
noorul | both of them has release branches | 08:43 |
noorul | rel_1.1.1 | 08:43 |
noorul | Now I have a private branch br1 and using that I raise a PR | 08:44 |
noorul | If I defined the repo2 and required project for the job | 08:44 |
noorul | Any idea what will happen? | 08:44 |
ianw | mordred: when you say "not handling an error properly" do you mean in the streamer, or somewhere else in zuul that might decide to abort the job? http://paste.openstack.org/show/777068/ is what i see trying with local testing, but i can't make the streamer fail :/ | 08:46 |
avass | noorul: I guess that it would try to checkout your private branch on both repositories unless override-checkout attribute is specified | 08:48 |
avass | noorul: But I'm not sure exactly how it works since we're not using it yet :) | 08:49 |
avass | noorul: unless I'm reading this wrong | 08:51 |
avass | noorul: Seems strange to me that it would try to checkout the same ref for both projects since you can't guarantee that the same ref exists in both repos. | 08:55 |
mordred | ianw: yeah - I'm honestly not sure what I mean - I'm a bit grasping at straws there | 08:58 |
mordred | ianw: oh yeah - hrm. both are showing a traceback | 09:00 |
avass | mordred: could you shine some light on how this works? https://zuul-ci.org/docs/zuul/user/config.html#attr-job.required-projects.override-checkout | 09:02 |
avass | mordred: actually I mean this: https://zuul-ci.org/docs/zuul/user/config.html#attr-job.required-projects.override-checkout | 09:02 |
avass | mordred: which branch/ref does it checkout by default? master? | 09:03 |
mordred | well - by default it checks out the branch matching the target, and if it cant' find that, it'll fall back to master. if override-checkout is defined, it'll use that | 09:04 |
avass | mordred: ah, that makes sense | 09:04 |
mordred | in the example from noorul above, the private branch br1 isn't relevant if it's the *source* of the PR | 09:04 |
mordred | assuming the PR is targetting one of the regular shared long-lived branches | 09:05 |
noorul | mordred: In my example I want repo2's rel_1.1.1 to be checked out | 09:06 |
noorul | mordred: But looks like master will be checked out | 09:06 |
mordred | noorul: if you are submitting the PR ato repo1's rel_1.1.1 branch, zuul should also check out rel_1.1.1 of repo2 | 09:08 |
noorul | So, if br1 exists in repo2, it will be checkout out, otherwise rel_1.1.1 | 09:10 |
noorul | Is my understanding correct? | 09:10 |
mordred | noorul: ah - no - sorry, I misunderstood what you meant by private branch. you mean you have a branch, br1, on the main shared repo1 and you are submitting PRs to that branch | 09:15 |
mordred | am I understanding that right? | 09:15 |
noorul | Not exactly | 09:16 |
noorul | I submitting a PR from br1 to rel_1.1.1 of repo1 | 09:16 |
mordred | ah - awesome | 09:16 |
noorul | and say I have in the job required_projects - repo2 | 09:16 |
mordred | in that case, br1 shouldn't play into the decision making from zuul at all | 09:16 |
mordred | it's about which branch you are submitting a change *to* - so since you are submitting from br1 to rel_1.1.1 of repo1- then if you add repo2 into the required_projects, zuul should default to checking out rel_1.1.1 of repo2 | 09:17 |
noorul | I see | 09:18 |
mordred | so in this case you should not need an override-checkout and zuul should do the right thing | 09:18 |
noorul | What is the use case of override-checkout? | 09:18 |
*** pcaruana has joined #zuul | 09:19 | |
mordred | in case the repos don't share a common structure. for instance, I have a job in openstacksdk that tests against ansible and has required-projects: github.com/ansible/ansible ... in this case, I want to test stable/rocky of openstacksdk against stable-2.6 of ansible - so I use override-checkout to tell zuul about that | 09:20 |
mordred | you could also use if it you wanted to define an additional job that tested your rel_1.1.1 of repo1 against master of of repo2 - to verify that a change worked both with a release and had future upwards compat, for instance | 09:21 |
noorul | does it support wildcard ? | 09:22 |
mordred | no - only direct values. however- the repos all have all of their branches in correct state, so if you need to get more clever, you can always do a git checkout in a job | 09:24 |
*** saneax has joined #zuul | 09:24 | |
mordred | https://opendev.org/opendev/system-config/src/branch/master/.zuul.yaml#L192-L229 <-- this is an example of a job where we want stable-2.15 of a bunch of gerrit repos on patches to master of opendev/system-config | 09:24 |
noorul | I see | 09:25 |
noorul | thanks for that example | 09:25 |
noorul | I have a very simple job http://paste.openstack.org/show/777072/ | 09:25 |
noorul | but it fails | 09:25 |
noorul | /bin/sh: 1: ./run_tests.sh: not found | 09:25 |
*** sanjayu_ has joined #zuul | 09:25 | |
noorul | But the file exists | 09:26 |
mordred | you need to change directories to your repo | 09:26 |
noorul | inside run_tests.sh? | 09:26 |
mordred | no - in the job - that shell command is going to be running with cwd of /home/zuul | 09:27 |
mordred | but your repo will be in something like /home/zuul/src/opendev.org/openstack/repo1 | 09:27 |
noorul | Hmm | 09:27 |
mordred | (repos are put in golang format on disk inside of the src dir) | 09:27 |
noorul | Is there an example? | 09:28 |
mordred | so - http://paste.openstack.org/show/777073/ | 09:28 |
avass | noorul: you probably want to put a chdir: {{ zuul.project.src_dir }} on the shell command | 09:28 |
*** saneax has quit IRC | 09:29 | |
mordred | https://opendev.org/opendev/system-config/src/branch/master/playbooks/zuul/gerrit/repos.yaml | 09:29 |
mordred | yah | 09:29 |
mordred | {{ zuul.project.src_dir }} is great for this case | 09:29 |
*** roman_g has joined #zuul | 09:30 | |
*** jamesmcarthur has quit IRC | 09:31 | |
*** jamesmcarthur has joined #zuul | 09:32 | |
*** noorul has quit IRC | 09:42 | |
openstackgerrit | Ian Wienand proposed zuul/zuul master: [dnm] testing python3 ansible https://review.opendev.org/682556 | 09:50 |
*** noorul has joined #zuul | 09:54 | |
noorul | mordred: http://paste.openstack.org/show/777075/ | 09:54 |
avass | noorul: found unacceptable key (unhashable type: 'AnsibleMapping') | 10:03 |
avass | noorul: https://docs.ansible.com/ansible/2.5/user_guide/playbooks_variables.html#hey-wait-a-yaml-gotcha | 10:03 |
avass | noorul: The jinja expression needs to be put in quotes otherwise ansible will think it's a yaml dictionary | 10:07 |
*** jamesmcarthur has quit IRC | 10:07 | |
*** pcaruana has quit IRC | 10:07 | |
avass | noorul: But only if a value is started with an expression | 10:07 |
avass | So it should have been chdir: "{{ zuul.project.src_dir }}", my fault :) | 10:08 |
*** bhavikdbavishi has quit IRC | 10:14 | |
noorul | avass: Got it | 10:16 |
*** recheck has quit IRC | 10:17 | |
noorul | Is it possible to define ansible role in non-trusted project? | 10:17 |
*** recheck has joined #zuul | 10:18 | |
*** recheck has quit IRC | 10:22 | |
*** recheck has joined #zuul | 10:23 | |
mordred | absolutely | 10:24 |
*** recheck has quit IRC | 10:24 | |
*** recheck has joined #zuul | 10:25 | |
openstackgerrit | Ian Wienand proposed zuul/zuul master: [dnm] testing python3 ansible https://review.opendev.org/682556 | 10:27 |
*** recheck has quit IRC | 10:28 | |
*** recheck has joined #zuul | 10:29 | |
*** recheck has quit IRC | 10:31 | |
*** recheck has joined #zuul | 10:31 | |
*** noorul has quit IRC | 10:34 | |
*** recheck has quit IRC | 10:37 | |
*** recheck has joined #zuul | 10:38 | |
*** jamesmcarthur has joined #zuul | 10:40 | |
*** recheck has quit IRC | 10:41 | |
*** recheck has joined #zuul | 10:41 | |
*** hashar has quit IRC | 10:44 | |
*** pcaruana has joined #zuul | 10:55 | |
*** noorul has joined #zuul | 11:00 | |
*** avass has quit IRC | 11:07 | |
*** hashar has joined #zuul | 11:09 | |
*** noorul has quit IRC | 11:11 | |
*** jamesmcarthur has quit IRC | 11:12 | |
*** jpena is now known as jpena|lunch | 11:17 | |
*** panda is now known as panda|ruck | 11:41 | |
*** sanjayu_ has quit IRC | 11:44 | |
*** gtema_ has joined #zuul | 11:53 | |
*** gtema_ has quit IRC | 11:56 | |
*** bhavikdbavishi has joined #zuul | 11:59 | |
*** rfolco has joined #zuul | 12:04 | |
*** jangutter_ has joined #zuul | 12:05 | |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Evaluate CODEOWNERS settings during canMerge check https://review.opendev.org/644557 | 12:07 |
*** jangutter has quit IRC | 12:08 | |
*** jamesmcarthur has joined #zuul | 12:10 | |
*** jamesmcarthur has quit IRC | 12:10 | |
*** jamesmcarthur_ has joined #zuul | 12:10 | |
*** jangutter_ has quit IRC | 12:11 | |
*** rlandy has joined #zuul | 12:18 | |
*** bhavikdbavishi1 has joined #zuul | 12:22 | |
*** gtema_ has joined #zuul | 12:23 | |
*** bhavikdbavishi has quit IRC | 12:24 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 12:24 | |
*** jamesmcarthur_ has quit IRC | 12:26 | |
*** pcaruana has quit IRC | 12:27 | |
*** openstackstatus has quit IRC | 12:28 | |
*** openstack has joined #zuul | 12:32 | |
*** ChanServ sets mode: +o openstack | 12:32 | |
*** sanjayu_ has joined #zuul | 12:33 | |
*** AJaeger has quit IRC | 12:34 | |
*** AJaeger has joined #zuul | 12:36 | |
*** jpena|lunch is now known as jpena | 12:37 | |
*** Goneri has joined #zuul | 12:39 | |
*** bhavikdbavishi has quit IRC | 12:46 | |
*** fdegir has quit IRC | 12:47 | |
*** fdegir has joined #zuul | 12:48 | |
*** mattymo has joined #zuul | 12:53 | |
*** gtema_ has quit IRC | 12:56 | |
*** pcaruana has joined #zuul | 12:59 | |
mattymo | Anyone familiar with nodepool that could help me out? I can get nodepool to do rhel7 registration just fine, but it unregisters at the end of build. That's okay. I want now to do RH registration when nodepool-launcher launches an openstack instance | 12:59 |
mattymo | or is this something beyond nodepool's scope? | 12:59 |
pabelanger | mattymo: yes, you need'd to setup a zuul pre-run playbook to do that | 13:03 |
pabelanger | nodepool no longer has the ability to modify a node at launch time, only zuul can | 13:03 |
mordred | what an interesting use case ... | 13:03 |
pabelanger | mattymo: other option, is create some sort of boot script, that does it when the node first launches | 13:04 |
pabelanger | we do that tody with some dns things in opendev, and for ansible-network setting up network appliance config | 13:04 |
*** gtema_ has joined #zuul | 13:07 | |
mordred | pabelanger: this seems like a good description of a type of action that might (or might not) be worth pondering. because the activity in question is tied to a nodeset / label and isn't really job specific. I don't think I'd pondered the rhel activation use case before (I mean - obviously can do with a pre-run - it's just an interesting case to consider) | 13:07 |
fungi | it also could be the case that you want to restrict to some specific maximum number of rhel nodes running simultaneously because you only have a certain number of licenses? if so that starts to look a lot like a (perhaps label-specific) quota | 13:09 |
mattymo | pabelanger, is that dns boot script on a public git repo? | 13:10 |
fungi | i wonder if a nodepool driver shim might be a way to solve it | 13:10 |
pabelanger | mordred: yup, there is likey a few network appliance related things we can lump into that too. X that needs to be accomplish to finsh image build process. For now, we solved that with pre-run jobs in zuul, but is complicated to control which playbooks run via hosts | 13:11 |
pabelanger | mattymo: https://opendev.org/openstack/project-config/src/branch/master/nodepool/elements/nodepool-base/finalise.d/89-unbound#L138 | 13:13 |
*** bhavikdbavishi has joined #zuul | 13:13 | |
*** brendangalloway has joined #zuul | 13:14 | |
tristanC | corvus: is it ok if I +3 the pagure patches from fbo you already +2? | 13:14 |
mattymo | thankfully in my case I have plenty of licenses | 13:14 |
mattymo | I just don't want creds to ever get stored on the target host | 13:14 |
mattymo | but the way my deployment runs, ansible runs only on the target host | 13:15 |
brendangalloway | I'm writing a job that redeploys a static host (by triggering a pxe boot) in the middle | 13:19 |
brendangalloway | However, I'm getting a failure on wait_for_reconnect timing out even though the host has come back up after the reinstall | 13:20 |
pabelanger | mattymo: yah, in that case, might be better to have zuul do that step. So, you can store that data as a secret in zuul | 13:20 |
brendangalloway | Running the ansible role outside of zuul succeeds, and I am able to reestablish a connection and continue the job | 13:20 |
pabelanger | then hope (which I haven't figured out or even tested) and job doesn't try to leak the license info | 13:20 |
brendangalloway | Any suggestions on debugging the state of the executor at the time of the failure to figure out what is going wrong? | 13:22 |
pabelanger | brendangalloway: I'm not sure what wait_for_reconnect is, that something you created? | 13:23 |
pabelanger | or using wait_for_connection | 13:23 |
brendangalloway | pabelanger: yes, wait_for_connection | 13:24 |
pabelanger | brendangalloway: maybe first use wait_for to ensure SSH port is open? | 13:25 |
pabelanger | we do that today, and works well | 13:25 |
brendangalloway | ok, let me try that | 13:25 |
*** bhavikdbavishi has quit IRC | 13:25 | |
*** bhavikdbavishi has joined #zuul | 13:26 | |
*** pcaruana has quit IRC | 13:28 | |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Pagure - add support for git.tag.creation event https://review.opendev.org/679938 | 13:33 |
pabelanger | mordred: do you remember a time, where zuul might have been running multiple ansible shell tasks, on the same node from nodepool, at the same time? For some reason I thought that was a problem a while back. Basically, I've seen something odd, where we pontentially have shell task process running twice in zuul.a.c | 13:37 |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Pagure - add support for git.tag.creation event https://review.opendev.org/679938 | 13:37 |
pabelanger | I want to say, it was something to do with the version of command that zuul shipped for ansible? | 13:38 |
*** avass has quit IRC | 13:47 | |
*** jamesmcarthur has quit IRC | 13:47 | |
*** sanjayu_ has quit IRC | 13:48 | |
*** pcaruana has joined #zuul | 13:59 | |
*** michael-beaver has joined #zuul | 14:00 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: Disable rsh synchronize rsync_opts https://review.opendev.org/682657 | 14:12 |
mordred | pabelanger: I vaguely remember something something but also don't remember. clarkb ? | 14:16 |
corvus | clarkb: ^ 682657 | 14:17 |
*** jamesmcarthur has joined #zuul | 14:17 | |
clarkb | corvus: tristanC done | 14:18 |
clarkb | mordred: pabelanger I dont recall | 14:18 |
*** jamesmcarthur has quit IRC | 14:22 | |
corvus | tristanC: re pagure, yes -- https://review.opendev.org/679938 was the only thing i was worried about. i just left a comment on that (cc fbo) | 14:24 |
mordred | corvus: I left you a comment on your robot comments patch - possibly just to prove I'm actually reading these patches | 14:25 |
*** jamesmcarthur has joined #zuul | 14:27 | |
corvus | mordred: excellent question | 14:31 |
*** jamesmcarthur has quit IRC | 14:32 | |
corvus | mordred: i made an answer | 14:32 |
mordred | corvus: cool | 14:34 |
corvus | clarkb: do you think you could take a look soon at the gerrit stack starting at https://review.opendev.org/681132 ? | 14:37 |
clarkb | corvus: yes after my morning meeting I can take a look | 14:38 |
*** jamesmcarthur has joined #zuul | 14:38 | |
mordred | clarkb: I think you'll enjoy it | 14:38 |
*** mattymo has quit IRC | 14:42 | |
*** jamesmcarthur has quit IRC | 14:43 | |
*** bolg has quit IRC | 14:47 | |
*** jamesmcarthur has joined #zuul | 14:48 | |
*** pcaruana has quit IRC | 14:55 | |
openstackgerrit | Merged zuul/zuul master: Disable rsh synchronize rsync_opts https://review.opendev.org/682657 | 15:08 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Add scheduler max_hold_age config option. https://review.opendev.org/682675 | 15:28 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Mark nodes as USED when deleting autohold https://review.opendev.org/664060 | 15:38 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Auto-delete expired autohold requests https://review.opendev.org/663762 | 15:38 |
brendangalloway | pabelanger: I'm trying to use wait_for to check that the ssh service has come back up, but I can't execute the task on the executor. Not sure I correctly understood your suggestion | 15:39 |
pabelanger | brendangalloway: do we block it? Which error are you seeing. In our multi node setup, we run it from nested ansible node to check if 2nd node is online | 15:41 |
Shrews | tristanC: I don't really follow your comments on https://review.opendev.org/679057. Why is the tenant needed to get or delete an autohold via the web API? I was following mhu's instructions there and I don't quite get why that's required (obviously not required via CLI). | 15:41 |
brendangalloway | pabelanger: "msg": "Executing local code is prohibited" | 15:41 |
pabelanger | k, so in that case you need to move it to trusted playbook or maybe run from nested ansible | 15:42 |
Shrews | corvus: is https://review.opendev.org/682675 what you had in mind to deal with the node expiration issue? | 15:43 |
brendangalloway | We're running a single node here - we asked previously about issues trying to run jobs with both static and openstack nodes. | 15:43 |
pabelanger | right, so the single node is still online, is that right? it is the static node you are doing a pxe with? | 15:44 |
brendangalloway | so we don't really have an option to spin up another node in the same job to defer the wait_for to | 15:46 |
brendangalloway | the single node is going down during the play and being pxe booted | 15:46 |
tristanC | Shrews: when a tenant REST endpoint is white-label, user doesn't have access to /api/autohold, all there requests are scoped /api/tenant/{user-tenant-name}/ | 15:47 |
mordred | why are we blocking wait_for ? | 15:47 |
pabelanger | ah, so yah in that case you need to move the wait_for into a trusted playbook, to run from executor | 15:47 |
brendangalloway | trusted playbooks are only allow to run in a post environment correct? | 15:47 |
mordred | no - they can run anywhere - their content just isn't executed speculatively - so if you propose a change to one, the change has to land before it takes effect | 15:48 |
tristanC | Shrews: thus if we have /api/tenant/{tenant}/autohold to list autoholds (at L1105), then we should have /api/tenant/{tenant}/autohold/{id} | 15:48 |
mordred | but also - I want to see if there is a way we can allow wait_for - because it seems like a sensible thing to want to do | 15:48 |
mordred | brendangalloway: can you try using wait_for_connection instead? | 15:50 |
pabelanger | that didn't work | 15:50 |
brendangalloway | That was my original approach | 15:50 |
mordred | oh. weird | 15:50 |
mordred | k. | 15:50 |
pabelanger | wait_for, was to scan to see if port was open | 15:51 |
clarkb | corvus: https://review.opendev.org/#/c/682487/ is the stack you were asking for review on right? | 15:51 |
pabelanger | then try wait_for_connection | 15:51 |
pabelanger | brendangalloway: next option, would be shell ssh-keyscan / loop | 15:51 |
brendangalloway | but it's getting the 'unable to ssh' error | 15:51 |
pabelanger | but with executor, that is going to be blocked too | 15:51 |
pabelanger | IIRC | 15:51 |
brendangalloway | wait_for being allowed to execute would be ideal | 15:51 |
corvus | clarkb: that's the end, https://review.opendev.org/681132 is the start | 15:52 |
*** gtema_ has quit IRC | 15:52 | |
corvus | Shrews: generally yes -- i'll take a detailed look in a few | 15:52 |
*** bolg has joined #zuul | 15:52 | |
*** hashar has quit IRC | 15:53 | |
*** hashar has joined #zuul | 15:54 | |
brendangalloway | pabelanger: That would also still need to executed by trusted or a third party? | 15:54 |
pabelanger | brendangalloway: yah, in this case, you are going to have very limited way to check in untrusted job | 15:54 |
mordred | brendangalloway: yeah - I think allowing wait_for to be used makes sense - it might be tomorrow before I can get enough headspace to dive in to the action plugin exclusions and figure it all out | 15:54 |
pabelanger | what I'd suggest, is figure out how to do the wait check, move that job into trusted, then parent to it from untrusted job | 15:55 |
mordred | pabelanger, brendangalloway: what didn't work with wait_for_connection (curious) | 15:55 |
*** mattw4 has joined #zuul | 15:56 | |
brendangalloway | pabelanger: that could be an option. Luckily the reformat is the first role called and once it's working it shouldn't need much changing | 15:56 |
pabelanger | not sure, I'm guessing that some with socket from executor to node? | 15:57 |
brendangalloway | mordred: "msg": "SSH Error: data could not be sent to remote host "<redacted>". Make sure this host can be reached over ssh" | 15:57 |
pabelanger | wait | 15:57 |
pabelanger | so, if a new host is coming online with pxe boot | 15:57 |
pabelanger | it will have new hostkeys | 15:57 |
pabelanger | and zuul-executor hasn't accepted them | 15:57 |
*** hashar has quit IRC | 15:58 | |
brendangalloway | same playbook works when I run it from my laptop, and once nodepool sees the node, zuul accessess them just fine | 15:58 |
pabelanger | brendangalloway: are you preserving hostkeys? | 15:58 |
brendangalloway | pabelanger: that's already been solved | 15:58 |
pabelanger | kk | 15:58 |
brendangalloway | yes | 15:58 |
*** hashar has joined #zuul | 15:58 | |
pabelanger | I wonder if ssh-agent hasn't timed out or something | 15:58 |
pabelanger | brendangalloway: maybe try using meta reset_connection before wait_for_connection? | 15:59 |
pabelanger | that would cause ansible-playbook to create a new connection again | 15:59 |
brendangalloway | Is there some way I can introduce a delay on that? I'd like to wait a minute or two to make sure the connection is properly down before resetting and waiting for the connection | 16:02 |
pabelanger | wait_for_connection has delay / sleep / timeout settings | 16:03 |
*** noorul has joined #zuul | 16:03 | |
pabelanger | same with wait_for | 16:03 |
pabelanger | but you can also use pause task to hardcode a delay too | 16:03 |
brendangalloway | so wait_for_connection with ignore errors, reset connection, wait_for_connection again? | 16:04 |
pabelanger | reboot node, reset connection, wait_for_connection (delay / sleep / timeout) | 16:05 |
mordred | I look away for a second and come back to a very fun scrollback | 16:06 |
brendangalloway | when I call reset connection, will it just drop the connection until the next task? Or will it try reconnect as part of the reset? | 16:06 |
pabelanger | it will reconnect | 16:06 |
pabelanger | on next task | 16:06 |
brendangalloway | sounds perfect, I will test that then | 16:06 |
pabelanger | I'm starting to think, when node is booting up again, something is going on with sshd on server side. And maybe connection is reset or something | 16:07 |
pabelanger | why I like wait_for, is you can look for SSHd headers in connection attempt | 16:07 |
pabelanger | eg: https://github.com/ansible/ansible-zuul-jobs/blob/master/playbooks/ansible-network-appliance-base/pre.yaml#L25 | 16:07 |
brendangalloway | we do a normal reboot at other stages in the play and wait_for_connection works fine | 16:09 |
*** hashar has quit IRC | 16:10 | |
noorul | Where does ansible push the code to the remote node? | 16:12 |
clarkb | noorul: https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/prepare-workspace-git/README.rst that is the role that should be used and typically as an early action of a base job | 16:14 |
corvus | noorul: https://zuul-ci.org/docs/zuul/admin/quick-start.html#configure-a-base-job also has more info -- you might remember doing that part when you ran through the quickstart | 16:15 |
clarkb | corvus: left a question on https://review.opendev.org/#/c/681132/ | 16:15 |
corvus | clarkb, noorul: the quickstart uses 'prepare-workspace' -- maybe we should change it to prepare-workspace-git ? | 16:15 |
clarkb | corvus: ++ prepare-workspace-git will handle presence of a cache or no cache | 16:16 |
noorul | clarkb: I am using prepare-workspace | 16:16 |
clarkb | it is more flexible | 16:16 |
corvus | clarkb: yes. Do you want that in the form of a new patchset or followup? | 16:17 |
clarkb | corvus: considering there are already a few moving parts and we likely need a full stack to do anything useful a followup is probably fine | 16:17 |
corvus | clarkb: cool. i'll stage the fix that way and wait for you to finish the stack before pushing it up. | 16:18 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Add autohold delete/info commands to web API https://review.opendev.org/679057 | 16:28 |
*** rlandy has quit IRC | 16:30 | |
*** hashar has joined #zuul | 16:33 | |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Remove outdated TODO https://review.opendev.org/682421 | 16:34 |
*** rlandy has joined #zuul | 16:35 | |
clarkb | corvus: left some thoughts on the big change https://review.opendev.org/#/c/680778 | 16:35 |
corvus | heh, i just replied to mordreds comment on that, i'll look at clarkb's now | 16:35 |
*** noorul has quit IRC | 16:45 | |
clarkb | corvus: https://review.opendev.org/#/c/681936/ I think I found a bug in that change | 16:47 |
clarkb | (and I -1'd because I think merging it as is would break opendev) | 16:47 |
corvus | clarkb: ++ | 16:48 |
*** rfolco is now known as rfolco|dentist | 16:51 | |
SpamapS | TIL that if you have a directory in your roles path that just has a README.rst in it, Ansible will still consider that a "role", and when you depend on that role, it will happily consider it having run successfully by not doing anything at all. | 16:54 |
clarkb | corvus: and comment on https://review.opendev.org/#/c/682487 I think all of my comments but the one on 681936 can be addressed as followups if you prefer | 16:54 |
SpamapS | This seems like.. a less than ideal default. | 16:54 |
*** pcaruana has joined #zuul | 16:55 | |
corvus | clarkb: what do you think of my response on that one? | 16:57 |
paladox | corvus robot comments are in 2.15.0. At least i doin't remember anyone adding it in a point release :) | 16:57 |
corvus | paladox: cool, i'll set it to >=2.15.0 then | 16:58 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Add enqueue reporter action https://review.opendev.org/681132 | 16:58 |
paladox | corvus https://github.com/GerritCodeReview/gerrit/commit/3fde7e4e75f4653e4a56e6c38bc7718a3280bd9c#diff-a9ed91a039490d5fed094853d96608fe | 16:58 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Add no-jobs reporter action https://review.opendev.org/681278 | 16:58 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Add report time to item model https://review.opendev.org/681323 | 16:58 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Add Item.formatStatusUrl https://review.opendev.org/681324 | 16:58 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Add support for the Gerrit checks plugin https://review.opendev.org/680778 | 16:58 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Update gerrit pagination test fixtures https://review.opendev.org/682114 | 16:58 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Support HTTP-only Gerrit https://review.opendev.org/681936 | 16:58 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Add autogenerated tag to Gerrit reviews https://review.opendev.org/682473 | 16:58 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Use robot_comments in Gerrit https://review.opendev.org/682487 | 16:58 |
corvus | clarkb: those exceeded my own threshold for followups, so i amended the original commits and redid the whole stack | 16:58 |
clarkb | corvus: wfm | 16:59 |
paladox | it's actually in 2.14 based on that commit! | 16:59 |
*** jpena is now known as jpena|off | 16:59 | |
clarkb | corvus: in https://review.opendev.org/#/c/682487/3..4/zuul/driver/gerrit/gerritconnection.py you updated teh > to >= but left the version as 2.15.0 instead of 2.15.16. The rest of the stack lgtm now | 17:03 |
corvus | clarkb: yeah did that based on paladox's comment above | 17:04 |
clarkb | oh /me catches up on irc | 17:04 |
clarkb | aha thanks | 17:04 |
corvus | which came right after i left the response in gerrit, sorry i didn't update | 17:04 |
clarkb | in that case I think the whole stack is good | 17:05 |
Shrews | corvus: been wanting to review that stack, but been dealing with a stack of my own :/ | 17:06 |
Shrews | tristanC: you want to review https://review.opendev.org/682675 that deals with the node expiration issue? | 17:10 |
*** brendangalloway has quit IRC | 17:13 | |
tristanC | Shrews: left a comment | 17:18 |
Shrews | tristanC: Not sure what you're asking there. That code should act the same if the user supplied 0 or not | 17:20 |
tristanC | Shrews: if the user supplied 0, then the code default to scheduler configuration value | 17:21 |
tristanC | Shrews: e.g. shouldn't we differentiate a supplied 0, explicite no expiration, to the default cli value? | 17:21 |
Shrews | tristanC: oh, well, that brings up a good question. Would we ever want a user supplied value to exceed our zuul's configured max? If we've configured a max of 2 days, would we want to allow something greater than that? | 17:25 |
tristanC | Shrews: at the moment you can, but you would have to use a silly "--hold-expiration 99999999" to ensure a greater value than what is zuul's configured max | 17:26 |
Shrews | tristanC: right. either way, yes, that needs fixed. but i think that needs to be answered before i can fix it properly | 17:26 |
tristanC | Shrews: I guess supplied expiration should never exceed what is configured in zuul | 17:27 |
tristanC | Shrews: or perhaps it could on the cli, but not from the rest endpoint | 17:27 |
Shrews | corvus: after our call, maybe you have an opinion on that ^^^ | 17:30 |
*** michael-beaver has quit IRC | 17:40 | |
openstackgerrit | Merged zuul/zuul-jobs master: Update the base-roles test to use prepare-workspace-git https://review.opendev.org/680703 | 17:47 |
openstackgerrit | Merged zuul/zuul-jobs master: Clean non-bare remote repos https://review.opendev.org/680689 | 17:47 |
*** recheck has quit IRC | 17:49 | |
*** recheck has joined #zuul | 17:53 | |
*** themroc has quit IRC | 17:54 | |
*** igordc has joined #zuul | 18:16 | |
corvus | Shrews, tristanC: i don't think we want to differentiate cli vs rest -- i believe the cli is expected to move to use rest eventually anyway. the current nodepool setting is a true max -- it can't be exceeded. | 18:21 |
corvus | Shrews, tristanC: max_hold_expiration should probably match that. maybe we want to add another option though, a default? for the situations where you might not want to set a hard limit, but you don't want the default to be unlimited. | 18:22 |
corvus | er, 'max_hold_age' | 18:22 |
*** bhavikdbavishi has quit IRC | 18:30 | |
Shrews | corvus: ok. i can rework it for that | 18:41 |
Shrews | default_hold_expiration / max_hold_expiration is probably clearest | 18:47 |
corvus | mordred: i updated the gerrit stack, so if you have a sec to re-review the updated changes, that'd be swell | 18:55 |
*** armstrongs has joined #zuul | 19:09 | |
*** armstrongs has quit IRC | 19:15 | |
*** kerby has joined #zuul | 19:21 | |
*** bolg has quit IRC | 19:38 | |
*** spsurya has quit IRC | 19:48 | |
*** sean-k-mooney has quit IRC | 20:17 | |
*** sean-k-mooney has joined #zuul | 20:25 | |
*** panda|ruck is now known as panda|ruck|off | 20:29 | |
*** kerby has quit IRC | 20:43 | |
*** Goneri has quit IRC | 20:46 | |
openstackgerrit | Ian Wienand proposed zuul/zuul master: [dnm] testing python3 ansible https://review.opendev.org/682556 | 20:53 |
*** kerby has joined #zuul | 20:56 | |
*** hashar has quit IRC | 20:57 | |
*** pcaruana has quit IRC | 20:58 | |
corvus | zuul-maint: i pushed up https://review.opendev.org/682743 regarding http://lists.zuul-ci.org/pipermail/zuul-discuss/2019-September/001017.html | 20:59 |
openstackgerrit | Ian Wienand proposed zuul/zuul master: [dnm] testing python3 ansible https://review.opendev.org/682556 | 21:15 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Add support for the Gerrit checks plugin https://review.opendev.org/680778 | 21:15 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Update gerrit pagination test fixtures https://review.opendev.org/682114 | 21:15 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Support HTTP-only Gerrit https://review.opendev.org/681936 | 21:15 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Add autogenerated tag to Gerrit reviews https://review.opendev.org/682473 | 21:15 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Use robot_comments in Gerrit https://review.opendev.org/682487 | 21:15 |
*** rfolco|dentist is now known as rfolco | 21:23 | |
ianw | SpamapS / mordred: well it seems 10+ hours of debugging that python3 failure has come down to a single "b" character :) | 21:37 |
SpamapS | ianw: of course it has | 21:52 |
openstackgerrit | James E. Blair proposed zuul/project-config master: Add a third-party check pipeline to OpenDev https://review.opendev.org/682756 | 21:52 |
corvus | oops ^ wrong repo :) | 21:59 |
corvus | i've pushed HEAD as 3.10.2 for the security fix | 22:09 |
*** jamesmcarthur has quit IRC | 22:13 | |
openstackgerrit | Merged zuul/zuul master: Add enqueue reporter action https://review.opendev.org/681132 | 22:23 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Move reference pipelines out of the quickstart https://review.opendev.org/682760 | 22:36 |
corvus | we should merge ^ asap -- i'm not sure if it's causing the quick-start instability, but it's certainly not helping debug it and it's not doing new users any favors | 22:37 |
clarkb | hrm do zuul release notes only update when changes merge and not when we tag things? | 22:41 |
clarkb | I'm noticing that 3.10.2 isn't in the release notes yet | 22:41 |
clarkb | (still shows under in development) | 22:41 |
clarkb | there is a change in the gate so I guess we will know soon if that merges and updates release notes | 22:42 |
corvus | clarkb: yep. https://zuul-ci.org/docs/zuul/3.10.2/releasenotes.html is correct, but master is not | 22:42 |
corvus | until a change lands | 22:42 |
clarkb | aha | 22:42 |
clarkb | thanks! | 22:42 |
corvus | unintended consequence of promote | 22:42 |
corvus | oh wait, we don't promote on tag | 22:42 |
corvus | so yeah, i guess we need a "rebuild master docs" job on release | 22:43 |
corvus | so we'd build it twice, once for the tag, and once so that master sees the new tag | 22:43 |
corvus | but we can't use the same build for both because the tag may be behind master | 22:43 |
corvus | i think this is a peculiarity of projects that put their release notes in their docs, which isn't most of the reno users? | 22:43 |
clarkb | I think openstack reno usage publishes the release notes independent of the docs | 22:44 |
corvus | i sent the release announce email | 22:45 |
fungi | yes, openstack projects run a separate release notes job | 22:46 |
fungi | independent of docs jobs for the same repos | 22:46 |
corvus | i think we should be able to fix it with some override-checkout mojo | 22:47 |
corvus | i'm too braindead to work it out myself right now :) | 22:47 |
fungi | i don't think it's especially urgent, no | 22:48 |
clarkb | ya the next update to master will fix it | 22:48 |
clarkb | and that will happen shortly according to the zuu lstatus page | 22:48 |
* mnaser just barging in with ideas | 22:48 | |
mnaser | how different/far is bwrap's security model from docker and friends | 22:49 |
mnaser | i.e. could we technically have a k8s native zuul-executor "driver" to use pods instead of bwrap | 22:49 |
mnaser | (coming in from a security approach and not as much of a "how much code has to be done") | 22:50 |
clarkb | I think its quite a bit different to how people often deploy k8s but not so different than openshift's locked down pods | 22:50 |
clarkb | for example k8s on google by default gave every pod admin access to the account in the psat (I think they have changed that since) | 22:50 |
mnaser | oof | 22:50 |
mnaser | i was thinking more on the container side of things, that stuff could technically be locked down via serviceaccount/rolebindings | 22:51 |
SpamapS | mnaser: bubblewrap as zuul uses it is pretty locked down. clarkb's assessment matches my own. | 22:51 |
clarkb | specifically it is there to limit blast radius if you manage to do something on the executor via ansible in an untrusted job | 22:51 |
fungi | yeah, at best docker and pals would be no better security-wise (they rely on the same kernel features after all) | 22:52 |
SpamapS | Essentially, if you can break out of bwrap, you probably have a local kernel root. | 22:52 |
clarkb | and often times the way k8s is deployed means pods are trusted to interact with the rest of the system | 22:52 |
clarkb | we want the opposite of that | 22:52 |
SpamapS | hm, that's not been my experience | 22:52 |
mnaser | right but there's a lot of stuff now to avoid exactly that (networkpolicy, securitycontext, etc) | 22:52 |
mnaser | i was just trying to compare the container vs bwrap aspect, if in some weird/odd way you could have pods _only_ instead of zuul-executors with bwraps inside them | 22:53 |
mnaser | then you can start scaling things out far more because you're not locking down an executors 'bwrap' to a single host | 22:53 |
SpamapS | Unless you allow privileged: true, my experience has been that most k8s nodes are set up to be pretty safe from container escape. | 22:53 |
mnaser | SpamapS: my only annoyance is the fact all service are exposed to everything by defaulty | 22:54 |
mnaser | but networkpolicy can work around that easily | 22:54 |
clarkb | SpamapS: cloud providers like gke put cloud authentication stuff in those pods though | 22:54 |
clarkb | SpamapS: and from that you change the config to allow privileged true and win | 22:54 |
SpamapS | mnaser: that's life, IMO. zero trust networking is the cloud model that we follow. Everything requires auth. | 22:54 |
fungi | well, part of the problem for scaling it to hosts other than the executor is that we share files directly into the bubblewrap containers | 22:54 |
clarkb | I think its more of a make things easy for users problem in popular deployments | 22:54 |
corvus | mnaser: it's come up before. i think it would be possible, perhaps desirable for various non-security purposes to have a container-based executor, but it's going to take a lot of planning and implementation. we're talking a pretty big spec, and it's absolutely going to depend on the successful completion of the zuul-operator spec. in the mean time, i agree with clarkb and SpamapS that it wouldn't be a | 22:55 |
corvus | security win. the more compelling reasons to do that have to do with, honestly, things like better integration with openshift. | 22:55 |
clarkb | not inherent ot k8s and openshift for example locks it down | 22:55 |
mnaser | on the subject of the zuul-operator, i am kinda under a very pressing deadline so i have been taking time to set things up and see what a golang based operator looks like | 22:55 |
corvus | (and yeah, fungi has hit on one of the fundamental design problems to be overcome) | 22:56 |
mnaser | and ive discovered more useful things like the ability to literally embed another operator that uses operator-sdk | 22:56 |
clarkb | https://cloud.google.com/kubernetes-engine/docs/concepts/security-overview#securing_instance_metadata for gke docs on the subject | 22:56 |
mnaser | so why ask the user to install the zookeeper operator when you can quite literally just include it as a dependency _inside_ your operator and it will start managing zookeeper, so 1 operator for everything | 22:56 |
corvus | mnaser: well, the spec addresses that with https://github.com/operator-framework/operator-lifecycle-manager | 22:57 |
mnaser | and im not talking about an extra pod, im talking about the apis and controllers living _inside_ the zuul-operator | 22:57 |
corvus | mnaser: would embedding be a better approach? | 22:57 |
mnaser | yeah, but this adds up a whole bunch of other things, the zuul-operator would manage Zookeeper types inside its namespace | 22:57 |
mnaser | i.e.: to get started, simply install the zuul-operator. and that's it. you're done. | 22:58 |
corvus | mnaser: i thought that would come out of the lifecycle manager too? | 22:58 |
mnaser | no OLM -- or -- "if you dont want to use OLM.... make sure you have the zookeeper and the pxc and the etc" | 22:58 |
mnaser | it sounds like OLM it actaully an extra componenet that will make it so you have 3 operators running in your namespace, where as this case, you have one | 22:59 |
mnaser | this means you dont need anything _except_ the zuul-operator running | 22:59 |
SpamapS | Still | 22:59 |
mnaser | also some other fun things you can do with golang that you couldnt with the ansible one, i actually can do things like use the provided github credentials | 23:00 |
SpamapS | that's highly optimized | 23:00 |
mnaser | poll github | 23:00 |
SpamapS | I respect the desire to optimize it | 23:00 |
SpamapS | but there's a whole community to think about | 23:00 |
mnaser | and pull down repositories it's installed into | 23:00 |
SpamapS | 98% awesome, and maintainable by ansible-knowing folks is better than 100% awesome but only 10% of Zuul users can approach it. | 23:00 |
mnaser | so no need to necessarily list out all the repos you are using the app for (obviously some might want to explicitly list it, but it does simplify life in that manner) | 23:00 |
mnaser | i totally understand | 23:01 |
mnaser | anyways, i'd be happy to show what i have at some point if there's interest, i don't think i'd probably want to throw it out there to cause confusion for those seeking a zuul operator | 23:02 |
*** tosky has quit IRC | 23:11 | |
*** jamesmcarthur has joined #zuul | 23:24 | |
openstackgerrit | Merged zuul/zuul master: Move reference pipelines out of the quickstart https://review.opendev.org/682760 | 23:25 |
*** mattw4 has quit IRC | 23:34 | |
*** igordc has quit IRC | 23:39 | |
*** jamesmcarthur has quit IRC | 23:50 | |
*** rlandy has quit IRC | 23:51 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!