pabelanger | SpamapS: ya, I've just been using zookeeper from fedora for testing, so centos-8 will have something. But, we can likely push zookeeper-lite from tristanC / dmsimard to COPR or something | 00:03 |
---|---|---|
SpamapS | meh | 00:03 |
SpamapS | tarball seems to be working-ish | 00:03 |
pabelanger | I have an ansible-role for zookeeper I plan on adding tarball support for also | 00:03 |
pabelanger | cool | 00:03 |
SpamapS | pabelanger: -> BonnyCI/hoist | 00:03 |
pabelanger | SpamapS: are you centos now? | 00:04 |
SpamapS | pabelanger: aye | 00:05 |
mordred | jeblair: http://logs.openstack.org/02/500202/24/check/devstack/a0ebcd2/job-output.txt.gz#_2017-09-07_22_29_05_056316 | 00:06 |
mordred | jeblair: I think you got further | 00:06 |
pabelanger | SpamapS: cool | 00:07 |
SpamapS | sorta ;) | 00:07 |
pabelanger | :) | 00:08 |
ianw | SpamapS: if you want "shoved into a rpm" then i've got -> https://copr.fedorainfracloud.org/coprs/iwienand/zookeeper-el7/packages/ | 00:08 |
SpamapS | ianw: you're about 20 minutes late | 00:09 |
ianw | SpamapS: it's ok, nothing happens for years ,then everything happens in 20 minutes ;) | 00:10 |
SpamapS | yep | 00:11 |
SpamapS | realistically, I'm just pushing this bonnyci work to completion, and then I'll probably just install Software Factory and be happy | 00:11 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add support for debian in configure-mirrors https://review.openstack.org/501537 | 00:11 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add support for fedora in configure-mirrors https://review.openstack.org/501538 | 00:18 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add support for opensuse in configure-mirrors https://review.openstack.org/501539 | 00:19 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Remove project pipeline definition from zuul-jobs https://review.openstack.org/501849 | 00:20 |
ianw | have i done something wrong or is https://review.openstack.org/#/c/501904 not getting picked up for testing? | 00:28 |
ianw | hmm, neither is a recheck of an old job | 00:29 |
ianw | it seems to be in a pretty hard loop around http://paste.openstack.org/show/620679/ | 00:36 |
ianw | jeblair / mordred: ^ ... | 00:39 |
mordred | ianw: hrm | 00:42 |
mordred | ianw: I agree with you - that definitely looks like a hard loop | 00:43 |
ianw | variations of | 00:43 |
ianw | 2017-09-08 00:43:14,011 INFO zuul.DependentPipelineManager: Reported change <Change 0x7fecd74a41d0 501849,2> status: all-succeeded: True, merged: True | 00:43 |
ianw | 2017-09-08 00:43:14,115 INFO zuul.DependentPipelineManager: Reported change <Change 0x7fecd7010c88 501538,5> status: all-succeeded: True, merged: True | 00:43 |
ianw | over and over | 00:43 |
mordred | ianw: 501849 removed a pipeline definition from zuul-jobs that was in the zuul-jobs repo | 00:45 |
mordred | ianw: a few bugs have flushed out related to pipeline removals - so maybe something went south? | 00:45 |
mordred | ianw: in any case, at that rate we're going to run outof disk space due to logging | 00:46 |
mordred | ianw: so how about we restart the scheduler (if there's not enough logging now there never will be) | 00:46 |
ianw | sgtm | 00:46 |
mordred | I have stopped it - restarting | 00:47 |
mordred | it's now reading is config | 00:48 |
dmsimard | what did I break | 00:49 |
mordred | ianw: I rechecked your change and it seems to be enqueued | 00:49 |
dmsimard | mordred: I made sure to put depends-on on my forklift patches to ensure things did not merge out of order | 00:50 |
dmsimard | mordred: https://review.openstack.org/#/q/topic:zuulv3-forklift | 00:50 |
mordred | dmsimard: I don't think it was that - my hunch is that we tickeled a weird edge-case somewhere | 00:50 |
dmsimard | oh wow my patch series finally landed \o/ | 00:50 |
dmsimard | need +3 on two patches to bring back zuul-jobs jobs into place https://review.openstack.org/#/q/topic:zuulv3-forklift | 00:51 |
SpamapS | gah.. so many apt: calls to fix :-P | 00:52 |
mordred | ianw: although we're now just sitting in queued state | 00:53 |
mordred | ianw: nevermind. I'm just impatient | 00:53 |
dmsimard | I'm not sure I understand the deal with that apt issue | 00:54 |
dmsimard | ansible is unable to use python-apt until you do apt-cache update or something like that ? | 00:55 |
ianw | dmsimard: just been looking at them, as i was doing 501904 | 00:55 |
ianw | mordred / dmsimard: are we at a point we can merge https://review.openstack.org/#/q/topic:zuulv3-forklift ? | 00:58 |
dmsimard | ianw: yes, it's tech debt from merging my configure-mirror tree | 00:58 |
ianw | it lgtm, if zuul's up to it :) | 00:58 |
ianw | everything seems to be moving, so ok | 00:58 |
dmsimard | ianw: for the ci creation script, go ahead | 00:59 |
dmsimard | ianw: if it's part of what would be the base job, add the role in the base.yaml test playbook | 00:59 |
dmsimard | that's where validate-host will end up as well | 00:59 |
dmsimard | although I still need to figure out what to do with that one, it has tasks that can only run on the executor.. | 01:00 |
ianw | dmsimard: yep ... so for testing purposes, just assert: that: tests that things look right? | 01:00 |
ianw | file is there, perms are right, etc? | 01:00 |
ianw | or is there a better way? | 01:00 |
dmsimard | ianw: checking that it runs without horribly failing is already a good start (the job will fail if it fails horribly) | 01:00 |
dmsimard | ianw: asserts are good too. | 01:00 |
dmsimard | ianw: there's some examples here if you want: https://github.com/ansible/ansible/tree/devel/test/integration | 01:00 |
dmsimard | ianw: some examples of what I've written https://github.com/ansible/ansible/blob/devel/test/integration/targets/sensu_client/tasks/main.yml | 01:01 |
dmsimard | https://github.com/ansible/ansible/tree/devel/test/integration/targets/include_vars | 01:01 |
ianw | excellent thanks; looks like roughly what i had in my head so a good sign ;) | 01:02 |
*** harlowja has quit IRC | 01:11 | |
tristanC | ianw: fwiw, there is a zookeeper-lite rpm package built in software-factory repository | 01:33 |
ianw | tristanC: cool, is that more or less the tarball in rpm foramt? | 01:40 |
tristanC | ianw: it's built from source for el7, the trick was to remove all the client and netty, hence the "-lite" suffix | 01:42 |
tristanC | package is https://softwarefactory-project.io/kojifiles/repos/sf-2.6-el7-release/Mash/zookeeper-lite-3.4.10-1.el7.x86_64.rpm, .spec is https://softwarefactory-project.io/r/gitweb?p=software-factory/zookeeper-lite-distgit.git;a=tree | 01:42 |
ianw | tristanC: excellent, updated https://etherpad.openstack.org/p/zookeeper-epel7 in case any googlenauts find it | 01:54 |
dmsimard | wanted to be productive tonight but spent the evening recovering from rdo infrastructure outage | 01:54 |
dmsimard | ianw: if you want to take a stab at devstack centos/fedora/suse/debian go for it, I'll follow up tomorrow | 01:55 |
ianw | ok, see how i got this afternoon. need some lunch first! :) | 01:56 |
dmsimard | ianw: if you try out fedora, I can probably just pattern off of that for | 01:56 |
dmsimard | centos* and the others | 01:56 |
tristanC | ianw: thanks! | 01:58 |
*** olaph1 has quit IRC | 02:08 | |
*** xinliang has quit IRC | 02:12 | |
*** xinliang has joined #zuul | 02:24 | |
*** xinliang has joined #zuul | 02:24 | |
dmsimard | all the cloud outage things are fixed | 02:53 |
dmsimard | going to bed now gnight | 02:53 |
dmsimard | o/ | 02:53 |
*** jkilpatr has quit IRC | 02:59 | |
tristanC | SpamapS: there is an upcoming blog post about using software-factory as a third-party-ci: https://softwarefactory-project.io/r/#/c/9473/3/Openstack_3rd_Party_CI_with_SF_26/2017-08-28-openstack-3rd-party-ci-with-software-factory.html.md | 03:07 |
tristanC | SpamapS: and the general documention is https://softwarefactory-project.io/docs/operator/deployment.html | 03:08 |
SpamapS | tristanC: awesome. :) I really do intend to give it a shot. I just figured I invested half a day in centos-ifying BonnyCI/hoist, I might as well finish :) | 03:08 |
tristanC | As you wish, I think the main difference is that software-factory uses rpm packages for everything and the sfconfig script automatically generate all the secrets | 03:09 |
SpamapS | there are likely tons of differences :) | 03:09 |
tristanC | may i ask how are you getting python3 on centos? | 03:10 |
SpamapS | but ultimately, if I can get a zuulv3 up that talks to our internal Github and starts running jobs on our cloud... I can show people the magic. :) | 03:10 |
SpamapS | tristanC: haven't crossed that bridge yet. :) | 03:10 |
SpamapS | I'm still in the stage of converting all the apt's to yums. | 03:11 |
SpamapS | and went through the really silly-feeling process of making mariadb wori | 03:11 |
SpamapS | work | 03:11 |
tristanC | alternatively, you could try using software-fatory repository so that you could yum install zookeeper, zuul and nodepool | 03:12 |
tristanC | well fwiw, all the distgit are available over softwarefactory-project.io gerrit, and package update goes through full integration test so contribution are welcome too :-) | 03:14 |
SpamapS | neat | 03:14 |
ianw | jeblair / mordred : zuul has hung again, and the logfile is up to 16gb ... | 03:23 |
ianw | http://paste.openstack.org/show/620684/ | 03:24 |
ianw | i'm going to restart it, because this doesn't end well | 03:25 |
SpamapS | hrm so why isn't there python3.5 in EPEL? :-P | 03:35 |
SpamapS | or does python3.4 work ok for zuul/nodepool these days? :-P | 03:35 |
* SpamapS can't remember what the min version was | 03:35 | |
tristanC | SpamapS: you could get python35 using softwarecollections | 03:38 |
SpamapS | tristanC: ah | 03:38 |
tristanC | SpamapS: but then, zuul-executor needs libpython35 in /usr/lib because bwrap will drop a custom ld_library_path | 03:38 |
SpamapS | round and round we go | 03:39 |
SpamapS | tristanC: you're scoring points :) | 03:41 |
* SpamapS is having to update bonnyci's code to be very explicit about pip/python executables | 03:41 | |
tristanC | SpamapS: sort of, imo the real point is using zuul, what ever distribution/confmgmt ;) | 03:43 |
openstackgerrit | Ian Wienand proposed openstack-infra/zuul-jobs master: Expand PATH for SUSE systems to include /usr/sbin https://review.openstack.org/501737 | 03:45 |
clarkb | py35 is required bcause of the type annotations | 03:50 |
clarkb | I dont follow why bubblewrap factors into that | 03:51 |
tristanC | clarkb: when using software-collections, python35 is installed in /opt and the .so isn't declared in the main ld.so.conf | 03:52 |
clarkb | I had zuul running on py34 for gerrit testing but that was before the type annotations | 03:52 |
clarkb | tristanC: oh software collection specific thing | 03:52 |
tristanC | clarkb: yes, regarding python35 on centos7 with scl | 03:53 |
tristanC | the issue is that bwrap is setuid and it will drop the custom LD_LIBRARY_PATH from zuul-executor, so ansible-playbook will failed with missing libpyton35 | 03:55 |
clarkb | got it | 03:56 |
tristanC | though it's easy to fix, just symlink the python lib from /opt to /lib | 03:56 |
*** xinliang has quit IRC | 03:59 | |
ianw | dmsimard: ahh, i think we've had a bit of a split-brain thing happening between depends-on, zuulv2 and zuulv3 merging with your jobs moving tests. just untangling it now ... | 04:02 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Expand PATH for SUSE systems to include /usr/sbin https://review.openstack.org/501737 | 04:02 |
*** xinliang has joined #zuul | 04:06 | |
ianw | oh, you know what ... it's because there's no jenkins reporting on the depends-on | 04:10 |
SpamapS | weird... pip: is using 'pip2' instead of 'pip' which is causing all kinds of weirdness for me :-P | 04:27 |
openstackgerrit | Ian Wienand proposed openstack-infra/zuul-jobs master: Add integration tests for validate-host https://review.openstack.org/501543 | 04:34 |
tristanC | SpamapS: btw, you'll also need git-2 because zuul is using GIT_SSH_COMMAND, which is also avail in scl with rh-git29 | 05:05 |
tristanC | and well, if you don't want to bother with those details, you could get the service installed now using "yum install -y https://softwarefactory-project.io/repos/sf-release-master.rpm && yum install -y rh-python35-zuul-* rh-python35-nodepool-*" ... just saying | 05:11 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add a role to emit an informative header for logs https://review.openstack.org/501495 | 05:33 |
*** harlowja has joined #zuul | 06:19 | |
*** harlowja has quit IRC | 06:40 | |
*** fbo_ has joined #zuul | 06:43 | |
*** fbo_ is now known as fbo | 06:48 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Use connection type supplied from nodepool https://review.openstack.org/501976 | 06:58 |
*** yolanda has joined #zuul | 07:07 | |
*** electrofelix has joined #zuul | 08:38 | |
*** hashar has joined #zuul | 09:01 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Support username also for unmanaged cloud images https://review.openstack.org/500808 | 09:04 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Add password to build and upload information https://review.openstack.org/502011 | 09:04 |
tristanC | I started the process of writting a SELinux policy for zuul, once it passes some integration test i'd like to submit upstream for review if you are ok with this | 09:55 |
tristanC | fwiw, the type enforcement file looks like this: https://softwarefactory-project.io/r/#/c/9593/2/zuul/sf-zuul.te | 09:56 |
*** jkilpatr has joined #zuul | 10:57 | |
*** olaph has joined #zuul | 12:03 | |
dmsimard | jeblair, mordred: so I was thinking last night.. with ara 1.0, there is the opportunity to make a *very* bare bones callback with (almost) just 'requests' as a dependency for use with the HTTP REST API. The python API leverages the flask app so it still depends on some things (less than the entire webapp, like no xstatic, etc.).. as things progress along, I'll see what are the opportunity for a more "bare | 12:08 |
dmsimard | bones" python API callback | 12:08 |
*** odyssey4me has joined #zuul | 12:43 | |
*** hashar has quit IRC | 13:28 | |
*** hashar has joined #zuul | 13:28 | |
*** dkranz has joined #zuul | 13:56 | |
rcarrillocruz | hey folks | 13:59 |
rcarrillocruz | how's bubblewrap installed | 13:59 |
rcarrillocruz | via distro package? | 13:59 |
rcarrillocruz | setting up a zuul, getting failures due to missing bwrap binary | 13:59 |
odyssey4me | rcarrillocruz are there instructions around for zuul v3 - or is this zuul v2.5 ? | 14:03 |
rcarrillocruz | setting up v3 | 14:03 |
rcarrillocruz | hmm | 14:05 |
rcarrillocruz | https://github.com/openstack-infra/puppet-zuul/blob/master/manifests/executor.pp | 14:05 |
mordred | rcarrillocruz: there'sa PPA we use for ubuntu-xenial | 14:06 |
rcarrillocruz | yah, just enabled https://launchpad.net/~openstack-ci-core/+archive/ubuntu/bubblewrap/+index | 14:07 |
rcarrillocruz | thx | 14:07 |
mordred | rcarrillocruz: it's already in fedora - I'm not sure about the story for centos | 14:07 |
mordred | ++ | 14:07 |
rcarrillocruz | oh | 14:07 |
rcarrillocruz | fedora 26? | 14:07 |
rcarrillocruz | i mean, right now i'm in AIO, scheduler/merger/executor all in a xenial | 14:08 |
rcarrillocruz | but that's interesting | 14:08 |
rcarrillocruz | in other news | 14:09 |
rcarrillocruz | https://github.com/rcarrillocruz-org/zuul-tested-repo/pull/5 | 14:09 |
rcarrillocruz | i have GH reporter working | 14:09 |
rcarrillocruz | \o/ | 14:09 |
rcarrillocruz | odyssey4me: oh sorry, didn't follow, you are willing to spin up a v3? | 14:11 |
rcarrillocruz | i'm doing an install using ansible-role-zuul | 14:11 |
rcarrillocruz | getting notes of things i'm encountering that need manual fix | 14:11 |
rcarrillocruz | for later patches | 14:11 |
rcarrillocruz | but overall, the role sets up a v3 AIO nicely | 14:11 |
mordred | rcarrillocruz: wot! | 14:12 |
odyssey4me | rcarrillocruz yes, I'm particularly interested in nodepool at this stage - not sure how tightly coupled it is to zuul v3 | 14:12 |
mordred | odyssey4me: it can totally be run on its own | 14:12 |
rcarrillocruz | odyssey4me: well, i'm using nodepool standalone for a very rough CI in Ansible networking | 14:13 |
mordred | odyssey4me: nodepool in the feature/zuulv3 branch is the one you want if you're setting up a nodepool | 14:13 |
odyssey4me | ah, excellent - not that I wouldn't prefer our CI to use zuul... but for now we have jenkins :( | 14:13 |
rcarrillocruz | used ansible-role-nodepool | 14:13 |
rcarrillocruz | happy to help | 14:13 |
mordred | odyssey4me: so - there's a thing that I think would be GREAT if someone wrote | 14:13 |
mordred | odyssey4me: but I'm not going to because I'm busy | 14:13 |
odyssey4me | mordred aha, that'd be what I'm looking for then | 14:13 |
mordred | odyssey4me: which is a nodepool plugin for jenkins | 14:13 |
odyssey4me | rcarrillocruz so ansible-role-nodepool works with the feature branch version? | 14:14 |
rcarrillocruz | yep | 14:14 |
rcarrillocruz | by default it pulls feature/zuulv3 | 14:14 |
odyssey4me | mordred hmm, yes - that might actually just end up happening | 14:14 |
mordred | odyssey4me: nodepool v3 uses zookeeper for zuul to request nodes from it - it should be VERY easy for someone with the javas to write a plugin for jenkins that could request nodes from nodepool using the same api | 14:14 |
odyssey4me | rcarrillocruz sweet, that helps a bunch - thanks | 14:14 |
mordred | odyssey4me: I'd personally like it because I think it makes a really nice migration path for folks - or for situatoins where you want to run zuul and jenkins side by side | 14:15 |
rcarrillocruz | things you need to do before ansible-role-nodepool invocation: | 14:15 |
rcarrillocruz | 1.bootstrap python | 14:15 |
rcarrillocruz | 2. bootstrap pip | 14:15 |
rcarrillocruz | 3. generate ssh key | 14:15 |
mordred | odyssey4me: both a zuul and a jenkins witha zk plugin could totally consume nodes from the same nodepool with no problems | 14:15 |
rcarrillocruz | 4. copy over clouds.yaml to nodepool home folder | 14:15 |
rcarrillocruz | 5. invoke ansible-role-zookeeper | 14:16 |
odyssey4me | mordred yeah, I saw the intended lock feature - so that makes sense | 14:16 |
rcarrillocruz | 6. invoke ansible-role-nodepool, passing as param the nodepool_file_nodepool_yaml_src var (it creates nodepool.yaml) | 14:16 |
rcarrillocruz | i'm in the process of creating a network-infra, it's private repo, once i split off secrets from it i wanna make it public | 14:17 |
rcarrillocruz | will ping you when done | 14:17 |
odyssey4me | rcarrillocruz that'd be great | 14:17 |
rcarrillocruz | but the nodepool playbook is pretty much the steps i depicted above | 14:17 |
odyssey4me | I wonder if we should etherpad that little procedure and share war stories as we go | 14:18 |
odyssey4me | I suppose a review to the repo would probably make better sense. | 14:18 |
odyssey4me | I'll take these as notes and prep a patch to the README | 14:18 |
odyssey4me | thanks rcarrillocruz once again | 14:18 |
rcarrillocruz | ++ | 14:18 |
rcarrillocruz | i need to app myself a doc patch describing what perms are bare minimum to create a zuul gh app | 14:20 |
rcarrillocruz | s/app/write | 14:20 |
*** olaph has quit IRC | 14:34 | |
*** olaph has joined #zuul | 14:36 | |
dmsimard | rcarrillocruz: what are you deploying on ? centos ? | 14:51 |
rcarrillocruz | xenial | 14:51 |
dmsimard | rcarrillocruz: okay -- if you deployed on centos I might have been able to help :) | 14:52 |
dmsimard | rcarrillocruz: https://www.rdoproject.org/blog/2017/03/standalone-nodepool/ | 14:53 |
dmsimard | out of software factory | 14:53 |
dmsimard | rcarrillocruz: oh, that might be for odyssey4me instead though | 14:53 |
rcarrillocruz | otoh, i believe sf is not on zuulv3 yet | 14:53 |
rcarrillocruz | ? | 14:53 |
dmsimard | rcarrillocruz: it's there: https://softwarefactory-project.io/zuul3/ | 14:54 |
dmsimard | rcarrillocruz: tristanC might have more details but I believe all the necessary bits to use it are there but the next version of SF should contain the "real" v3 release | 14:54 |
rcarrillocruz | orly? yanis told me SF v3 was still to be deterimned | 14:54 |
odyssey4me | thanks dmsimard - that may prove useful :) | 14:54 |
dmsimard | mordred: which brings the question, do you guys plan on merging feature/v3 back into master and start tagging releases at some point ? :) | 14:55 |
rcarrillocruz | hmm | 14:55 |
rcarrillocruz | sudo yum install -y --nogpgcheck https://softwarefactory-project.io/repos/sf-release-2.5.rpm | 14:55 |
rcarrillocruz | i wonder if that v3 was hand rolled | 14:55 |
rcarrillocruz | and is what yanis referred to, that there no v3 packages | 14:56 |
dmsimard | rcarrillocruz: v3 is not in 2.5 | 14:56 |
dmsimard | rcarrillocruz: it landed in 2.6 | 14:56 |
rcarrillocruz | yanis == spredzy on IRC for those who don't know | 14:56 |
dmsimard | rcarrillocruz: install 2.6 instead | 14:56 |
rcarrillocruz | ah, so SF releases are not related to zuul versioning | 14:56 |
mordred | dmsimard: yup! | 14:56 |
rcarrillocruz | good | 14:56 |
mordred | dmsimard: well - we plan on merging v3 back into master for sure | 14:57 |
mordred | dmsimard: and I believe we plan on tagging a 3.0 release once we think it's 'ready' for other folks | 14:57 |
dmsimard | rcarrillocruz, odyssey4me: docs here https://softwarefactory-project.io/docs/operator/deployment.html | 14:57 |
dmsimard | rcarrillocruz: correct, sf releases are not tagged according to zuul release :) | 14:58 |
mordred | dmsimard: ongoing releases are an interesting question - since we run CD from master (or will be once it's re-merged) - releases wind up being arbitrary snapshots -so we'll need to think about the semantics of that | 14:58 |
dmsimard | rcarrillocruz: 2.6 ships with jenkins, zuul-launcher (so zuul v2.5 without jenkins) *and* zuul v3, it's a release to help with transition | 14:58 |
rcarrillocruz | hmm, k | 14:59 |
rcarrillocruz | pabelanger: ^ | 14:59 |
tristanC | rcarrillocruz: the current roadmap is sf-2.x support both zuulv2 and zuulv3, and a future sf-3 will be zuulv3 only | 14:59 |
rcarrillocruz | we talked about SF and v3 not that long | 15:00 |
rcarrillocruz | ++ | 15:00 |
tristanC | rcarrillocruz: sf-2.6 does have a working "tech-preview" zuulv3, and we are waiting for upstream release of zuulv3 to release it part of sf-2.7 | 15:00 |
tristanC | the master repository (that will be sf-2.7) does have all the nodepoolv3/zuulv3 bits in place | 15:02 |
tristanC | + the static and opencontainer driver so that you can test the full stack without a cloud | 15:04 |
rcarrillocruz | OH | 15:05 |
rcarrillocruz | that's neat! | 15:05 |
rcarrillocruz | i thought static driver was in review | 15:05 |
rcarrillocruz | has that been merged? | 15:05 |
tristanC | rcarrillocruz: not yet but I've added it to the nodepool3.rpm so that it's easy to test | 15:06 |
jeblair | mordred: post-hoc -1 review of https://review.openstack.org/501886 | 15:06 |
tristanC | rcarrillocruz: and we just added an integration test that does create oci slave to verify zuulv3 can merge patch | 15:06 |
rcarrillocruz | hmm, what's ansible_user value on the executor | 15:07 |
rcarrillocruz | is it root by default? | 15:07 |
rcarrillocruz | the user it ssh to on nodepool nodes | 15:07 |
tristanC | rcarrillocruz: yes, nodepool needs root access to create slave | 15:08 |
tristanC | rcarrillocruz: you can have a look at this integration test: https://github.com/softwarefactory-project/sf-ci/blob/master/health-check/nodepool3.yaml | 15:08 |
tristanC | rcarrillocruz: which use this nodepool.yaml: https://github.com/softwarefactory-project/sf-ci/blob/master/health-check/templates/nodepoolV3.yaml.j2 | 15:08 |
tristanC | similarly, this zuulv3 verify a change can be merged: https://github.com/softwarefactory-project/sf-ci/blob/master/health-check/zuul3.yaml | 15:09 |
jeblair | pabelanger, clarkb: fyi see comments on https://review.openstack.org/501886 | 15:09 |
tristanC | those tests are run on every software-factory change :) | 15:09 |
pabelanger | morning | 15:23 |
pabelanger | just catching up on backscroll | 15:23 |
pabelanger | jeblair: clarkb: thanks, keep forgetting about that | 15:24 |
mordred | jeblair: GAH. thank you | 15:30 |
tristanC | leaving for denver soon, see you there folks! | 15:35 |
*** hashar is now known as hasharAway | 15:45 | |
rcarrillocruz | are we good to go https://review.openstack.org/#/c/500808/3 | 15:50 |
rcarrillocruz | i need this to consume user from nodepool settings, to log into network appliances images | 15:51 |
rcarrillocruz | mordred: ^, you +2'd previously | 15:58 |
rcarrillocruz | hmm, on the other hand, where in the code of zuul we consume 'username' from nodepool? i see executor/server.py it creates ansible_user off executor.default_username, but don't see setting the user on hostvars off nodepool node | 15:59 |
rcarrillocruz | i guess by passing the full provider, as it contains labels, then username | 16:02 |
mordred | tristanC: see you in denver! | 16:07 |
mordred | rcarrillocruz: tobiash has patches for zuul to consume the username field once it's there | 16:08 |
mordred | rcarrillocruz: you might also want to review https://review.openstack.org/#/c/500800/2 and https://review.openstack.org/#/c/453968/3 | 16:08 |
rcarrillocruz | thx for the +1 | 16:08 |
rcarrillocruz | i'm a bit torn exposing passwords on nodepool.yaml | 16:08 |
rcarrillocruz | maybe it should go in something like secure.yaml or the likes | 16:08 |
rcarrillocruz | like | 16:08 |
rcarrillocruz | nodepool.yaml have it in core review public repo | 16:08 |
rcarrillocruz | secrets in private repo or somewhere else | 16:09 |
rcarrillocruz | i refer to https://review.openstack.org/#/c/502011/ | 16:09 |
mordred | rcarrillocruz: I just left that same -1 | 16:10 |
mordred | rcarrillocruz: we have an optional secure.yaml already - I think the docs in this case should talk about using it for that field | 16:10 |
mordred | rcarrillocruz: also - I tihnk we need doc mentions on securing zookeeper | 16:11 |
rcarrillocruz | my mem may fail on me, secure.yaml was for clouds.yaml passwords? | 16:12 |
pabelanger | rcarrillocruz: we used secure.yaml for database connection string | 16:14 |
pabelanger | right now, it is not used | 16:15 |
mordred | rcarrillocruz, pabelanger: both of you are right ... | 16:15 |
pabelanger | but, I think we want to use it for zookeeper auth at some point | 16:15 |
mordred | os-client-config supports a clouds.yaml and a secure.yaml | 16:15 |
mordred | and nodepool support a nodepool.yaml and a secure.yaml | 16:15 |
rcarrillocruz | what i'm getting at: will secure.yaml become a general thing to store nodepool creds | 16:15 |
pabelanger | Oh, TIL about os-client-config | 16:15 |
mordred | (os-client-config supports secure.yaml because nodepool did and it seemed like a nice thing to copy) | 16:15 |
rcarrillocruz | images creds | 16:15 |
rcarrillocruz | connection strings | 16:15 |
rcarrillocruz | etc | 16:15 |
pabelanger | and secure.yaml | 16:16 |
mordred | rcarrillocruz: secure.yaml already is a general thing in which any setting can go | 16:16 |
rcarrillocruz | if so, it sounds like we want to have a mini schema here | 16:16 |
rcarrillocruz | ack | 16:16 |
mordred | rcarrillocruz: it's basically just two files so you can, as an admin, put some in one file with stricter perms as you see fit | 16:16 |
mordred | or not, if your env is just you and you don't care :) | 16:16 |
rcarrillocruz | heh | 16:17 |
mordred | pabelanger: yah - we could split our nodepool clouds.yaml files if we wanted, put secure.yaml files out there with our passwords managed in system-config and put the other bits into project-config | 16:18 |
pabelanger | ++ | 16:18 |
jeblair | tobiash: http://paste.openstack.org/show/620735/ errors related to connection cache | 16:21 |
rcarrillocruz | btw, timer trigger doesn't work on github repos | 16:23 |
rcarrillocruz | http://paste.openstack.org/show/620737/ | 16:23 |
rcarrillocruz | it's supposing events will have commits | 16:23 |
jeblair | that error means that we're not completing the reconfiguringation process | 16:23 |
rcarrillocruz | that logic seems to need a move | 16:23 |
rcarrillocruz | will prep a patch | 16:23 |
jeblair | rcarrillocruz: thx | 16:23 |
jeblair | mordred, ianw: it is possible the maintainCache bug may be the cause of the stuck changes in queue. when we reconfigure (because we land a config change), we re-enqueue all the changes in new pipelines, then maintain the connection cache (which fails now and aborts the process). *after* that we re-establish the shared change queues in the pipeline postConfig method. it seems plausible that since that's not happening, that a change that was in ... | 16:29 |
jeblair | ... a pipeline across a reconfiguration may not be removed from the correct queue. | 16:29 |
jeblair | i will prepare a revert | 16:29 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Revert "Enable maintainConnectionCache" https://review.openstack.org/502121 | 16:38 |
jeblair | mordred: ^ | 16:40 |
tobiash | jeblair: looks like maintainCache is untested yet in github | 16:48 |
jlk | huh, yeah I don't think we've tested that | 16:48 |
tobiash | +2 for the revert | 16:49 |
jeblair | since it was probably copied, it's worth double checking that the bug in the github version of that function isn't also in the gerrit version. it's possible we ran that on github before we would have run it on gerrit. in other words, that particular code path may not be tested in gerrit as well. | 16:51 |
tobiash | jeblair: yes, I assumed it worked in v2 and wasn't modified much | 16:53 |
mordred | jeblair: +A | 16:53 |
mordred | jeblair: and ah - your explanation of how that could cause the things we saw seems plausible | 16:54 |
pabelanger | mordred: jeblair: is something wrong with zuulv3.o.o? I am seeing what I think is a loop in the debug.log | 17:00 |
*** harlowja has joined #zuul | 17:00 | |
jeblair | i'll restart it | 17:02 |
jeblair | pabelanger: see scrollback and https://review.openstack.org/502121 | 17:02 |
pabelanger | jeblair: ah, thank you | 17:05 |
tobiash | mordred, rcarrillocruz: for production use of storing passwords in zookeeper we for sure need to secure zookeeper with auth and encryption | 17:11 |
tobiash | mordred, rcarrillocruz: both should be supported in zookeeper, but we will have to add tls support to kazoo | 17:12 |
clarkb | this is login credentials? | 17:12 |
clarkb | another option may be to negotiate those out of band like is done with ssh keys? | 17:13 |
tobiash | clarkb: yes, that's login credentials to the nodes | 17:14 |
tobiash | clarkb: then we would need secrets on the executor also for user supplied images and means on the executor to configure different credentials for different labels | 17:15 |
clarkb | tobiash: yes, which is how ssh works isn't it? | 17:17 |
clarkb | (the private key being the secret) | 17:17 |
tobiash | yes | 17:18 |
tobiash | currently there is just a single private key on the executor for all nodes | 17:18 |
tobiash | and accessing windows nodes unfortunately doesn't work with ssh | 17:19 |
clarkb | tobiash: but you can use ssl client cert auth with windows | 17:19 |
clarkb | which is similar ish to ssh keys | 17:19 |
tobiash | clarkb: hm, didn't think about that possibility yet | 17:22 |
tobiash | sounds cool | 17:22 |
tobiash | clarkb: so you suggest to add a winrm_client_cert setting to the executor to avoid password passing? | 17:23 |
clarkb | tobiash: ya, I mean I've never had to actually use it but seems like a good match up to how things work with ssh keys | 17:23 |
clarkb | and that might simplify bootstrapping | 17:24 |
mordred | it's worthwhile checking about how windows guests work on openstack, but also on aws - do you get an administrator password returned in the nova server or ec2 instance data from the API? | 17:24 |
mordred | mostly because if that's the mechanism available for those, then figuring out password passing will still need to happen at some point | 17:25 |
tobiash | mordred: in our v2 env the windows guests work quite well | 17:25 |
tobiash | they just take longer to boot | 17:25 |
tobiash | around 2min instead of 50s | 17:25 |
tobiash | but we preconfigured the login stuff in the image | 17:26 |
mordred | tobiash: how does auth work with that? also - someone was asking about image building too | 17:26 |
jeblair | 2017-09-08 17:26:19.542490 | ubuntu-xenial | 2017-09-08 17:26:19.542 | stack.sh completed in 944 seconds. | 17:27 |
jeblair | clarkb, mordred: ^ | 17:27 |
jeblair | http://logs.openstack.org/02/500202/25/check/devstack/dbd6e11/ | 17:27 |
tobiash | mordred: we built that image by hand and added cygwin ssh service with public key authentication such that nodepool and jenkins are happy | 17:27 |
tobiash | but windows was a side use case before and now we need to professionalize this with automated image building and so on | 17:28 |
clarkb | jeblair: nice | 17:29 |
tobiash | ok, client cert auth should be possible (although undocumented) with ansible: https://github.com/ansible/ansible/issues/16243 | 17:34 |
mordred | jeblair: WOOT | 17:38 |
mordred | tobiash: cool | 17:39 |
pabelanger | 502121 looks stuck in gate :( | 17:39 |
clarkb | does make me wonder why windows doesn't do ssh key auth for powershell. Its all bsd licensed too isn't it? | 17:39 |
jeblair | mordred, Shrews: do you know if anyone working on the websocket streaming test failures? | 17:40 |
jeblair | pabelanger: i'll force-merge it and restart i guess | 17:40 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Revert "Enable maintainConnectionCache" https://review.openstack.org/502121 | 17:41 |
pabelanger | jeblair: ty | 17:41 |
tobiash | clarkb: https://blogs.msdn.microsoft.com/powershell/2015/10/19/openssh-for-windows-update/ | 17:43 |
tobiash | But still ansible would have to support win modules via ssh | 17:44 |
jeblair | clarkb: do you want to take a look at 496959, 496301, 500202 ? | 17:45 |
jeblair | mordred: and you the last 2 of those? ^ | 17:45 |
jeblair | i think that's a good checkpoint; we can land those and iterate from there (i think we still need to add the disk partitioning role to that, and then multinode, and tempest... :) | 17:46 |
clarkb | mordred: jeblair for 959 homedir is moving from /opt/stack/new to /opt/stack ? | 17:49 |
*** electrofelix has quit IRC | 17:50 | |
jeblair | clarkb: yep; since we're starting from scratch here, i'm trying to be as close to devstack default as possible (i'd like to actually have devstack do that -- it is capable of doing so, but there's some chicken/egg stuff with git repos we'd need to work out first) | 17:50 |
clarkb | and tempest homedir is default path which isn't a change I don't think | 17:51 |
clarkb | ? | 17:51 |
jeblair | clarkb: i believe that's correct | 17:52 |
mordred | jeblair: +2 from me all around | 17:53 |
clarkb | ya confirmed tempest isn't moving just continues to use default | 17:54 |
mordred | jeblair, Shrews: I am not working on websocket streaming test failures | 17:54 |
clarkb | jeblair: and the localrc sort method is not alnum or similar it is instead writing out referenced vars first before they are referenced? | 17:57 |
clarkb | ya ok thats what vargraph is | 17:59 |
mordred | jeblair: I just rechecked https://review.openstack.org/#/c/500365 to see how it goes | 18:05 |
pabelanger | mordred: jeblair: puppet has installed 502121 on zuulv3.o.o, safe to restart scheduler? | 18:05 |
mordred | pabelanger: fine by me - I just rechecked a job, so you may want to do graceful save/restore - or else I can just recheck once you're done | 18:06 |
pabelanger | I am not sure, I think zuul is stuck in loop? cc jeblair | 18:06 |
mordred | oh- then absolutely safe to restart | 18:06 |
pabelanger | k | 18:06 |
pabelanger | okay, restarted | 18:07 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add migration tool for v2 to v3 conversion https://review.openstack.org/491805 | 18:09 |
mordred | jeblair, clarkb, pabelanger: ^^ that's ready for people to take stabs at | 18:09 |
mordred | I added a script "tools/run-migration.sh" that you can use to run it locally if project-config is adjacent to zuul in your local git structure | 18:10 |
mordred | it expects the project-config repo to have the zuul/mapping.yaml file in it though | 18:10 |
mordred | I also added a check job with that patch that runs the script on project-config and collects the output | 18:10 |
mordred | and I put in some notes at the top of the file on things that need to be implemented - I think we can divy those up | 18:12 |
mordred | also, while there are some classes in this tool, it's also vintage mordred code, which means it's not always doing things the most sanest way | 18:12 |
clarkb | jeblair: I am +2 on those changes too. Not approving because have plumber here again and distracted | 18:15 |
mordred | jeblair: errors like this when running a job: | 18:18 |
pabelanger | oh noes | 18:18 |
mordred | "shade-ansible-devel-functional-devstack shade-ansible-devel-functional-devstack : ERROR Unable to find playbook /var/lib/zuul/builds/f244873046ed46a8abcdbb7a036008ea/work/src/git.openstack.org/openstack-infra/shade/playbooks/devstack/pre-run" | 18:18 |
pabelanger | 2017-09-08 18:17:00,357 DEBUG zuul.AnsibleJob: [build: 924886064b4f45d28785c6851c60bc27] Ansible output: b'ERROR! A worker was found in a dead state' | 18:18 |
pabelanger | that was on ze01 | 18:18 |
mordred | pabelanger: ugh | 18:19 |
pabelanger | we seem to still have our PPA version installed | 18:20 |
mordred | jeblair: how hard would it be to catch those in the job parser / validator? I mean, I guess it would involve the parser making additional cat job calls to get the playbook content | 18:20 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add migration tool for v2 to v3 conversion https://review.openstack.org/491805 | 18:20 |
mordred | pabelanger: also - could you review https://review.openstack.org/#/c/500320/ for me please? | 18:21 |
mordred | clarkb: or you - you +2'd it before, but then I squashed it with the previous patch | 18:22 |
pabelanger | looking | 18:22 |
mordred | pabelanger: and https://review.openstack.org/#/c/501001/ to go along with it | 18:23 |
jeblair | mordred: yeah, the problem with catching that at validation is that we don't know where to look. we'd have to either ask the merger for a list of *all* files, or have two round trips to the merger. | 18:24 |
mordred | jeblair: nod | 18:25 |
Shrews | jeblair: mordred: i was not aware of websocket test failures. example? | 18:26 |
mordred | Shrews: it's a sporadic failure | 18:26 |
mordred | Shrews: one sec- lemme find one | 18:27 |
jeblair | mordred, pabelanger, clarkb, jlk, tobiash, dmsimard, Shrews, anyone else: how about we take a look at mordred's migration code, and maybe convene around 20:00 utc and work on a plan for next steps / ptg? does that time work for folks? alternate suggestions? | 18:27 |
Shrews | mordred: didn't you submit a ipv6 patch for something yesterday? | 18:27 |
mordred | Shrews: http://logs.openstack.org/92/500592/6/gate/tox-py35-on-zuul/679b5c4/testr_results.html.gz | 18:27 |
mordred | Shrews: I did, and it has landed - and it still doesn't work :( | 18:27 |
mordred | jeblair: ++ I'll be there | 18:27 |
mordred | Shrews: test_websocket_streaming is the only failure in that log that's relevant | 18:28 |
*** hasharAway has quit IRC | 18:28 | |
*** hashar has joined #zuul | 18:29 | |
Shrews | mordred: any idea when this started happening? | 18:29 |
mordred | Shrews: yah - when we added IPv6 enabled test nodes into the v3 nodepool | 18:30 |
mordred | Shrews: best I can tell it only happens when we run those unittests on one of them | 18:30 |
pabelanger | jeblair: wfm | 18:30 |
dmsimard | jeblair: where is the migration code ? | 18:30 |
clarkb | jeblair: yes that should work for me, though I doubt I will be able to look at the code much. | 18:31 |
clarkb | good news is the saga of the plumbing and washing machine is almost over \o/ | 18:31 |
Shrews | mordred: hrm, seems more zuul_stream related (possibly). no logging output whatsoever | 18:31 |
Shrews | fun | 18:31 |
jlk | jeblair: sounds right | 18:33 |
Shrews | or finger server... hrm | 18:33 |
jeblair | dmsimard: mordred just pushed it up -- 491805 | 18:35 |
Shrews | i'm betting socketserver may not be ipv6 friendly | 18:36 |
Shrews | address_family = socket.AF_INET | 18:38 |
mordred | Shrews: gah. I thought I got all of those | 18:38 |
Shrews | likely culprit... seeing how to override | 18:38 |
Shrews | mordred: this is in socketserver itself | 18:38 |
mordred | Shrews: oh - that's in socketserver itself? | 18:38 |
mordred | nod | 18:38 |
Shrews | mordred: specifically, https://github.com/python/cpython/blob/3.4/Lib/socketserver.py#L415 | 18:40 |
tobiash | jeblair: I'll be around | 18:40 |
Shrews | mordred: so, we can override that value... but AF_INETV6 would not support ipv4, right? not sure how to tell which we should use | 18:41 |
mordred | Shrews: AF_INET6 supports both | 18:42 |
mordred | Shrews: you can grep in zuul source for a few places we use it | 18:43 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Also support IPv6 in the finger log streamer https://review.openstack.org/502137 | 18:46 |
Shrews | let's see what that gets us | 18:46 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Also support IPv6 in the finger log streamer https://review.openstack.org/502137 | 18:57 |
dmsimard | jeblair, mordred: added two small comments to mordred's patch, it's hard to me to review without seeing the end result -- I'll test with the provided bash script and report back if I see anything weird | 18:59 |
dmsimard | didn't see anything shocking | 18:59 |
mordred | dmsimard: sweet. I've got an update/followp coming in just a sec - so I'll go look at your comments before I push it up | 19:00 |
Shrews | mordred: where is your zuul/mapping.yaml file referenced in 491805? | 19:02 |
dmsimard | Shrews: I think that's where the mapping ends up being generated ? /me looks again | 19:03 |
dmsimard | ah, nope, the file is expected to be there indeed | 19:06 |
dmsimard | Shrews: https://review.openstack.org/#/c/491804/5/zuul/mapping.yaml | 19:06 |
tobiash | mordred: I didn't find the generated playbooks in the build result | 19:07 |
dmsimard | tobiash: I see the new layout in zuul.d/99converted.yaml but I don't see the actual jobs | 19:10 |
tobiash | dmsimard: yeah, also just noticed that | 19:10 |
dmsimard | would jobs be "compiled" at runtime and this is just a migration of the layout ? | 19:10 |
tobiash | how is it supposed to be distributed over the repos at the first try? | 19:11 |
dmsimard | tobiash: it's not afaik | 19:11 |
tobiash | so all will be in project-config at first? | 19:11 |
dmsimard | tobiash: it will live in project-config until projects start migrating over ? | 19:11 |
mordred | tobiash: we are not generating the playbooks yet | 19:16 |
mordred | that's one of the next things needed | 19:16 |
mordred | (we have a bunch of the code for that already in 2.5 that we can copy-pasta in) | 19:16 |
dmsimard | mordred: I wonder if we should prefix or suffix jobs that have been automatically migrated | 19:16 |
mordred | dmsimard: aha - but we do! :) | 19:17 |
mordred | dmsimard: legacy-{name}-ansible-func-centos-7 ... for instance | 19:17 |
dmsimard | mordred: yeah I saw legacy but it's not everywhere so I thought it was something else | 19:17 |
mordred | dmsimard: we have a bunch of places where we have already defined new jobs for tolks | 19:18 |
mordred | so people who were using gate-nova-python27 before just get "tox-py27" | 19:18 |
dmsimard | mordred: so you ended up defining nodesets after the node names just like we discussed ? | 19:18 |
mordred | but if we don't have a nice new job to migrate you to (that's what the mapping file is for) | 19:18 |
mordred | dmsimard: oh - yah - need to write that patch, but yes, that idea makes this quite nice | 19:19 |
tobiash | mordred: looks like the jobs used in 99converted are also sill missing? | 19:19 |
mordred | tobiash: yes indeed | 19:19 |
dmsimard | mordred: yeah I don't see the nodesets but I see the intent of using that | 19:19 |
mordred | tobiash: the job+playbook content will come next | 19:19 |
dmsimard | I think jeblair was against the idea though | 19:19 |
tobiash | ah, ok | 19:19 |
dmsimard | or maybe it was about mixing label and name without requiring the other | 19:19 |
dmsimard | I forget | 19:19 |
mordred | dmsimard: that's a different thing- but the thing you just mentoined, defining some nodesets that match each of our labels - and also the old v2 multi-node labels | 19:20 |
mordred | makes for a nice transition | 19:20 |
mordred | I'll get that patch up in just a sec | 19:20 |
dmsimard | wfm | 19:20 |
mordred | WOOT | 19:24 |
mordred | " | 19:24 |
mordred | Accepted python3.5 into zesty-proposed. The package will build now and | 19:24 |
mordred | be available at | 19:24 |
mordred | https://launchpad.net/ubuntu/+source/python3.5/3.5.3-1ubuntu0~17.04.0 in | 19:24 |
mordred | a few hours, and then in the -proposed repository. | 19:24 |
mordred | clarkb, jeblair, pabelanger, Shrews, SpamapS: ^^ | 19:24 |
jeblair | yay! | 19:28 |
jeblair | dmsimard, mordred: i am all for doing the convenience nodeset definitions (they should be in project-config) | 19:29 |
jeblair | mordred: what things require ordereddict? (i'm concerned about the stuff going into zuul/lib/yamlutil) | 19:37 |
mordred | jeblair: if you don't use ordereddict the resulting file is illegible | 19:38 |
mordred | jeblair: happy to just copy-pasta all of that into migrate.py though | 19:38 |
jeblair | mordred: well, i was mostly looking at ordered_load, which would all be on the input side. what makes it all the way through the system ordered? | 19:39 |
mordred | jeblair: by illegible, I mean the source data has been maintain in alphabetical order, so scanning through and comparing source data to produced data with produced data being in an arbitrary order is hard to process | 19:39 |
pabelanger | mordred: cool | 19:40 |
jeblair | mordred: oh, you mean: "projects: [nova]" | 19:40 |
mordred | yah. - and also we have a generally accepte practice of having pipeline definitions look like project: name: foo template: - a - b check: ... etc | 19:41 |
jeblair | mordred: we could maybe drop that and re-alphabetize it (for project_name in sorted(self.layout['projects'].keys())) | 19:41 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Support IPv6 in the finger log streamer https://review.openstack.org/502137 | 19:42 |
pabelanger | mordred: jeblair: Shrews: SpamapS: so, before we dive into any we got another dead worker on ze01, lets upgraded to -proposed version of python, and see if we can reproduce | 19:42 |
jeblair | mordred: okay. i'm going to leave some comments about moving it into migrate and why. | 19:42 |
mordred | jeblair: kk | 19:42 |
mordred | jeblair: I'll do that in my next iteation | 19:42 |
jeblair | mordred: okay, left comment; that's all i have after a quick look, will play with it now | 19:45 |
mordred | jeblair: cool. also, please ignore the fact that the the output is missing a 'jobs' ... fixing that right now :) | 19:46 |
jeblair | pabelanger: probably a good plan | 19:46 |
tobiash | Shrews: have thoughts on 502137 | 19:51 |
Shrews | tobiash: b/c 'localhost' did not work for me | 19:52 |
tobiash | Shrews: ok, that's an argument | 19:52 |
Shrews | dammit. now some sort of race.... "Streamed: Build ID c8ace3ccb68d44198522a428ac1440b3 not found" | 19:57 |
dmsimard | Why is there nova patches in the v3 check queue ? | 20:02 |
dmsimard | Also, it seems like there's a glitch in the UI where you can't expand the box to see queued jobs if none of the jobs have started yet ? | 20:03 |
clarkb | o/ here for 2000UTC convening | 20:05 |
jlk | o/ same | 20:05 |
jeblair | pabelanger, Shrews, tobiash, mordred, dmsimard: ping ^ | 20:05 |
dmsimard | pong | 20:05 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add migration tool for v2 to v3 conversion https://review.openstack.org/491805 | 20:06 |
mordred | jeblair: heya | 20:07 |
jeblair | dmsimard: yeah, that's the thing that mordred wanted to fix with a ui patch (but it ended up making it harder for us to diagnose issues). we should still think about ways to improve it, but probably later at this point. in short, every change that *might* be enqueued *is* enqueued at least long enough for zuul to determine whether it should be. that's when it shows up there, and it may stay there with no jobs until the merger comes back with ... | 20:07 |
jeblair | ... information on whach jobs it should run. | 20:07 |
jeblair | s/whach/which/ | 20:07 |
tobiash | o/ | 20:07 |
dmsimard | jeblair: ah somehow I thought it was a merger queue thing so I wasn't entirely wrong | 20:07 |
dmsimard | ok, let's look at it later :) | 20:08 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add UPPER_CONSTRAINTS_FILE file if it exists https://review.openstack.org/500320 | 20:08 |
jeblair | how about we start real quick by going over the etherpad from earlier: https://etherpad.openstack.org/p/zuulv3-pre-ptg | 20:08 |
jeblair | we're really close on devstack jobs. i think there's a little more work to go into the v3 native job, and then we'll be ready to point people at it to start building new simple devstack jobs. | 20:09 |
jeblair | it's probably close enough at this point not to consider it a migration blocker. | 20:10 |
jeblair | that sound right to folks? | 20:10 |
mordred | agree | 20:10 |
clarkb | is the devstack job running tempest? | 20:10 |
mordred | we can emit jobs for devstack-legacy for the most of them | 20:10 |
clarkb | (I want to say the log I saaw did run tempest) | 20:11 |
jeblair | clarkb: not yet | 20:11 |
jeblair | clarkb: not the v3 native one (devstack-legacy, yes) | 20:11 |
clarkb | ah ok | 20:11 |
dmsimard | jeblair: devstack doesn't worry me too much, it's the non-devstack things that do | 20:12 |
tobiash | was large scale dynamic layout reconfiguration tested already? | 20:12 |
dmsimard | jeblair: especially deployment projects like puppet-openstack, openstack-ansible, kolla, tripleo.. | 20:12 |
jeblair | tobiash: no, only static configuration loading | 20:12 |
jeblair | tobiash: and that with a null configuration | 20:14 |
jeblair | okay, next on the list -- jobs with special slaves: | 20:14 |
jeblair | https://etherpad.openstack.org/p/zuulv3-special-jobs | 20:14 |
jeblair | mordred, pabelanger: what's left from that list? | 20:15 |
mordred | jeblair: there's 4 release jobs | 20:15 |
mordred | jeblair: npm-upload, jenkinsci-upload, mavencentral-upload and forge-upload | 20:16 |
jlk | oh crap, I need to finish up the translation-update | 20:16 |
jlk | upstream-translation-update | 20:16 |
jlk | and while I'm there, I might be able to do propose-translation-update{suffix} | 20:16 |
mordred | ++ those are the others | 20:17 |
jeblair | what's forge-upload? | 20:17 |
mordred | jeblair: the stack in the middle just need to be handled by the migration script, I'll add those to the mapping file in the next patch | 20:17 |
jeblair | puppetforge | 20:18 |
jeblair | that could probably be more clear in the future :) | 20:18 |
jeblair | (hardly the first thing called forge) | 20:19 |
mordred | right? | 20:19 |
jeblair | jenknsci and mavencentral are probably not critical to cutover. npm, forge, and translations probably should be considered blockers? | 20:19 |
clarkb | even puppetforge is probably not critical | 20:19 |
pabelanger | jeblair: I'm just retesting publishing to afs, and tagging a release, then good to mark them as done | 20:19 |
clarkb | I don't think we consume any of our puppet modulse via puppet forge | 20:19 |
clarkb | so question would be if puppet-openstack does | 20:20 |
jlk | https://review.openstack.org/#/c/499845/ needs to be rebased by me, but could use reviews after. | 20:20 |
pabelanger | jeblair: so, ready to work on next task | 20:20 |
jlk | it's for propose-project-config-update | 20:20 |
jeblair | clarkb: yeah, i was thinking of p-o | 20:20 |
dmsimard | jeblair, clarkb: puppet-openstack used to publish to the forge | 20:20 |
mordred | forge-upload is used by two projets | 20:20 |
dmsimard | but that was back in the day of hodgepoge PTL | 20:20 |
mordred | puppet-httpd-forge-upload and puppet-storyboard-forge-upload | 20:20 |
dmsimard | I don't think we publish anymore | 20:20 |
clarkb | mordred: if it is just those two then I don't think it is criticial to treat that job as a blocker | 20:21 |
jeblair | oh, then i agree we can lower priority of that | 20:21 |
jeblair | what about npm-upload? | 20:21 |
mordred | it's used by openstack/tripleo-ui | 20:21 |
mordred | and that's it | 20:21 |
jeblair | let's call that a blocker? | 20:22 |
mordred | yah - it shouldn't be too hard to translate | 20:22 |
jeblair | okay, so i put npm and translation-upload on the blocker list, and the rest on a new list of things to do right after cutover | 20:23 |
mordred | it runs a script from slave_scripts, so the pattern used in the other proposal jobs can be copied easily | 20:23 |
jeblair | okay, moving down the list (but saving migration script for last): migration docs | 20:24 |
jeblair | i'm happy with where they are at the moment, but i'm sure we will want to update them with info about the migration script once we have it | 20:24 |
jeblair | so let's call this done | 20:24 |
jeblair | the wget breakage is fixed | 20:25 |
jeblair | " docs jobs incorrecly publishing" | 20:25 |
jeblair | pabelanger: i think you just fixed that? | 20:25 |
mordred | yah | 20:25 |
mordred | jeblair: for migration docs - perhaps we should keep an etherpad going at the PTG to jot down notes as folks ask us questions | 20:26 |
jeblair | zuul-cloner shim -- Shrews that's in the base job now, right? | 20:26 |
Shrews | yep | 20:26 |
jeblair | mordred: ++ | 20:26 |
jeblair | configure-mirror parity | 20:26 |
jeblair | there are reviews there; sounds like we're almost done | 20:26 |
jeblair | dmsimard: when those 3 are landed, are we good? | 20:27 |
jlk | mordred: good idea. | 20:27 |
jeblair | (note, i think as part of this that populating /etc/ci is being separated from configure mirror role, which makes a lot of sense) | 20:27 |
dmsimard | jeblair: that should provide backwards compat for the /etc/ci/mirror_info.sh -- would need to compare or better, use, in a job that consumes it in order to tell if it's good | 20:28 |
clarkb | dmsimard: I believe the dib cross build jobs make extensive use of it | 20:29 |
jeblair | dmsimard: ack. devstack-legacy uses one line from it | 20:29 |
dmsimard | at first glance the reviews seem okay but I haven't had the chance to review them thoroughly yet | 20:29 |
dmsimard | jeblair: I believe tripleo uses that file too, would need to double check | 20:29 |
jeblair | ok. so still a blocker but likely we can wrap it up today | 20:30 |
jeblair | new servers | 20:30 |
jeblair | pabelanger: ^ what's the status there? | 20:30 |
dmsimard | are we keeping the same set of zuul mergers ? is there anything we are taking the opportunity to reinstall/install in xenial ? | 20:30 |
clarkb | dmsimard: I think we need to update to xenial because of python3.5 | 20:31 |
pabelanger | jeblair: I'd like to stand up nb01/nb02 shortly | 20:31 |
clarkb | the type annotations won't work as is in < 3.5 | 20:31 |
dmsimard | so we have 8 new mergers to stand up ? | 20:31 |
pabelanger | jeblair: still need to work on zuul-executor / zuul-mergers | 20:31 |
jeblair | okay; we don't strictly need to do nb01/nb02 | 20:32 |
clarkb | for the mergers I expect that we can rollback to v2 on xenial if we have to and just replace the existing set with xenial nodes | 20:32 |
jeblair | should we consider deferring that to save time+quota? | 20:32 |
jeblair | clarkb: that sounds reasonable | 20:32 |
*** hashar has quit IRC | 20:32 | |
pabelanger | clarkb: oh, so just upgrade not new servers | 20:33 |
pabelanger | I do think we need to do some puppet work for puppet-zuul on mergers | 20:33 |
clarkb | they are largely stateless (if you wave your hands around v2 needing to fetch from them) so easy enough to make rollback be redeploy on trusty as well | 20:34 |
jlk | I assume once in production we'll have a better idea of how to balance executors vs pure mergers | 20:35 |
jeblair | jlk: ++ | 20:36 |
jeblair | clarkb: what would you recommend we do? | 20:36 |
clarkb | jeblair: re mergers? maybe split the current 8 in two and have 4 trusty for easy rollback and 4 xenial for v3 | 20:37 |
jeblair | that works for me. we can also disable the merge-check pipeline temporarily to reduce load on them. | 20:37 |
jeblair | pabelanger: sound good ^? | 20:37 |
jeblair | if not, you can work it out later :) | 20:39 |
jeblair | last thing i just added to the list and can't believe i forgot -- logstash emitter | 20:39 |
jeblair | i started some prep for this a couple weeks ago | 20:39 |
jeblair | my plan is to have a trusted post-playbook in base that submits a background job to the logstash gearman queue | 20:40 |
dmsimard | emitter reminds me of the stuff like firehose/openstack-health, we don't need to do anything special for those ? | 20:40 |
jeblair | what it emits will be compatible with what we currently emit | 20:40 |
jeblair | basically, it'll be a copy-pasta of the jenkins-log-client code into a post-playbook, skipping the zmq step | 20:41 |
mordred | ++ | 20:41 |
jeblair | dmsimard: largely to assist with this, zuul emits no firehose yet | 20:41 |
pabelanger | jeblair: clarkb: sure, wfm | 20:41 |
dmsimard | ok. | 20:42 |
jeblair | regarding openstack-health, i think the subunit processor is hooked up to the logstash system | 20:42 |
clarkb | ya its the same sort of job submission with different parameters | 20:42 |
clarkb | should be straightforward to solve both of those together | 20:42 |
jeblair | ok, so probably our post playbook will submit 2 jobs | 20:43 |
mordred | jeblair: and the log-gearman-client.py, once zmq is done, only depends on gear, which is installed on the executor so should be piece of cake | 20:43 |
jeblair | ya | 20:43 |
jeblair | i think this will not take long to write; i am unsure whether i can also get it safely merged into the base job today. | 20:44 |
jeblair | i'll try to get as far as i can, but of the things so far, this seems most likely not to happen by EOD today | 20:45 |
jeblair | okay, last thing -- migration script | 20:45 |
jeblair | it runs! | 20:45 |
jeblair | and outputs 19.5 kilolines of configuration! | 20:45 |
jlk | egads | 20:45 |
jeblair | which doesn't include the job definitions yet :) | 20:45 |
clarkb | what is it if not job definitions? | 20:46 |
jeblair | (that's actually 600 more than the current layout.yaml, interestingly enough) | 20:46 |
clarkb | oh right layout | 20:46 |
jeblair | clarkb: project definitions (ie, what's in current layout.yaml) | 20:46 |
jlk | current makes more use of templates, no? | 20:46 |
jeblair | clarkb: but the "jobs" section with all the crazy regex stuff is gone, folded into the project-pipeline definitions | 20:47 |
jeblair | so it is slightly longer and *much* more readable | 20:47 |
jeblair | - legacy-grenade-dsvm-redis-zaqar: | 20:47 |
jeblair | voting: false | 20:47 |
jeblair | for instance ... can you guess whether that job is voting? :) | 20:47 |
jeblair | in v2 the answer is, no, you can not guess. | 20:47 |
mordred | \o/ | 20:48 |
jeblair | anyway | 20:48 |
jeblair | mordred: do you have incomplete code to do job configuration yet, or are we at the start of that? | 20:48 |
mordred | jeblair: the existing 2.5 code | 20:49 |
mordred | jeblair: so - no, I don'thave it pulled out or stiched in to the migration script yet - but that was going to be the starting place | 20:50 |
jeblair | mordred: right, so we have that to do, as well as formulating the 'job:' configuration stanza | 20:50 |
jeblair | and somewhere in there we need the role to do ZUUL_ variable compatability | 20:50 |
mordred | yup | 20:51 |
jeblair | the change listed some other todos | 20:51 |
mordred | there isa todo at the top of the migration script in the comments, fwiw | 20:51 |
mordred | yah | 20:51 |
jeblair | shared job queues | 20:51 |
jeblair | also, filters from the jobs section | 20:51 |
mordred | yah - and continuing to add various mapping information to mapping.yaml | 20:51 |
jeblair | mordred: oh, so i guess the current voting: settings just came from '-nv' ? | 20:51 |
mordred | jeblair: yes. that is right | 20:52 |
*** olaph has quit IRC | 20:53 | |
jeblair | mordred: this is great progress and i can see it coming together, but i feel like we probably have a couple of days work yet until we get to what we'd consider final output stage. would you concur? | 20:53 |
*** olaph has joined #zuul | 20:53 | |
mordred | jeblair: maybe? the migration/mapping stuff actually goes fairly quickly - but it's also 4pm on a friday here, so it's tough to say | 20:54 |
pabelanger | mordred: Hmm, release jobs are broken | 20:55 |
jeblair | based on the other stuff i still need to do, i probably personally can't pick up a migration script task until the plane flight on sunday | 20:55 |
pabelanger | mordred: I think mirror-workspace-git-repos is the issue | 20:55 |
pabelanger | as it doesn't checkout tag? | 20:55 |
jeblair | pabelanger: can we hold that for a minute? | 20:56 |
pabelanger | jeblair: sure | 20:57 |
jeblair | so here's a strawman -- personally, i don't think we're going to be in a position to cut over monday morning; i think it's more likely that we'll finish up the migration script and continue to flesh things out early in the week and we can perhaps attempt cutovers mid or late next week. | 20:58 |
mordred | jeblair: yah- that's kind of what it feels like to me too - we're able to make extreme progress when we're all heads-down on it, so knocking out the final things while we're in the room will likely work quite well | 20:59 |
dmsimard | FWIW I've put name on some risks at the bottom of the pad | 21:00 |
jlk | that's how it sounds to me as well | 21:00 |
dmsimard | And I also believe that doing this *tomorrow evening* is risky, everyone will be travelling | 21:00 |
jlk | I'd rather see us work hard on getting things right, rather than doing the cut over badly and firefighting frantically for the few days after | 21:00 |
mordred | ++ | 21:00 |
jeblair | dmsimard: ya thanks that helps a lot | 21:00 |
dmsimard | jeblair: I'm trying to de-risk things that are uncharted territory right now | 21:01 |
dmsimard | but my time is short, won't have time to hack on things tonight or tomorrow as I prep the family for the time I won't be around | 21:01 |
clarkb | ya and juggling knowing who is around when is hard when its all travel time | 21:02 |
mordred | at the ptg we'll also be in a decent position to do some targetted trial runs of some of the unknowns with folks - dmsimard mentioned non-devstack integratoin tests as an example | 21:02 |
pabelanger | I'll be working late today | 21:02 |
pabelanger | but travel in the morning to PTG, but around all after Saturday / Sunday | 21:02 |
dmsimard | what mordred said, I would very much like to de-risk the migration by opting in higher risk projects first gradually, somehow | 21:02 |
jeblair | dmsimard: we have to do the cutover all at once, but we *can* start running check jobs on any project we want | 21:03 |
mordred | we're pretty sure that our base jobs and ZUUL_ vars role should put things in the right state for those, but won't know until we try one | 21:03 |
dmsimard | jeblair: yes, sure, I'd basically like to try and run real "migrated" v3 jobs on projects ahead of the cutover | 21:03 |
mordred | luckily we can generate some, put a few into a .zuul.yaml and run a check job on them | 21:04 |
mordred | but for many of these projects we don't know how to assess if they are behaving properly, or if they aren't, what isn't happening that should | 21:04 |
clarkb | maybe instead of attempting fisrt cutover on saturday we use daytime sunday to prep and get check jobs running in more places? | 21:05 |
dmsimard | mordred: I have experience and/or contacts in all the deployment projects to help | 21:05 |
clarkb | I think a lot of us arrive on saturday evening ish time so potentially sunday is good sit down and crank things out time | 21:05 |
mordred | clarkb: agree. fungi and I will be in tc/board meeting, but I usually laptop-hack in those anyway :) | 21:06 |
dmsimard | Yeah I'll be there sunday | 21:06 |
dmsimard | tristanC too | 21:06 |
jeblair | clarkb: i'm not expecting us to be in a position to even do trial cutovers saturday | 21:06 |
mordred | I get in saturday evening | 21:06 |
dmsimard | mordred: from China ? | 21:06 |
mordred | dmsimard: no - woundup not going to china | 21:06 |
dmsimard | Oh, ok, no jet lag then | 21:06 |
mordred | jeblair: yah- I think clarkb was suggesting we not try to do trial cutovers on saturday | 21:07 |
jeblair | so i think we just keep plugging away at this and whenever the migration script gives us output we can use, we start throwing some of it at zuulv3. but i don't think we should plan on any 'scheduled' events this weekend. | 21:07 |
dmsimard | Agreed | 21:07 |
mordred | agree | 21:08 |
dmsimard | Let's make as much progress to de-risk the migration as much as we can and see if it's doable thursday ? | 21:08 |
dmsimard | Cause, you know, read only friday | 21:08 |
jeblair | the last question i have is about notification/communication. the last thing we said about this to the dev community was 1 month ago, and we said we'd (probably) cutover monday. normally, we would have sent some reminder announcements, but the situation hadn't really gotten any clearer. do we want to send something out now? if so, what? | 21:09 |
jeblair | or do we want to cutover and say "surprise!" | 21:09 |
mordred | I definitely don't think we should attempt to throw the switch on friday :) | 21:09 |
mordred | jeblair: heh | 21:09 |
mordred | I think sending out an update is a good idea | 21:09 |
dmsimard | An update to say it's not happening either saturday or monday is the minimum | 21:10 |
mordred | something along the lines of "we're close, but we're not satisfied with the migration/cutover and are going to work on it for a few more days while we're in denver. blah blah safe than sorry see you soon kthxbai" | 21:10 |
clarkb | mordred: ++ | 21:10 |
jeblair | maybe also include "zuul might start leaving info on patches before then. we'll let you know when it happens. read this page: https://docs.openstack.org/infra/manual/zuulv3.html attend the ptg session on monday." | 21:11 |
clarkb | especially since I know a lot of people are wondering when they will get first crack at zuulv3 | 21:11 |
jlk | We should probably send something that outlines what we've discussed today, and set expectations that there is high chance of a cutover next week. | 21:11 |
mordred | jeblair: there is also a chunk of time monday scheduled in a room for some value of "us" to talk to some value of "people" about v3 and what it means | 21:11 |
mordred | jlk, clarkb: ++ | 21:11 |
jlk | or what mordred said | 21:11 |
jeblair | mordred: you want to draft that up? | 21:11 |
mordred | jeblair: yah - I can take a swing at it real quick | 21:11 |
dmsimard | jeblair: yes, something about potentially getting non-voting reviews from zuul as we dry run things could be good | 21:11 |
mordred | "if you start seeing comments from zuul, don't freak out - unless you want to" | 21:12 |
jeblair | okay, i think we have a Plan(tm) | 21:12 |
jeblair | anything else? | 21:12 |
Shrews | and possibly note that we are still holding "office hours" to discuss v3 things? | 21:13 |
jlk | We've got budget for the cut-over celebration right? | 21:13 |
jeblair | Shrews: ++ | 21:13 |
mordred | jlk: there's always budget for that | 21:13 |
jlk | hopefully it's before Thursday evening. | 21:13 |
jeblair | jlk, mordred: free booze! *now* i'm motivated! | 21:13 |
Shrews | free booze AND free steak is better motivation. just sayin'... | 21:14 |
jlk | I'm happy to donate my entire per diem for that day to the cause | 21:14 |
mordred | we may also want to ponder, before we show up, that there are a bunch of future/post-ptg zuulv3 related things that people are going to want to discuss while we're in denver | 21:14 |
jeblair | mordred: yes, and unfortunately, we still have a self-generated backlog of weeks/months | 21:15 |
mordred | and think about how to allocate some amount of time to that so people don't explode, but not so much that we stall on the last mile of rollout | 21:15 |
pabelanger | So, is there an updated list of things we need to focus on? Or is https://etherpad.openstack.org/p/zuulv3-pre-ptg been updated | 21:15 |
mordred | it's also possible the time for that will just be at the bar in the evening :) | 21:15 |
jeblair | mordred: yeah. if we can collect use cases, etc, that will be good. if we can also convey it'll probably take a bit to get around to them on account of we still need to do some basic things that would be great | 21:16 |
jeblair | pabelanger: that's current | 21:17 |
jlk | probably need to make clear that cutover != v 3.0 release, right? | 21:17 |
jeblair | jlk: indeed | 21:17 |
jlk | that we still have some things to finish before calling it 3.0 | 21:17 |
pabelanger | okay, I've added my name to some tasks | 21:18 |
pabelanger | but release jobs appear broken, so making note | 21:18 |
Shrews | 3.0a alpha release | 21:19 |
jeblair | Shrews: i don't think we're even there yet | 21:19 |
jlk | 3.lol | 21:19 |
jeblair | more like that yeah | 21:19 |
Shrews | i will call it 3p0 to myself, b/c i find humor in that | 21:20 |
* Shrews must do dinner things now | 21:20 | |
pabelanger | I'm relocating as ansiblefest is over, going to get food then get back online | 21:20 |
jeblair | it's sort of like the opposite of tex version numbers. we started at 3.lololololololol, and we're at 3.lolol, almost down to 3.lol. :) | 21:21 |
jlk | soon enough, it'll be 3.yolo | 21:21 |
jeblair | friday | 21:21 |
jlk | (that's when the drinking begins) | 21:21 |
mordred | hah | 21:21 |
* jeblair puts away flask | 21:21 | |
jeblair | okay, i'm going to give my chair a break, then back to logstash things | 21:22 |
mordred | woot! | 21:22 |
mordred | jeblair: before you do logstash things ... | 21:22 |
jeblair | thanks everyone! | 21:22 |
jlk | cheers. I'm out for a bit too, dog needs a walk | 21:22 |
mordred | jeblair: I may have another zuul scheduler issue for you | 21:22 |
jeblair | too late already stood up | 21:22 |
mordred | jeblair: https://review.openstack.org/#/c/500365/ is not running any jobs | 21:23 |
mordred | jeblair: nor is its ancestor | 21:23 |
mordred | jeblair: the job content itself isn't a big deal - but zuul ignoring the uploads and the rechecks is potentially worrisome - I looked briefly and didn't see anyting immediately | 21:23 |
jeblair | mordred: ack; i'll check logs in a bit | 21:24 |
mordred | jeblair, clarkb, jlk, pabelanger, Shrews, dmsimard: https://etherpad.openstack.org/p/6NTJxufjNU how's that look? | 21:37 |
pabelanger | looking | 21:38 |
mordred | jlk: if you get bored and run out of other things to do after the translation proposal things, https://review.openstack.org/502185 adds the secret needed for the npm upload job, so anybody can take the ball and run with that one too | 21:39 |
pabelanger | mordred: minor change, looks good | 21:41 |
mordred | pabelanger: ++ | 21:41 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Update roadmap in the README https://review.openstack.org/502188 | 21:52 |
mordred | pabelanger: so what's up with the release jobs? | 21:57 |
pabelanger | mordred: let me find log, but I don't think we were on right tag, we ran against master | 21:58 |
mordred | nod | 21:58 |
pabelanger | mordred: http://logs.openstack.org/da/0a0162e080f09d5593effd4fcb837c3554d014da/release/release-openstack-python/44c7a5b/job-output.txt.gz#_2017-09-08_21_04_03_780620 | 21:59 |
mordred | pabelanger: also - I was just scanning through the project-template section of zuul/layout.yaml and I think there are a few more templates we could define based on content we have now - and a couple that we might need to add (there is an openstack-server-release-jobs that bulds and uploads tarball like normal but does not publish to pypi) | 22:00 |
mordred | pabelanger: cool, looking | 22:00 |
pabelanger | mordred: ya, happy to look and get the templates working | 22:01 |
mordred | pabelanger: if you happen to look at the project-template list, adding stuff as we go to that mapping file will, I think, help us not forget things (I know I've already lost track of a few of the things we have myself :) ) | 22:03 |
mordred | pabelanger: that looks like something EXTRA weird with Determine Local Head | 22:04 |
pabelanger | mordred: let me look at your migration script first | 22:04 |
pabelanger | so to get up to speed | 22:04 |
jeblair | mordred: msg looks good! | 22:06 |
jeblair | pabelanger: if you can make that release job run with keep enabled, that would be great. we can look at the local repo state on the executor. | 22:08 |
mordred | yah ... I can't find anything that would cause that output | 22:09 |
mordred | although it does make me want to change something in the base job ... | 22:09 |
mordred | oh. hah. I already wrote a change for this somewhere | 22:11 |
dmsimard | Would it be worthwhile to plug jeblair's Barcelona Zuul v3 talk ? | 22:17 |
dmsimard | I re-watched it recently to refresh my memory on things :D | 22:17 |
dmsimard | Otherwise message lgtm | 22:18 |
jeblair | dmsimard: heh, i feel like it's a bit dated and lacks practical information; but i'll let you/others decide if it's useful. | 22:18 |
dmsimard | I think it's easy for us to take a lot of things as granted or "common sense", the talk goes into a bit of the background for people unfamiliar with what even zuul v2 would be.. but I don't have a strong opinion, just thinking out loud | 22:20 |
*** olaph1 has joined #zuul | 22:21 | |
mordred | jeblair, pabelanger: https://review.openstack.org/#/c/501242/ would be nice to land - also would let us see the inventory we produced for the borked release job | 22:22 |
*** olaph has quit IRC | 22:23 | |
mordred | jeblair, dmsimard I added a PS to the end with a link to that talk and other things - I can't decide if I like including that section or not | 22:28 |
mordred | clarkb, jlk, pabelanger: ^^ thoughts? | 22:29 |
pabelanger | jeblair: mordred: sure, I can do --keep now | 22:29 |
dmsimard | mordred: worst that can happen is that people don't watch/read/care, best case we get more people on the same wavelength | 22:31 |
dmsimard | Doesn't sound like a bad tradeoff | 22:31 |
jeblair | mordred: we don't have a great way to share a module between two (roles, playbooks, etc) do we? | 22:34 |
dmsimard | jeblair: openshift-ansible has a solution for that | 22:35 |
jeblair | maybe symlinks? otherwise, we'd have to have a role for it, then include_role from 2 other roles? | 22:36 |
dmsimard | One sec | 22:36 |
jeblair | maybe we should handle /library like we handle /roles? | 22:36 |
jeblair | dmsimard: thx | 22:36 |
dmsimard | jeblair: https://github.com/openshift/openshift-ansible/tree/master/roles/lib_openshift | 22:36 |
dmsimard | jeblair: https://github.com/openshift/openshift-ansible/tree/master/roles/lib_utils | 22:36 |
dmsimard | They're basically roles with more or less just a library folder with modules/things in them | 22:37 |
mordred | dmsimard: dont you have to use a role before the module in it is accesible? | 22:38 |
dmsimard | mordred: meta dependencies | 22:38 |
dmsimard | meta dependencies will *run* the role.. but if there's nothing to run.. :) | 22:38 |
mordred | dmsimard: neither of those roles declar depends | 22:38 |
mordred | ah. yah | 22:38 |
dmsimard | mordred: https://github.com/openshift/openshift-ansible/blob/3409e6db205b6b24914e16c62972de50071f4051/roles/docker/meta/main.yml#L13 | 22:38 |
mordred | jeblair: so yah - make a role that just has library in it, then put dependencies: - library-role into meta/main.yaml of the roles that need it | 22:39 |
dmsimard | openshift-ansible does a lot of really cool things, I've sent a few patches their way. awesome stuff. | 22:39 |
mordred | jeblair: but - that's also basically what you were saying with include_role - same thing | 22:39 |
jeblair | cool (the library role will be the "submit a gearman job" and the depending roles will be "submit a logstash job" and "submit a subunit job") | 22:39 |
pabelanger | jeblair: mordred 76fa1c4a60314afe843e0c5cf8c96803 on ze01.o.o was the release job | 22:40 |
jeblair | mordred: yeah, but then i can use the module directly in a task? | 22:40 |
dmsimard | mordred, jeblair: this one is particularly cool, it's more or less the equivalent of our validate-host but on steroids https://github.com/openshift/openshift-ansible/tree/3409e6db205b6b24914e16c62972de50071f4051/roles/openshift_health_checker | 22:40 |
mordred | jeblair: root@ze01:/var/lib/zuul/builds/76fa1c4a60314afe843e0c5cf8c96803/work/src/git.openstack.org/openstack-dev/sandbox# git status | 22:40 |
mordred | Not currently on any branch. | 22:40 |
mordred | jeblair: yes | 22:40 |
jeblair | k. i'll do a local mockup first :) | 22:41 |
dmsimard | And then they even have a callback that prints what the actual failures were: https://github.com/openshift/openshift-ansible/blob/3409e6db205b6b24914e16c62972de50071f4051/roles/openshift_health_checker/callback_plugins/zz_failure_summary.py | 22:41 |
jeblair | mordred: huh, i thought i checked tags | 22:41 |
mordred | jeblair: well- status there is not showing what it does for me locally | 22:42 |
jeblair | that'll do it | 22:42 |
jeblair | git describe | 22:42 |
jeblair | 0.0.23 | 22:42 |
dmsimard | jeblair: happy to review your "zuul-lib" role when you got something going, add me as reviewer | 22:42 |
dmsimard | (btw we might want to move the module from validate-host there) | 22:43 |
mordred | jeblair: what's the state difference - locally if I do "git checkout 0.0.23" | 22:43 |
mordred | git branch shows me: | 22:43 |
mordred | * (HEAD detached at 0.0.22) | 22:43 |
jeblair | yeah me too | 22:44 |
jeblair | .git/HEAD is the same | 22:44 |
jeblair | huh, git status consults .git/logs/HEAD | 22:46 |
mordred | jeblair: so is this becaus it's cloned directly from /var/lib/zuul/executor-git/git.openstack.org/openstack-dev/sandbox to that ref rather than cloned and then checked out? | 22:47 |
jeblair | mordred: it should be cloned and checked out; i think executor is checking it out differently than git cli | 22:48 |
mordred | \o/ | 22:49 |
mordred | jeblair: so - fwiw, for tag jobs we do have zuul.tag for the main project | 22:50 |
dmsimard | oh, hey, we got a devstack-legacy pass on centos7 but a failure on suse so it's not too horrible | 22:51 |
dmsimard | is devstack on debian a thing 6 | 22:52 |
jeblair | mordred: i think the only viable option is for the executor to make things correct in its work root, and to synchronize that exactly to the remote nodes. i don't want anything bypassing that process. | 22:52 |
dmsimard | ? | 22:52 |
mordred | jeblair: and the normal "find local branch" logic works fine for branches that aren't the one that's tagged | 22:52 |
mordred | jeblair: nod | 22:52 |
pabelanger | dmsimard: no | 22:53 |
pabelanger | dmsimard: centos should pass, I am not sure about opensuse. might want to ask dirk in openstack-infra | 22:53 |
dmsimard | pabelanger: it should, we have it in devstack-gate | 22:53 |
dmsimard | pabelanger: I'll try and see if I can find anything obvious | 22:54 |
dmsimard | in the meantime re-checking a cleaner patch with f26 thrown in | 22:54 |
clarkb | mordred: +2 on 501242 | 22:54 |
clarkb | didn't approve because I am very distracted by home things before having to get on a plane | 22:54 |
pabelanger | dmsimard: systemd failed to start peakmem_tracker | 22:54 |
pabelanger | no idea why | 22:55 |
dmsimard | pabelanger: what, on the suse job ? | 22:55 |
pabelanger | dmsimard: yup | 22:55 |
pabelanger | http://logs.openstack.org/47/502147/2/check/devstack-legacy-opensuse-423-tempest-dsvm-neutron-full/1d87625/logs/devstacklog.txt.gz#_2017-09-08_20_39_49_067 | 22:55 |
jeblair | this is what the executor does: https://etherpad.openstack.org/p/BPndl6rF47 | 22:56 |
jeblair | you can use that to reproduce the weird state | 22:56 |
dmsimard | pabelanger: "['pmap', '-XX', '1']' returned non-zero exit status 1" | 22:57 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Make validate-host read from site-variables https://review.openstack.org/500592 | 22:57 |
mordred | clarkb: thanks! | 22:57 |
pabelanger | dmsimard: this could be devstack bug, I think they did screen removal recently | 22:58 |
pabelanger | dmsimard: I'd check if zuulv2.5 jobs are working | 22:58 |
dmsimard | pabelanger: yeah doesn't look like -X is even an arg for pmap http://www.unix.com/man-page/suse/1/pmap/ | 22:58 |
jeblair | mordred, pabelanger: maybe we want the executor to do: r.git.checkout('0.0.22') | 22:59 |
mordred | jeblair: maybe because of reset --hard -- to do that after setting the head reference? | 22:59 |
mordred | jeblair: yah - I agree - but it seems we only really need to do that for tags? | 22:59 |
pabelanger | dmsimard: http://logs.openstack.org/47/502147/2/check/gate-tempest-dsvm-neutron-full-opensuse-423-nv/a30d9c4/logs/devstacklog.txt.gz#_2017-09-08_21_00_40_012 failing on zuulv2.5. | 22:59 |
mordred | jeblair: like, we certainly don't want to try to checkout speculative refs :) | 22:59 |
jeblair | mordred: we're already using a different method for branches | 23:00 |
dmsimard | pabelanger: yeah I found https://review.openstack.org/#/c/496301/ which ran today | 23:00 |
mordred | jeblair: cool | 23:00 |
jeblair | mordred: speculative refs are on branches now, so we just checkout the branch in that case | 23:00 |
jeblair | (that's one of the super subtle awesome things about v3 :) | 23:00 |
dmsimard | pabelanger: I'll ping some suse folks on -infra | 23:00 |
pabelanger | jeblair: mordred: will defer to your expertise | 23:00 |
pabelanger | dmsimard: +1 | 23:00 |
mordred | jeblair: so yah - in that case I think r.git.checkout is gonna be more better | 23:01 |
jeblair | oh neat, we can even 'git checkout refs/tags/0.0.22' | 23:01 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Use 'git checkout' when checking out a tag https://review.openstack.org/502195 | 23:07 |
jeblair | mordred, pabelanger, clarkb: ^ we'll need to merge that and restart executors then try the release job again | 23:07 |
pabelanger | +2 | 23:10 |
mordred | jeblair: I like that idea that we can get branches and tags working the same | 23:12 |
jeblair | mordred: yeah, if it were any other day, i would have gone ahead and done that :) | 23:13 |
mordred | jeblair: ++ | 23:15 |
pabelanger | ze01 still has --keep, I've disabled it on ze02 already | 23:29 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Use 'git checkout' when checking out a tag https://review.openstack.org/502195 | 23:29 |
mordred | jeblair, pabelanger: the last couple of times I've restarted the scheduler it has seemed like I needed to restart the executors too | 23:32 |
mordred | jeblair, pabelanger: is that a thing? like, do we need to restart the executors after a full scheduler restart? | 23:32 |
mordred | or am I just too impatient | 23:32 |
jlk | mordred: might be impatient. I haven't seen that in my testing locally | 23:41 |
jlk | mordred: what I _have_ seen from time to time is that the way the VMs work that existing nodepool VMs wouldn't necessary come back, because they check in at boot time, and since it's not boot time, they don't check in | 23:41 |
mordred | jlk: cool. I'll go with impatient :) | 23:43 |
jlk | mordred: the etherpad looks good, for a mail to go out. | 23:43 |
mordred | jeblair: figured out why the shade changes weren't being tested | 23:43 |
jlk | what time on Monday is the zuul session? | 23:43 |
mordred | jeblair: there was a dependency loop | 23:43 |
mordred | jlk: /me looks scared and hides | 23:43 |
mordred | jlk: it's sometime in the afternoon I think ... not really sure | 23:46 |
jeblair | mordred, jlk 2pm vail | 23:46 |
jeblair | https://ethercalc.openstack.org/Queens-PTG-Discussion-Rooms | 23:46 |
jlk | ah I'll probably miss it. I land at like 1pm pacific time. | 23:46 |
jlk | so I land right when that session is starting | 23:47 |
jeblair | mordred: updated your email etherpad to include that | 23:48 |
jeblair | mordred: what made you think you should restart the executors? | 23:48 |
mordred | jeblair: jobs weren't executing | 23:48 |
mordred | jeblair: but - I think last time it was early in the morning and I wasn't coffeed enough - so I think it should stay in the anecdote bucket | 23:49 |
pabelanger | mordred: I've had success with just scheduler restarts | 23:50 |
mordred | ok. good to know | 23:50 |
pabelanger | mordred: say when to try tagging again | 23:51 |
jlk | I do it a ton in docker so that I can alter code and just restart scheduler | 23:51 |
mordred | pabelanger: I have not restarted anything, but the change has landed | 23:51 |
jeblair | pabelanger: i don't see the fix commit on ze01 yet. we can manually update though if you're ready | 23:51 |
pabelanger | sure, I am ready if you want to manually update | 23:52 |
jlk | hrm, I'm looking at upstream-translation-update, trying to find where it sets the node to proposal | 23:53 |
jeblair | there were 3 zuul-executors running on ze01 again | 23:53 |
jlk | I see it runs a proposal-slave-cleanup though | 23:53 |
jlk | oh I see it, yaml buried the lede there. | 23:54 |
pabelanger | jeblair: what would cause that? | 23:55 |
jeblair | dmsimard: i may have aborted some devstack jobs with a restart just now | 23:55 |
jeblair | pabelanger: someone restarting it incorrectly? | 23:55 |
pabelanger | ok | 23:55 |
jeblair | systemd not doing the one thing it's supposed to be good at? | 23:55 |
dmsimard | jeblair: ok let me know when I can do a recheck | 23:55 |
jeblair | dmsimard: you're good | 23:55 |
pabelanger | jeblair: does that drop back to 1 process when zuulv3 mergers come online? | 23:55 |
jeblair | pabelanger: you can tag now | 23:56 |
pabelanger | k, tagging | 23:56 |
jeblair | pabelanger: well, i mean, we're never supposed to have 3 running. | 23:56 |
jeblair | is not related to mergers | 23:56 |
pabelanger | k | 23:56 |
pabelanger | sandbox 0.0.24 tagged | 23:57 |
jeblair | mordred: it looks like you were the last to restart ze01, can you tell me exactly what you did? | 23:57 |
mordred | jeblair: yah - I did a service zuul-executor stop - then I _believe_ I checked the process list for any python processes (which I do because I have absolutely no trust in our init scripts currently) | 23:58 |
pabelanger | :( | 23:58 |
pabelanger | 2017-09-08 23:57:55.899000 | ubuntu-xenial -> localhost | ERROR: AnsibleUndefinedVariable: 'zuul_traceroute_host' is undefined | 23:58 |
mordred | jeblair: I tend to not issue any starts until I've seen that there are no processes remaining | 23:58 |
mordred | pabelanger: oh good! | 23:58 |
pabelanger | let me see why | 23:59 |
mordred | I just landed the 'use site_variables' change | 23:59 |
jeblair | mordred: that sounds like a good procedure. maybe a straggler just escaped your attention. | 23:59 |
dmsimard | btw thanks #zuul for being patient with me while I learned zuul v3 and devstack/devstack-gate :D | 23:59 |
mordred | jeblair: yah. I really wish it didn't have to be a procedue | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!