Friday, 2017-09-08

pabelanger	SpamapS: ya, I've just been using zookeeper from fedora for testing, so centos-8 will have something. But, we can likely push zookeeper-lite from tristanC / dmsimard to COPR or something	00:03
SpamapS	meh	00:03
SpamapS	tarball seems to be working-ish	00:03
pabelanger	I have an ansible-role for zookeeper I plan on adding tarball support for also	00:03
pabelanger	cool	00:03
SpamapS	pabelanger: -> BonnyCI/hoist	00:03
pabelanger	SpamapS: are you centos now?	00:04
SpamapS	pabelanger: aye	00:05
mordred	jeblair: http://logs.openstack.org/02/500202/24/check/devstack/a0ebcd2/job-output.txt.gz#_2017-09-07_22_29_05_056316	00:06
mordred	jeblair: I think you got further	00:06
pabelanger	SpamapS: cool	00:07
SpamapS	sorta ;)	00:07
pabelanger	:)	00:08
ianw	SpamapS: if you want "shoved into a rpm" then i've got -> https://copr.fedorainfracloud.org/coprs/iwienand/zookeeper-el7/packages/	00:08
SpamapS	ianw: you're about 20 minutes late	00:09
ianw	SpamapS: it's ok, nothing happens for years ,then everything happens in 20 minutes ;)	00:10
SpamapS	yep	00:11
SpamapS	realistically, I'm just pushing this bonnyci work to completion, and then I'll probably just install Software Factory and be happy	00:11
openstackgerrit	Merged openstack-infra/zuul-jobs master: Add support for debian in configure-mirrors https://review.openstack.org/501537	00:11
openstackgerrit	Merged openstack-infra/zuul-jobs master: Add support for fedora in configure-mirrors https://review.openstack.org/501538	00:18
openstackgerrit	Merged openstack-infra/zuul-jobs master: Add support for opensuse in configure-mirrors https://review.openstack.org/501539	00:19
openstackgerrit	Merged openstack-infra/zuul-jobs master: Remove project pipeline definition from zuul-jobs https://review.openstack.org/501849	00:20
ianw	have i done something wrong or is https://review.openstack.org/#/c/501904 not getting picked up for testing?	00:28
ianw	hmm, neither is a recheck of an old job	00:29
ianw	it seems to be in a pretty hard loop around http://paste.openstack.org/show/620679/	00:36
ianw	jeblair / mordred: ^ ...	00:39
mordred	ianw: hrm	00:42
mordred	ianw: I agree with you - that definitely looks like a hard loop	00:43
ianw	variations of	00:43
ianw	2017-09-08 00:43:14,011 INFO zuul.DependentPipelineManager: Reported change <Change 0x7fecd74a41d0 501849,2> status: all-succeeded: True, merged: True	00:43
ianw	2017-09-08 00:43:14,115 INFO zuul.DependentPipelineManager: Reported change <Change 0x7fecd7010c88 501538,5> status: all-succeeded: True, merged: True	00:43
ianw	over and over	00:43
mordred	ianw: 501849 removed a pipeline definition from zuul-jobs that was in the zuul-jobs repo	00:45
mordred	ianw: a few bugs have flushed out related to pipeline removals - so maybe something went south?	00:45
mordred	ianw: in any case, at that rate we're going to run outof disk space due to logging	00:46
mordred	ianw: so how about we restart the scheduler (if there's not enough logging now there never will be)	00:46
ianw	sgtm	00:46
mordred	I have stopped it - restarting	00:47
mordred	it's now reading is config	00:48
dmsimard	what did I break	00:49
mordred	ianw: I rechecked your change and it seems to be enqueued	00:49
dmsimard	mordred: I made sure to put depends-on on my forklift patches to ensure things did not merge out of order	00:50
dmsimard	mordred: https://review.openstack.org/#/q/topic:zuulv3-forklift	00:50
mordred	dmsimard: I don't think it was that - my hunch is that we tickeled a weird edge-case somewhere	00:50
dmsimard	oh wow my patch series finally landed \o/	00:50
dmsimard	need +3 on two patches to bring back zuul-jobs jobs into place https://review.openstack.org/#/q/topic:zuulv3-forklift	00:51
SpamapS	gah.. so many apt: calls to fix :-P	00:52
mordred	ianw: although we're now just sitting in queued state	00:53
mordred	ianw: nevermind. I'm just impatient	00:53
dmsimard	I'm not sure I understand the deal with that apt issue	00:54
dmsimard	ansible is unable to use python-apt until you do apt-cache update or something like that ?	00:55
ianw	dmsimard: just been looking at them, as i was doing 501904	00:55
ianw	mordred / dmsimard: are we at a point we can merge https://review.openstack.org/#/q/topic:zuulv3-forklift ?	00:58
dmsimard	ianw: yes, it's tech debt from merging my configure-mirror tree	00:58
ianw	it lgtm, if zuul's up to it :)	00:58
ianw	everything seems to be moving, so ok	00:58
dmsimard	ianw: for the ci creation script, go ahead	00:59
dmsimard	ianw: if it's part of what would be the base job, add the role in the base.yaml test playbook	00:59
dmsimard	that's where validate-host will end up as well	00:59
dmsimard	although I still need to figure out what to do with that one, it has tasks that can only run on the executor..	01:00
ianw	dmsimard: yep ... so for testing purposes, just assert: that: tests that things look right?	01:00
ianw	file is there, perms are right, etc?	01:00
ianw	or is there a better way?	01:00
dmsimard	ianw: checking that it runs without horribly failing is already a good start (the job will fail if it fails horribly)	01:00
dmsimard	ianw: asserts are good too.	01:00
dmsimard	ianw: there's some examples here if you want: https://github.com/ansible/ansible/tree/devel/test/integration	01:00
dmsimard	ianw: some examples of what I've written https://github.com/ansible/ansible/blob/devel/test/integration/targets/sensu_client/tasks/main.yml	01:01
dmsimard	https://github.com/ansible/ansible/tree/devel/test/integration/targets/include_vars	01:01
ianw	excellent thanks; looks like roughly what i had in my head so a good sign ;)	01:02
*** harlowja has quit IRC		01:11
tristanC	ianw: fwiw, there is a zookeeper-lite rpm package built in software-factory repository	01:33
ianw	tristanC: cool, is that more or less the tarball in rpm foramt?	01:40
tristanC	ianw: it's built from source for el7, the trick was to remove all the client and netty, hence the "-lite" suffix	01:42
tristanC	package is https://softwarefactory-project.io/kojifiles/repos/sf-2.6-el7-release/Mash/zookeeper-lite-3.4.10-1.el7.x86_64.rpm, .spec is https://softwarefactory-project.io/r/gitweb?p=software-factory/zookeeper-lite-distgit.git;a=tree	01:42
ianw	tristanC: excellent, updated https://etherpad.openstack.org/p/zookeeper-epel7 in case any googlenauts find it	01:54
dmsimard	wanted to be productive tonight but spent the evening recovering from rdo infrastructure outage	01:54
dmsimard	ianw: if you want to take a stab at devstack centos/fedora/suse/debian go for it, I'll follow up tomorrow	01:55
ianw	ok, see how i got this afternoon. need some lunch first! :)	01:56
dmsimard	ianw: if you try out fedora, I can probably just pattern off of that for	01:56
dmsimard	centos* and the others	01:56
tristanC	ianw: thanks!	01:58
*** olaph1 has quit IRC		02:08
*** xinliang has quit IRC		02:12
*** xinliang has joined #zuul		02:24
*** xinliang has joined #zuul		02:24
dmsimard	all the cloud outage things are fixed	02:53
dmsimard	going to bed now gnight	02:53
dmsimard	o/	02:53
*** jkilpatr has quit IRC		02:59
tristanC	SpamapS: there is an upcoming blog post about using software-factory as a third-party-ci: https://softwarefactory-project.io/r/#/c/9473/3/Openstack_3rd_Party_CI_with_SF_26/2017-08-28-openstack-3rd-party-ci-with-software-factory.html.md	03:07
tristanC	SpamapS: and the general documention is https://softwarefactory-project.io/docs/operator/deployment.html	03:08
SpamapS	tristanC: awesome. :) I really do intend to give it a shot. I just figured I invested half a day in centos-ifying BonnyCI/hoist, I might as well finish :)	03:08
tristanC	As you wish, I think the main difference is that software-factory uses rpm packages for everything and the sfconfig script automatically generate all the secrets	03:09
SpamapS	there are likely tons of differences :)	03:09
tristanC	may i ask how are you getting python3 on centos?	03:10
SpamapS	but ultimately, if I can get a zuulv3 up that talks to our internal Github and starts running jobs on our cloud... I can show people the magic. :)	03:10
SpamapS	tristanC: haven't crossed that bridge yet. :)	03:10
SpamapS	I'm still in the stage of converting all the apt's to yums.	03:11
SpamapS	and went through the really silly-feeling process of making mariadb wori	03:11
SpamapS	work	03:11
tristanC	alternatively, you could try using software-fatory repository so that you could yum install zookeeper, zuul and nodepool	03:12
tristanC	well fwiw, all the distgit are available over softwarefactory-project.io gerrit, and package update goes through full integration test so contribution are welcome too :-)	03:14
SpamapS	neat	03:14
ianw	jeblair / mordred : zuul has hung again, and the logfile is up to 16gb ...	03:23
ianw	http://paste.openstack.org/show/620684/	03:24
ianw	i'm going to restart it, because this doesn't end well	03:25
SpamapS	hrm so why isn't there python3.5 in EPEL? :-P	03:35
SpamapS	or does python3.4 work ok for zuul/nodepool these days? :-P	03:35
* SpamapS can't remember what the min version was		03:35
tristanC	SpamapS: you could get python35 using softwarecollections	03:38
SpamapS	tristanC: ah	03:38
tristanC	SpamapS: but then, zuul-executor needs libpython35 in /usr/lib because bwrap will drop a custom ld_library_path	03:38
SpamapS	round and round we go	03:39
SpamapS	tristanC: you're scoring points :)	03:41
* SpamapS is having to update bonnyci's code to be very explicit about pip/python executables		03:41
tristanC	SpamapS: sort of, imo the real point is using zuul, what ever distribution/confmgmt ;)	03:43
openstackgerrit	Ian Wienand proposed openstack-infra/zuul-jobs master: Expand PATH for SUSE systems to include /usr/sbin https://review.openstack.org/501737	03:45
clarkb	py35 is required bcause of the type annotations	03:50
clarkb	I dont follow why bubblewrap factors into that	03:51
tristanC	clarkb: when using software-collections, python35 is installed in /opt and the .so isn't declared in the main ld.so.conf	03:52
clarkb	I had zuul running on py34 for gerrit testing but that was before the type annotations	03:52
clarkb	tristanC: oh software collection specific thing	03:52
tristanC	clarkb: yes, regarding python35 on centos7 with scl	03:53
tristanC	the issue is that bwrap is setuid and it will drop the custom LD_LIBRARY_PATH from zuul-executor, so ansible-playbook will failed with missing libpyton35	03:55
clarkb	got it	03:56
tristanC	though it's easy to fix, just symlink the python lib from /opt to /lib	03:56
*** xinliang has quit IRC		03:59
ianw	dmsimard: ahh, i think we've had a bit of a split-brain thing happening between depends-on, zuulv2 and zuulv3 merging with your jobs moving tests. just untangling it now ...	04:02
openstackgerrit	Merged openstack-infra/zuul-jobs master: Expand PATH for SUSE systems to include /usr/sbin https://review.openstack.org/501737	04:02
*** xinliang has joined #zuul		04:06
ianw	oh, you know what ... it's because there's no jenkins reporting on the depends-on	04:10
SpamapS	weird... pip: is using 'pip2' instead of 'pip' which is causing all kinds of weirdness for me :-P	04:27
openstackgerrit	Ian Wienand proposed openstack-infra/zuul-jobs master: Add integration tests for validate-host https://review.openstack.org/501543	04:34
tristanC	SpamapS: btw, you'll also need git-2 because zuul is using GIT_SSH_COMMAND, which is also avail in scl with rh-git29	05:05
tristanC	and well, if you don't want to bother with those details, you could get the service installed now using "yum install -y https://softwarefactory-project.io/repos/sf-release-master.rpm && yum install -y rh-python35-zuul-* rh-python35-nodepool-*" ... just saying	05:11
openstackgerrit	Merged openstack-infra/zuul-jobs master: Add a role to emit an informative header for logs https://review.openstack.org/501495	05:33
*** harlowja has joined #zuul		06:19
*** harlowja has quit IRC		06:40
*** fbo_ has joined #zuul		06:43
*** fbo_ is now known as fbo		06:48
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Use connection type supplied from nodepool https://review.openstack.org/501976	06:58
*** yolanda has joined #zuul		07:07
*** electrofelix has joined #zuul		08:38
*** hashar has joined #zuul		09:01
openstackgerrit	Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Support username also for unmanaged cloud images https://review.openstack.org/500808	09:04
openstackgerrit	Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Add password to build and upload information https://review.openstack.org/502011	09:04
tristanC	I started the process of writting a SELinux policy for zuul, once it passes some integration test i'd like to submit upstream for review if you are ok with this	09:55
tristanC	fwiw, the type enforcement file looks like this: https://softwarefactory-project.io/r/#/c/9593/2/zuul/sf-zuul.te	09:56
*** jkilpatr has joined #zuul		10:57
*** olaph has joined #zuul		12:03
dmsimard	jeblair, mordred: so I was thinking last night.. with ara 1.0, there is the opportunity to make a very bare bones callback with (almost) just 'requests' as a dependency for use with the HTTP REST API. The python API leverages the flask app so it still depends on some things (less than the entire webapp, like no xstatic, etc.).. as things progress along, I'll see what are the opportunity for a more "bare	12:08
dmsimard	bones" python API callback	12:08
*** odyssey4me has joined #zuul		12:43
*** hashar has quit IRC		13:28
*** hashar has joined #zuul		13:28
*** dkranz has joined #zuul		13:56
rcarrillocruz	hey folks	13:59
rcarrillocruz	how's bubblewrap installed	13:59
rcarrillocruz	via distro package?	13:59
rcarrillocruz	setting up a zuul, getting failures due to missing bwrap binary	13:59
odyssey4me	rcarrillocruz are there instructions around for zuul v3 - or is this zuul v2.5 ?	14:03
rcarrillocruz	setting up v3	14:03
rcarrillocruz	hmm	14:05
rcarrillocruz	https://github.com/openstack-infra/puppet-zuul/blob/master/manifests/executor.pp	14:05
mordred	rcarrillocruz: there'sa PPA we use for ubuntu-xenial	14:06
rcarrillocruz	yah, just enabled https://launchpad.net/~openstack-ci-core/+archive/ubuntu/bubblewrap/+index	14:07
rcarrillocruz	thx	14:07
mordred	rcarrillocruz: it's already in fedora - I'm not sure about the story for centos	14:07
mordred	++	14:07
rcarrillocruz	oh	14:07
rcarrillocruz	fedora 26?	14:07
rcarrillocruz	i mean, right now i'm in AIO, scheduler/merger/executor all in a xenial	14:08
rcarrillocruz	but that's interesting	14:08
rcarrillocruz	in other news	14:09
rcarrillocruz	https://github.com/rcarrillocruz-org/zuul-tested-repo/pull/5	14:09
rcarrillocruz	i have GH reporter working	14:09
rcarrillocruz	\o/	14:09
rcarrillocruz	odyssey4me: oh sorry, didn't follow, you are willing to spin up a v3?	14:11
rcarrillocruz	i'm doing an install using ansible-role-zuul	14:11
rcarrillocruz	getting notes of things i'm encountering that need manual fix	14:11
rcarrillocruz	for later patches	14:11
rcarrillocruz	but overall, the role sets up a v3 AIO nicely	14:11
mordred	rcarrillocruz: wot!	14:12
odyssey4me	rcarrillocruz yes, I'm particularly interested in nodepool at this stage - not sure how tightly coupled it is to zuul v3	14:12
mordred	odyssey4me: it can totally be run on its own	14:12
rcarrillocruz	odyssey4me: well, i'm using nodepool standalone for a very rough CI in Ansible networking	14:13
mordred	odyssey4me: nodepool in the feature/zuulv3 branch is the one you want if you're setting up a nodepool	14:13
odyssey4me	ah, excellent - not that I wouldn't prefer our CI to use zuul... but for now we have jenkins :(	14:13
rcarrillocruz	used ansible-role-nodepool	14:13
rcarrillocruz	happy to help	14:13
mordred	odyssey4me: so - there's a thing that I think would be GREAT if someone wrote	14:13
mordred	odyssey4me: but I'm not going to because I'm busy	14:13
odyssey4me	mordred aha, that'd be what I'm looking for then	14:13
mordred	odyssey4me: which is a nodepool plugin for jenkins	14:13
odyssey4me	rcarrillocruz so ansible-role-nodepool works with the feature branch version?	14:14
rcarrillocruz	yep	14:14
rcarrillocruz	by default it pulls feature/zuulv3	14:14
odyssey4me	mordred hmm, yes - that might actually just end up happening	14:14
mordred	odyssey4me: nodepool v3 uses zookeeper for zuul to request nodes from it - it should be VERY easy for someone with the javas to write a plugin for jenkins that could request nodes from nodepool using the same api	14:14
odyssey4me	rcarrillocruz sweet, that helps a bunch - thanks	14:14
mordred	odyssey4me: I'd personally like it because I think it makes a really nice migration path for folks - or for situatoins where you want to run zuul and jenkins side by side	14:15
rcarrillocruz	things you need to do before ansible-role-nodepool invocation:	14:15
rcarrillocruz	1.bootstrap python	14:15
rcarrillocruz	2. bootstrap pip	14:15
rcarrillocruz	3. generate ssh key	14:15
mordred	odyssey4me: both a zuul and a jenkins witha zk plugin could totally consume nodes from the same nodepool with no problems	14:15
rcarrillocruz	4. copy over clouds.yaml to nodepool home folder	14:15
rcarrillocruz	5. invoke ansible-role-zookeeper	14:16
odyssey4me	mordred yeah, I saw the intended lock feature - so that makes sense	14:16
rcarrillocruz	6. invoke ansible-role-nodepool, passing as param the nodepool_file_nodepool_yaml_src var (it creates nodepool.yaml)	14:16
rcarrillocruz	i'm in the process of creating a network-infra, it's private repo, once i split off secrets from it i wanna make it public	14:17
rcarrillocruz	will ping you when done	14:17
odyssey4me	rcarrillocruz that'd be great	14:17
rcarrillocruz	but the nodepool playbook is pretty much the steps i depicted above	14:17
odyssey4me	I wonder if we should etherpad that little procedure and share war stories as we go	14:18
odyssey4me	I suppose a review to the repo would probably make better sense.	14:18
odyssey4me	I'll take these as notes and prep a patch to the README	14:18
odyssey4me	thanks rcarrillocruz once again	14:18
rcarrillocruz	++	14:18
rcarrillocruz	i need to app myself a doc patch describing what perms are bare minimum to create a zuul gh app	14:20
rcarrillocruz	s/app/write	14:20
*** olaph has quit IRC		14:34
*** olaph has joined #zuul		14:36
dmsimard	rcarrillocruz: what are you deploying on ? centos ?	14:51
rcarrillocruz	xenial	14:51
dmsimard	rcarrillocruz: okay -- if you deployed on centos I might have been able to help :)	14:52
dmsimard	rcarrillocruz: https://www.rdoproject.org/blog/2017/03/standalone-nodepool/	14:53
dmsimard	out of software factory	14:53
dmsimard	rcarrillocruz: oh, that might be for odyssey4me instead though	14:53
rcarrillocruz	otoh, i believe sf is not on zuulv3 yet	14:53
rcarrillocruz	?	14:53
dmsimard	rcarrillocruz: it's there: https://softwarefactory-project.io/zuul3/	14:54
dmsimard	rcarrillocruz: tristanC might have more details but I believe all the necessary bits to use it are there but the next version of SF should contain the "real" v3 release	14:54
rcarrillocruz	orly? yanis told me SF v3 was still to be deterimned	14:54
odyssey4me	thanks dmsimard - that may prove useful :)	14:54
dmsimard	mordred: which brings the question, do you guys plan on merging feature/v3 back into master and start tagging releases at some point ? :)	14:55
rcarrillocruz	hmm	14:55
rcarrillocruz	sudo yum install -y --nogpgcheck https://softwarefactory-project.io/repos/sf-release-2.5.rpm	14:55
rcarrillocruz	i wonder if that v3 was hand rolled	14:55
rcarrillocruz	and is what yanis referred to, that there no v3 packages	14:56
dmsimard	rcarrillocruz: v3 is not in 2.5	14:56
dmsimard	rcarrillocruz: it landed in 2.6	14:56
rcarrillocruz	yanis == spredzy on IRC for those who don't know	14:56
dmsimard	rcarrillocruz: install 2.6 instead	14:56
rcarrillocruz	ah, so SF releases are not related to zuul versioning	14:56
mordred	dmsimard: yup!	14:56
rcarrillocruz	good	14:56
mordred	dmsimard: well - we plan on merging v3 back into master for sure	14:57
mordred	dmsimard: and I believe we plan on tagging a 3.0 release once we think it's 'ready' for other folks	14:57
dmsimard	rcarrillocruz, odyssey4me: docs here https://softwarefactory-project.io/docs/operator/deployment.html	14:57
dmsimard	rcarrillocruz: correct, sf releases are not tagged according to zuul release :)	14:58
mordred	dmsimard: ongoing releases are an interesting question - since we run CD from master (or will be once it's re-merged) - releases wind up being arbitrary snapshots -so we'll need to think about the semantics of that	14:58
dmsimard	rcarrillocruz: 2.6 ships with jenkins, zuul-launcher (so zuul v2.5 without jenkins) and zuul v3, it's a release to help with transition	14:58
rcarrillocruz	hmm, k	14:59
rcarrillocruz	pabelanger: ^	14:59
tristanC	rcarrillocruz: the current roadmap is sf-2.x support both zuulv2 and zuulv3, and a future sf-3 will be zuulv3 only	14:59
rcarrillocruz	we talked about SF and v3 not that long	15:00
rcarrillocruz	++	15:00
tristanC	rcarrillocruz: sf-2.6 does have a working "tech-preview" zuulv3, and we are waiting for upstream release of zuulv3 to release it part of sf-2.7	15:00
tristanC	the master repository (that will be sf-2.7) does have all the nodepoolv3/zuulv3 bits in place	15:02
tristanC	+ the static and opencontainer driver so that you can test the full stack without a cloud	15:04
rcarrillocruz	OH	15:05
rcarrillocruz	that's neat!	15:05
rcarrillocruz	i thought static driver was in review	15:05
rcarrillocruz	has that been merged?	15:05
tristanC	rcarrillocruz: not yet but I've added it to the nodepool3.rpm so that it's easy to test	15:06
jeblair	mordred: post-hoc -1 review of https://review.openstack.org/501886	15:06
tristanC	rcarrillocruz: and we just added an integration test that does create oci slave to verify zuulv3 can merge patch	15:06
rcarrillocruz	hmm, what's ansible_user value on the executor	15:07
rcarrillocruz	is it root by default?	15:07
rcarrillocruz	the user it ssh to on nodepool nodes	15:07
tristanC	rcarrillocruz: yes, nodepool needs root access to create slave	15:08
tristanC	rcarrillocruz: you can have a look at this integration test: https://github.com/softwarefactory-project/sf-ci/blob/master/health-check/nodepool3.yaml	15:08
tristanC	rcarrillocruz: which use this nodepool.yaml: https://github.com/softwarefactory-project/sf-ci/blob/master/health-check/templates/nodepoolV3.yaml.j2	15:08
tristanC	similarly, this zuulv3 verify a change can be merged: https://github.com/softwarefactory-project/sf-ci/blob/master/health-check/zuul3.yaml	15:09
jeblair	pabelanger, clarkb: fyi see comments on https://review.openstack.org/501886	15:09
tristanC	those tests are run on every software-factory change :)	15:09
pabelanger	morning	15:23
pabelanger	just catching up on backscroll	15:23
pabelanger	jeblair: clarkb: thanks, keep forgetting about that	15:24
mordred	jeblair: GAH. thank you	15:30
tristanC	leaving for denver soon, see you there folks!	15:35
*** hashar is now known as hasharAway		15:45
rcarrillocruz	are we good to go https://review.openstack.org/#/c/500808/3	15:50
rcarrillocruz	i need this to consume user from nodepool settings, to log into network appliances images	15:51
rcarrillocruz	mordred: ^, you +2'd previously	15:58
rcarrillocruz	hmm, on the other hand, where in the code of zuul we consume 'username' from nodepool? i see executor/server.py it creates ansible_user off executor.default_username, but don't see setting the user on hostvars off nodepool node	15:59
rcarrillocruz	i guess by passing the full provider, as it contains labels, then username	16:02
mordred	tristanC: see you in denver!	16:07
mordred	rcarrillocruz: tobiash has patches for zuul to consume the username field once it's there	16:08
mordred	rcarrillocruz: you might also want to review https://review.openstack.org/#/c/500800/2 and https://review.openstack.org/#/c/453968/3	16:08
rcarrillocruz	thx for the +1	16:08
rcarrillocruz	i'm a bit torn exposing passwords on nodepool.yaml	16:08
rcarrillocruz	maybe it should go in something like secure.yaml or the likes	16:08
rcarrillocruz	like	16:08
rcarrillocruz	nodepool.yaml have it in core review public repo	16:08
rcarrillocruz	secrets in private repo or somewhere else	16:09
rcarrillocruz	i refer to https://review.openstack.org/#/c/502011/	16:09
mordred	rcarrillocruz: I just left that same -1	16:10
mordred	rcarrillocruz: we have an optional secure.yaml already - I think the docs in this case should talk about using it for that field	16:10
mordred	rcarrillocruz: also - I tihnk we need doc mentions on securing zookeeper	16:11
rcarrillocruz	my mem may fail on me, secure.yaml was for clouds.yaml passwords?	16:12
pabelanger	rcarrillocruz: we used secure.yaml for database connection string	16:14
pabelanger	right now, it is not used	16:15
mordred	rcarrillocruz, pabelanger: both of you are right ...	16:15
pabelanger	but, I think we want to use it for zookeeper auth at some point	16:15
mordred	os-client-config supports a clouds.yaml and a secure.yaml	16:15
mordred	and nodepool support a nodepool.yaml and a secure.yaml	16:15
rcarrillocruz	what i'm getting at: will secure.yaml become a general thing to store nodepool creds	16:15
pabelanger	Oh, TIL about os-client-config	16:15
mordred	(os-client-config supports secure.yaml because nodepool did and it seemed like a nice thing to copy)	16:15
rcarrillocruz	images creds	16:15
rcarrillocruz	connection strings	16:15
rcarrillocruz	etc	16:15
pabelanger	and secure.yaml	16:16
mordred	rcarrillocruz: secure.yaml already is a general thing in which any setting can go	16:16
rcarrillocruz	if so, it sounds like we want to have a mini schema here	16:16
rcarrillocruz	ack	16:16
mordred	rcarrillocruz: it's basically just two files so you can, as an admin, put some in one file with stricter perms as you see fit	16:16
mordred	or not, if your env is just you and you don't care :)	16:16
rcarrillocruz	heh	16:17
mordred	pabelanger: yah - we could split our nodepool clouds.yaml files if we wanted, put secure.yaml files out there with our passwords managed in system-config and put the other bits into project-config	16:18
pabelanger	++	16:18
jeblair	tobiash: http://paste.openstack.org/show/620735/ errors related to connection cache	16:21
rcarrillocruz	btw, timer trigger doesn't work on github repos	16:23
rcarrillocruz	http://paste.openstack.org/show/620737/	16:23
rcarrillocruz	it's supposing events will have commits	16:23
jeblair	that error means that we're not completing the reconfiguringation process	16:23
rcarrillocruz	that logic seems to need a move	16:23
rcarrillocruz	will prep a patch	16:23
jeblair	rcarrillocruz: thx	16:23
jeblair	mordred, ianw: it is possible the maintainCache bug may be the cause of the stuck changes in queue. when we reconfigure (because we land a config change), we re-enqueue all the changes in new pipelines, then maintain the connection cache (which fails now and aborts the process). after that we re-establish the shared change queues in the pipeline postConfig method. it seems plausible that since that's not happening, that a change that was in ...	16:29
jeblair	... a pipeline across a reconfiguration may not be removed from the correct queue.	16:29
jeblair	i will prepare a revert	16:29
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Revert "Enable maintainConnectionCache" https://review.openstack.org/502121	16:38
jeblair	mordred: ^	16:40
tobiash	jeblair: looks like maintainCache is untested yet in github	16:48
jlk	huh, yeah I don't think we've tested that	16:48
tobiash	+2 for the revert	16:49
jeblair	since it was probably copied, it's worth double checking that the bug in the github version of that function isn't also in the gerrit version. it's possible we ran that on github before we would have run it on gerrit. in other words, that particular code path may not be tested in gerrit as well.	16:51
tobiash	jeblair: yes, I assumed it worked in v2 and wasn't modified much	16:53
mordred	jeblair: +A	16:53
mordred	jeblair: and ah - your explanation of how that could cause the things we saw seems plausible	16:54
pabelanger	mordred: jeblair: is something wrong with zuulv3.o.o? I am seeing what I think is a loop in the debug.log	17:00
*** harlowja has joined #zuul		17:00
jeblair	i'll restart it	17:02
jeblair	pabelanger: see scrollback and https://review.openstack.org/502121	17:02
pabelanger	jeblair: ah, thank you	17:05
tobiash	mordred, rcarrillocruz: for production use of storing passwords in zookeeper we for sure need to secure zookeeper with auth and encryption	17:11
tobiash	mordred, rcarrillocruz: both should be supported in zookeeper, but we will have to add tls support to kazoo	17:12
clarkb	this is login credentials?	17:12
clarkb	another option may be to negotiate those out of band like is done with ssh keys?	17:13
tobiash	clarkb: yes, that's login credentials to the nodes	17:14
tobiash	clarkb: then we would need secrets on the executor also for user supplied images and means on the executor to configure different credentials for different labels	17:15
clarkb	tobiash: yes, which is how ssh works isn't it?	17:17
clarkb	(the private key being the secret)	17:17
tobiash	yes	17:18
tobiash	currently there is just a single private key on the executor for all nodes	17:18
tobiash	and accessing windows nodes unfortunately doesn't work with ssh	17:19
clarkb	tobiash: but you can use ssl client cert auth with windows	17:19
clarkb	which is similar ish to ssh keys	17:19
tobiash	clarkb: hm, didn't think about that possibility yet	17:22
tobiash	sounds cool	17:22
tobiash	clarkb: so you suggest to add a winrm_client_cert setting to the executor to avoid password passing?	17:23
clarkb	tobiash: ya, I mean I've never had to actually use it but seems like a good match up to how things work with ssh keys	17:23
clarkb	and that might simplify bootstrapping	17:24
mordred	it's worthwhile checking about how windows guests work on openstack, but also on aws - do you get an administrator password returned in the nova server or ec2 instance data from the API?	17:24
mordred	mostly because if that's the mechanism available for those, then figuring out password passing will still need to happen at some point	17:25
tobiash	mordred: in our v2 env the windows guests work quite well	17:25
tobiash	they just take longer to boot	17:25
tobiash	around 2min instead of 50s	17:25
tobiash	but we preconfigured the login stuff in the image	17:26
mordred	tobiash: how does auth work with that? also - someone was asking about image building too	17:26
jeblair	2017-09-08 17:26:19.542490 \| ubuntu-xenial \| 2017-09-08 17:26:19.542 \| stack.sh completed in 944 seconds.	17:27
jeblair	clarkb, mordred: ^	17:27
jeblair	http://logs.openstack.org/02/500202/25/check/devstack/dbd6e11/	17:27
tobiash	mordred: we built that image by hand and added cygwin ssh service with public key authentication such that nodepool and jenkins are happy	17:27
tobiash	but windows was a side use case before and now we need to professionalize this with automated image building and so on	17:28
clarkb	jeblair: nice	17:29
tobiash	ok, client cert auth should be possible (although undocumented) with ansible: https://github.com/ansible/ansible/issues/16243	17:34
mordred	jeblair: WOOT	17:38
mordred	tobiash: cool	17:39
pabelanger	502121 looks stuck in gate :(	17:39
clarkb	does make me wonder why windows doesn't do ssh key auth for powershell. Its all bsd licensed too isn't it?	17:39
jeblair	mordred, Shrews: do you know if anyone working on the websocket streaming test failures?	17:40
jeblair	pabelanger: i'll force-merge it and restart i guess	17:40
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Revert "Enable maintainConnectionCache" https://review.openstack.org/502121	17:41
pabelanger	jeblair: ty	17:41
tobiash	clarkb: https://blogs.msdn.microsoft.com/powershell/2015/10/19/openssh-for-windows-update/	17:43
tobiash	But still ansible would have to support win modules via ssh	17:44
jeblair	clarkb: do you want to take a look at 496959, 496301, 500202 ?	17:45
jeblair	mordred: and you the last 2 of those? ^	17:45
jeblair	i think that's a good checkpoint; we can land those and iterate from there (i think we still need to add the disk partitioning role to that, and then multinode, and tempest... :)	17:46
clarkb	mordred: jeblair for 959 homedir is moving from /opt/stack/new to /opt/stack ?	17:49
*** electrofelix has quit IRC		17:50
jeblair	clarkb: yep; since we're starting from scratch here, i'm trying to be as close to devstack default as possible (i'd like to actually have devstack do that -- it is capable of doing so, but there's some chicken/egg stuff with git repos we'd need to work out first)	17:50
clarkb	and tempest homedir is default path which isn't a change I don't think	17:51
clarkb	?	17:51
jeblair	clarkb: i believe that's correct	17:52
mordred	jeblair: +2 from me all around	17:53
clarkb	ya confirmed tempest isn't moving just continues to use default	17:54
mordred	jeblair, Shrews: I am not working on websocket streaming test failures	17:54
clarkb	jeblair: and the localrc sort method is not alnum or similar it is instead writing out referenced vars first before they are referenced?	17:57
clarkb	ya ok thats what vargraph is	17:59
mordred	jeblair: I just rechecked https://review.openstack.org/#/c/500365 to see how it goes	18:05
pabelanger	mordred: jeblair: puppet has installed 502121 on zuulv3.o.o, safe to restart scheduler?	18:05
mordred	pabelanger: fine by me - I just rechecked a job, so you may want to do graceful save/restore - or else I can just recheck once you're done	18:06
pabelanger	I am not sure, I think zuul is stuck in loop? cc jeblair	18:06
mordred	oh- then absolutely safe to restart	18:06
pabelanger	k	18:06
pabelanger	okay, restarted	18:07
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add migration tool for v2 to v3 conversion https://review.openstack.org/491805	18:09
mordred	jeblair, clarkb, pabelanger: ^^ that's ready for people to take stabs at	18:09
mordred	I added a script "tools/run-migration.sh" that you can use to run it locally if project-config is adjacent to zuul in your local git structure	18:10
mordred	it expects the project-config repo to have the zuul/mapping.yaml file in it though	18:10
mordred	I also added a check job with that patch that runs the script on project-config and collects the output	18:10
mordred	and I put in some notes at the top of the file on things that need to be implemented - I think we can divy those up	18:12
mordred	also, while there are some classes in this tool, it's also vintage mordred code, which means it's not always doing things the most sanest way	18:12
clarkb	jeblair: I am +2 on those changes too. Not approving because have plumber here again and distracted	18:15
mordred	jeblair: errors like this when running a job:	18:18
pabelanger	oh noes	18:18
mordred	"shade-ansible-devel-functional-devstack shade-ansible-devel-functional-devstack : ERROR Unable to find playbook /var/lib/zuul/builds/f244873046ed46a8abcdbb7a036008ea/work/src/git.openstack.org/openstack-infra/shade/playbooks/devstack/pre-run"	18:18
pabelanger	2017-09-08 18:17:00,357 DEBUG zuul.AnsibleJob: [build: 924886064b4f45d28785c6851c60bc27] Ansible output: b'ERROR! A worker was found in a dead state'	18:18
pabelanger	that was on ze01	18:18
mordred	pabelanger: ugh	18:19
pabelanger	we seem to still have our PPA version installed	18:20
mordred	jeblair: how hard would it be to catch those in the job parser / validator? I mean, I guess it would involve the parser making additional cat job calls to get the playbook content	18:20
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add migration tool for v2 to v3 conversion https://review.openstack.org/491805	18:20
mordred	pabelanger: also - could you review https://review.openstack.org/#/c/500320/ for me please?	18:21
mordred	clarkb: or you - you +2'd it before, but then I squashed it with the previous patch	18:22
pabelanger	looking	18:22
mordred	pabelanger: and https://review.openstack.org/#/c/501001/ to go along with it	18:23
jeblair	mordred: yeah, the problem with catching that at validation is that we don't know where to look. we'd have to either ask the merger for a list of all files, or have two round trips to the merger.	18:24
mordred	jeblair: nod	18:25
Shrews	jeblair: mordred: i was not aware of websocket test failures. example?	18:26
mordred	Shrews: it's a sporadic failure	18:26
mordred	Shrews: one sec- lemme find one	18:27
jeblair	mordred, pabelanger, clarkb, jlk, tobiash, dmsimard, Shrews, anyone else: how about we take a look at mordred's migration code, and maybe convene around 20:00 utc and work on a plan for next steps / ptg? does that time work for folks? alternate suggestions?	18:27
Shrews	mordred: didn't you submit a ipv6 patch for something yesterday?	18:27
mordred	Shrews: http://logs.openstack.org/92/500592/6/gate/tox-py35-on-zuul/679b5c4/testr_results.html.gz	18:27
mordred	Shrews: I did, and it has landed - and it still doesn't work :(	18:27
mordred	jeblair: ++ I'll be there	18:27
mordred	Shrews: test_websocket_streaming is the only failure in that log that's relevant	18:28
*** hasharAway has quit IRC		18:28
*** hashar has joined #zuul		18:29
Shrews	mordred: any idea when this started happening?	18:29
mordred	Shrews: yah - when we added IPv6 enabled test nodes into the v3 nodepool	18:30
mordred	Shrews: best I can tell it only happens when we run those unittests on one of them	18:30
pabelanger	jeblair: wfm	18:30
dmsimard	jeblair: where is the migration code ?	18:30
clarkb	jeblair: yes that should work for me, though I doubt I will be able to look at the code much.	18:31
clarkb	good news is the saga of the plumbing and washing machine is almost over \o/	18:31
Shrews	mordred: hrm, seems more zuul_stream related (possibly). no logging output whatsoever	18:31
Shrews	fun	18:31
jlk	jeblair: sounds right	18:33
Shrews	or finger server... hrm	18:33
jeblair	dmsimard: mordred just pushed it up -- 491805	18:35
Shrews	i'm betting socketserver may not be ipv6 friendly	18:36
Shrews	address_family = socket.AF_INET	18:38
mordred	Shrews: gah. I thought I got all of those	18:38
Shrews	likely culprit... seeing how to override	18:38
Shrews	mordred: this is in socketserver itself	18:38
mordred	Shrews: oh - that's in socketserver itself?	18:38
mordred	nod	18:38
Shrews	mordred: specifically, https://github.com/python/cpython/blob/3.4/Lib/socketserver.py#L415	18:40
tobiash	jeblair: I'll be around	18:40
Shrews	mordred: so, we can override that value... but AF_INETV6 would not support ipv4, right? not sure how to tell which we should use	18:41
mordred	Shrews: AF_INET6 supports both	18:42
mordred	Shrews: you can grep in zuul source for a few places we use it	18:43
openstackgerrit	David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Also support IPv6 in the finger log streamer https://review.openstack.org/502137	18:46
Shrews	let's see what that gets us	18:46
openstackgerrit	David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Also support IPv6 in the finger log streamer https://review.openstack.org/502137	18:57
dmsimard	jeblair, mordred: added two small comments to mordred's patch, it's hard to me to review without seeing the end result -- I'll test with the provided bash script and report back if I see anything weird	18:59
dmsimard	didn't see anything shocking	18:59
mordred	dmsimard: sweet. I've got an update/followp coming in just a sec - so I'll go look at your comments before I push it up	19:00
Shrews	mordred: where is your zuul/mapping.yaml file referenced in 491805?	19:02
dmsimard	Shrews: I think that's where the mapping ends up being generated ? /me looks again	19:03
dmsimard	ah, nope, the file is expected to be there indeed	19:06
dmsimard	Shrews: https://review.openstack.org/#/c/491804/5/zuul/mapping.yaml	19:06
tobiash	mordred: I didn't find the generated playbooks in the build result	19:07
dmsimard	tobiash: I see the new layout in zuul.d/99converted.yaml but I don't see the actual jobs	19:10
tobiash	dmsimard: yeah, also just noticed that	19:10
dmsimard	would jobs be "compiled" at runtime and this is just a migration of the layout ?	19:10
tobiash	how is it supposed to be distributed over the repos at the first try?	19:11
dmsimard	tobiash: it's not afaik	19:11
tobiash	so all will be in project-config at first?	19:11
dmsimard	tobiash: it will live in project-config until projects start migrating over ?	19:11
mordred	tobiash: we are not generating the playbooks yet	19:16
mordred	that's one of the next things needed	19:16
mordred	(we have a bunch of the code for that already in 2.5 that we can copy-pasta in)	19:16
dmsimard	mordred: I wonder if we should prefix or suffix jobs that have been automatically migrated	19:16
mordred	dmsimard: aha - but we do! :)	19:17
mordred	dmsimard: legacy-{name}-ansible-func-centos-7 ... for instance	19:17
dmsimard	mordred: yeah I saw legacy but it's not everywhere so I thought it was something else	19:17
mordred	dmsimard: we have a bunch of places where we have already defined new jobs for tolks	19:18
mordred	so people who were using gate-nova-python27 before just get "tox-py27"	19:18
dmsimard	mordred: so you ended up defining nodesets after the node names just like we discussed ?	19:18
mordred	but if we don't have a nice new job to migrate you to (that's what the mapping file is for)	19:18
mordred	dmsimard: oh - yah - need to write that patch, but yes, that idea makes this quite nice	19:19
tobiash	mordred: looks like the jobs used in 99converted are also sill missing?	19:19
mordred	tobiash: yes indeed	19:19
dmsimard	mordred: yeah I don't see the nodesets but I see the intent of using that	19:19
mordred	tobiash: the job+playbook content will come next	19:19
dmsimard	I think jeblair was against the idea though	19:19
tobiash	ah, ok	19:19
dmsimard	or maybe it was about mixing label and name without requiring the other	19:19
dmsimard	I forget	19:19
mordred	dmsimard: that's a different thing- but the thing you just mentoined, defining some nodesets that match each of our labels - and also the old v2 multi-node labels	19:20
mordred	makes for a nice transition	19:20
mordred	I'll get that patch up in just a sec	19:20
dmsimard	wfm	19:20
mordred	WOOT	19:24
mordred	"	19:24
mordred	Accepted python3.5 into zesty-proposed. The package will build now and	19:24
mordred	be available at	19:24
mordred	https://launchpad.net/ubuntu/+source/python3.5/3.5.3-1ubuntu0~17.04.0 in	19:24
mordred	a few hours, and then in the -proposed repository.	19:24
mordred	clarkb, jeblair, pabelanger, Shrews, SpamapS: ^^	19:24
jeblair	yay!	19:28
jeblair	dmsimard, mordred: i am all for doing the convenience nodeset definitions (they should be in project-config)	19:29
jeblair	mordred: what things require ordereddict? (i'm concerned about the stuff going into zuul/lib/yamlutil)	19:37
mordred	jeblair: if you don't use ordereddict the resulting file is illegible	19:38
mordred	jeblair: happy to just copy-pasta all of that into migrate.py though	19:38
jeblair	mordred: well, i was mostly looking at ordered_load, which would all be on the input side. what makes it all the way through the system ordered?	19:39
mordred	jeblair: by illegible, I mean the source data has been maintain in alphabetical order, so scanning through and comparing source data to produced data with produced data being in an arbitrary order is hard to process	19:39
pabelanger	mordred: cool	19:40
jeblair	mordred: oh, you mean: "projects: [nova]"	19:40
mordred	yah. - and also we have a generally accepte practice of having pipeline definitions look like project: name: foo template: - a - b check: ... etc	19:41
jeblair	mordred: we could maybe drop that and re-alphabetize it (for project_name in sorted(self.layout['projects'].keys()))	19:41
openstackgerrit	David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Support IPv6 in the finger log streamer https://review.openstack.org/502137	19:42
pabelanger	mordred: jeblair: Shrews: SpamapS: so, before we dive into any we got another dead worker on ze01, lets upgraded to -proposed version of python, and see if we can reproduce	19:42
jeblair	mordred: okay. i'm going to leave some comments about moving it into migrate and why.	19:42
mordred	jeblair: kk	19:42
mordred	jeblair: I'll do that in my next iteation	19:42
jeblair	mordred: okay, left comment; that's all i have after a quick look, will play with it now	19:45
mordred	jeblair: cool. also, please ignore the fact that the the output is missing a 'jobs' ... fixing that right now :)	19:46
jeblair	pabelanger: probably a good plan	19:46
tobiash	Shrews: have thoughts on 502137	19:51
Shrews	tobiash: b/c 'localhost' did not work for me	19:52
tobiash	Shrews: ok, that's an argument	19:52
Shrews	dammit. now some sort of race.... "Streamed: Build ID c8ace3ccb68d44198522a428ac1440b3 not found"	19:57
dmsimard	Why is there nova patches in the v3 check queue ?	20:02
dmsimard	Also, it seems like there's a glitch in the UI where you can't expand the box to see queued jobs if none of the jobs have started yet ?	20:03
clarkb	o/ here for 2000UTC convening	20:05
jlk	o/ same	20:05
jeblair	pabelanger, Shrews, tobiash, mordred, dmsimard: ping ^	20:05
dmsimard	pong	20:05
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add migration tool for v2 to v3 conversion https://review.openstack.org/491805	20:06
mordred	jeblair: heya	20:07
jeblair	dmsimard: yeah, that's the thing that mordred wanted to fix with a ui patch (but it ended up making it harder for us to diagnose issues). we should still think about ways to improve it, but probably later at this point. in short, every change that might be enqueued is enqueued at least long enough for zuul to determine whether it should be. that's when it shows up there, and it may stay there with no jobs until the merger comes back with ...	20:07
jeblair	... information on whach jobs it should run.	20:07
jeblair	s/whach/which/	20:07
tobiash	o/	20:07
dmsimard	jeblair: ah somehow I thought it was a merger queue thing so I wasn't entirely wrong	20:07
dmsimard	ok, let's look at it later :)	20:08
openstackgerrit	Merged openstack-infra/zuul-jobs master: Add UPPER_CONSTRAINTS_FILE file if it exists https://review.openstack.org/500320	20:08
jeblair	how about we start real quick by going over the etherpad from earlier: https://etherpad.openstack.org/p/zuulv3-pre-ptg	20:08
jeblair	we're really close on devstack jobs. i think there's a little more work to go into the v3 native job, and then we'll be ready to point people at it to start building new simple devstack jobs.	20:09
jeblair	it's probably close enough at this point not to consider it a migration blocker.	20:10
jeblair	that sound right to folks?	20:10
mordred	agree	20:10
clarkb	is the devstack job running tempest?	20:10
mordred	we can emit jobs for devstack-legacy for the most of them	20:10
clarkb	(I want to say the log I saaw did run tempest)	20:11
jeblair	clarkb: not yet	20:11
jeblair	clarkb: not the v3 native one (devstack-legacy, yes)	20:11
clarkb	ah ok	20:11
dmsimard	jeblair: devstack doesn't worry me too much, it's the non-devstack things that do	20:12
tobiash	was large scale dynamic layout reconfiguration tested already?	20:12
dmsimard	jeblair: especially deployment projects like puppet-openstack, openstack-ansible, kolla, tripleo..	20:12
jeblair	tobiash: no, only static configuration loading	20:12
jeblair	tobiash: and that with a null configuration	20:14
jeblair	okay, next on the list -- jobs with special slaves:	20:14
jeblair	https://etherpad.openstack.org/p/zuulv3-special-jobs	20:14
jeblair	mordred, pabelanger: what's left from that list?	20:15
mordred	jeblair: there's 4 release jobs	20:15
mordred	jeblair: npm-upload, jenkinsci-upload, mavencentral-upload and forge-upload	20:16
jlk	oh crap, I need to finish up the translation-update	20:16
jlk	upstream-translation-update	20:16
jlk	and while I'm there, I might be able to do propose-translation-update{suffix}	20:16
mordred	++ those are the others	20:17
jeblair	what's forge-upload?	20:17
mordred	jeblair: the stack in the middle just need to be handled by the migration script, I'll add those to the mapping file in the next patch	20:17
jeblair	puppetforge	20:18
jeblair	that could probably be more clear in the future :)	20:18
jeblair	(hardly the first thing called forge)	20:19
mordred	right?	20:19
jeblair	jenknsci and mavencentral are probably not critical to cutover. npm, forge, and translations probably should be considered blockers?	20:19
clarkb	even puppetforge is probably not critical	20:19
pabelanger	jeblair: I'm just retesting publishing to afs, and tagging a release, then good to mark them as done	20:19
clarkb	I don't think we consume any of our puppet modulse via puppet forge	20:19
clarkb	so question would be if puppet-openstack does	20:20
jlk	https://review.openstack.org/#/c/499845/ needs to be rebased by me, but could use reviews after.	20:20
pabelanger	jeblair: so, ready to work on next task	20:20
jlk	it's for propose-project-config-update	20:20
jeblair	clarkb: yeah, i was thinking of p-o	20:20
dmsimard	jeblair, clarkb: puppet-openstack used to publish to the forge	20:20
mordred	forge-upload is used by two projets	20:20
dmsimard	but that was back in the day of hodgepoge PTL	20:20
mordred	puppet-httpd-forge-upload and puppet-storyboard-forge-upload	20:20
dmsimard	I don't think we publish anymore	20:20
clarkb	mordred: if it is just those two then I don't think it is criticial to treat that job as a blocker	20:21
jeblair	oh, then i agree we can lower priority of that	20:21
jeblair	what about npm-upload?	20:21
mordred	it's used by openstack/tripleo-ui	20:21
mordred	and that's it	20:21
jeblair	let's call that a blocker?	20:22
mordred	yah - it shouldn't be too hard to translate	20:22
jeblair	okay, so i put npm and translation-upload on the blocker list, and the rest on a new list of things to do right after cutover	20:23
mordred	it runs a script from slave_scripts, so the pattern used in the other proposal jobs can be copied easily	20:23
jeblair	okay, moving down the list (but saving migration script for last): migration docs	20:24
jeblair	i'm happy with where they are at the moment, but i'm sure we will want to update them with info about the migration script once we have it	20:24
jeblair	so let's call this done	20:24
jeblair	the wget breakage is fixed	20:25
jeblair	" docs jobs incorrecly publishing"	20:25
jeblair	pabelanger: i think you just fixed that?	20:25
mordred	yah	20:25
mordred	jeblair: for migration docs - perhaps we should keep an etherpad going at the PTG to jot down notes as folks ask us questions	20:26
jeblair	zuul-cloner shim -- Shrews that's in the base job now, right?	20:26
Shrews	yep	20:26
jeblair	mordred: ++	20:26
jeblair	configure-mirror parity	20:26
jeblair	there are reviews there; sounds like we're almost done	20:26
jeblair	dmsimard: when those 3 are landed, are we good?	20:27
jlk	mordred: good idea.	20:27
jeblair	(note, i think as part of this that populating /etc/ci is being separated from configure mirror role, which makes a lot of sense)	20:27
dmsimard	jeblair: that should provide backwards compat for the /etc/ci/mirror_info.sh -- would need to compare or better, use, in a job that consumes it in order to tell if it's good	20:28
clarkb	dmsimard: I believe the dib cross build jobs make extensive use of it	20:29
jeblair	dmsimard: ack. devstack-legacy uses one line from it	20:29
dmsimard	at first glance the reviews seem okay but I haven't had the chance to review them thoroughly yet	20:29
dmsimard	jeblair: I believe tripleo uses that file too, would need to double check	20:29
jeblair	ok. so still a blocker but likely we can wrap it up today	20:30
jeblair	new servers	20:30
jeblair	pabelanger: ^ what's the status there?	20:30
dmsimard	are we keeping the same set of zuul mergers ? is there anything we are taking the opportunity to reinstall/install in xenial ?	20:30
clarkb	dmsimard: I think we need to update to xenial because of python3.5	20:31
pabelanger	jeblair: I'd like to stand up nb01/nb02 shortly	20:31
clarkb	the type annotations won't work as is in < 3.5	20:31
dmsimard	so we have 8 new mergers to stand up ?	20:31
pabelanger	jeblair: still need to work on zuul-executor / zuul-mergers	20:31
jeblair	okay; we don't strictly need to do nb01/nb02	20:32
clarkb	for the mergers I expect that we can rollback to v2 on xenial if we have to and just replace the existing set with xenial nodes	20:32
jeblair	should we consider deferring that to save time+quota?	20:32
jeblair	clarkb: that sounds reasonable	20:32
*** hashar has quit IRC		20:32
pabelanger	clarkb: oh, so just upgrade not new servers	20:33
pabelanger	I do think we need to do some puppet work for puppet-zuul on mergers	20:33
clarkb	they are largely stateless (if you wave your hands around v2 needing to fetch from them) so easy enough to make rollback be redeploy on trusty as well	20:34
jlk	I assume once in production we'll have a better idea of how to balance executors vs pure mergers	20:35
jeblair	jlk: ++	20:36
jeblair	clarkb: what would you recommend we do?	20:36
clarkb	jeblair: re mergers? maybe split the current 8 in two and have 4 trusty for easy rollback and 4 xenial for v3	20:37
jeblair	that works for me. we can also disable the merge-check pipeline temporarily to reduce load on them.	20:37
jeblair	pabelanger: sound good ^?	20:37
jeblair	if not, you can work it out later :)	20:39
jeblair	last thing i just added to the list and can't believe i forgot -- logstash emitter	20:39
jeblair	i started some prep for this a couple weeks ago	20:39
jeblair	my plan is to have a trusted post-playbook in base that submits a background job to the logstash gearman queue	20:40
dmsimard	emitter reminds me of the stuff like firehose/openstack-health, we don't need to do anything special for those ?	20:40
jeblair	what it emits will be compatible with what we currently emit	20:40
jeblair	basically, it'll be a copy-pasta of the jenkins-log-client code into a post-playbook, skipping the zmq step	20:41
mordred	++	20:41
jeblair	dmsimard: largely to assist with this, zuul emits no firehose yet	20:41
pabelanger	jeblair: clarkb: sure, wfm	20:41
dmsimard	ok.	20:42
jeblair	regarding openstack-health, i think the subunit processor is hooked up to the logstash system	20:42
clarkb	ya its the same sort of job submission with different parameters	20:42
clarkb	should be straightforward to solve both of those together	20:42
jeblair	ok, so probably our post playbook will submit 2 jobs	20:43
mordred	jeblair: and the log-gearman-client.py, once zmq is done, only depends on gear, which is installed on the executor so should be piece of cake	20:43
jeblair	ya	20:43
jeblair	i think this will not take long to write; i am unsure whether i can also get it safely merged into the base job today.	20:44
jeblair	i'll try to get as far as i can, but of the things so far, this seems most likely not to happen by EOD today	20:45
jeblair	okay, last thing -- migration script	20:45
jeblair	it runs!	20:45
jeblair	and outputs 19.5 kilolines of configuration!	20:45
jlk	egads	20:45
jeblair	which doesn't include the job definitions yet :)	20:45
clarkb	what is it if not job definitions?	20:46
jeblair	(that's actually 600 more than the current layout.yaml, interestingly enough)	20:46
clarkb	oh right layout	20:46
jeblair	clarkb: project definitions (ie, what's in current layout.yaml)	20:46
jlk	current makes more use of templates, no?	20:46
jeblair	clarkb: but the "jobs" section with all the crazy regex stuff is gone, folded into the project-pipeline definitions	20:47
jeblair	so it is slightly longer and much more readable	20:47
jeblair	- legacy-grenade-dsvm-redis-zaqar:	20:47
jeblair	voting: false	20:47
jeblair	for instance ... can you guess whether that job is voting? :)	20:47
jeblair	in v2 the answer is, no, you can not guess.	20:47
mordred	\o/	20:48
jeblair	anyway	20:48
jeblair	mordred: do you have incomplete code to do job configuration yet, or are we at the start of that?	20:48
mordred	jeblair: the existing 2.5 code	20:49
mordred	jeblair: so - no, I don'thave it pulled out or stiched in to the migration script yet - but that was going to be the starting place	20:50
jeblair	mordred: right, so we have that to do, as well as formulating the 'job:' configuration stanza	20:50
jeblair	and somewhere in there we need the role to do ZUUL_ variable compatability	20:50
mordred	yup	20:51
jeblair	the change listed some other todos	20:51
mordred	there isa todo at the top of the migration script in the comments, fwiw	20:51
mordred	yah	20:51
jeblair	shared job queues	20:51
jeblair	also, filters from the jobs section	20:51
mordred	yah - and continuing to add various mapping information to mapping.yaml	20:51
jeblair	mordred: oh, so i guess the current voting: settings just came from '-nv' ?	20:51
mordred	jeblair: yes. that is right	20:52
*** olaph has quit IRC		20:53
jeblair	mordred: this is great progress and i can see it coming together, but i feel like we probably have a couple of days work yet until we get to what we'd consider final output stage. would you concur?	20:53
*** olaph has joined #zuul		20:53
mordred	jeblair: maybe? the migration/mapping stuff actually goes fairly quickly - but it's also 4pm on a friday here, so it's tough to say	20:54
pabelanger	mordred: Hmm, release jobs are broken	20:55
jeblair	based on the other stuff i still need to do, i probably personally can't pick up a migration script task until the plane flight on sunday	20:55
pabelanger	mordred: I think mirror-workspace-git-repos is the issue	20:55
pabelanger	as it doesn't checkout tag?	20:55
jeblair	pabelanger: can we hold that for a minute?	20:56
pabelanger	jeblair: sure	20:57
jeblair	so here's a strawman -- personally, i don't think we're going to be in a position to cut over monday morning; i think it's more likely that we'll finish up the migration script and continue to flesh things out early in the week and we can perhaps attempt cutovers mid or late next week.	20:58
mordred	jeblair: yah- that's kind of what it feels like to me too - we're able to make extreme progress when we're all heads-down on it, so knocking out the final things while we're in the room will likely work quite well	20:59
dmsimard	FWIW I've put name on some risks at the bottom of the pad	21:00
jlk	that's how it sounds to me as well	21:00
dmsimard	And I also believe that doing this tomorrow evening is risky, everyone will be travelling	21:00
jlk	I'd rather see us work hard on getting things right, rather than doing the cut over badly and firefighting frantically for the few days after	21:00
mordred	++	21:00
jeblair	dmsimard: ya thanks that helps a lot	21:00
dmsimard	jeblair: I'm trying to de-risk things that are uncharted territory right now	21:01
dmsimard	but my time is short, won't have time to hack on things tonight or tomorrow as I prep the family for the time I won't be around	21:01
clarkb	ya and juggling knowing who is around when is hard when its all travel time	21:02
mordred	at the ptg we'll also be in a decent position to do some targetted trial runs of some of the unknowns with folks - dmsimard mentioned non-devstack integratoin tests as an example	21:02
pabelanger	I'll be working late today	21:02
pabelanger	but travel in the morning to PTG, but around all after Saturday / Sunday	21:02
dmsimard	what mordred said, I would very much like to de-risk the migration by opting in higher risk projects first gradually, somehow	21:02
jeblair	dmsimard: we have to do the cutover all at once, but we can start running check jobs on any project we want	21:03
mordred	we're pretty sure that our base jobs and ZUUL_ vars role should put things in the right state for those, but won't know until we try one	21:03
dmsimard	jeblair: yes, sure, I'd basically like to try and run real "migrated" v3 jobs on projects ahead of the cutover	21:03
mordred	luckily we can generate some, put a few into a .zuul.yaml and run a check job on them	21:04
mordred	but for many of these projects we don't know how to assess if they are behaving properly, or if they aren't, what isn't happening that should	21:04
clarkb	maybe instead of attempting fisrt cutover on saturday we use daytime sunday to prep and get check jobs running in more places?	21:05
dmsimard	mordred: I have experience and/or contacts in all the deployment projects to help	21:05
clarkb	I think a lot of us arrive on saturday evening ish time so potentially sunday is good sit down and crank things out time	21:05
mordred	clarkb: agree. fungi and I will be in tc/board meeting, but I usually laptop-hack in those anyway :)	21:06
dmsimard	Yeah I'll be there sunday	21:06
dmsimard	tristanC too	21:06
jeblair	clarkb: i'm not expecting us to be in a position to even do trial cutovers saturday	21:06
mordred	I get in saturday evening	21:06
dmsimard	mordred: from China ?	21:06
mordred	dmsimard: no - woundup not going to china	21:06
dmsimard	Oh, ok, no jet lag then	21:06
mordred	jeblair: yah- I think clarkb was suggesting we not try to do trial cutovers on saturday	21:07
jeblair	so i think we just keep plugging away at this and whenever the migration script gives us output we can use, we start throwing some of it at zuulv3. but i don't think we should plan on any 'scheduled' events this weekend.	21:07
dmsimard	Agreed	21:07
mordred	agree	21:08
dmsimard	Let's make as much progress to de-risk the migration as much as we can and see if it's doable thursday ?	21:08
dmsimard	Cause, you know, read only friday	21:08
jeblair	the last question i have is about notification/communication. the last thing we said about this to the dev community was 1 month ago, and we said we'd (probably) cutover monday. normally, we would have sent some reminder announcements, but the situation hadn't really gotten any clearer. do we want to send something out now? if so, what?	21:09
jeblair	or do we want to cutover and say "surprise!"	21:09
mordred	I definitely don't think we should attempt to throw the switch on friday :)	21:09
mordred	jeblair: heh	21:09
mordred	I think sending out an update is a good idea	21:09
dmsimard	An update to say it's not happening either saturday or monday is the minimum	21:10
mordred	something along the lines of "we're close, but we're not satisfied with the migration/cutover and are going to work on it for a few more days while we're in denver. blah blah safe than sorry see you soon kthxbai"	21:10
clarkb	mordred: ++	21:10
jeblair	maybe also include "zuul might start leaving info on patches before then. we'll let you know when it happens. read this page: https://docs.openstack.org/infra/manual/zuulv3.html attend the ptg session on monday."	21:11
clarkb	especially since I know a lot of people are wondering when they will get first crack at zuulv3	21:11
jlk	We should probably send something that outlines what we've discussed today, and set expectations that there is high chance of a cutover next week.	21:11
mordred	jeblair: there is also a chunk of time monday scheduled in a room for some value of "us" to talk to some value of "people" about v3 and what it means	21:11
mordred	jlk, clarkb: ++	21:11
jlk	or what mordred said	21:11
jeblair	mordred: you want to draft that up?	21:11
mordred	jeblair: yah - I can take a swing at it real quick	21:11
dmsimard	jeblair: yes, something about potentially getting non-voting reviews from zuul as we dry run things could be good	21:11
mordred	"if you start seeing comments from zuul, don't freak out - unless you want to"	21:12
jeblair	okay, i think we have a Plan(tm)	21:12
jeblair	anything else?	21:12
Shrews	and possibly note that we are still holding "office hours" to discuss v3 things?	21:13
jlk	We've got budget for the cut-over celebration right?	21:13
jeblair	Shrews: ++	21:13
mordred	jlk: there's always budget for that	21:13
jlk	hopefully it's before Thursday evening.	21:13
jeblair	jlk, mordred: free booze! now i'm motivated!	21:13
Shrews	free booze AND free steak is better motivation. just sayin'...	21:14
jlk	I'm happy to donate my entire per diem for that day to the cause	21:14
mordred	we may also want to ponder, before we show up, that there are a bunch of future/post-ptg zuulv3 related things that people are going to want to discuss while we're in denver	21:14
jeblair	mordred: yes, and unfortunately, we still have a self-generated backlog of weeks/months	21:15
mordred	and think about how to allocate some amount of time to that so people don't explode, but not so much that we stall on the last mile of rollout	21:15
pabelanger	So, is there an updated list of things we need to focus on? Or is https://etherpad.openstack.org/p/zuulv3-pre-ptg been updated	21:15
mordred	it's also possible the time for that will just be at the bar in the evening :)	21:15
jeblair	mordred: yeah. if we can collect use cases, etc, that will be good. if we can also convey it'll probably take a bit to get around to them on account of we still need to do some basic things that would be great	21:16
jeblair	pabelanger: that's current	21:17
jlk	probably need to make clear that cutover != v 3.0 release, right?	21:17
jeblair	jlk: indeed	21:17
jlk	that we still have some things to finish before calling it 3.0	21:17
pabelanger	okay, I've added my name to some tasks	21:18
pabelanger	but release jobs appear broken, so making note	21:18
Shrews	3.0a alpha release	21:19
jeblair	Shrews: i don't think we're even there yet	21:19
jlk	3.lol	21:19
jeblair	more like that yeah	21:19
Shrews	i will call it 3p0 to myself, b/c i find humor in that	21:20
* Shrews must do dinner things now		21:20
pabelanger	I'm relocating as ansiblefest is over, going to get food then get back online	21:20
jeblair	it's sort of like the opposite of tex version numbers. we started at 3.lololololololol, and we're at 3.lolol, almost down to 3.lol. :)	21:21
jlk	soon enough, it'll be 3.yolo	21:21
jeblair	friday	21:21
jlk	(that's when the drinking begins)	21:21
mordred	hah	21:21
* jeblair puts away flask		21:21
jeblair	okay, i'm going to give my chair a break, then back to logstash things	21:22
mordred	woot!	21:22
mordred	jeblair: before you do logstash things ...	21:22
jeblair	thanks everyone!	21:22
jlk	cheers. I'm out for a bit too, dog needs a walk	21:22
mordred	jeblair: I may have another zuul scheduler issue for you	21:22
jeblair	too late already stood up	21:22
mordred	jeblair: https://review.openstack.org/#/c/500365/ is not running any jobs	21:23
mordred	jeblair: nor is its ancestor	21:23
mordred	jeblair: the job content itself isn't a big deal - but zuul ignoring the uploads and the rechecks is potentially worrisome - I looked briefly and didn't see anyting immediately	21:23
jeblair	mordred: ack; i'll check logs in a bit	21:24
mordred	jeblair, clarkb, jlk, pabelanger, Shrews, dmsimard: https://etherpad.openstack.org/p/6NTJxufjNU how's that look?	21:37
pabelanger	looking	21:38
mordred	jlk: if you get bored and run out of other things to do after the translation proposal things, https://review.openstack.org/502185 adds the secret needed for the npm upload job, so anybody can take the ball and run with that one too	21:39
pabelanger	mordred: minor change, looks good	21:41
mordred	pabelanger: ++	21:41
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Update roadmap in the README https://review.openstack.org/502188	21:52
mordred	pabelanger: so what's up with the release jobs?	21:57
pabelanger	mordred: let me find log, but I don't think we were on right tag, we ran against master	21:58
mordred	nod	21:58
pabelanger	mordred: http://logs.openstack.org/da/0a0162e080f09d5593effd4fcb837c3554d014da/release/release-openstack-python/44c7a5b/job-output.txt.gz#_2017-09-08_21_04_03_780620	21:59
mordred	pabelanger: also - I was just scanning through the project-template section of zuul/layout.yaml and I think there are a few more templates we could define based on content we have now - and a couple that we might need to add (there is an openstack-server-release-jobs that bulds and uploads tarball like normal but does not publish to pypi)	22:00
mordred	pabelanger: cool, looking	22:00
pabelanger	mordred: ya, happy to look and get the templates working	22:01
mordred	pabelanger: if you happen to look at the project-template list, adding stuff as we go to that mapping file will, I think, help us not forget things (I know I've already lost track of a few of the things we have myself :) )	22:03
mordred	pabelanger: that looks like something EXTRA weird with Determine Local Head	22:04
pabelanger	mordred: let me look at your migration script first	22:04
pabelanger	so to get up to speed	22:04
jeblair	mordred: msg looks good!	22:06
jeblair	pabelanger: if you can make that release job run with keep enabled, that would be great. we can look at the local repo state on the executor.	22:08
mordred	yah ... I can't find anything that would cause that output	22:09
mordred	although it does make me want to change something in the base job ...	22:09
mordred	oh. hah. I already wrote a change for this somewhere	22:11
dmsimard	Would it be worthwhile to plug jeblair's Barcelona Zuul v3 talk ?	22:17
dmsimard	I re-watched it recently to refresh my memory on things :D	22:17
dmsimard	Otherwise message lgtm	22:18
jeblair	dmsimard: heh, i feel like it's a bit dated and lacks practical information; but i'll let you/others decide if it's useful.	22:18
dmsimard	I think it's easy for us to take a lot of things as granted or "common sense", the talk goes into a bit of the background for people unfamiliar with what even zuul v2 would be.. but I don't have a strong opinion, just thinking out loud	22:20
*** olaph1 has joined #zuul		22:21
mordred	jeblair, pabelanger: https://review.openstack.org/#/c/501242/ would be nice to land - also would let us see the inventory we produced for the borked release job	22:22
*** olaph has quit IRC		22:23
mordred	jeblair, dmsimard I added a PS to the end with a link to that talk and other things - I can't decide if I like including that section or not	22:28
mordred	clarkb, jlk, pabelanger: ^^ thoughts?	22:29
pabelanger	jeblair: mordred: sure, I can do --keep now	22:29
dmsimard	mordred: worst that can happen is that people don't watch/read/care, best case we get more people on the same wavelength	22:31
dmsimard	Doesn't sound like a bad tradeoff	22:31
jeblair	mordred: we don't have a great way to share a module between two (roles, playbooks, etc) do we?	22:34
dmsimard	jeblair: openshift-ansible has a solution for that	22:35
jeblair	maybe symlinks? otherwise, we'd have to have a role for it, then include_role from 2 other roles?	22:36
dmsimard	One sec	22:36
jeblair	maybe we should handle /library like we handle /roles?	22:36
jeblair	dmsimard: thx	22:36
dmsimard	jeblair: https://github.com/openshift/openshift-ansible/tree/master/roles/lib_openshift	22:36
dmsimard	jeblair: https://github.com/openshift/openshift-ansible/tree/master/roles/lib_utils	22:36
dmsimard	They're basically roles with more or less just a library folder with modules/things in them	22:37
mordred	dmsimard: dont you have to use a role before the module in it is accesible?	22:38
dmsimard	mordred: meta dependencies	22:38
dmsimard	meta dependencies will run the role.. but if there's nothing to run.. :)	22:38
mordred	dmsimard: neither of those roles declar depends	22:38
mordred	ah. yah	22:38
dmsimard	mordred: https://github.com/openshift/openshift-ansible/blob/3409e6db205b6b24914e16c62972de50071f4051/roles/docker/meta/main.yml#L13	22:38
mordred	jeblair: so yah - make a role that just has library in it, then put dependencies: - library-role into meta/main.yaml of the roles that need it	22:39
dmsimard	openshift-ansible does a lot of really cool things, I've sent a few patches their way. awesome stuff.	22:39
mordred	jeblair: but - that's also basically what you were saying with include_role - same thing	22:39
jeblair	cool (the library role will be the "submit a gearman job" and the depending roles will be "submit a logstash job" and "submit a subunit job")	22:39
pabelanger	jeblair: mordred 76fa1c4a60314afe843e0c5cf8c96803 on ze01.o.o was the release job	22:40
jeblair	mordred: yeah, but then i can use the module directly in a task?	22:40
dmsimard	mordred, jeblair: this one is particularly cool, it's more or less the equivalent of our validate-host but on steroids https://github.com/openshift/openshift-ansible/tree/3409e6db205b6b24914e16c62972de50071f4051/roles/openshift_health_checker	22:40
mordred	jeblair: root@ze01:/var/lib/zuul/builds/76fa1c4a60314afe843e0c5cf8c96803/work/src/git.openstack.org/openstack-dev/sandbox# git status	22:40
mordred	Not currently on any branch.	22:40
mordred	jeblair: yes	22:40
jeblair	k. i'll do a local mockup first :)	22:41
dmsimard	And then they even have a callback that prints what the actual failures were: https://github.com/openshift/openshift-ansible/blob/3409e6db205b6b24914e16c62972de50071f4051/roles/openshift_health_checker/callback_plugins/zz_failure_summary.py	22:41
jeblair	mordred: huh, i thought i checked tags	22:41
mordred	jeblair: well- status there is not showing what it does for me locally	22:42
jeblair	that'll do it	22:42
jeblair	git describe	22:42
jeblair	0.0.23	22:42
dmsimard	jeblair: happy to review your "zuul-lib" role when you got something going, add me as reviewer	22:42
dmsimard	(btw we might want to move the module from validate-host there)	22:43
mordred	jeblair: what's the state difference - locally if I do "git checkout 0.0.23"	22:43
mordred	git branch shows me:	22:43
mordred	* (HEAD detached at 0.0.22)	22:43
jeblair	yeah me too	22:44
jeblair	.git/HEAD is the same	22:44
jeblair	huh, git status consults .git/logs/HEAD	22:46
mordred	jeblair: so is this becaus it's cloned directly from /var/lib/zuul/executor-git/git.openstack.org/openstack-dev/sandbox to that ref rather than cloned and then checked out?	22:47
jeblair	mordred: it should be cloned and checked out; i think executor is checking it out differently than git cli	22:48
mordred	\o/	22:49
mordred	jeblair: so - fwiw, for tag jobs we do have zuul.tag for the main project	22:50
dmsimard	oh, hey, we got a devstack-legacy pass on centos7 but a failure on suse so it's not too horrible	22:51
dmsimard	is devstack on debian a thing 6	22:52
jeblair	mordred: i think the only viable option is for the executor to make things correct in its work root, and to synchronize that exactly to the remote nodes. i don't want anything bypassing that process.	22:52
dmsimard	?	22:52
mordred	jeblair: and the normal "find local branch" logic works fine for branches that aren't the one that's tagged	22:52
mordred	jeblair: nod	22:52
pabelanger	dmsimard: no	22:53
pabelanger	dmsimard: centos should pass, I am not sure about opensuse. might want to ask dirk in openstack-infra	22:53
dmsimard	pabelanger: it should, we have it in devstack-gate	22:53
dmsimard	pabelanger: I'll try and see if I can find anything obvious	22:54
dmsimard	in the meantime re-checking a cleaner patch with f26 thrown in	22:54
clarkb	mordred: +2 on 501242	22:54
clarkb	didn't approve because I am very distracted by home things before having to get on a plane	22:54
pabelanger	dmsimard: systemd failed to start peakmem_tracker	22:54
pabelanger	no idea why	22:55
dmsimard	pabelanger: what, on the suse job ?	22:55
pabelanger	dmsimard: yup	22:55
pabelanger	http://logs.openstack.org/47/502147/2/check/devstack-legacy-opensuse-423-tempest-dsvm-neutron-full/1d87625/logs/devstacklog.txt.gz#_2017-09-08_20_39_49_067	22:55
jeblair	this is what the executor does: https://etherpad.openstack.org/p/BPndl6rF47	22:56
jeblair	you can use that to reproduce the weird state	22:56
dmsimard	pabelanger: "['pmap', '-XX', '1']' returned non-zero exit status 1"	22:57
openstackgerrit	Merged openstack-infra/zuul-jobs master: Make validate-host read from site-variables https://review.openstack.org/500592	22:57
mordred	clarkb: thanks!	22:57
pabelanger	dmsimard: this could be devstack bug, I think they did screen removal recently	22:58
pabelanger	dmsimard: I'd check if zuulv2.5 jobs are working	22:58
dmsimard	pabelanger: yeah doesn't look like -X is even an arg for pmap http://www.unix.com/man-page/suse/1/pmap/	22:58
jeblair	mordred, pabelanger: maybe we want the executor to do: r.git.checkout('0.0.22')	22:59
mordred	jeblair: maybe because of reset --hard -- to do that after setting the head reference?	22:59
mordred	jeblair: yah - I agree - but it seems we only really need to do that for tags?	22:59
pabelanger	dmsimard: http://logs.openstack.org/47/502147/2/check/gate-tempest-dsvm-neutron-full-opensuse-423-nv/a30d9c4/logs/devstacklog.txt.gz#_2017-09-08_21_00_40_012 failing on zuulv2.5.	22:59
mordred	jeblair: like, we certainly don't want to try to checkout speculative refs :)	22:59
jeblair	mordred: we're already using a different method for branches	23:00
dmsimard	pabelanger: yeah I found https://review.openstack.org/#/c/496301/ which ran today	23:00
mordred	jeblair: cool	23:00
jeblair	mordred: speculative refs are on branches now, so we just checkout the branch in that case	23:00
jeblair	(that's one of the super subtle awesome things about v3 :)	23:00
dmsimard	pabelanger: I'll ping some suse folks on -infra	23:00
pabelanger	jeblair: mordred: will defer to your expertise	23:00
pabelanger	dmsimard: +1	23:00
mordred	jeblair: so yah - in that case I think r.git.checkout is gonna be more better	23:01
jeblair	oh neat, we can even 'git checkout refs/tags/0.0.22'	23:01
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Use 'git checkout' when checking out a tag https://review.openstack.org/502195	23:07
jeblair	mordred, pabelanger, clarkb: ^ we'll need to merge that and restart executors then try the release job again	23:07
pabelanger	+2	23:10
mordred	jeblair: I like that idea that we can get branches and tags working the same	23:12
jeblair	mordred: yeah, if it were any other day, i would have gone ahead and done that :)	23:13
mordred	jeblair: ++	23:15
pabelanger	ze01 still has --keep, I've disabled it on ze02 already	23:29
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Use 'git checkout' when checking out a tag https://review.openstack.org/502195	23:29
mordred	jeblair, pabelanger: the last couple of times I've restarted the scheduler it has seemed like I needed to restart the executors too	23:32
mordred	jeblair, pabelanger: is that a thing? like, do we need to restart the executors after a full scheduler restart?	23:32
mordred	or am I just too impatient	23:32
jlk	mordred: might be impatient. I haven't seen that in my testing locally	23:41
jlk	mordred: what I _have_ seen from time to time is that the way the VMs work that existing nodepool VMs wouldn't necessary come back, because they check in at boot time, and since it's not boot time, they don't check in	23:41
mordred	jlk: cool. I'll go with impatient :)	23:43
jlk	mordred: the etherpad looks good, for a mail to go out.	23:43
mordred	jeblair: figured out why the shade changes weren't being tested	23:43
jlk	what time on Monday is the zuul session?	23:43
mordred	jeblair: there was a dependency loop	23:43
mordred	jlk: /me looks scared and hides	23:43
mordred	jlk: it's sometime in the afternoon I think ... not really sure	23:46
jeblair	mordred, jlk 2pm vail	23:46
jeblair	https://ethercalc.openstack.org/Queens-PTG-Discussion-Rooms	23:46
jlk	ah I'll probably miss it. I land at like 1pm pacific time.	23:46
jlk	so I land right when that session is starting	23:47
jeblair	mordred: updated your email etherpad to include that	23:48
jeblair	mordred: what made you think you should restart the executors?	23:48
mordred	jeblair: jobs weren't executing	23:48
mordred	jeblair: but - I think last time it was early in the morning and I wasn't coffeed enough - so I think it should stay in the anecdote bucket	23:49
pabelanger	mordred: I've had success with just scheduler restarts	23:50
mordred	ok. good to know	23:50
pabelanger	mordred: say when to try tagging again	23:51
jlk	I do it a ton in docker so that I can alter code and just restart scheduler	23:51
mordred	pabelanger: I have not restarted anything, but the change has landed	23:51
jeblair	pabelanger: i don't see the fix commit on ze01 yet. we can manually update though if you're ready	23:51
pabelanger	sure, I am ready if you want to manually update	23:52
jlk	hrm, I'm looking at upstream-translation-update, trying to find where it sets the node to proposal	23:53
jeblair	there were 3 zuul-executors running on ze01 again	23:53
jlk	I see it runs a proposal-slave-cleanup though	23:53
jlk	oh I see it, yaml buried the lede there.	23:54
pabelanger	jeblair: what would cause that?	23:55
jeblair	dmsimard: i may have aborted some devstack jobs with a restart just now	23:55
jeblair	pabelanger: someone restarting it incorrectly?	23:55
pabelanger	ok	23:55
jeblair	systemd not doing the one thing it's supposed to be good at?	23:55
dmsimard	jeblair: ok let me know when I can do a recheck	23:55
jeblair	dmsimard: you're good	23:55
pabelanger	jeblair: does that drop back to 1 process when zuulv3 mergers come online?	23:55
jeblair	pabelanger: you can tag now	23:56
pabelanger	k, tagging	23:56
jeblair	pabelanger: well, i mean, we're never supposed to have 3 running.	23:56
jeblair	is not related to mergers	23:56
pabelanger	k	23:56
pabelanger	sandbox 0.0.24 tagged	23:57
jeblair	mordred: it looks like you were the last to restart ze01, can you tell me exactly what you did?	23:57
mordred	jeblair: yah - I did a service zuul-executor stop - then I _believe_ I checked the process list for any python processes (which I do because I have absolutely no trust in our init scripts currently)	23:58
pabelanger	:(	23:58
pabelanger	2017-09-08 23:57:55.899000 \| ubuntu-xenial -> localhost \| ERROR: AnsibleUndefinedVariable: 'zuul_traceroute_host' is undefined	23:58
mordred	jeblair: I tend to not issue any starts until I've seen that there are no processes remaining	23:58
mordred	pabelanger: oh good!	23:58
pabelanger	let me see why	23:59
mordred	I just landed the 'use site_variables' change	23:59
jeblair	mordred: that sounds like a good procedure. maybe a straggler just escaped your attention.	23:59
dmsimard	btw thanks #zuul for being patient with me while I learned zuul v3 and devstack/devstack-gate :D	23:59
mordred	jeblair: yah. I really wish it didn't have to be a procedue	23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!