Monday, 2018-12-03

openstackgerrit	Tristan Cacqueray proposed openstack-infra/zuul-jobs master: Add install and deploy openshift roles. https://review.openstack.org/608610	00:40
openstackgerrit	Tristan Cacqueray proposed openstack-infra/zuul-jobs master: Add install and deploy openshift roles. https://review.openstack.org/608610	01:16
openstackgerrit	Tristan Cacqueray proposed openstack-infra/zuul-jobs master: Add install and deploy openshift roles. https://review.openstack.org/608610	01:50
*** bhavikdbavishi has joined #zuul		03:14
*** threestrands has joined #zuul		05:21
*** threestrands has quit IRC		05:21
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul master: WIP: Add spec for scale out scheduler https://review.openstack.org/621479	06:23
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul master: WIP: Add spec for scale out scheduler https://review.openstack.org/621479	06:24
openstackgerrit	Surya Prakash (spsurya) proposed openstack-infra/zuul master: dict_object.keys() is not required for in operator https://review.openstack.org/621482	06:35
*** quiquell\|off is now known as quiquell		07:10
*** pcaruana has joined #zuul		07:25
*** quiquell is now known as quiquell\|brb		07:40
*** bjackman has joined #zuul		07:56
bjackman	Just realised I didn't get a reviewer auto-assigned to my Zuul change, should I request one?	07:59
bjackman	I thought tristanC was auto-assigned to my last one but maybe I misunderstood	07:59
tristanC	bjackman: reviewer are not auto-assigned in review.openstack.org	08:00
bjackman	tristanC: OK thanks	08:00
tristanC	bjackman: feel free to assign me, or ask for review here using the change url	08:01
tristanC	ask here*	08:01
bjackman	tristanC: Cheers, I have assigned you	08:02
*** gtema has joined #zuul		08:07
tristanC	bjackman: thank you for fixing gerrit-2.16 support :)	08:09
bjackman	tristanC: No worries :)	08:13
*** quiquell\|brb is now known as quiquell		08:18
*** jpena\|off is now known as jpena		08:28
*** themroc has joined #zuul		08:50
*** dkehn has quit IRC		09:29
*** sshnaidm\|off is now known as sshnaidm		09:43
bjackman	What does the promote operation do?	10:13
*** electrofelix has joined #zuul		10:18
*** electrofelix has quit IRC		10:22
*** bhavikdbavishi has quit IRC		10:23
*** electrofelix has joined #zuul		10:31
*** bjackman has quit IRC		10:35
*** bjackman_ has joined #zuul		10:35
*** bjackman_ is now known as bjackman		10:35
*** panda\|pto is now known as panda		10:47
*** gtema has quit IRC		11:02
*** sshnaidm has quit IRC		11:16
*** sshnaidm has joined #zuul		11:16
*** sshnaidm has quit IRC		11:18
*** rfolco has joined #zuul		11:18
*** sshnaidm has joined #zuul		11:19
*** quiquell is now known as quiquell\|brb		11:21
openstackgerrit	Tristan Cacqueray proposed openstack-infra/zuul-jobs master: Add install and deploy openshift roles. https://review.openstack.org/608610	11:44
*** quiquell\|brb is now known as quiquell		11:45
*** electrofelix has quit IRC		11:58
*** electrofelix has joined #zuul		12:03
*** njohnston has joined #zuul		12:06
*** njohnston has quit IRC		12:10
*** njohnston has joined #zuul		12:10
openstackgerrit	Tristan Cacqueray proposed openstack-infra/nodepool master: Implement an OpenShift resource provider https://review.openstack.org/570667	12:13
*** jpena is now known as jpena\|lunch		12:34
*** rlandy has joined #zuul		12:54
*** bjackman has quit IRC		12:55
*** jpena\|lunch is now known as jpena		13:26
*** dkehn has joined #zuul		13:44
*** quiquell is now known as quiquell\|lunch		14:01
*** smyers has quit IRC		14:19
*** smyers has joined #zuul		14:21
*** gtema has joined #zuul		14:34
*** quiquell\|lunch is now known as quiquell		14:40
quiquell	AJaeger_, pabelanger: I think openstack/pbrx is missing configparser as requirement.txt	14:41
*** gtema has quit IRC		14:45
mordred	quiquell: yeah - I think you're right (at least for python2) - if you install it under python3 is should be fine ... but we should add that for sure	14:51
mordred	quiquell: you want to send a patch or want me to?	14:51
quiquell	mordred: do we have to support python2 '	14:52
quiquell	mordred: what version of configparser do we need to pin to ?	14:52
mordred	quiquell: I don't think we need to pin it - but we should mark it with ;python_version=='2.7' so it only installs for 2.7	14:53
mordred	quiquell: and yes - for now - there is another utility in pbrx that wants to be used under 2.7 by some people - I think we should likely split that out and make pbrx into two separate tools	14:54
mordred	because I don't actually find python2 support for build-images to be useful :)	14:54
pabelanger	agree, was just going to ask why python2 for image builds at all. But, maybe because of centos-7	14:58
quiquell	mordred: So maybe is better to split things, and not add configparser dep	14:58
mordred	quiquell: yeah. I'll work on that today ... I think the multiple command thing is less useful than it should be	15:00
mordred	pabelanger: indeed ... although I think maybe for now, since pbrx is pretty new and not used in many places, not supporting centos-7 is maybe not the worst thing?	15:02
mordred	especially with 8 right around the corner	15:02
pabelanger	Yup, I'd agree. Would love to only support python3 on rhel-8 / centos-8	15:03
*** themroc has quit IRC		15:32
*** hughsaunders has joined #zuul		15:34
corvus	hughsanders: you were asking about how nodepool requests are distributed in #openstack-infra...	15:34
hughsaunders	corvus: hey	15:34
hughsaunders	yep	15:34
corvus	hughsaunders: your summary is pretty accurate -- there's almost no communication between the nodepool launchers	15:35
corvus	(they know which other launchers are online, but otherwise, they know nothing about any of the others)	15:35
corvus	lemme dig up a link	15:35
hughsaunders	ok thanks	15:35
corvus	hughsaunders: the algorithm is still basically what's described here (from the original zuulv3 spec): http://specs.openstack.org/openstack-infra/infra-specs/specs/zuulv3.html	15:36
corvus	hughsaunders: (search for "A simple algorithm")	15:37
corvus	hughsaunders: the original design goals there were simplicity, reliability, and fault tolerance	15:37
corvus	hughsaunders: i think we'd all love to improve the situation to make it smarter (so that quota and capacity across different providers is taken into account)	15:38
corvus	hughsaunders: i think the trick is that any changes need to take into consideration the behavior of the existing algorithm and ensure that we don't create any new edge cases where something can get wedged, or lose fault tolerance.	15:39
corvus	hughsaunders: or, we could create an entirely new algorithm	15:39
corvus	hughsaunders: either will be acceptable -- the main thing is that the current algorithm doesn't really have a lot of facilities for changes like that, and any changes need to be made with great care	15:40
hughsaunders	corvus: My initial thought was cause each provider to pause for a few seconds if they can't satisfy a request from ready nodes, or immediately lock it if they can. This gives other providers time to lock the request if they have ready capacity. After the timout, any waiting thread tries to lock the request.	15:40
hughsaunders	Not pretty, but it is simple :/	15:41
hughsaunders	Other thoughts included some kind of auction where providers bit for a request depending on the number of nodes from the request they can fulfil from ready capacity. That would be neat but makes the algorithm much more complex.	15:41
hughsaunders	*bid	15:41
*** themroc has joined #zuul		15:44
corvus	hughsaunders: i think the timeout approach would work, but may be somewhat fragile -- in other situations we're dealing with nodepool taking many minutes to process the full request list	15:45
corvus	hughsaunders: so i think an approach where launchers can communicate more about themselves might be better... though also....	15:45
corvus	hughsaunders: i think clarkb was suggesting (for other reasons) that we should have a two-phase system where node requests from zuul are processed in one phase and then any still-needed nodes are put onto a queue individually, so that node creation is less-directly tied to node requests. we may be able to incorporate your case into that too, by making sure that the initial request processor can just return	15:48
corvus	ready nodes regardless of the launcher which supplied them.	15:48
tobiash	corvus: I hope it was ok that I fixed your fixes to zuul (I think they landed already)	15:49
corvus	hughsaunders: the bid idea would also work -- it's actually not too dissimilar from the current algorithm; i think the downside there is that it would essentially run the algorithm twice for each request (every launcher reports on whether it can fulfill requests during the ready phase, and then again afterwords (assuming no ready nodes))	15:50
corvus	tobiash: yes, thanks! were there any significant errors i should look at, or pep8 stuff?	15:51
tobiash	corvus: just minor logic thing	15:52
hughsaunders	corvus: It could slow things down for the min-ready:0 case, but I would have thought most users of nodepool are interested in min-ready>0.	15:52
tobiash	corvus: https://review.openstack.org/#/c/621319/3/zuul/manager/__init__.py that's what I fixed	15:52
pabelanger	hughsaunders: for ansible-network nodepool, we've stopped using min-ready nodes, mostly for the behavior you are talking about. We default to min-ready:0 and which ends up tacking a few more seconds on job run. This is also becase we pay $$$ for each node online also.	15:53
tobiash	the filtered list didn't necessarily contain the item	15:53
corvus	hughsaunders: yes, though for example, openstack almost never benefits from min-ready, because we are almost never idle enough to be able to build min-ready nodes. so taking that as an example, a busy system would spend longer running the algorithm twice, so it's sort of opposite-scaling.	15:53
tobiash	corvus: so it needed a search for it and return the index or the list length as a fallback	15:53
hughsaunders	corvus: I also thought about removing this check to see what happens, but I guess it would screw up accounting.. https://github.com/openstack-infra/nodepool/blob/master/nodepool/driver/__init__.py#L427	15:54
*** lennyb_ has quit IRC		15:55
*** jhesketh has quit IRC		15:55
corvus	tobiash: in what case?	15:55
tobiash	in case the item is not yet enqueued	15:55
corvus	tobiash: ah, we do call that before enqueing don't we. what about dependent.py?	15:56
*** jhesketh has joined #zuul		15:57
tobiash	the dependent seemed to work	15:57
corvus	tobiash: shouldn't it have the same problem?	15:57
corvus	tobiash: (getNodePriority is overidden in dependent.py	15:57
*** lennyb has joined #zuul		15:57
tobiash	it has its own implementation of getNodePriority does it?	15:57
*** quiquell is now known as quiquell\|off		15:58
tobiash	corvus: hrm the dependent implementation should throw an error in this case too, so either I lied or that is not happening in gate or is not tested in gate	16:00
corvus	tobiash: maybe the case was when the item wasn't live?	16:01
corvus	(which never happens in dependent)	16:02
tobiash	that could be	16:02
corvus	tobiash: yeah, i don't think getNodePriority is called before enqueing	16:02
corvus	tobiash: and the tests only failed after the change to filter out non-live items	16:03
tobiash	corvus: makes sense	16:04
corvus	of course -- why is this being called at all for non-live items?	16:04
tobiash	corvus: I think I'll re-analyze this later	16:04
tobiash	that's an interesting question	16:04
corvus	ah, i think it's called unecessarily inside of processOneItem, but it won't be used	16:05
tobiash	ah yes, it's called before we know that there is no node request	16:06
corvus	i'm going to double check that was the error reported, and if so, i think i might adjust the fix a bit	16:07
tobiash	great	16:12
*** themroc has quit IRC		16:21
openstackgerrit	James E. Blair proposed openstack-infra/zuul master: Don't calculate priority of non-live items https://review.openstack.org/621626	16:35
corvus	tobiash: how about that? ^	16:35
tobiash	corvus: I had exactly the same in mind	16:39
mordred	tobiash, corvus: I just added the second +2 to that - any reason I should not go ahead and +A?	16:40
corvus	mordred: nope, if it passes gate it's good :)	16:40
tobiash	++	16:41
clarkb	corvus: are there other bug fixes on relative priority that need review? or are we good to make another go at it once 621626 is in?	16:54
corvus	clarkb: afaik gtg, and we could do it before 626, it's just a slight inefficiency	16:54
*** gtema has joined #zuul		16:58
openstackgerrit	David Shrewsbury proposed openstack-infra/nodepool master: Extract out common config parsing for ConfigPool https://review.openstack.org/621642	17:18
*** manjeets has joined #zuul		17:19
Shrews	corvus: tobiash: tristanC: proposed change to how we deal with driver configs ^^^	17:20
*** gtema has quit IRC		17:31
SpamapS	hit a weird problem today with our Zuul	17:36
SpamapS	A developer proposed a PR from origin/foo -> origin/master, and then realized they shouldn't do PR's from origin, deleted origin/foo, re-proposed as them/foo -> origin/master, but the sha's were the same, and zuul seemed to get confused and marked the post job as RETRY_LIMIT because of a crash	17:38
SpamapS	http://paste.openstack.org/show/736584/ <-- the crash	17:38
tobiash	SpamapS: you mean a branch named 'origin/foo'?	17:40
tobiash	SpamapS: we had something similar: https://review.openstack.org/#/c/619245	17:42
SpamapS	no, the branch was named 'foo'	17:43
SpamapS	I'm just giving remote names as symbolic "the main repo"	17:43
SpamapS	Also I think we need better docs (or I need to be shown them) on how to use zuul-enqueue-ref because it never does anything for me.	17:44
mordred	SpamapS: was the branch actually named "feat/charity" ?	17:45
mordred	SpamapS: also - nice edge case your developer found :)	17:46
clarkb	zuul would've treated origin/foo PR source as merged and load config from that directly? then the new branch on then/foo would be unmerged	17:47
clarkb	I'm guessing its that behavior in particular that has confused zuul?	17:47
mordred	clarkb: the traceback looks more like zuul merger got confused by the branch going away	17:47
clarkb	mordred: ya but only because it treats those branches as special I think?	17:48
pabelanger	Yah, we had issue in ansible-network when we deleted an origin branch, zuul didn't respond well and we had to purge the mergers	17:48
SpamapS	mordred: yes, it was named feat/charity	17:48
SpamapS	I also am now working to turn on the "only look at protected branches" flag	17:49
SpamapS	which I should have turned on long ago	17:49
mordred	SpamapS: cool. just making sure I was grokking the traceback properly mapping it to your thing	17:49
clarkb	on the gerrit side we handle this by taking action on the ref updated new sha1 0*40 event	17:49
clarkb	iirc	17:49
pabelanger	SpamapS: +1	17:49
clarkb	does github send a similar event we can take action on when branches are deleted?	17:49
pabelanger	clarkb: yah, I believe so	17:50
pabelanger	https://developer.github.com/v3/activity/events/types/#deleteevent	17:50
clarkb	ya reading the merger code is specifically keys off of '0'*40 to know when to delete the upstream branch locally in _alterRepoState	17:52
clarkb	so we either need to have github driver set newrev to '0'*40 on those events, or pass that event through more directly and take the same action	17:52
clarkb	SpamapS: do you have access to the webhook event that fired for that delete? reading pabelanger's doc link it seems that 'ref' is set to the ref name not the sha1. So I wonder if the assumptions in the github driver around EMPTY_GIT_REF are never hit	17:55
clarkb	yup, github connection sets event.newrev to body.get('after'). Reading the docs pabelanger linked there is no 'after' key in the body for the delete event. The rest of the code does seem to do the right thing if newrev is set to '0'*40 though	17:58
openstackgerrit	David Shrewsbury proposed openstack-infra/nodepool master: Extract out common config parsing for ConfigPool https://review.openstack.org/621642	17:59
clarkb	oh and we only handle that in the push event which is weird. I don't know enough about the api to know if that will fire on deletes. Maybe if you delete via the empty push mechanism?	18:00
SpamapS	clarkb: yeah I'll peek	18:00
clarkb	tobiash: jlk ^	18:00
clarkb	I think the fix here is to add a handler for the delete event push. Then create a translated internal zuul event that sets newrev to '0'*40. Zuul should handle that in the merger properly	18:02
tobiash	I thought we have that already	18:03
clarkb	tobiash: _event_push handles it if it shows up as a push	18:04
clarkb	but I don't see an _event_delete	18:04
SpamapS	Hrm, I don't see any hooks for the branch delete	18:04
SpamapS	clarkb: apparently the default app setup doesn't have us subscribed to branch/tag delete, only create	18:06
SpamapS	so that's one thing to consider changing	18:06
openstackgerrit	Merged openstack-infra/zuul master: Don't calculate priority of non-live items https://review.openstack.org/621626	18:07
tobiash	SpamapS: I think you ran into a race	18:10
tobiash	Need to switch to laptop to better explain	18:10
openstackgerrit	Clark Boylan proposed openstack-infra/zuul master: Handle github delete events https://review.openstack.org/621665	18:10
clarkb	tobiash: do we not need something like ^?	18:11
SpamapS	tobiash: agreed	18:11
clarkb	I guess if deletevent and pushevent always happen together then pushevent is sufficient	18:11
SpamapS	I think there's another race that tobiash may be getting at	18:13
SpamapS	the merge probably shouldn't checkout branch names	18:13
SpamapS	it should probably checkout with a ref	18:14
tobiash	clarkb: it's not about delete events, they don't have anything to do with mergers	18:14
clarkb	tobiash: they do. _alterRepoState() in the merger uses those events to claen up branches that were deleted upstream of the merger	18:15
clarkb	what should happen is that github sends the delete event when the branch is deleted. This causes _alterRepoState to remove that branch from the local repo	18:16
clarkb	then later when the new branch with same sha1 is pushed the merger should fetch it and be happy because of the earlier reconciliation	18:16
tobiash	let me check	18:16
*** jpena is now known as jpena\|off		18:17
tobiash	clarkb: I don't see a connection of _alterRepoState to any event	18:19
clarkb	tobiash: it is called by getRepoState	18:19
tobiash	clarkb: I still don't get the relation to an event	18:20
clarkb	hrm ya its called on the tail end of merging things	18:21
clarkb	which actualyl seems right	18:21
clarkb	seems like what happens is if we have a chagne to merge we merge the change. If its a repo update type event then we run getRepoState	18:21
clarkb	this is in scheduleMerge in the pipeline manager	18:21
tobiash	SpamapS, clarkb: so the race I have in mind works like this: have branch x, push branch y, merger merges and adds x and y into repostate, delete branch x, job now starts, takes repo state tries to restore x and y and fails during restoring x	18:22
tobiash	SpamapS, clarkb: we had these things with tags and I have a hacky patch in our zuul since august to mitigate that but hadn't time to push up a real fix for this	18:23
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul master: WIP: Fix broken setRefs whith missing objects https://review.openstack.org/621667	18:24
tobiash	clarkb, SpamapS: this is my hacky fox for this we have in our zuul ^	18:25
tobiash	this is actually my only hacky fix that is not upstream yet ;)	18:25
*** electrofelix has quit IRC		18:31
SpamapS	tobiash: yeah that seems like what happened	18:37
tobiash	SpamapS: we had the same but with tags (someone deleted a tag) and from this point some mergers had that tag and some executors didn't and everything was broken because tags are always in the repo state	18:39
SpamapS	Tags are just refs of a different type so it makes sense.	18:39
clarkb	tobiash: reading through the scheduler and pipeline manager I think my idea is mostly correct. githubconnection.getChange should create a Ref() change object for that event. That will then get processed by the queue processing done on the pipelines. This will then trigger the mergers to clean up the deleted branch via alterRepoState	18:40
clarkb	some of the details may need fixing to get stuff mapped properly, but that seems to be how the gerrit side handles this	18:41
tobiash	clarkb: the repostate is metadata that is attached to a queue item, which is carried through to the executor which will reproduce the same thing as the first merger	18:43
clarkb	oh I see repo_state there isn't mapped onto disk	18:44
clarkb	so we probably need to concile in multiple places	18:44
tobiash	clarkb: when getting a merge command a merger updates its remote refs, performs the merge and creates a clean repostate data structure from that. That is carried through to all jobs of that change so the mergers that create the workspaces can reproduce the merges deterministically	18:44
tobiash	this repostate contains all branches of the repo, in github you often have branches that get created and deleted often	18:46
tobiash	that's the reason you don't typically see this in gerrit but in github as it's common to work on short lived feature branches and delete them after the pr merges	18:46
clarkb	right, is that repostate valid after a branch delete?	18:46
clarkb	I don't see where we would currently clean it up beacuse it relies on a push of sha1 0*40 to do so	18:46
tobiash	we don't need to clean it up	18:47
tobiash	the first merge comes without a repo state and takes the current remote refs at that time	18:47
tobiash	that is perfectly fine	18:47
clarkb	I see so it will start fresh and see the current valid list	18:47
tobiash	the only problem is the race between creating that repo state and using it when in the middle there were upstream deletions	18:48
tobiash	my hacky fix was to just ignore the now non-existing stuff	18:48
clarkb	would my fix, which will update repo_state after such a deletion also fix it?	18:49
clarkb	or is there still a race between that reconciliation and using it later	18:49
tobiash	that wouldn't change anything	18:49
clarkb	looking at _mergeItem we reset the repo (whcih clean up stale refs on disk) then we restoreRepoState in that case wouldn't we either remove the stale ref and be fine or the stale ref would still be on disk and things would work until the next reset?	18:52
tobiash	the only thing I can think of is really to ignore non-existing stuff from the repostate. In almost all cases this will be something unused and if not, a job trying to use a non-existing ref would fail anyway	18:52
clarkb	if the order was reversed I would understand the race you describe. But we set an on disk state then introspect that so it should be consistent?	18:52
openstackgerrit	James E. Blair proposed openstack-infra/nodepool master: Make launcher debug slightly less chatty https://review.openstack.org/621675	18:53
clarkb	even save repo state ignores the remote state and uses the on disk state	18:53
tobiash	clarkb: but the merger directly before did an update	18:54
clarkb	tobiash: shouldn't that be ok? the mergers are prcessing repos serially. We reset the repo as the first step which will either say the ref is still valid or clean it up. After that is when we set repostate and it should be consistent with that reset? Is the problem we reset on subsequent merger actions but don't update the repostate?	18:55
clarkb	reset (no delete). save repo state (no delete). then later reset again (delete). Don't resave repostate use preexisting?	18:56
tobiash	clarkb: yes, thats correct, but later we use the repo state on a different merger/executor potentially hours later in order to re-do the exactly same merge	18:56
clarkb	got it	18:57
tobiash	that hour is the time you're usually waiting for a node ;)	18:57
tobiash	and during that anything can happen to remote branches (which is common in the github workflow)	18:57
clarkb	can we reprune the repo state after subsequent resets?	18:58
tobiash	so when re-doing the merge, it also performs a repo reset, restores the repo state, merges the change	18:58
clarkb	I guess those subsequent resets happen external to that event so lining things up may be painful	18:58
tobiash	yes, I think the only efficient way is to just ignore things that are missing. If the ref that we are trying to merge is missing during repo state restore the merge will fail anyway	19:00
tobiash	that process is completely disconnected to any event from the driver, these are just used to modify the job configs, not to modify inflight changes	19:01
openstackgerrit	David Shrewsbury proposed openstack-infra/nodepool master: Set pool for error'ed instances https://review.openstack.org/621681	19:32
*** AJaeger_ is now known as AJaeger		19:43
*** manjeets_ has joined #zuul		20:17
*** manjeets has quit IRC		20:18
SpamapS	I'm building a slack notifier role. Does anyone know if there is a standard way, in a job, to get the URL to the Zuul status/builds/etc. API?	20:19
clarkb	SpamapS: infra's zuul jobs set it in a zuul_return iirc	20:20
clarkb	then that is used across the builds	20:20
clarkb	https://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/set-zuul-log-path-fact/tasks/main.yaml consumed by https://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/emit-job-header/tasks/main.yaml#n20 then set as a return later iirc	20:21
clarkb	and I think zuul_log_url is a site var?	20:22
SpamapS	That's for logs	20:23
SpamapS	I want the API	20:23
SpamapS	I think a site var is fine	20:23
SpamapS	I'll just document that in the role.	20:23
SpamapS	"Set this site var"	20:23
SpamapS	maybe we can make a convention for it	20:23
clarkb	ya its for logs, but its a similar idea "here is where the other part of the sytem you care about is located"	20:24
*** manjeets_ is now known as manjeets		21:20
corvus	SpamapS: status_url is set in the web configuration: https://zuul-ci.org/docs/zuul/admin/components.html#attr-web.status_url	21:29
*** techraving has joined #zuul		21:40
SpamapS	corvus: is there any way to get at that in a post job?	21:49
corvus	SpamapS: oh, you said role. i thought you were writing a driver.	21:52
corvus	SpamapS: i don't think so. but i think we should expose it. i think we should expose both the change-specific status page and the buildset page in zuul vars.	21:54
corvus	i'm not sure how close to completion those are.	21:55
SpamapS	mmmm yes that would be amazing. :)	22:13
SpamapS	Well in this case, I want to build the various URLs for a slack message.	22:13
SpamapS	And actually, I think I may want to chase that "cleanup job" idea soon.	22:13
SpamapS	So the slack message can be for the buildset and not all the jobs.	22:13
SpamapS	(I guess I can make one job that depends on all the others.. but feels like we can just model that in python a lot easier than yaml. :)	22:14
SpamapS	I also just found a race in the EC2 driver that has been leaking nodes and fixed it so yay for that.	22:15
* SpamapS still can't quite get the unit tests to pass reliably		22:15
*** pcaruana has quit IRC		22:18
SpamapS	So... FYI, for anyone doing monorepos (maybe I should email list this) ... zuul with monorepo works fine. However, you will want to be pretty aggressive about using files matchers. So.. many... jobs..	22:37
corvus	SpamapS: speaking of file matchers: i've been thinking that it would be useful for zuul to detect when a change modifies a given job and automatically have that job "match"	22:44
SpamapS	corvus: ++	22:45
SpamapS	I have been splitting jobs in to groups in .zuul.d for that exact purpose.	22:45
corvus	SpamapS: this came from openstack-infra where we are starting to use file matchers aggressively for changes to our ansible. occasionally we make changes to those jobs, and right now, if you touch .zuul.yaml, all the jobs run. this would let us omit .zuul.yaml from files matchers, and still have the (relevant) jobs run	22:45
corvus	SpamapS: oh yeah, that's smart, why didn't i think of that. :)	22:45
SpamapS	and then adding files: '^.zuul.d/group-jobs.yaml$'	22:46
corvus	SpamapS: so yeah, sounds like we've both seen the same problem and that would help	22:46
SpamapS	Definitely	22:46
SpamapS	And it shouldn't be hard to do I think.	22:46
corvus	nope, i'm pretty sure all the info is in memory	22:46
corvus	just need to connect some dots	22:47
*** irclogbot_3 has quit IRC		22:47
SpamapS	Yeah, I was thinking about something similar when writing a sandbox usage of a zuul-roles job too.	22:48
SpamapS	(more for the case of testing jobs in real repos automatically, but similarly I was diving around the model seeing that most of the info is already there in the config and triggers)	22:49
SpamapS	while we're spraying ideas around.. I think I want to write a role that lets me find common sha's with another branch and, consequently, dig out the log url for jobs that ran against the sha, so I can grab docker containers from that sha and not rebuild them (so I can have a promotion workflow). I feel like there was some other stuff going on with artifacts though.	22:51
clarkb	SpamapS: I think people are using zuul_return to pass that info along? Though I guess that is within the context of a single pipeline	22:52
clarkb	not between pipelines	22:52
corvus	SpamapS: yeah, the artifact url storage work i'm doing stores artifact urls along with a build record. so my idea is to then, in a later pipeline, query the api to find the build and artifact urls, and use them.	22:53
corvus	SpamapS: you should be able to query by "patchset" (in the github case) and get a build	22:54
corvus	SpamapS: so once the new stuff lands, you may have what you need	22:54
SpamapS	corvus: yeah that's perfect.	22:55
corvus	SpamapS: (in the gerrit case, i'm imagining a "promote" pipeline which runs on change-merged events, and uses the change-merged event info to query the api to find the builds+artifacts)	22:55
corvus	the same should work for github too (in the pr-closed case)	22:55
corvus	SpamapS: looks like it's ready to merge: https://review.openstack.org/620427	22:56
corvus	i mean, ready for review. don't mean to be presumptuous. :)	22:56
SpamapS	Yeah what I have is merge commits in my promotion branch (called 'prod') that refer to commits that were already built and tested in master. In those cases the containers are already pushed and tested, I just have to change a single field in a k8s object to make them "live". So I want the post job in the 'prod' branch to just find the exact image URL from that previous run and make it so.	22:57
SpamapS	We do the --no-ff so we have metadata about when things were merged to 'prod'	22:57
SpamapS	oh sweet I'll review post haste	22:57
SpamapS	Right now, btw, I can totally fake this. I tag the containers with the newrev in post, so I can just look at the merge commit's children, and regenerate the tag. But I am not sure that will work for everything.	22:59
corvus	SpamapS: you should be able to test out the query part of this now -- though http://zuul.openstack.org/api/builds?change=620427&patchset=1 isn't working for me and i'm not sure why (http://zuul.openstack.org/api/builds?change=620427 does work)	23:01
corvus	haha	23:01
corvus	because jobs never reported on ps1	23:01
corvus	http://zuul.openstack.org/api/builds?change=620427&patchset=2 works fine	23:01
corvus	and so does http://zuul.openstack.org/api/builds?patchset=2 (though, obviously it's not useful in our case)	23:02
* SpamapS peeks		23:05
corvus	patchset doesn't show up as an option in the web ui, but it is supported in the api	23:06
SpamapS	https://zuul.gdmny.co/builds?change=11,patchset=cd60082ebb6b84a723a9a2fb2ea951b195c6add8	23:06
SpamapS	yeah that works	23:06
SpamapS	Can I query a commit sha tho?	23:08
*** irclogbot_3 has joined #zuul		23:10
SpamapS	Seems like nodepool+ec2 is ignoring my max-servers	23:11
SpamapS	I think GitHub suffers from a deficiency here becuase there's no "pull request merge" event, there's a push, and you have to go dig out the PR backwards from the commit. :-P	23:14
corvus	SpamapS: there's a closed event	23:28
corvus	i suspect that happens on merge but am not positive, or what the additional info to distinguish that from hitting the close button is, but i think it's worth looking into	23:29
*** dkehn has quit IRC		23:53
*** dkehn has joined #zuul		23:58

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!