Tuesday, 2017-06-13

jeblair	jamielennox: when all the streaming stuff is in place, i expect us to have a finger multiplexer (probably built into the websocket multiplexer) so that we can have a single streaming server for all of the executors; the executors should not be end-user addressible. eg zuul.example.com/uuid	00:01
jamielennox	jeblair: yea, that was my understanding - but it means that there's really no reason to run the executor fingerd on 79, and therefore no reason for root on the executor	00:02
jeblair	jamielennox: that service will either want to listen on 79, or it could listen on another port and you could port-forward to it.	00:02
*** dmsimard has joined #zuul		00:02
jamielennox	the multiplexor might be 79	00:02
jeblair	jamielennox: yep, i think you've got it.	00:02
jamielennox	jeblair: so would it be reasonable to change the default port of the executor fingerd?	00:02
jeblair	jamielennox: yeah, i was thinking we'd do it when we had the multiplexer up and running, but if inverting that would be helpful now, we could probably go ahead and do it.	00:03
jamielennox	jeblair: it's purely the run as root that threw me, i've got it changed in local config i just wasn't sure if there was something else i was missing	00:05
mordred	tristanC: yes. zuul and nodepool both are fully py3 now	00:11
jlk	well	00:13
*** yolanda has quit IRC		00:14
mordred	jlk: no?	00:14
jlk	I think there is an open change to actually _use_ that setting in some ways	00:14
jlk	sorry that was meant for jamielennox and jeblair	00:14
mordred	ah - nod	00:15
jlk	https://review.openstack.org/473103	00:15
mordred	jlk: did you see the patch about github retries?	00:15
jlk	jamielennox: that's part of some changes	00:15
jlk	mordred: yeah, I -1d it	00:15
mordred	ah - I see that now - cool	00:15
jamielennox	jlk: ok, i can try and look at those - maybe the executor needs to register the finger port in zk when it picks up a job?	00:16
jamielennox	cause at the moment it's configurable, but those changes don't appear to be	00:17
jamielennox	oh, no, it's still configurable	00:17
mordred	jamielennox: the port is supported at least in the finger url bit when I added them	00:18
mordred	or, rather, here: https://review.openstack.org/#/c/473103/3	00:19
mordred	so they show up _somewhere_	00:19
jamielennox	mordred: yea, i'm trying to get back to the point where i can actually test that	00:21
jamielennox	stuck on py3 and some ssl issues, maybe today, depends how badly the dentist goes	00:22
pabelanger	jeblair: mordred https://review.openstack.org/#/c/473533/ and https://review.openstack.org/#/c/472861/ need to land to make zuul +1 again	00:25
pabelanger	that should help zuul +1 all our open changes	00:26
pabelanger	zuul (user)	00:26
pabelanger	https://review.openstack.org/#/c/473613/ would also be nice to get a few more nodes on nl01.o.o	00:26
*** yolanda has joined #zuul		00:37
openstackgerrit	David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Add log streaming test https://review.openstack.org/471079	00:48
Shrews	stupid pep8	00:48
Shrews	jamielennox: should only need to change the finger port to not run as root	00:50
Shrews	that was the only reason	00:50
openstackgerrit	Jesse Keating proposed openstack-infra/zuul feature/zuulv3: Improve fake github commit status handling https://review.openstack.org/473633	01:02
jlk	blah, pep8.	01:05
openstackgerrit	Jesse Keating proposed openstack-infra/zuul feature/zuulv3: Improve fake github commit status handling https://review.openstack.org/473633	01:06
jamielennox	any zuul v3 outstanding patches?	01:09
jamielennox	bah	01:09
jamielennox	python3	01:09
jamielennox	getting some encoding issues	01:09
openstackgerrit	Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: github: retry pull_request() https://review.openstack.org/473301	01:14
openstackgerrit	Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: zuul_stream: handle empty line https://review.openstack.org/473222	01:34
*** yolanda has quit IRC		01:37
*** jiaohaolin1 has left #zuul		01:45
openstackgerrit	Jamie Lennox proposed openstack-infra/zuul feature/zuulv3: Encode webhook_token secret https://review.openstack.org/473674	03:36
openstackgerrit	Jesse Keating proposed openstack-infra/zuul feature/zuulv3: Improve fake github commit status handling https://review.openstack.org/473633	03:36
*** bhavik1 has joined #zuul		04:37
*** yolanda has joined #zuul		05:54
*** isaacb has joined #zuul		06:28
*** yolanda has quit IRC		07:28
*** bhavik1 has quit IRC		08:19
*** hashar has joined #zuul		09:13
openstackgerrit	Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: RFC: zuul.d: configuration split https://review.openstack.org/473764	09:53
*** isaacb_ has joined #zuul		11:07
*** isaacb has quit IRC		11:09
*** jkilpatr has joined #zuul		11:10
*** isaacb_ has quit IRC		11:41
*** isaacb_ has joined #zuul		11:42
openstackgerrit	Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: RFC: zuul.d: configuration split https://review.openstack.org/473764	11:56
*** openstackgerrit has quit IRC		12:18
*** openstackgerrit has joined #zuul		12:23
openstackgerrit	Ricardo Carrillo Cruz proposed openstack-infra/nodepool feature/zuulv3: Create group for label type https://review.openstack.org/473809	12:23
rcarrillocruz	jeblair: handwavy patch for adding per-label groups ^	12:24
rcarrillocruz	i need to spawn a nodepool VM to test that	12:25
*** olaph has quit IRC		12:27
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add a success-url for status.json test https://review.openstack.org/473604	12:29
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Move streaming url formatting to model https://review.openstack.org/473090	12:29
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Support finger ports in finger URL https://review.openstack.org/473103	12:29
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add build.started state flag https://review.openstack.org/473811	12:29
Shrews	mordred: https://review.openstack.org/471079 is really for sure ready now. sorry i left a pep8 problem in there	13:17
*** _ari_\|conf is now known as _ari_		13:18
mordred	Shrews: +2	13:20
mordred	rcarrillocruz: lgtm	13:21
mordred	rcarrillocruz: from a zuul perspective, check out https://review.openstack.org/#/c/467611/ and make sure it's good enough for what you want moving forward	13:23
mordred	rcarrillocruz, jeblair, jlk: there is a question that I've been mulling related to that, btw - which is whether or not we should add the "openstack" host vars to our nodes that get added by the openstack ansible inventory ...	13:24
mordred	argument for is that if someone has a playbook for a thing that expects to run against some nodes that are running in openstack and is using those variables, they won't be able to directly test their playbook against the nodes we launch for them	13:25
mordred	argument against is that we actually dont' communicate anywhere "these are ubuntu-xenial nodes in openstack" - the openstack part is an impl detail	13:25
mordred	perhaps we should consider including an optional node type in the nodeset spec - so that a user could say "I want an ubuntu-xenial node in an openstack cloud" and if they did, we'd attach the openstack hostvars	13:26
mordred	but if they don't explicitly request that, we wouldn't do that	13:26
mordred	(that might also be a thing we need anyway for other types of workloads - "give me ubuntu-xenial in a container" or "give me ubuntu-xenial in GCE")	13:27
mordred	BUT - all of that also feeds in to a different thing I keep meaning on bringing up - which is "as a user, how do I get a list of available labels" - which seems like perhaps a good REST API call and dashboard page to add to tristanC's thing	13:28
pabelanger	jeblair: mordred: Shrews: would it be much effort if I requested finger @ze01.o.o list all running build ID and job names?	13:38
Shrews	pabelanger: the streamer doesn't have access to all available builds (not without processing on disk directories and trying to make a guess at which ones are valid)	13:39
pabelanger	Shrews: we don't store that info in zk today?	13:42
Shrews	pabelanger: not that i'm aware of	13:42
pabelanger	k	13:45
*** dkranz has joined #zuul		13:47
jlk	mordred: so I am thinking that we'd want extra details like that to come from the nodepool driver. If using the openstack nodepool driver, you get some openstack hints in your vars. If you're using the k8s driver, you get some k8s hints in inventory (if there are any), ditto GCP, AWS, and so on.	13:52
*** dmsimard has quit IRC		14:05
mordred	jlk: yah - I definitley think htey should come from there - but if they're on by default is that muddying an abstraction?	14:23
mordred	jlk: like, today, you get a node named "ubuntu-xenial" in your inventory with some zuul variables and an ip	14:24
mordred	but if we started adding the openstack dict to the variables for all openstack-provided nodes, would that be good? or bad?	14:24
mordred	pabelanger, Shrews: I think that's likely something we can/should add to the multi-plexor when we write it - because the multi-plexor would be in a good position to, at the very least, GET /status.json and then return a list	14:26
mordred	but for the executor streamers that seems like the wrong layer	14:26
jeblair	mordred: agreed re multiplexer	14:28
mordred	jlk: I definitely agree re the k8s/aws/gce hints etc - just mostly not sure if we should add those hints if the user hasn't requested a certain delivery model- or whether we should expose/allow that request	14:28
jeblair	mordred: is there any harm to supplying extra info like the openstack hints? as for requesting it -- i think the way you request something like that is via labels (ubuntu-xenial-container vs ubuntu-xenial-openstack) if that sort of thing is important to you.	14:30
mordred	jeblair: any harm is the question - I can't decide whether it's leaking abstraction details or not	14:36
mordred	jeblair: but for the labels thing - in my brain I was imagining that a deployer might have, for instance, an openstack provider, a k8s provider and an AWS provider - but similar to how we define an ubuntu-xenial across all of our clouds today and jobs don't care where it comes from, they define an ubuntu-xenial label across all three and jobs don't care	14:38
jeblair	mordred: yep, if you don't care, just call it 'ubuntu-xenial'. if you do care, call it 'ubuntu-xenial-openstack'. if you both do and don't care, make both labels.	14:39
mordred	jeblair: right. but if you make both labels, that would affect quota math	14:39
jeblair	mordred: nodepool v3 is noticeably less concerned with that than v0.	14:40
mordred	nod	14:40
mordred	jeblair: I was thinking if you had an ubuntu-xenial node on both and could further restrict it like label: ubuntu-xential provider: openstack	14:40
pabelanger	I'd appreciate a review on https://review.openstack.org/#/c/472853 please. I'd like to add our infra-root keys to nl01.o.o to make some debugging a little easier: https://review.openstack.org/#/c/472854	14:40
mordred	jeblair: but I can also see that being too much and not useful	14:41
jeblair	mordred: the only quota related thing that would come into play is min-ready, and i think that still maps pretty well here. "i want 10 ubuntu-xenial-openstack nodes min-ready, and only 1 ubuntu-xenial-k8s"	14:41
pabelanger	We should also consider adding auto-hold support back for zuulv3.o.o, its helpful when debugging job failures too	14:41
jeblair	pabelanger: it's in storyboard. so we have considered it and accepted that it needs to be done. no one has picked up the task yet.	14:42
mordred	jeblair: right- but "i want 10 ubuntu-xenial-openstack nodes min-ready" doesn't get you any min-ready of ubuntu-xenial nodes	14:43
* Shrews recommends a more non-generic, understandable term than "multiplexor". What's a good term for the thing that gives you the finger?		14:43
* fungi refrains from obvious wisecrack opening		14:43
jeblair	mordred: sure. ubuntu-xenial would need its own min-ready, and there would be a minor loss of efficiency there. if we really cared about that, we could probably map them to each other in nodepool.	14:43
Shrews	fungi: come on... you know you wanna	14:44
mordred	jeblair: yah - that was mostly what I was thinking -some sort of map or even an alias or something	14:44
jeblair	mordred: but "provider: openstack" inside of zuul seems like the abstraction violation. i'd very much prefer not to do that.	14:44
jeblair	mordred: if any of this stuff is important, let's make it happen in nodepool.	14:45
mordred	jeblair: yah - so, that's the question I had WRT exposing the openstack variables - is it an abstraction violation to do so if they're there	14:45
pabelanger	jeblair: sure, it would be great if we could find a volunteer to maybe work on it.	14:45
mordred	jeblair: and if it is a violation, is it a thing we should allow someone to opt-in to?	14:46
mordred	Shrews: how about "finger gateway"?	14:46
* rcarrillocruz back and reads scrollback		14:46
jeblair	mordred: i don't know what these variables are and what they are used for. but if they are useful, i don't see why we can't pass them over. and if we aren't sure we want to, we can make it an option on the nodepool side (add a "send openstack vars" option to the pool-label)	14:46
Shrews	mordred: that is a good (but non-humorous) alternative	14:47
Shrews	i like that	14:47
mordred	jeblair: maybe let's start with passing them, then see if we think they're a bad idea and need an option?	14:47
jeblair	mordred: do we have a need/use for them now?	14:48
mordred	jeblair: well - the concept came to mind because rcarrillocruz was talking about adding nodepool label to the groups nova metadata. this affects how the openstack dynamic inventory puts nodes in to groups. that mainly made me think that if people have playbooks that are expecting to be run in the context of the openstack dynamic inventory, we are not actually currently exposing the information those playbooks	14:50
mordred	are provided even though we have access to it	14:50
mordred	jeblair: (it's come up a few times in my brain as I've been reviewing the code around the node exchange, but just never bubbled all the way up to "discuss")	14:51
jeblair	mordred: rcarrillocruz's grouping issue is handled by zuul, right?	14:52
jeblair	mordred: i guess i'm having trouble understanding what automatic grouping would happen from openstack vars in a nodepool+zuulv3 scenario.	14:53
mordred	jeblair: oh! sorry - I understand the disconnect	14:53
mordred	jeblair: I don't mean for automatic grouping purposes - one sec, lemme make a quick paste	14:54
rcarrillocruz	yeah, groups in nodesets zuul is a superst of my patch	14:54
rcarrillocruz	trying to think of a use case i could use with what mordred is describing	14:54
rcarrillocruz	in just- nodepool scenarios	14:54
jeblair	pabelanger: yeah, i'm concerned that we're all getting a bit distracted by post zuul-v3 things and people are not working on the things we need to get to zuul v3 itself. there is still a considerable backlog and almost no one is working it down.	14:59
pabelanger	jeblair: agree. infra-root keys is the first step to helping	15:04
jeblair	pabelanger: i +3d that change	15:04
pabelanger	I think I'll also have to work on log publishing today too. It's been too difficult tracking all the open streams in terminals	15:05
jeblair	mordred: can you +3 https://review.openstack.org/473222 ?	15:12
jeblair	pabelanger: what do you we need to do regarding log publishing?	15:13
pabelanger	jeblair: right now we are not publishing any logs to logs.o.o. This is because only set it up before to publish to zuulv3-dev.o.o/logs. So we need to address that	15:15
jeblair	pabelanger: right. what's your plan?	15:16
pabelanger	jeblair: I'd like to see what people would like to do. Specfically, have we solved our log url / log path issues we discussed at the PTG? Today we just dropped everything into a top level build uuid folder. I am not sure that will work well on logs.o.o	15:18
jeblair	pabelanger: no, i think we should use the same log path system we have in production now. so any variables missing, we should add. first step then is probably to make a new post-playbook scp log copier that copies logs up to logs.o.o.	15:20
jeblair	pabelanger: it will have to be in a trusted post playbook and use an ssh key on the executor to do the copying because we can't use secrets yet because we haven't turned on ssl support for gearman.	15:20
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Improve fake github commit status handling https://review.openstack.org/473633	15:21
jeblair	(if we switch gearman to ssl, we can use secrets, then we can put the ssh key in a secret and do this in an untrusted repo)	15:21
pabelanger	jeblair: okay, do we have a SSL cert for gearman or will it just be a self-signed	15:21
jeblair	pabelanger: self-signed	15:21
pabelanger	k	15:21
pabelanger	we might as well do that first, let me see what needs to be done	15:22
jeblair	Shrews: i left a comment on https://review.openstack.org/471079	15:23
Shrews	jeblair: regarding the file buffering, i quickly ran into a test run where i got less data from the file than what the recv() returned	15:24
jeblair	Shrews, mordred: i know that zuul_console is unbuffered -- did we perhaps inadvertently make the callback plugin write buffered? or ....	15:25
Shrews	so i assumed that was due to buffering	15:25
jeblair	Shrews: recv() returned more data than the file?	15:25
Shrews	jeblair: yes	15:25
jeblair	Shrews: hrm... you strip newlines from the file... is it possible it was short by the number of newlines?	15:26
Shrews	jeblair: no, there were visible lines missing	15:26
jeblair	Shrews: are you thinking they were written to the file after you read the file but before you streamed it?	15:27
Shrews	jeblair: it's possible there is some sort of race around that	15:28
*** jkilpatr_ has joined #zuul		15:29
Shrews	jeblair: that's partly why i didn't also choose to keep calling recv() until the entire buffer size is returned (there is actually a recv flag to do that in a single call)	15:29
Shrews	not being able to guarantee that the file contents returned would meet the minimum	15:30
Shrews	i didn't want the test to have a possibility of hanging	15:30
*** jkilpatr has quit IRC		15:31
jeblair	Shrews: how's this for a crazy idea: open the log file for reading in the test, open the streaming connection and spawn a thread to read the whole stream in the background and return when complete, remove the flag file, wait for job to complete, read entire file in (the file is deleted now, but we still have an open handle so we'll be able to read it), wait for stream read to complete, compare, close file.	15:33
Shrews	jeblair: "return when complete" ... part of the problem is that we don't know when it is "complete"	15:35
jeblair	Shrews: does the finger server disconnect when the job ends?	15:36
Shrews	jeblair: in theory the server should break the connection on EOF	15:37
jeblair	Shrews: if that's the case (and it seems like if it isn't, we probably should make it so), if we wait to the end of the job we'll get that signal	15:38
*** isaacb_ has quit IRC		15:39
Shrews	jeblair: how do you propose coordinating that? ending the job while the streaming is ongoing doesn't guarantee we will get the entire file streamed, does it? the file could get removed from underneath the streamer	15:41
jeblair	Shrews: if so, that's a bug.	15:41
jeblair	Shrews: but i expect it to hold it open until it gets to the end of the file.	15:42
Shrews	this test is really getting super complex. i'll try to add that new bit though	15:43
jeblair	Shrews: i think it can be simpler this way	15:43
Shrews	hah	15:44
jeblair	Shrews: there's only one sync point in the job	15:44
Shrews	1?	15:44
jeblair	sure, laugh, whatever	15:44
jeblair	yes one sync point	15:44
jeblair	the job only has to wait until the test has opened the log file and made the initial stream connection	15:44
jeblair	then it can run to the end	15:44
jeblair	and after that, the test can wait until the build completes in the usual manner	15:45
Shrews	not a derisive laugh, btw. that came across poorly	15:45
jeblair	'sok :)	15:45
jeblair	i'll reread it as an evil cackle for fun	15:46
Shrews	more a maniacal, point-of-turning-evil laugh	15:47
openstackgerrit	Paul Belanger proposed openstack-infra/nodepool feature/zuulv3: Add region_name to zookeeper https://review.openstack.org/473887	15:50
pabelanger	jeblair: Shrews: ^ was something I noticed missing when we try to setup our mirrors in zuulv3.o.o. Would be interested in your feedback	15:51
mordred	jeblair, jlk: this is what I'm talking about http://paste.openstack.org/show/612438/	15:51
Shrews	pabelanger: lgtm	15:52
mordred	jeblair: the first thing is what we're emitting today	15:53
mordred	jeblair: if we add the "openstack" vars (which are added to a node's hostvars if you use the openstack dynamic inventory) you get the second thing	15:53
openstackgerrit	Paul Belanger proposed openstack-infra/zuul feature/zuulv3: Update to nodepool_region_name from zookeeper data https://review.openstack.org/473889	15:55
pabelanger	^ is the zuul change too	15:55
mordred	pabelanger: actually - region is already in the inventory	15:55
jeblair	mordred: gotcha -- so just things like getting the ip info from there is an obvious usage (like, i have an ansible play that sets up some iptables stuff and reads openstack.networks.private to do that)	15:55
mordred	pabelanger: should be in the "nodepool_region" variable in the node's hostvars	15:55
mordred	jeblair: yah - I could also imagine a playbook looking at the volumes dict and deciding to do or not do something with that information	15:56
jeblair	mordred: nod	15:56
mordred	pabelanger: should be in the "nodepool_region" variable in the node's hostvars	15:57
rcarrillocruz	i can imagine this would be useful for $provider testing	15:57
rcarrillocruz	like	15:57
mordred	pabelanger: maybe we should also start saving the inventory file when we save logs	15:57
jeblair	mordred: i think pabelanger wants to transmit that to zuul though	15:57
rcarrillocruz	give me a xenial-openstack cos i want to test specific openstack things on it	15:57
pabelanger	mordred: yes, nodepool_region will always be null currently. The zuul patch problem pulls the data from zookeeper, and just renames it to nodepool_region_name, since we use region_name for zookeeper. However, if people okay with nodepool_region, I can omit that change	15:58
mordred	jeblair: yes. zuul already writes that to the inventory	15:58
rcarrillocruz	i see no harm on it, providing is opt-in	15:58
mordred	pabelanger: ah - I'm fine putting it in the zuul vars and not as nodepool_region	15:58
jeblair	mordred, rcarrillocruz: yeah, it's a big chunk of data... maybe we should have it disabled by default, but let folks opt in to sending it?	15:58
rcarrillocruz	++ from me for disabled by default	15:59
pabelanger	mordred: okay, nodepool_region_name is okay for you?	15:59
mordred	jeblair: yah. we can also put a pin in that and figure out the best mechanism to communicate it later	15:59
jeblair	mordred: ya, not urgent	15:59
mordred	pabelanger, jeblair: I'm fine either way: zuul.region_name or nodepool_region - but I think I lean towards zuul.region_name	16:00
mordred	pabelanger, jeblair: if we do zuul.region_name I think we should remove nodepool_region from the hostvars	16:00
jeblair	mordred, pabelanger: give me a minute, i'm really confused by these patches.	16:00
mordred	kk	16:00
pabelanger	now I am confused too	16:00
jeblair	473887 looks like it's adding something to the zk node data structure	16:01
pabelanger	http://paste.openstack.org/show/612440/ is the issue we have today, nodepool_region will always be null	16:02
jeblair	like, nodepool is not passing the region at all currently, regardless of whatever we name it. is that correct?	16:02
mordred	yes	16:02
pabelanger	jeblair: correct. We must have missed that before	16:03
jeblair	ok. then the zuul change is needed because 473887 chose a different name than what zuul was expecting	16:03
jeblair	okay, i think i understand these, and i think we should call the zk attribute "region". that's to match "provider" (which is not provider_name)	16:04
pabelanger	okay, I can make that change	16:04
mordred	jeblair, pabelanger: on 473889 - I put in a question on the review - but it's "should those variables be top-level or be moved to the zuul variables dict"	16:05
* mordred votes for move - but is fine if we pick the other thing		16:05
jeblair	mordred: hrm, it's in as a host var... but i guess they all just share the same namespace, huh?	16:06
jeblair	mordred: so yeah, i'm leaning toward move	16:06
jeblair	mordred: how do we accomplish that though?	16:06
mordred	it's easy - we just put them into the zuul dict - I can doa followup	16:07
jeblair	mordred: a zuul dict for a single host?	16:07
pabelanger	I'd be okay with a zuul.nodepool dict() if we went that roune	16:07
pabelanger	rout*	16:07
pabelanger	fetching coffee	16:07
*** hashar has quit IRC		16:07
jeblair	mordred: would it be like this https://etherpad.openstack.org/p/mHTWbqUQ3M ?	16:08
jeblair	(would those get merged?)	16:09
mordred	yes. to both questions	16:10
jeblair	cool, in that case, zuul.nodepool_region or zuul.nodepool.region both wfm.	16:10
*** isaacb has joined #zuul		16:11
Shrews	let's be careful with that. i'm not 100% sure how that would affect current nodes with the old style metadata	16:11
mordred	jeblair: nope. I take it all back.	16:12
jeblair	mordred: it overwrites the whole thing?	16:12
mordred	jeblair: we'd need to turn on variable dict merging to get merging, and we've been avoiding that on purpose so far	16:13
mordred	jeblair: so I think ignoring me and keeping them top-level per-node as we have them is fine	16:13
Shrews	oh, these aren't node metadata vars. nm	16:13
jeblair	ok. so i think we just revise pabelanger's patch for my bikeshed, then no change to zuul.	16:13
mordred	Shrews: cool - I was just about to ask which thing we should be careful about	16:13
mordred	jeblair: yup	16:13
pabelanger	Agree, I don't think we should enable dict merge setting	16:14
mordred	jeblair: we _could_ make nodepool a dict to be consistent with zuul being a dict	16:14
mordred	but I do not think it's essential that we do so	16:14
jeblair	mordred: agreed, that seems like it would be fine too, but <shrug>	16:14
jeblair	i guess we should decide rsn though. :)	16:14
pabelanger	nodepool dict seems okay to me	16:15
mordred	woot	16:15
jeblair	wfm	16:15
openstackgerrit	Paul Belanger proposed openstack-infra/nodepool feature/zuulv3: Add region to zookeeper https://review.openstack.org/473887	16:15
pabelanger	okay, ^ should mean we can abandon 473889	16:16
jeblair	pabelanger: yep. but if you want to move them into a dict, now's the time. :)	16:17
pabelanger	k, I can do that too	16:17
pabelanger	jeblair: mordred: still as a host_vars right?	16:20
mordred	pabelanger: yah - I think just right there in that same function	16:20
pabelanger	k	16:23
jlk	mordred: do you know of any consumers that would appreciate such variables? Or do you feel like "if you build it, they will come"?	16:28
openstackgerrit	Paul Belanger proposed openstack-infra/zuul feature/zuulv3: Create nodepool dictionary for ansible inventory https://review.openstack.org/473889	16:29
mordred	jlk: yah. that's the real rub - I don't know of specific consumers- only that it's variables we provide for openstack inventory users ... and so far everything we've put upstream in ansible gets used way more than I expect :)	16:30
jlk	hah. I see there has been some more discussion. I agree it's a lot of info, maybe way more than should be exposed by default. As much as I hate adding more config toggles, this does seem like one.	16:31
*** isaacb has quit IRC		16:35
*** isaacb has joined #zuul		16:36
*** jkilpatr_ has quit IRC		16:39
*** jkilpatr has joined #zuul		16:39
rcarrillocruz	just tested https://review.openstack.org/#/c/473809/1 on my nodepool instance	16:41
rcarrillocruz	deleted a node	16:41
rcarrillocruz	confirm the per-label group is there	16:41
rcarrillocruz	jeblair: are we good to go ^ ?	16:41
jeblair	rcarrillocruz: almost -- we should call it 'label' and not type; left a comment on patch.	16:48
rcarrillocruz	ah ok, i follwed the Node convention	16:49
rcarrillocruz	thx, i'll amend	16:49
jeblair	yeah, this way it'll be one less thing we need to change :)	16:50
rcarrillocruz	so jeblair , i should wait for your patch to land or put a depends-on ?	16:50
jeblair	rcarrillocruz: i haven't written it yet, so why don't you just go ahead and rework yours to use 'label' wherever you can... so it might look like "nodepool_node_label=self._node.type", then i'll fix that when i change the rest of the zk stuff	16:51
rcarrillocruz	k	16:51
jeblair	rcarrillocruz: but at least you won't have to deal with the metadata variable changing	16:51
openstackgerrit	Merged openstack-infra/nodepool feature/zuulv3: Add region to zookeeper https://review.openstack.org/473887	16:54
openstackgerrit	Ricardo Carrillo Cruz proposed openstack-infra/nodepool feature/zuulv3: Create group for label type https://review.openstack.org/473809	16:55
*** jkilpatr has quit IRC		17:02
rcarrillocruz	thx folks	17:03
*** jkilpatr has joined #zuul		17:04
* SpamapS sits back down at laptop, refreshed from 2 days of legolanding.		17:04
*** isaacb has quit IRC		17:07
jlk	good times!	17:13
jlk	my boys were so bummed when we went that we were missing the opening of Ninjago by a couple weeks.	17:13
openstackgerrit	Paul Belanger proposed openstack-infra/zuul feature/zuulv3: Add ssl support to gearman / gearman_server https://review.openstack.org/473916	17:15
pabelanger	jeblair: okay, I think we first need ^ for gearman SSL. I didn't see anything in zuul today for it. I'd also like to setup a unit test for SSL, but want to make sure that is the right step first	17:17
SpamapS	jlk: It was meh for me, but my boys did in fact enjoy it since it was the only video game they got to play all day.	17:19
SpamapS	jlk: (re: Ninjago)	17:19
jeblair	pabelanger: yep agreed -- though i think we need to use "has_option" because if we don't, doesn't configparser.get raise an exception if the value isn't there?	17:30
pabelanger	jeblair: Ya, that is something I wanted to test. I thought it defaulted to None, but untested.	17:31
pabelanger	working on generating our self-signed certs	17:31
jeblair	pabelanger: yeah, we need has_option: http://paste.openstack.org/show/612449/	17:33
pabelanger	jeblair: thanks	17:34
openstackgerrit	Merged openstack-infra/nodepool feature/zuulv3: Create group for label type https://review.openstack.org/473809	17:34
jeblair	jlk: cache change lgtm	17:37
jlk	yay!	17:37
jeblair	mordred: i commented on 473090	17:55
SpamapS	FYI, 7/11, CNCF would like another demo of Zuul, this time with v3. :)	18:12
mordred	SpamapS: sweet	18:14
mordred	jeblair: I have responded - thanks!	18:34
*** hashar has joined #zuul		18:36
jlk	SpamapS: care to take a look at https://review.openstack.org/472468 and perhaps give it workflow?	18:43
SpamapS	jlk: looking now	19:02
SpamapS	zomg summer WFH is such a tire fire sometimes	19:17
* SpamapS is going to have to escape to a coffee shop me thinks		19:17
jlk	:(	19:18
SpamapS	jlk: review progressing :)	19:21
Shrews	mordred: after the meeting, we might need to put our heads together on your "follow_log" streaming logic. It doesn't seem to stop streaming on EOF	19:43
mordred	Shrews: kk. the logic for serving on the node itself? or the logic for reading it on the executor?	19:45
Shrews	mordred: the serving code	19:46
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Implement a cache of github change objects https://review.openstack.org/472468	19:46
Shrews	mordred: like, maybe if the file has NOT been truncated, is it ok to just return False since we've only gotten to that point (i think) if we've read all of the file?	19:47
Shrews	returning False should terminate the streaming	19:47
mordred	Shrews: well, that is on purpose. it's basically doing a tail -f of the log and doesn't know if more things wil be added	19:47
mordred	Shrews: HOWEVER - now that we're doing logfile-per-command, we could potentially change that logic, since we don't need to support subsequent commands appending to the file	19:48
Shrews	mordred: that might work for the ansible plugin, but i'm talking about in reference to the finger streamer (where i copied that code)	19:49
mordred	gotcha	19:49
Shrews	doesn't seem like we'll ever really know if we've reached the true EOF	19:49
mordred	so - yah - let's rethink this part - because there have been some structural changes since it was written	19:50
mordred	Shrews: on the node, we can definitely write an EOF to the log file, because we don't re-use log files across tasks on the node anymore	19:51
mordred	Shrews: on the executor, we also know when we're done with the logfile - that's when we run zuul_console: state=absent	19:52
mordred	Shrews: so the executor streamer can write a "yup, I'm done"	19:52
mordred	Shrews: and then the finger streamer can notice that and stop	19:53
mordred	Shrews: right?	19:53
jeblair	mordred, Shrews: the executor streamer knows it's done when the file is deleted	19:53
Shrews	hrm, yeah. maybe a presence check is what's needed there	19:54
jeblair	so while the node streamer is doing a tail -f (and i agree, that can probably change now), the executor streamer can replace that "did the file get truncated?" check with a "did the file get deleted?"	19:54
mordred	jeblair: yes. agree	19:55
Shrews	right, we won't ever truncate that log	19:55
jeblair	though, we'll have to think about startup -- what if someone requests a stream before the job starts? then the file won't exist -- yet.	19:55
Shrews	the finger streamer returns "log file not found" in that case	19:55
* SpamapS smells a lockfile coming		19:55
jeblair	however, the executor doesn't return the streaming information/url until after the job starts... so... maybe it's okay?	19:55
jeblair	well, it sends it right before it runs ansible for the first time	19:56
Shrews	or, if the build dir doesn't exist yet, "invalid build uuid"	19:56
jeblair	so... i guess there's a tiny race there.	19:56
jeblair	Shrews: aha!	19:56
jeblair	Shrews: build dir exists, but log file doesn't means that the job hasn't started yet. so the streamer can wait for the log to show up (if we want to be nice).	19:56
jeblair	Shrews: build dir doesn't exist means the job is finished. terminate connection.	19:57
Shrews	jeblair: yeah, we could be nicer that way	19:57
mordred	++ to both	19:58
mordred	btw - the executor streamer can stop streaming even before the file is deleted if we want - "[Zuul] Task exit code: %s" is the last line emitted by the command module	20:00
mordred	OH - I say that - I actually see a logic flaw there	20:00
jeblair	mordred: there may be more tasks, so the executor streamer can't stop, right?	20:02
openstackgerrit	Jesse Keating proposed openstack-infra/zuul feature/zuulv3: Implement pipeline requirement of github labels https://review.openstack.org/473962	20:02
mordred	jeblair: it can stop streaming that task	20:02
jeblair	mordred: oh, as a client. yes.	20:02
Shrews	jeblair: file exists check seems to work. good idea	20:02
jeblair	(i thought you made it do that? i recall a check for that line)	20:02
mordred	jeblair: it can also stop streaming that particular file	20:02
mordred	jeblair: I do on the executor	20:02
mordred	jeblair: but I was suggesting we could do the same check in the node streamer so that the thread streaming a given file could terminate	20:03
jeblair	mordred: ah, you said 'executor streamer' above. okay, resetting --	20:03
mordred	sorry. we have many pieces	20:04
jeblair	mordred: yeah, i think that's okay. just one thing we'll need to make sure of is that in both places (node and executor streamers), we flush all the data output before we close any open connections.	20:05
mordred	++	20:05
pabelanger	So, has anyboby tried using SSL with python gear recently? I keep getting SSLError: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:661) back from my SSL certs I have generated	20:06
Shrews	pabelanger: i added that code loooooong ago (4+ years?). worked at the time. :)	20:06
jeblair	Shrews: i think you added tests too? :)	20:07
pabelanger	Ya, that's what I see. I've been following https://dst.lbl.gov/~boverhof/openssl_certs.html for my server / client / ca certs	20:07
Shrews	wow, /me just realizing how long he has been in this openstack stuff	20:07
jeblair	Shrews: don't look down	20:07
pabelanger	I am starting to thing ssl_version=ssl.PROTOCOL_TLSv1 might be part of the issue	20:07
SpamapS	indeed	20:08
jeblair	ah, we use fixed certs in the tests, so we're not generating new ones or anything	20:08
jeblair	pabelanger: what emits that error?	20:11
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Make sure we always log the exit line https://review.openstack.org/473966	20:11
pabelanger	jeblair: gear.Server: http://paste.openstack.org/show/612463/	20:13
pabelanger	afk for a few minutes, need to stretch legs	20:13
openstackgerrit	David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Stop streaming from server when log file removed https://review.openstack.org/473967	20:13
Shrews	jeblair: mordred: ^^^ this seems to work	20:13
jeblair	pabelanger: are you sure the client is using ssl?	20:14
*** jkilpatr has quit IRC		20:15
pabelanger	jeblair: I believe so, let me confirm again	20:17
jeblair	pabelanger: there should be a "Using SSL" debug log line right after "Connecting to"	20:18
SpamapS	you can always 'openssl s_client -connect foo:4730' and try some admin commands :)	20:19
jeblair	ya	20:19
pabelanger	that is a good idea	20:20
pabelanger	jeblair: I don't see that, sorry was looking	20:20
pabelanger	let me first debug with geard client	20:20
*** jkilpatr has joined #zuul		20:31
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Use threads instead of processes in zuul_stream https://review.openstack.org/472850	20:38
openstackgerrit	David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Add log streaming test https://review.openstack.org/471079	20:39
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Stop streaming from server when log file removed https://review.openstack.org/473967	20:39
Shrews	jeblair: okie dokie, reworked that ^^^ to use the thread for streaming. Although I could remove a couple of sync points from the job itself, I had to add some artificial spinwaits to wait for things to get populated	20:40
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Don't wait for forever to join streamer https://review.openstack.org/472839	20:40
Shrews	jeblair: oh, this is fun. ever seen this seeming non-related fail? http://paste.openstack.org/show/612464/	20:42
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Use display.display for executor debug messages https://review.openstack.org/472840	20:42
jeblair	Shrews: no, but i've never had a config/streamer/ directory -- did you lose that in some git operation somehow?	20:43
Shrews	jeblair: nope. that's being added for that test	20:44
Shrews	happened while running the test 100 times	20:44
Shrews	failed on run 71	20:44
jeblair	Shrews: yeah, but it looks like it's not where it's expected on your local system	20:45
jeblair	Shrews: did you perform any git operations while running those tests in the background?	20:45
Shrews	hrm, maybe doing the 'git review' during the run caused issues	20:45
Shrews	yeah	20:45
Shrews	must be it	20:45
jeblair	yes, it absolutely did as part of the rebase check.	20:45
* Shrews makes mental note to not do that		20:45
jeblair	Shrews: or 'git review -R' if you really want to	20:46
jeblair	clarkb: can you take a look at 472964 ?	20:48
clarkb	jeblair: mordred my preference on the log indexing/parsing/etc side of things is to not make things conditionally different	20:51
clarkb	it complicates the rules necessary to parse and index and query	20:51
jeblair	clarkb: generally, i agree -- is it worth having the "hostname" (eg, ubuntu-xenial) in every log line on every single-node job though? or is that a parsing complexity we'd be willing to accept in order to simplify the most common logs?	20:52
mordred	clarkb: ok- so the prepended hostname on every logline you don't think will bother folks?	20:52
clarkb	it wouldn't bother me, we often have people confused about what jobs ran on and that would make it pretty explicit	20:53
mordred	that's a good point	20:53
jeblair	(yes, though i think even now our pre-playbook outputs that at the top of the job)	20:53
clarkb	I don't feel strongly about it. The difference in logstash rules for that is an optional match group for hostname	20:55
clarkb	so fairly straightforward	20:56
mordred	jeblair: I can split the if len > 1 bit out into its own patch and put it at the end if we want to get the rest of that stack in but till talk about the hostname part	20:57
mordred	oh - I guess there is nothing pressing on top of it	20:57
jeblair	i also don't feel strongly. however, based on what i've seen, i think perhaps this is one of those cases where we can make users' lives better at a small cost to our logstash config complexity. so if it's not too hard to do, let's do that. if there's a compelling user story for always having it, we can always stick it back in.	20:58
mordred	jeblair: kk	20:58
clarkb	is there a problem we are trying to solve with it? just concern that too much info will make logs harder to read?	20:59
clarkb	and if ^ is the case why are we not worried about it for multinode?	20:59
jeblair	clarkb: well, it's required by multinode in order to have any hope of reading it	21:00
mordred	clarkb: yah. but for single node it winds up being a change from the current log format	21:00
clarkb	right, but if users are capable of that wouldn't they be capable of it on signle node too?	21:00
mordred	so since it's a change, it made me want to think about whether the change would increase or decrease user happiness	21:00
clarkb	I'm personally a fan of consistency	21:01
clarkb	also journalctl outputs with hostname by default too	21:01
jeblair	i think users will be able to handle it; i think it just might be annoying -- the same thing repeated on every line... though now that i say that... mordred, does the hostname perhaps make an easy way to distinguish between "ansible" log lines vs "console" log lines?	21:02
clarkb	(so having hostname there would potentialyl make it easier on people used to journalctl output)	21:02
jeblair	(now i wish i had one of these handy, but we're broken right now)	21:02
mordred	clarkb: is there an example default journalctl output lying around anywhere?	21:02
clarkb	mordred: just run journalctl -f on your laptop :)	21:02
clarkb	Jun 13 08:49:10 nibbler systemd[2605]: Started XFCE notifications service.	21:02
clarkb	for example	21:02
clarkb	nibbler is the hostname here	21:02
jeblair	http://zuulv3-dev.openstack.org/logs/0a06c318cf2340e7a99f08a7b6127164/ansible_log.txt	21:03
mordred	jeblair: yah - but a bunch has changed there	21:03
jeblair	mordred: hrm, it kind of does -- ignoring the stuff we've changed (like fixing dual timestamps, etc) -- the hostname does kind of make a nice indication that something is a console line vs an ansible line...	21:03
mordred	jeblair: yah - it's not terrible. why don't we just keep it for a bit and after we read logs for a little while revisit?	21:05
mordred	jeblair: so I'll pull just that bit out of the stack but leave the other bits so that it's trivial to re-add if we want	21:05
jeblair	mordred: sgtm!	21:05
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Change log streaming link to finger protocol https://review.openstack.org/437764	21:09
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Extract get_playhosts listing in zuul_stream to a method https://review.openstack.org/472964	21:09
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Only prepend hostname on multi-node plays https://review.openstack.org/473985	21:09
mordred	jeblair: kk. there is is reorganized a bit	21:10
pabelanger	okay	21:15
pabelanger	would love some body to validate my steps for SSL and gearman: http://paste.openstack.org/show/612468/	21:16
pabelanger	taking a break before I dive back into it	21:20
*** dkranz has quit IRC		21:27
Shrews	pabelanger: it may be that self-signed certs won't work? i seem to recall some sort of issue around those, but i really don't remember	21:32
SpamapS	pabelanger: I usually use CA.sh	21:32
jeblair	Shrews: i think pabelanger's example signs the certs with a ca key, so they shouldn't be (only) self-signed	21:33
Shrews	jeblair: k. like i said, my memory on that is fuzzy. 'self-signed' brought up bad feelings that i can't put a finger on atm	21:35
Shrews	wish i could remember, but that would risk bring up memories of libra	21:40
Shrews	shudder	21:40
jeblair	pabelanger: i confirm your error	21:47
jeblair	pabelanger: this process, however, works for me: http://paste.openstack.org/show/612472/	21:48
jeblair	pabelanger: i suspect the problem is the expiration -- if i look at the certs i generate with your instructions, their begin and end dates are the same	21:57
jeblair	pabelanger: oops, i'm wrong about that, nevermind	21:58
jeblair	pabelanger, Shrews: oh wait -- i think those might be self-signed!	22:01
jeblair	i mean, there's a CA cert there and all	22:01
jeblair	but i think if you enter the same values for the subject information, you get that error	22:02
jeblair	pabelanger: i can't tell what you entered since you didn't include that in the paste, or use the -subj line	22:02
jeblair	but that is something that's different about my procedure -- i use -subj and provide something different for each cert	22:03
pabelanger	jeblair: I just used the default values provided by openssl, I just pressed enter for most prompts	22:05
pabelanger	let me make sure client and server different values	22:05
pabelanger	jeblair: yes, that was the issue. By using a different subject for client / server / ca, I no longer get errors. Thank you	22:09
*** hashar has quit IRC		22:12
*** hashar has joined #zuul		23:02
openstackgerrit	Jesse Keating proposed openstack-infra/zuul feature/zuulv3: Implement pipeline reject filter for github https://review.openstack.org/474001	23:10
* jlk hangs up his hat for the day.		23:11
*** hashar has quit IRC		23:28

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!