Thursday, 2018-12-06

openstackgerrit	Paul Belanger proposed openstack-infra/nodepool master: Include host_id for openstack provider https://review.openstack.org/623107	00:08
pabelanger	corvus: clarkb, mordred: Shrews: ^ first 1/2 to collect host_id for openstack providers, this is to better help openstack-infra collect information for the current jobs timing out in a cloud. Interested in your thoughts, feedback.	00:09
clarkb	pabelanger: well its not a feature in any of the clouds right?	00:10
clarkb	or does it pull from the api instead?	00:10
clarkb	ah ya it is in the api interesting	00:10
pabelanger	yah, looking at openstacksdk, we should get it	00:11
pabelanger	nodepool dsvm test should help confirm	00:11
clarkb	pabelanger: a common ish thing for me when debugging is to take the test node id and grep for that in the launcher debug log	00:11
clarkb	that gets me lines like 2018-12-05 17:08:28,545 DEBUG nodepool.NodeLauncher-0000956882: Waiting for server 0b056afb-88e9-4d0f-8b3c-13f8363d7af2 for node id: 0000956882 and 2018-12-05 17:08:58,596 DEBUG nodepool.NodeLauncher-0000956882: Node 0000956882 is running [region: BHS1, az: nova, ip: 158.69.66.132 ipv4: 158.69.66.132, ipv6: 2607:5300:201:2000::576]	00:12
clarkb	if we added the uuid and the host_id to the second line there, that would be a major win for me I think	00:12
pabelanger	clarkb: kk, current patch doesn't log host_id, but should add it	00:15
pabelanger	will do that in ps2	00:15
clarkb	pabelanger: if you do expose it on the zuul side too adding in the instance uuid to the zuul side would be helpful too I think	00:16
clarkb	not sure if that is already there	00:16
jhesketh	panda: perhaps long term there can be enough automation to actually run through the playbooks, but for now I was planning on preparing all the playbooks and tasks locally and spitting out the ansible-playbook commands that the user would need to run. The user can then modify the playbooks and set up an itinerary to match their local environment.	00:17
jhesketh	To do it fully automatically we'd have to build in extra flags to point to hosts etc, and/or build in cloud launching functionality. Which is something I'd like to see, but as a part 2 or separate tool even. eg, you give the tool your cloud credentials and it does the rest. But it'd need to know a lot more about the image building	00:18
pabelanger	clarkb: k, I'll look at uuid also	00:23
clarkb	pabelanger: the nice thing about those two messages is I get commonly needed info (uuid, ip addrs, etc)	00:26
*** manjeets_ has joined #zuul		01:49
*** manjeets has quit IRC		01:51
*** bhavikdbavishi has joined #zuul		02:41
*** bhavikdbavishi1 has joined #zuul		02:44
*** bhavikdbavishi has quit IRC		02:45
*** bhavikdbavishi1 is now known as bhavikdbavishi		02:45
*** rlandy\|bbl is now known as rlandy		03:09
*** rlandy has quit IRC		03:10
openstackgerrit	Paul Belanger proposed openstack-infra/nodepool master: Include host_id for openstack provider https://review.openstack.org/623107	03:12
*** bjackman has joined #zuul		04:28
bjackman	Is there a way to get your config-project changes tested pre-merge in a post-review pipeline? I tried but it didn't work, not sure if this is because of config error on my part or just the way Zuul is	05:59
bjackman	Ah OK, I think the real answer to my question is that where I have shared config that I want to be tested pre-merge, that should go in a shared untrusted project (equivalent to the zuul-jobs one)	06:35
openstackgerrit	Tristan Cacqueray proposed openstack-infra/zuul master: web: update status page layout based on screen size https://review.openstack.org/622010	06:43
*** goern has quit IRC		06:58
*** goern has joined #zuul		07:08
*** bhavikdbavishi has quit IRC		07:13
*** gtema has joined #zuul		07:32
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul master: Report tenant and project specific resource usage stats https://review.openstack.org/616306	07:33
*** pcaruana has joined #zuul		07:58
*** pcaruana is now known as muttley		07:58
openstackgerrit	Tristan Cacqueray proposed openstack-infra/zuul master: web: refactor jobs page to use a reducer https://review.openstack.org/621396	08:06
openstackgerrit	Tristan Cacqueray proposed openstack-infra/zuul master: web: refactor job page to use a reducer https://review.openstack.org/623156	08:06
openstackgerrit	Tristan Cacqueray proposed openstack-infra/zuul master: web: refactor tenants page to use a reducer https://review.openstack.org/623157	08:06
*** themroc has joined #zuul		08:48
*** AJaeger has quit IRC		08:49
*** AJaeger has joined #zuul		08:51
*** bhavikdbavishi has joined #zuul		08:55
*** sshnaidm\|afk has quit IRC		09:45
*** sshnaidm\|afk has joined #zuul		09:46
*** bhavikdbavishi has quit IRC		09:49
*** electrofelix has joined #zuul		10:04
*** dkehn has quit IRC		10:05
*** sshnaidm\|afk is now known as sshnaidm		10:12
*** sshnaidm has quit IRC		10:33
*** sshnaidm has joined #zuul		10:34
*** jesusaur has quit IRC		11:27
*** jesusaur has joined #zuul		11:31
*** bhavikdbavishi has joined #zuul		11:48
*** sshnaidm is now known as sshnaidm\|bbl		12:08
*** dkehn has joined #zuul		12:39
*** bjackman has quit IRC		12:42
*** gtema has quit IRC		12:46
*** bjackman has joined #zuul		12:47
*** rlandy has joined #zuul		12:58
*** muttley has quit IRC		13:08
*** bjackman has quit IRC		13:09
*** muttley has joined #zuul		13:21
*** muttley has quit IRC		13:25
*** muttley has joined #zuul		13:26
*** muttley has quit IRC		13:29
*** pcaruana has joined #zuul		13:34
*** pcaruana has quit IRC		13:39
*** rfolco has quit IRC		13:41
*** rfolco has joined #zuul		13:41
*** gtema has joined #zuul		13:42
*** pcaruana has joined #zuul		13:43
*** pcaruana has quit IRC		13:47
*** bhavikdbavishi has quit IRC		13:53
*** gtema has quit IRC		13:53
*** smyers_ has joined #zuul		13:57
*** smyers has quit IRC		13:57
*** smyers_ is now known as smyers		13:57
Shrews	corvus: tobiash: fwiw, i don't think https://review.openstack.org/622403 made much impact. I'm still seeing lot's of empty nodes being left around (but thankfully cleaned up now)	14:06
tobiash	Shrews: ok, so maybe we should consider switching to sibling locks	14:07
tobiash	but that would be a harder transition and might require a complete synchronized zuul + nodepool upgrade and shutdown	14:08
Shrews	yes, a bit more involved to do that	14:08
Shrews	but at least not urgent now	14:08
Shrews	at least we've learned something new about using zookeeper! :)	14:10
Shrews	child locks + znode deletion == bad news	14:10
tobiash	yepp :)	14:13
*** gtema has joined #zuul		14:26
*** smyers has quit IRC		14:32
*** smyers has joined #zuul		14:32
*** gtema has quit IRC		14:44
*** sshnaidm\|bbl is now known as sshnaidm		14:51
mordred	tobiash: that reducers stack is really nice	14:55
mordred	gah	14:55
mordred	tristanC: ^^	14:55
mordred	t <tab> is a fail :)	14:55
tobiash	:)	14:55
mordred	tobiash: I approved the stack except for the last 2	14:56
tobiash	mordred: k, I'll check that out latest	14:57
tobiash	later	14:57
mordred	++	14:57
tobiash	mordred: lgtm but I'm not feeling competent enough to +a it.	15:02
*** njohnston_ is now known as njohnston		15:10
mordred	tobiash: yeah. these javascripts are just about at the edge of my brain abilities	15:17
ssbarnea\|rover	can we do something to avoid zuul spam with "Waiting on logger"? as in http://logs.openstack.org/30/621930/2/gate/tripleo-ci-centos-7-standalone/4fb356a/job-output.txt.gz	15:18
mordred	ssbarnea\|rover: we should probably instead figure out what broke the log streamer - do y'all reboot any of the VMs?	15:27
mordred	or, alternately, if you're doing iptables on the vms those could be blocking access to the log streamer daemon	15:28
rlandy	hello - I am testing out zuul static driver for use with some ready provisioned vms. I followed the nodepool.yaml configuration per https://zuul-ci.org/docs/zuul/admin/nodepool_static.html. The playbook setting up the multinode bridge fails - I think due to the fact that the private_ipv4 is set to null. The public_ipv4 value is populated with the 'name' ip. How can I get the static driver to set a private_ipv4?	15:28
tobiash	rlandy: the static driver only knows one ip address so you need to set the private_ipv4 in your job if you're depending on it (or maybe do a fallback when setting up the multinode bridge)	15:30
rlandy	tobiash: ok - set_fact on hostvars[groups['switch'][0]]['nodepool']['private_ipv4']?	15:32
rlandy	setting a fallback would mean editing this role: https://github.com/openstack-infra/zuul-jobs/blob/master/roles/multi-node-bridge/tasks/peer.yaml#L16	15:33
tobiash	rlandy: you need the hostvars[]... cruft only for setting facts for a different machine	15:33
rlandy	and I am not sure other user will be open to my editing that for a static driver case	15:33
rlandy	tobiash: yep - ack - thanks for your help	15:33
*** jhesketh has quit IRC		15:34
tobiash	rlandy: yes, I'm not familiar with this role so someone else (mordred, AJaeger ?) might be of help with the multi-node-bridge role	15:34
*** jhesketh has joined #zuul		15:35
mordred	rlandy: that role should work on nodes that don't have a private ip though - we have clouds that give us vms with no private ip	15:37
mordred	we should check with clarkb when he wakes up	15:37
rlandy	mordred: looking at the inventory, I saw the private_ipv4 set to null and I assumed that was the cause of the error. I could be wrong. I am testing it out again with private_ipv4 set	15:39
mordred	kk	15:39
mordred	it also might not be terrible to allow setting a private_ipv4 in the static driver since it's a value we provide for the dynamic nodes too	15:40
rlandy	or default to the public_ipv4 if private_ipv4 is null	15:50
ssbarnea\|rover	mordred: i didn't do anything myself and I see the "[primary] Waiting on logger" error on multiple jobs during the last 7 days. i don't know how to make a group by in logstash to identify a pattern.	15:55
Shrews	i wonder why that role is using private_ipv4 and not interface_ip	16:00
Shrews	that's available in the inventory	16:01
Shrews	mordred: do you know? ^^	16:02
mordred	Shrews: the role uses private_ipv4 if it's there to establish the network bridge between nodes - that lets us always have a consistent network jobs can use regardless of provider differences	16:04
Shrews	that makes sense	16:04
*** nilashishc has joined #zuul		16:05
clarkb	private ip is set to public ip is there is no private ip	16:18
clarkb	the reason to useprivate over public when you have both is vxlan/gre were not reloable over nat	16:18
clarkb	rather than try and debug that I decided it was easier to just avoid the issue entirely	16:19
clarkb	we can either make that ip behavior consistent in nodepool drivers or update the role to make that assumption instead	16:19
*** themroc has quit IRC		16:26
openstackgerrit	Monty Taylor proposed openstack-infra/zuul master: Read old json data right before writing new data https://review.openstack.org/623245	16:30
pabelanger	mordred: Shrews: clarkb: in https://review.openstack.org/623107/ I'm trying to collect the host_id from wait_for_server in nodepool, but it seems to be empty: http://logs.openstack.org/07/623107/2/check/nodepool-functional-py35-src/46b9fb9/controller/logs/screen-nodepool-launcher.txt.gz#_Dec_06_04_00_07_442925 but 2 lines up I can in fact see hostId. Any ideas why that would be?	16:41
openstackgerrit	Monty Taylor proposed openstack-infra/zuul master: Add appending yaml log plugin https://review.openstack.org/623256	16:47
mordred	pabelanger: looking	16:49
mordred	pabelanger: no - that makes no sense	16:51
mordred	pabelanger: I'm landing so can't dig too deep for a few minutes - but I'm gonna put money on a bug :(	16:52
pabelanger	mordred: okay, that is what I figured also. I can start to dig into it more locally today too	16:52
mordred	pabelanger: cool. I'm guessing something in the conn.compute.servers() -> to_dict() -> normalize_server() sequence	16:54
mordred	pabelanger: which is new and is the first step in making the shade layer consume the underlying sdk objects	16:54
mordred	although looking at it it seems like all the things are in place properly to make sure you'd end up with a host_id	16:55
clarkb	ssbarnea\|rover: mordred: those test nodes are very memory constrainted I wonder if OOMKiller is targetting that process if it gets invoked	17:01
clarkb	ssbarnea\|rover: do those jobs capture syslog? we should be able to check for OOMKiller there	17:01
rlandy	clarkb: wrt private_ipv4 for drivers that only define a public_ipv4, I am happy to put in a review to default the private_ipv4 value in the role but if making the behavior consistent is possible, I think that would be better	17:12
pabelanger	mordred: ack, thanks for the pointers	17:12
clarkb	rlandy: we may want to do both things now that I think about it more. As much as possible consistent driver behavior from nodepool is desireable but the roles should manage when they aren't consistent (maybe someone is running older nodepool)	17:13
Shrews	tobiash: i think i found the race in test_handler_poll_session_expired. running for a bit locally before i push up the fix	17:14
tobiash	Yay :)	17:14
rlandy	clarkb: understood. I'll put in the role change for my own testing at least. Currently I am hacking up the job definition which is not a good way to go	17:15
corvus	Shrews: if the deleted state didn't help, should we revert that patch? (but also, any idea why it didn't work?)	17:27
openstackgerrit	David Shrewsbury proposed openstack-infra/nodepool master: Fix race in test_handler_poll_session_expired https://review.openstack.org/623269	17:39
Shrews	corvus: i have no idea why it didn't work. as for removing it, the only upside to doing so is an easier upgrade path for operators (the code itself doesn't hurt anything afaict)	17:40
corvus	tobiash: heads up on https://review.openstack.org/620285	17:41
Shrews	corvus: downgrading now might be tricky (we'd have to make sure there are no DELETED node states before we restarted)	17:41
Shrews	an unlikely scenario, but we'd still have to check	17:41
corvus	Shrews: hrm, maybe we keep DELETED in there for a little while?	17:41
corvus	maybe until after the next release...	17:41
corvus	sorry let me clarify	17:41
corvus	maybe we should keep DELETED as an acceptable state, but remove the code which sets it	17:42
corvus	then after the next release, also remove the state	17:42
Shrews	corvus: it's possible that change is at least a little helpful, too. but hard to quantify	17:42
corvus	i think one of two plans make sense: 1) debug the DELETED patch and figure out how to make it work; or 2) agree that we should switch to sibling locks, remove the deleted state, and rely on the cleanup worker until we make the switch.	17:43
Shrews	i think 2 is the real solution, but much harder to get there	17:44
SpamapS	looks like /build/xxx doesn't know how to 404.	17:45
corvus	yeah, we'll need coordination between zuul and nodepool for that	17:45
SpamapS	it just... waits	17:45
corvus	SpamapS: :(	17:45
SpamapS	Ya.. throw it on the bug pile? ;-)	17:45
corvus	SpamapS: there's sevear lines of code about returning a 404 in there	17:45
corvus	wow	17:45
corvus	several	17:45
corvus	SpamapS: it does take a while, but http://zuul.openstack.org/api/build/foo returns 404	17:46
corvus	a while=3.7s	17:46
clarkb	ssbarnea\|rover: mordred: Looking at logs for cases with Waiting on logger. This happens when the run playbook seems to die with "[Zuul] Log Stream did not terminate" Then the post run playbook has the Waiting on logger errors, presumably beacuse port 19885 is still heald by the existing log stream daemon?	17:47
tobiash	corvus: thanks, looking	17:47
clarkb	ssbarnea\|rover: mordred: It also seems that when this happens we have incomplete log stream for the run playbook, but ara shows that things kept going behind the scenes	17:48
clarkb	http://logs.openstack.org/25/620625/2/gate/tripleo-ci-centos-7-standalone/70949b6/job-output.txt.gz#_2018-12-06_17_21_48_790889 is an example. This particular job failed trying to run delorean	17:48
SpamapS	hm I may not have waited 3.7s	17:48
clarkb	I don't find any OOMs so it is possibly a bug in zuul (with the cleanup of the streamer in run failing hence talking about it here and not in -infra)	17:48
corvus	SpamapS: i don't see an index on uuid; that's probably why the response is slow	17:48
corvus	so for us, it's 3.7 seconds for a table scan i guess	17:49
SpamapS	def worth an index, but the UI still doesn't show the 404	17:49
SpamapS	Loading... forever	17:49
corvus	(yeah, the table scan part of it takes 2.39s for us)	17:50
SpamapS	In fact /api/build/foo show Loading... too.. hmm	17:50
SpamapS	Why is that an HTML page and not a json response?	17:50
corvus	SpamapS: oh, maybe you need to restart zuul-web?	17:50
tobiash	Shrews, corvus: while reading that stats rework, I've a side question. How can zuul cancel a node request that is currently locked by a provider trying to fulfill it?	17:51
SpamapS	I haven't upgraded recently.	17:51
corvus	SpamapS: i've seen zuul-web get grumpy after a scheduler restart	17:51
SpamapS	oh, it seems to switch on ACcept	17:52
SpamapS	Because my browser is sending Accept html, it's sending me HTML	17:52
Shrews	tobiash: not currently possible	17:53
corvus	SpamapS: erm, we don't have anything like that. you can hit the api in your browser	17:53
corvus	tobiash, Shrews: wel... um, it looks like it actually just deletes the request out from under nodepool.	17:53
SpamapS	hm, no it's not Accept.	17:53
SpamapS	I cannot hit the api in my browser	17:53
Shrews	corvus: well, i mean, if we want to talk about out of bounds methods...	17:53
SpamapS	-> zuul.gdmny.co	17:53
corvus	tobiash, Shrews: without any consideration of the lock.	17:53
SpamapS	(it's still not auth walled)	17:54
tobiash	ah ok, that just works ;)	17:54
SpamapS	I probably messed something up in the translation from mod_rewrite to Nginx.	17:54
Shrews	corvus: tobiash: is this something we are actively seeing then? i thought it was a hypothetical question	17:54
SpamapS	Curl'ing my api works	17:54
SpamapS	but browsering it just shows Loading...	17:54
corvus	SpamapS: if i shift-reload i get an api response.	17:54
tobiash	Shrews: it was a hypothetical question	17:54
SpamapS	corvus: weird	17:54
corvus	Shrews: i'm sure this must happen in openstack-infra	17:55
SpamapS	And of course the javascript fetches are getting json	17:55
corvus	SpamapS: there may be something weird about the javascript service worker	17:56
tobiash	Shrews, corvus: I'm also thinking how we would design this for the scheduler-executor interface	17:56
SpamapS	I'm guessing there's a header combination that gets you HTML.	17:56
tobiash	maybe with two locks, a modify-znode-lock and a processing-lock	17:56
* SpamapS is out of time to investigate though		17:56
corvus	tobiash: okay you're getting way ahead of me here. what's this have to do with the scheduler-executor interface?	17:57
tobiash	or to rephrade, locks for modifying the object, and a further ephemeral node that is hold during processing	17:57
tobiash	corvus: I'm thinking about the scale out scheduler	17:58
corvus	tobiash: i know	17:58
corvus	tobiash: oh, you're thinking we need a distinct lock for "i'm running the job" and a separate lock for modifying the job information	17:58
tobiash	so I thought, if the executor holds the lock during processing, how can we cancel a build?	17:58
tobiash	exactly	17:58
tobiash	actually that's the same with the node-requests that are now just deleted	17:59
corvus	tobiash: i agree, the situations are similar.	17:59
corvus	we could probably do either thing: 2 locks, or, accept that "delete node out from under the lock" is a valid API :)	18:00
corvus	i'm not sure if the current situation with requests is on-purpose or accidental. i'm not sure what nodepool will do at this point if the request disappears from under it. especially with the cache changes.	18:01
tobiash	corvus: yes, for node-requests, but for jobs on the executor we might want not to delete it but leave it within the pipeline (if we follow your suggestion that the executors take their jobs directly from the pipeline data in zk)	18:02
tobiash	corvus: with the cache changes the object is removed from the list, but if some other code path currently processes it it probably just gets errors when locking or saving the node	18:03
Shrews	corvus: tobiash: well, nodepool explicitly looks for node requests to disappear during handling as an assumed error condition. we could just pretend that's a proper stop-now-please api	18:03
corvus	tobiash: if we wanted to, i think we could get rid of canceled build records faster than we do now. if we wanted to, we could just delete the build record and have the executor detect that and abort. i'm not saying let's do it that way, but i do think it's an option.	18:03
clarkb	ssbarnea\|rover: mordred http://logs.openstack.org/25/620625/2/gate/tripleo-ci-centos-7-standalone/70949b6/logs/undercloud/var/log/journal.txt.gz#_Dec_06_16_09_23 I think maybe that is the issue. Running out of disk space? The log streaming reads off of disk and I could see where maybe the reads and the writes get said if we run out of disk?	18:04
corvus	Shrews: retro-engineering!	18:04
corvus	Shrews: the fact that we're only now thinking about it probably means it's working okay :)	18:04
Shrews	http://git.openstack.org/cgit/openstack-infra/nodepool/tree/nodepool/driver/__init__.py#n680	18:04
corvus	Shrews: that almost looks purposeful	18:04
Shrews	right?	18:05
corvus	Shrews: i'm feeling generous; i'm going to assume we engineered it that way and forgot :)	18:05
Shrews	totes	18:05
Shrews	corvus: that's how i (now) remember it	18:05
corvus	++	18:05
clarkb	ssbarnea\|rover: mordred what is odd there is the job seems to start in that state so may be unrelated, however that could also explain why delorean failed possibly	18:05
tobiash	corvus: ok, I think the delete will work in the executor case too	18:07
corvus	tobiash: i think as part of this, we probably want to have the executor hold the lock on nodes in the future. so scheduler deletes build record; executor detects that and aborts job and releases node locks.	18:08
Shrews	corvus: tobiash: actually, we may want to move that nodepool check up a bit in the code. it will only reach that point if it's done launching all requested nodes	18:09
corvus	Shrews: ++ should save some time	18:09
Shrews	the 'if not launchesComplete(): return' is above that	18:09
corvus	Shrews: though... that may be complex	18:10
Shrews	yeah	18:10
Shrews	just thinking of the consequences...	18:10
corvus	Shrews: it's okay if we complete the request and then end up with some extra ready nodes. but if we want to abort mid-launch it'll get messy.	18:10
tobiash	corvus: probably makes sense, I'll think about it	18:10
SpamapS	corvus: btw, it's possible that my API being behind CloudFlare could cause weirdness.	18:11
corvus	tobiash: i think the main driver there is -- it's really the executor using the nodes. a distributed scheduler may get restarted at any time and should have no consequence to running jobs. only if the executor running the job is restarted should the nodes be returned.	18:11
SpamapS	I already can't use encrypt_secret.py through it.	18:11
SpamapS	(CloudFlare blocks unknown user agents and you have to pay to whitelist things, something we'll do.. but.. not today ;)	18:12
corvus	SpamapS: you may want to ask tobiash about building the web dashboard without support for service workers and see if that fixes any weirdness.	18:12
tobiash	corvus: totally correct, I'm just thinking about who should request the nodes. I think this should still be the pipeline processor	18:12
corvus	tobiash: yes. the hand-off to an executor will be a neat trick. :)	18:12
tobiash	and the executor holding the lock on the nodes is absolutely the right thing	18:13
tobiash	yeah, so the scheduler requests it, but the executor that got the job accepts it and locks the nodes	18:14
clarkb	mordred: ssbarnea\|rover ok where the run streaming stops in that job there is a nested ansible run which ara repos was interrupted; data will be inconsistent	18:16
clarkb	and from that point forward we stop getting streaming. So something is happening there that affects more than just zuul	18:16
openstackgerrit	Merged openstack-infra/zuul master: web: break the reducers module into logical units https://review.openstack.org/621385	18:20
*** electrofelix has quit IRC		18:26
*** cristoph_ has joined #zuul		18:30
SpamapS	So... I'm about to submit a slack notifier role for Zuul... wondering if we should stand up a slack (they're free) just for running test jobs.	18:31
SpamapS	Also.. Ansible 2.8 has added a threading mechanism to the slack module that would be super useful for threading based on buildset.... wondering how we're looking for catching up to Ansible any time soon.	18:32
Shrews	it feels like we just caught up to ansible, like, last week	18:34
Shrews	we need to slow their momentum :)	18:34
* Shrews plots an inside attack		18:34
tobiash	corvus: I'm +2 on 620285	18:38
tobiash	corvus: do we need to announce such a change on the mailing list?	18:39
SpamapS	<2.6 .. wasn't 2.6 like.. over a year ago?	18:40
tobiash	SpamapS: nope, we switched to 2.5 just one week before 2.6 has been released. And that was this year ;)	18:41
tobiash	that has been merged in june... (https://review.openstack.org/562668)	18:43
corvus	we need to find a volunteer for the support multiple ansible versions work	18:43
clarkb	SpamapS: openstack infra's testing of ansible 2.8 shows that handlers are going to all break unless things are changed in 2.8 before release	18:44
tobiash	I think I could at least support	18:44
clarkb	so ya ++ to multi ansible instead	18:44
SpamapS	Oh right ok 2.6 was July	18:52
SpamapS	Seems like multi-ansible is a virtualenv+syntax challenge, yeah?	18:53
tobiash	SpamapS: plus possibility to pre-install	18:54
tobiash	(my zuul doesn't really have access to the internet)	18:55
clarkb	if we used teh venv module we should be able to document steps or supply a script to preinstall venv virtualenvs for zuul as it would on demand	18:55
SpamapS	There's some interesting things to think through like, do we have 1executor:manyansibles, or just 1:1 executor:ansible and make ansible version a thing executors subscribe to (like "hey I can do 2.6")	18:55
tobiash	SpamapS: also we need to inject different versions of the command module into different versions of ansible	18:56
clarkb	tobiash: that is the biggest challenge I think	18:56
SpamapS	Is there no way we can write one that works for all supported versions?	18:56
openstackgerrit	David Shrewsbury proposed openstack-infra/nodepool master: Fix race in test_handler_poll_session_expired https://review.openstack.org/623269	18:56
SpamapS	and just always inject that in to the module path..	18:56
tobiash	SpamapS: it could be that the latest one by accident works with all but we don't know	18:57
SpamapS	Anyway, yeah, would be great to have multi-version support, especially with how fast Ansible seems to be moving/breaking.	18:57
*** nilashishc has quit IRC		19:04
openstackgerrit	Ronelle Landy proposed openstack-infra/zuul-jobs master: WIP: Default private_ipv4 to use public_ipv4 address when null https://review.openstack.org/623294	19:28
Shrews	wow, i don't think this nodepool test ever worked properly :(	19:29
Shrews	anyone know of a way to have a mock.side_effect both execute code AND raise an exception? seems it's either one or the other	19:34
openstackgerrit	Merged openstack-infra/zuul master: web: refactor info and tenant reducers action https://review.openstack.org/621386	19:35
clarkb	Shrews: have it call a fake?	19:36
clarkb	then have that raise itself	19:36
Shrews	clarkb: that doesn't work	19:36
Shrews	it can either call the fake, or raise an Exc, but not both it would seem	19:37
clarkb	Shrews: the fake does the raise	19:37
Shrews	clarkb: that didn't work in my test	19:38
Shrews	the raise is ignored	19:38
Shrews	oh, there is something else wrong here. maybe that will work if i fix that	19:41
clarkb	Shrews: http://paste.openstack.org/show/736784/ it works here	19:41
*** sshnaidm is now known as sshnaidm\|afk		19:43
openstackgerrit	David Shrewsbury proposed openstack-infra/nodepool master: Fix race in test_handler_poll_session_expired https://review.openstack.org/623269	19:50
pabelanger	mordred: it seems we might already be passing in normalized data for server at: http://git.openstack.org/cgit/openstack/openstacksdk/tree/openstack/cloud/openstackcloud.py#n2144 because I can see host_id and has_config_drive data before we attempt to normailze again, which results in lost of data	20:30
pabelanger	mordred: I am not familiar enough with code to figure out how to properly fix	20:30
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul master: WIP: Add spec for scale out scheduler https://review.openstack.org/621479	20:35
mordred	pabelanger: hrm. we shouldn't be double-normalizing :(	20:35
mordred	pabelanger: OH	20:36
pabelanger	Oh, HAHA	20:37
pabelanger	http://logs.openstack.org/07/623107/2/check/nodepool-functional-py35/e49051c/controller/logs/screen-nodepool-launcher.txt.gz#_Dec_06_03_59_24_677871	20:37
pabelanger	this actually works with 0.20.0	20:37
pabelanger	but is a bug in master	20:37
mordred	remote: https://review.openstack.org/623308 Deal with double-normalization of host_id	20:38
mordred	pabelanger: ^^	20:38
pabelanger	mordred: https://review.openstack.org/621585/ I think that is what broke it	20:38
mordred	pabelanger: I believe it's because what we're now starting from is an openstack.compute.v2.server.Server Resource object that we then run to_dict() on. the Resource object already coerces hostId into host_id - and the normalize function was only doing host_id = server.pop('hostId', None) - but there isn't a hostId in the incoming - only a host_id	20:41
pabelanger	mordred: yes, exactly	20:42
pabelanger	possible there is others, but haven't checked	20:42
mordred	pabelanger: so I Think that patch above will fix this specific thign - the next step is actually to make that normalize function go away completely	20:42
mordred	pabelanger: but I figure that's going to need slightly more care than a quick fix	20:42
openstackgerrit	Merged openstack-infra/zuul master: web: add error reducer and info toast notification https://review.openstack.org/621387	20:42
pabelanger	mordred: patch is missing trailing ) but worked	20:44
pabelanger	mordred: clarkb: corvus: Shrews: okay, so https://review.openstack.org/623107/ is in fact working, if you'd like to review and confirm format of patch is something we want to actually do	20:48
mordred	pabelanger: yay! I have updated the patch to add the appropriate number of )s	20:51
pabelanger	mordred: +2	20:51
mordred	\o/	20:52
openstackgerrit	Monty Taylor proposed openstack-infra/zuul master: Read old json data right before writing new data https://review.openstack.org/623245	20:59
openstackgerrit	Monty Taylor proposed openstack-infra/zuul master: Add appending yaml log plugin https://review.openstack.org/623256	20:59
fungi	looks like that busy cycle for the executors lasted ~17 minutes	21:05
fungi	er, wrong channel (sort of)	21:05
SpamapS	mordred: there are ways to make json append-only too you know.	21:10
fungi	i assumed the point of 623256 was more expressing a preference for yaml instead of json	21:13
fungi	i do find it sort of disjoint that ansible takes yaml input and returns json output	21:14
clarkb	fungi well yaml can be written to without first parsing the file	21:16
clarkb	so its memory overhead is better	21:16
fungi	ahh, yeah, i didn't consider that angle	21:16
corvus	SpamapS: can you elaborate on your json thoughts?	21:19
SpamapS	corvus: so there are some parsers that can handle this string as a "json stream": '{"field":1}\n{"field":2}\n'	21:31
SpamapS	Which allows you to have append-only json	21:31
SpamapS	But not all parsers do it	21:31
corvus	SpamapS: i think that's the crux -- that we want the output to be valid normal json, not special zuul json	21:32
corvus	(because we want this to be valid if the job dies at any point)	21:33
SpamapS	Yeah, I could have sworn there was a standard for doing it but I can't find it, so I probably dreamed it.	21:35
mordred	SpamapS: yah - I originally looked for a standard ... everything I could find with streaming json was just people doing really weird stuff	21:42
mordred	but I figure - other than python's weird obession with not including yaml support in the core language - everybody else seems to be able to parse it easily	21:43
SpamapS	mordred: so in yaml to make it appendable you just have to indent everything by one and start with a "- " and, all good... +1	21:46
mordred	SpamapS: heh	21:49
mordred	SpamapS: no - actually just separate sections with --- ... to make it a multi-document file	21:49
mordred	SpamapS: it's actually k8s yaml files that gave me the idea	21:50
SpamapS	Oh docs.. hm	21:54
SpamapS	You'd be surprised how many yaml parsers do not support multi doc	21:54
SpamapS	Mostly because they're short-sighted.	21:54
SpamapS	"make maps into {my language's version of dict} and lists into {my language version of list} and done"	21:55
openstackgerrit	Merged openstack-infra/zuul master: Read old json data right before writing new data https://review.openstack.org/623245	21:55
mordred	SpamapS: at this point in my life, nothing surprises me	22:03
mordred	SpamapS: well, except for bojack getting zero golden globe nominations	22:03
SpamapS	I'm sure he has a long face.	22:03
openstackgerrit	Ronelle Landy proposed openstack-infra/zuul-jobs master: WIP: Default private_ipv4 to use public_ipv4 address when null https://review.openstack.org/623294	22:04
openstackgerrit	Ronelle Landy proposed openstack-infra/zuul-jobs master: WIP: Default private_ipv4 to use public_ipv4 address when null https://review.openstack.org/623294	22:20
*** manjeets_ is now known as manjeets		22:28
*** dkehn has quit IRC		22:52
openstackgerrit	Ronelle Landy proposed openstack-infra/zuul-jobs master: WIP: Default private_ipv4 to use public_ipv4 address when null https://review.openstack.org/623294	22:53
SpamapS	Hey, I'm setting up a Slack just to test the slack notifier role I've built to submit to zuul-roles. Who would like to be added to that slack? Anybody?	22:57
*** cristoph_ has quit IRC		22:58
*** dkehn has joined #zuul		23:01
openstackgerrit	Paul Belanger proposed openstack-infra/nodepool master: Include host_id for openstack provider https://review.openstack.org/623107	23:49
clarkb	pabelanger: does ^ depend on a fix in the sdk lib?	23:50
pabelanger	clarkb: no, that was a failure with unreleased version of openstacksdk. Ones on pypi work	23:52
pabelanger	we can add depends-on if we want however	23:52

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!