Sunday, 2019-08-04

*** jamesmcarthur has quit IRC		00:01
*** jamesmcarthur has joined #zuul		00:21
*** jamesmcarthur has quit IRC		00:52
*** jamesmcarthur has joined #zuul		01:08
*** jamesmcarthur has quit IRC		01:55
*** jamesmcarthur has joined #zuul		01:55
*** jamesmcarthur has quit IRC		02:00
*** jamesmcarthur has joined #zuul		02:05
*** jamesmcarthur has quit IRC		02:10
*** jamesmcarthur has joined #zuul		02:18
*** jamesmcarthur has quit IRC		02:22
*** jamesmcarthur has joined #zuul		02:24
*** jamesmcarthur has quit IRC		02:32
*** jamesmcarthur has joined #zuul		02:58
*** jamesmcarthur has quit IRC		03:31
*** jamesmcarthur_ has joined #zuul		03:31
*** jamesmcarthur_ has quit IRC		04:01
*** jamesmcarthur has joined #zuul		04:21
*** zbr has quit IRC		04:45
*** zbr has joined #zuul		04:46
-openstackstatus- NOTICE: Our CI system has problems uploading job results to the log server and thus all jobs are failing. Do not recheck jobs until the situation is fixed.		05:41
*** ChanServ changes topic to "Our CI system has problems uploading job results to the log server and thus all jobs are failing. Do not recheck jobs until the situation is fixed."		05:41
*** jamesmcarthur has quit IRC		05:45
*** jamesmcarthur has joined #zuul		06:23
*** yolanda has joined #zuul		06:33
*** jamesmcarthur has quit IRC		06:36
*** shachar has quit IRC		07:51
*** shachar has joined #zuul		07:51
*** altlogbot_1 has quit IRC		07:57
*** altlogbot_0 has joined #zuul		07:59
*** ChanServ changes topic to "Discussion of the project gating system Zuul \| Website: https://zuul-ci.org/ \| Docs: https://zuul-ci.org/docs/ \| Source: https://git.zuul-ci.org/ \| Channel logs: http://eavesdrop.openstack.org/irclogs/%23zuul/ \| Weekly updates: https://etherpad.openstack.org/p/zuul-update-email"		12:32
-openstackstatus- NOTICE: log publishing is working again, you can recheck your jobs failed with "retry_limit"		12:32
*** rfolco\|ruck has quit IRC		12:58
*** tosky has joined #zuul		13:40
*** bhavikdbavishi has joined #zuul		13:43
*** bhavikdbavishi has quit IRC		13:59
*** bhavikdbavishi has joined #zuul		15:56
*** jamesmcarthur has joined #zuul		16:19
*** jamesmcarthur has joined #zuul		16:19
*** jamesmcarthur has quit IRC		16:44
*** jamesmcarthur has joined #zuul		16:45
*** armstrongs has joined #zuul		16:46
armstrongs	hi question, i have set-up a zookeeper cluster and hooked up multiple zuul-executors and nodepool launchers. The config is reference the zookeeper clusters. When i schedule jobs i keep seeing them land on the same executor. If i take the executor out of service it lands on another but i am not seeing a distribution of jobs across executors. How do	16:48
armstrongs	you make sure that jobs are distributed?	16:48
*** jamesmcarthur has quit IRC		16:57
SpamapS	armstrongs:gearman distributes jobs based on response time	16:59
armstrongs	ignore me i am talking nonsense had a config issue	16:59
armstrongs	working now	16:59
armstrongs	thanks	16:59
SpamapS	armstrongs:it sends out a "wakeup" to every worker, and the first one that responds with "GRAB_JOB" wins.	16:59
armstrongs	thanks for the info	17:00
SpamapS	also zookeeper is only used for scheduler and nodepool IIRC	17:00
armstrongs	ah ok so executor just connects to gearman	17:00
fungi	we do also have backoff heuristics in the executors to make sure they stop claiming jobs if they reach certain resource thresholds, which should make sure the workload evens out a bit under heavy volume	17:01
fungi	even if some of your executors are faster at claiming jobs than others	17:01
armstrongs	yeah its looking good testing with 5 executors and it looks pretty distributed	17:01
armstrongs	:)	17:02
fungi	corvus: so, looking at mirror_info i expect the challenge faced is roughly the same as in the currently-used role... basically for debian 10/buster right now we'd want to omit a mirror_info.debian entry for https://{{ zuul_site_local_mirror_host }}/debian {{ ansible_distribution_release }}-updates main" until debian has its first stable point release of buster (ideal would be if we could auto-detect presence of	17:02
fungi	the buster-updates dist on our mirrors, but just being able to omit it until we know we should adjust configuration to put it to use once available)	17:02
fungi	(...should be a viable alternative)	17:05
fungi	also https://zuul-ci.org/docs/zuul-jobs/mirror.html#rolevar-mirror_info.debian doesn't mention how you enumerate which dist repositories you want to use at the given url (e.g. stretch vs stretch-backports vs stretch-updates)	17:07
*** jamesmcarthur has joined #zuul		17:11
*** jamesmcarthur has quit IRC		17:16
corvus	fungi: re the last thing, i think that's the (perhaps poorly named) 'components' attribute?	17:16
fungi	"components" seems to be for specifying things like main, contrib, non-free	17:17
fungi	suites	17:17
corvus	fungi: re the debian 10 thing -- ah, i see now. that seems like logic that we could put into the debian mirror role, since at least whether debian has had a release is globally applicable. if we wanted to make that site-local configuration, i guess we would need to add or change that data structure....	17:18
corvus	we should rename that suites then :)	17:18
fungi	er, suites is actually what i'm calling dists, sorry... i'm going to rephrase using the field descriptors from the sources.list(5) manpage	17:19
fungi	the schema for a sources.list entry is:	17:20
fungi	<type> [options] <uri> <suite> [components]	17:21
fungi	so "components" is the right term for things like "main contrib non-free"	17:21
fungi	suite is something like "stretch" or "stretch-backports" or "stretch-updates"	17:21
fungi	type is generally "deb" or "deb-src"	17:22
fungi	so the terminology used in mirror_info.debian looks reasonable, we're just missing at least a couple more fields	17:23
corvus	oh, somehow i missed the difference between suite and components	17:24
corvus	i think i just saw "<uri> [components]" :)	17:24
corvus	so sounds like we should add suite	17:24
fungi	type can possibly be inferred (we can decide to either always include deb-src for every deb entry, or to never include it and assume jobs won't be consuming source packages)	17:25
fungi	though if the mirror doesn't include source packages, then deb-src entries could result in apt update failures, i expect	17:26
*** jamesmcarthur has joined #zuul		17:52
*** jamesmcarthur has quit IRC		17:59
*** tosky has quit IRC		18:12
*** jamesmcarthur has joined #zuul		18:16
*** jamesmcarthur has quit IRC		18:21
armstrongs	i have put zuul web dashboard behind a load balancer and it is all working fine, apart from the streaming of logs. It seems for some web instances it isn't showing and just outputs end of stream as opposed to the running log. Is there anything special needed on the load balancer to get this to work for all instances	18:21
fungi	possible the websocket connection is getting aggressively timed out by the lb?	18:25
*** bhavikdbavishi has quit IRC		18:25
armstrongs	it comes up eventually but just has a long delay	18:33
armstrongs	like 5 or 6 seconds	18:34
fungi	ahh, i think "end of stream" misleadingly displays until the javascript responsible is able to establish a connection, so sounds like something is getting delayed there maybe	18:38
fungi	are you observing this when pulling up the log stream for builds which have been underway for a while, or only on builds as they're starting up?	18:41
fungi	the console log streamer is started as an early part of the build, so isn't there instantly	18:42
fungi	if it's not something obvious like that, i would probably resort to packet captures or access log analysis as the next step	18:47
*** tflink has quit IRC		19:11
*** tflink has joined #zuul		19:12
*** tflink has quit IRC		19:15
*** tflink has joined #zuul		19:17
*** themroc has joined #zuul		19:19
*** themroc has quit IRC		19:25
*** jamesmcarthur has joined #zuul		20:18
SpamapS	armstrongs: some LB's don't handle websockets properly in HTTP mode.	20:36
SpamapS	ELB Classics being one of those.	20:36
*** jamesmcarthur has quit IRC		20:54
*** tosky has joined #zuul		21:32
armstrongs	observing it for jobs that have been running a while. Was trying this with 5 web nodes as a scale up test, seems the more web nodes that are there the more it happens. Its like the stream is pinned to a specific web server	23:23
*** tosky has quit IRC		23:25
armstrongs	as i tried hitting specific web nodes directly behind the load balancer and some arent getting streams. Its like 1 out of 10 dont show when running 10 concurrent jobs.	23:25
fungi	could one of them lack the requisite connectivity from the fingergw/web daemons to 7900/tcp on the executors?	23:37
fungi	or could 7900/tcp on some of your executors be blocked?	23:37
fungi	https://zuul-ci.org/docs/zuul/admin/components.html	23:38
*** panda is now known as panda\|pubholiday		23:41
*** jamesmcarthur has joined #zuul		23:58

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!