Tuesday, 2013-08-20

clarkb	jlk: our jenkins slaves are good at DDoSing our git server	00:01
clarkb	jlk: particularly when we point them at git-daemon	00:01
jlk	strange.	00:01
jlk	but your repos are significantly larger than Fedoras was	00:01
jlk	Fedora was thousands of small repos	00:01
jlk	Our hits were probably more distributed as well, distributed over time and network capabilities. RHT infrastructure had networking gear in between our servers and the Internet, I don't know what they did for throttling or whatnot	00:03
jeblair	jlk: did you use xinetd or run git-daemon itself?	00:04
jlk	good question! I believe I used whatever was packaged in EPEL	00:04
jlk	would have been rhel6 era	00:04
jeblair	jlk: that's pretty much what we're doing, which ends up being xinetd. so no particular tuning?	00:08
fungi	clarkb: any good reason not to pass --events on mysqldump runs? currently cronspamming us about skipping the mysql.event table on each server	00:08
jlk	jeblair: not that I remember.	00:09
jlk	I think I looked at one time at doing git export to just get the latest bits instead of doing a full clone, or doing shallow clones, on our build server	00:09
jlk	because it didn't need any history, just needed the bits	00:09
mordred	jeblair: ^^ that's a little bit what I was afraid of - we tend to absolutely slam the cloning infrastructure	00:10
jlk	ah, apparently they do use xinetd to throttle it a lot now	00:10
jlk	where "it" == anonymous clones	00:10
openstackgerrit	Clark Boylan proposed a change to openstack-infra/config: Proxy git-daemon with haproxy. https://review.openstack.org/42784	00:11
jeblair	mordred: not really, we almost never clone	00:11
clarkb	I can't help myself	00:11
ttx	mordred: are you done merging back swift m-p tags to master, or should I keep the m-p branch alive for some more time ?	00:12
clarkb	that is completely untested but in theory made easy with the puppetlabs module	00:12
mordred	ttx: done with it	00:12
ttx	mordred: I can delete it now ?	00:12
mordred	ttx: also wrote a patch to potentially do it	00:12
mordred	ttx: yup	00:12
clarkb	jlk: aren't all git-daemon clones anonymous?	00:12
ttx	mordred: ok, on my way to final cleanup	00:12
mordred	ttx: https://review.openstack.org/#/c/41927/	00:12
jlk	well, yes, I'm not sure why I added that bit of data.	00:12
jlk	"data"	00:12
jeblair	clarkb, mordred: http://paste.openstack.org/show/44553/	00:13
jeblair	clarkb, mordred: that thread is just sitting there. best i can tell, it's not waiting on a lock. but it is holding one which is blocking everyone else.	00:13
jeblair	that should be the jjb update that changes the git url. it applied fine on jenkins01	00:13
mordred	jeblair: wow. that's stellar	00:14
jeblair	i'm leaning towards "try to manually kill that thread". any other ideas before i do that?	00:14
clarkb	jeblair: is it possibly waiting on a locked file?	00:15
*** pcrews has quit IRC		00:16
*** ^demon has joined #openstack-infra		00:16
*** ^demon has joined #openstack-infra		00:16
ttx	mordred: there is a corner case in the merge-tags thing	00:16
jeblair	clarkb: it looks like a runaway regex	00:16
mordred	jeblair: I'd chalk that up to "java sucks sometimes"	00:17
mordred	ttx: yeah?	00:17
clarkb	fungi: uh I don't know	00:17
ttx	mordred: for stable/* I'm not sure you actually want to merge tags back... do you ?	00:17
* clarkb reads more manpages		00:17
*** nati_ueno has quit IRC		00:17
clarkb	pleia2: if you are really adventurous I think it would be cool to apply 42784 to your test server if it is still up	00:17
jeblair	mordred: 'cept gearman-plugin is a few rungs down the stacktrace	00:18
jeblair	mordred: so it's our fault	00:18
mordred	ttx: branch: ^(milestone-proposed).*$	00:18
ttx	mordred: i.e. when we tag 2013.1.3 on stable/grizzly, do we rally want to merge the tags back to havana master ?	00:18
mordred	ttx: the job is configured to only run on milestone-proposed	00:18
mordred	since that's the only time we ever want to do this	00:18
ttx	mordred: at release time we use milestone-proposed too, and turn that into stable/*	00:18
*** ^d has quit IRC		00:18
clarkb	mordred: thoughts on fungi's --events mysqldump option?	00:19
mordred	ttx: but it's milestone-proposed when you make the tag, right?	00:19
clarkb	mordred: is that table useful or just noise?	00:19
ttx	mordred: so we push like, havana-rc2 tags to milestone-proposed while master switched to icehouse	00:19
mordred	clarkb: noise. we don't use it	00:19
mordred	ttx: yup. that's fine	00:19
clarkb	mordred: so better to redirect that warning message to /dev/null than to dump the table?	00:20
ttx	mordred: ok, just doublechecking	00:20
mordred	ttx: we _do_ want the final tag from havana milestone-proposed to be in master, so that the in-flight versions look "sensible"	00:20
mordred	but I agree, the following tags that are made on stable/* do not want to be merged to master	00:20
ttx	mordred: can that job generate a conflict ? Or is it always successful ?	00:21
mordred	ttx: and we're making it always a null-merge, so the merge will never bring changes from m-p to master	00:21
ttx	ok, guess that answers my question	00:21
mordred	ttx: it's always successful. it's using the merge strategy which says "just keep my version"	00:21
ttx	ack	00:21
ttx	+1ed	00:22
Alex_Gaynor	Is there anythign I could be doing to help with the "ddosing ourselves with git" issue?	00:22
clarkb	Alex_Gaynor: right now we are switching to using https instead of git:// as apache deals with ddosing ourselves better	00:23
jeblair	clarkb, mordred: uh, wow, ok, it got unstuck.	00:23
mordred	jeblair: wow	00:23
Alex_Gaynor	clarkb: "apache deals with ddosing ourselves better", I feel like this encapsulates everything I feel about computering (for better and for worse) :)	00:23
clarkb	Alex_Gaynor: https://review.openstack.org/42784 is one potential way of moving back to using git:// but it needs testing and probably input from someone that knows haproxy better than me	00:23
Alex_Gaynor	clarkb: I can probably ping some HA proxy friends	00:23
clarkb	Alex_Gaynor: I am semi hoping we can abuse pleia2's test box if it is still around	00:24
jlk	seems really strange to make use of https to make things faster...	00:24
jlk	IIRC git:// isn't doing any encryption, which /should/ make it an easier process to handle.	00:24
jeblair	Alex_Gaynor, jlk: basically, git under xinetd has no socket queueing, so you're either under the 50 process limit, or over, in which case you get your connection dropped	00:24
jlk	interesting	00:24
jeblair	Alex_Gaynor, jlk: apache at least will let you separately tune how many things you run, vs how many things you queue	00:24
clarkb	and if we increase the connection limit we end up hitting cpu and disk hard	00:24
jlk	nod	00:25
Alex_Gaynor	Is there anything we can point at github?	00:25
jeblair	so we can set a reasonable number of processes to run at once, and a larger queue	00:25
Alex_Gaynor	let them deal with the problem	00:25
mordred	Alex_Gaynor: hehehe	00:25
mordred	Alex_Gaynor: that's funny	00:25
jeblair	Alex_Gaynor: that's been our strategy up to this point	00:25
jlk	they appear to be moving away from git:// as much as they can	00:25
BobBall_Away	mordred: Now the only failure with VIRTUAL_ENV is grenade... not sure how to fix it though, since we're explicitely trying to perform an upgrade it sounds like it might be more difficult than I'd hope...	00:25
jlk	but that might just be because they can stick all sorts of tracking around http usage that they can't w/ git://	00:25
mordred	BobBall_Away: I think we just may need to do similar work there	00:26
jeblair	Alex_Gaynor: github still fails quite often, enough for our automagic to notice	00:26
mordred	BobBall_Away: or backport some of the changes to devstack stable/grizzly	00:26
mordred	BobBall_Away: but that's thrilling!	00:26
BobBall_Away	mordred: effectively the error seems to be it's running in the venv but things (such as pip) haven't been installed in it	00:26
jeblair	Alex_Gaynor: (i should say partial strategy -- we haven't used github in tests for a long time, but we still use it for cronjobs, etc)	00:26
mordred	BobBall_Away: I'm going to run out fora second, I'll look at grenade when I get back	00:26
BobBall_Away	very thrilling	00:26
BobBall_Away	I'm going to bed now	00:26
jlk	I think Fedora infrastructure also has multiple front ends for git	00:26
jlk	that use a shared FS	00:26
mordred	BobBall_Away: thanks for your help!	00:26
BobBall_Away	it's 1:30am and I've had enough :D	00:27
dstufft	use a CDN	00:27
dstufft	!	00:27
jlk	not positive though	00:27
Alex_Gaynor	dstufft: doing invalidation on a CDN'd git repo sounds awful	00:27
jlk	yikes	00:27
* mordred has a hunch multiple servers is going to wind up being in the cards eventually		00:27
dstufft	Alex_Gaynor: I dunno sounds like it wouldn't be that bad actually	00:27
lifeless	dstufft: I'm not aware of any git CDN's	00:27
Alex_Gaynor	lifeless: if you run git over HTTP(S) you can just use any HTTP pass-through CDN	00:28
clarkb	lifeless: the http stuff should CDN just fine	00:28
jeblair	mordred: yep. i just want it to be multiple good servers.	00:28
lifeless	Alex_Gaynor: clarkb: yeouch. No. Thanks.	00:28
jlk	multiple servers seems easy for read-only support. it's the read/write that's hard with a load balancer	00:28
Alex_Gaynor	master/slave git	00:28
mordred	jlk: we don't need read/write	00:28
mordred	we have a single writ emaster	00:28
mordred	which is gerrit	00:28
jlk	and I really didn't want there to be two vastly different URLs for read-only clone vs write clone	00:28
jeblair	jlk: we are in the fortunate position of only needing to consider read-only mirrors here	00:29
mordred	which replicates to things	00:29
*** nati_ueno has joined #openstack-infra		00:29
jlk	mordred: oh right, that makes things a lot easier for you	00:29
mordred	yup	00:29
lifeless	Alex_Gaynor: clarkb: I presume you are aware of the way plain HTTP with git (and basically all VCS's) works, right ?	00:29
lifeless	Alex_Gaynor: clarkb: or perhaps I should say, I presume you aren't aware, or you wouldn't suggest a CDN be a good fit.	00:29
dstufft	pretend network latency doesn't exist and just fetch some files ? :V	00:29
lifeless	dstufft: thats part A of the terror. part B is to either do readv's, or to sporadically download the entire repo all over again, due to the rebalancing of 'pack' operations	00:30
openstackgerrit	Clark Boylan proposed a change to openstack-infra/config: Make mysql backup crons quiet. https://review.openstack.org/42785	00:30
clarkb	fungi: mordred ^ that should make mysqldump cronspam less annoying	00:30
dstufft	you can probably run multiple git slaves and just front it with haproxying proxying streams around, the only hard part would be determining if an incoming stream is read or write, if there's something obvious in the cnnect that lets you know if something is authentcated you can just shove all authenticated at the master and anonymous at the read slaves	00:31
clarkb	lifeless: if the repo hasn't changed the packs stay the same	00:31
jeblair	dstufft: all streams are read. :)	00:31
clarkb	lifeless: and iirc for large repos like nova you end up with several static packs as git leaves old stuff alone	00:31
jeblair	for us	00:31
jlk	dstufft: I don't think we have to worry about writes, everything is a read	00:32
jlk	dstufft: only gerrit has write access	00:32
dstufft	if everything is read then that's even easier	00:32
dstufft	just use haproxy as a TCP load balancer	00:32
dstufft	use whatever protocol you want, http, git, ssh, doesn't matter	00:33
clarkb	dstufft: https://review.openstack.org/42784	00:33
mordred	dstufft: that's what clarkb was looking in to earlier	00:33
dstufft	wtf is a pp file	00:33
mordred	dstufft: puppet	00:33
dstufft	oh	00:33
jeblair	dstufft, jlk, Alex_Gaynor: so here's the thing -- we spun up a 30g, 8vcpu cloud server for this, and ddosed it with jenkins (it's arguable whether it performed better or worse than the http setup we have on review.o.o)	00:34
jlk	that seems really bizarre, unless you're working with huge repos	00:34
dstufft	you mean the haproxy solution?	00:34
mordred	we have a LOT of activity :)	00:34
clarkb	dstufft: mordred that is a first stab at using haproxy to do queing but it can be grown to handle mutliple servers	00:34
jeblair	dstufft, jlk, Alex_Gaynor: before we spin up an army of maxsize(rackspacecloudservers) for this, i figure a little thought and testing of the tuning of one server might be in order.	00:34
mordred	clarkb: ports => '29418', ?	00:34
Alex_Gaynor	jeblair: so, suggest from a friend of mine "instances=32"	00:35
Alex_Gaynor	jeblair: for xinetd	00:35
dstufft	oh you were just shoving a bigger server at it	00:35
Alex_Gaynor	I assume this forks 32 processes to handle requests	00:35
lifeless	clarkb: it tries to accomodate things yes, which makes the behaviour worse, because you get sporadic 'wtf is it doing' when it has to suck down the entire history again.	00:35
clarkb	dstufft: mordred or maybe we use lbaas to do handle multiple services and keep the local haproxy for queueing	00:35
jlk	mordred: does all that activity require a full clone of the repo?	00:35
dstufft	what does rackspace have for HD's	00:35
*** dina_belova has joined #openstack-infra		00:35
*** rfolco has joined #openstack-infra		00:35
jeblair	Alex_Gaynor: we currently have the default of 50.	00:35
dstufft	spinning up more processes won't help if you're IO bound	00:35
clarkb	mordred: haproxy will listen on 9418 so I stuck gitdaemon on the alternate that gerrit uses	00:35
mordred	clarkb: ahhhh	00:36
mordred	clarkb: I agree with jeblair - let's see what a local haproxy queue will do to it	00:36
mordred	before we start adding in multi-machine lbaas	00:36
clarkb	mordred: definitely	00:36
mordred	but potentially yes	00:36
jeblair	i think we ought to do some real performance testing too	00:36
dstufft	where was the bottleneck?	00:36
*** coderanger has joined #openstack-infra		00:36
jeblair	where we figure out where the bottleneck actually is :)	00:36
coderanger	Alex_Gaynor: Fine :P	00:37
Alex_Gaynor	coderanger knows how haproxy works and junk	00:37
jeblair	and what kind of throughput we can get under different configurations	00:37
*** mriedem has joined #openstack-infra		00:37
Alex_Gaynor	coderanger: tl;dr; too many things trying to get stuff from git == ddosing ourselves	00:37
jlk	yeah, curious where the bottleneck is. Disk, or CPU, or network	00:37
clarkb	coderanger: Alex_Gaynor https://review.openstack.org/#/c/42784/1/modules/cgit/manifests/init.pp is the important file	00:37
dstufft	I think before you go changing your configs around you should figure out the bottleneck	00:37
coderanger	So cranking down maxconns won't buffer connections like it says in the review comment, it will just leave the socket in the listen queue	00:37
dstufft	because that's going to influence what the solution is a lot :V	00:38
coderanger	So if you are getting backed up, you are just going to end up with the kernel refusing conns	00:38
clarkb	coderanger: "anything behind that will queue" is what the commit message says. Is that completely wrong?	00:38
clarkb	ah	00:38
clarkb	well that doesn't help	00:38
coderanger	I mean if can smooth out spikes	00:39
*** michchap has joined #openstack-infra		00:39
coderanger	Up to whatever you max fds is	00:39
clarkb	coderanger: spikes are the current issue. Our jenkins slaves are a thundering herd	00:39
coderanger	Do you know the magnitude?	00:39
clarkb	coderanger: we need a semi deterministic way of making them wait in line if necessary	00:39
jeblair	#status ok	00:40
*** ChanServ changes topic to "Discussion of OpenStack Developer Infrastructure \| docs http://ci.openstack.org \| bugs https://launchpad.net/openstack-ci/+milestone/grizzly \| https://github.com/openstack-infra/config"		00:40
*** dina_belova has quit IRC		00:40
coderanger	clarkb: If thats the way you want to go, make sure you set the backlog param in haproxy too :)	00:40
clarkb	coderanger: absolute worst case is something like ~300 connections all at once based on the number of slaves we have	00:40
clarkb	+ some fudge for random people using it too	00:41
coderanger	Ahh okay, for 300 conns thats fine as long as you know you can clear them	00:41
coderanger	Do the slaves retry on failure?	00:41
clarkb	coderanger: they do not, and that may help a little but not fix the problem	00:41
coderanger	If so, you can also just set the xinetd instances=32	00:41
coderanger	or probably do that anyway jut for safety :)	00:41
coderanger	Any reason to not use Jenkins' "hash" support in the scm config?	00:42
coderanger	Thats been the default for a while now for exactly this reason	00:42
fungi	coderanger: we don't really use the scm plugin for this	00:42
clarkb	coderanger: because it has been useless for a long time. I believe mordred helped make it better but we tried switching to it and didn't for some other reason	00:43
clarkb	mordred: jeblair do you remember why we stuck with g-g-p?	00:43
coderanger	Ahh, manual build kickoff times every slave trying to pull down code?	00:43
jeblair	clarkb: because it has a nice echo statement	00:43
mordred	less work for jenkins to attempt to do	00:43
jeblair	coderanger: yeah, we 'manually' run 400-600 jobs per hour	00:44
jeblair	coderanger: obviously it's not manual, but that's the way jenkins sees it; they're triggered by a project gating system hooked up to our code review	00:44
coderanger	Yahr	00:44
coderanger	And to be clear, this is on recent-ish linux, right? :)	00:44
clarkb	coderanger: haproxy or jenkins?	00:45
mordred	well, the git server is running on centos6	00:45
coderanger	haproxy	00:45
coderanger	(this would do truly bad things on Windows)	00:45
mordred	we don't do windows	00:45
openstackgerrit	Clark Boylan proposed a change to openstack-infra/config: Proxy git-daemon with haproxy. https://review.openstack.org/42784	00:45
fungi	using windows would be truly bad things	00:45
clarkb	^^ now with backlog	00:45
uvirtbot	clarkb: Error: "^" is not a valid command.	00:45
clarkb	uvirtbot: sssshhh	00:45
uvirtbot	clarkb: Error: "sssshhh" is not a valid command.	00:45
mordred	clarkb: yes. that looks good	00:46
coderanger	clarkb: Other thing to check is that no hooks on the git server are using the remote IP for anything (access control, logging?)	00:47
coderanger	Other than that, sounds like it will do what you want :)	00:47
clarkb	coderanger: we don't have server side hooks so we should be fine	00:47
jlk	I hadn't thought about hooks on a git-daemon pull	00:48
clarkb	coderanger: cool thanks	00:48
* jeblair runs again		00:49
clarkb	coderanger: what does the hash option to jenkins scm plugin do?	00:51
* fungi assumes it's hash-based load distribution		00:51
* Alex_Gaynor assumes it reuses the same clone but just fetches that hash		00:51
fungi	ooh, you're probably right	00:52
coderanger	Yeah, the scm plugin uses a cron-style config	00:52
coderanger	the hash flag just lets you do <hash based lb>/N	00:52
coderanger	Spreads out the thundering heard, but that only helps balance against multiple jobs	00:52
coderanger	not multiple slaves on the same job	00:52
clarkb	coderanger: if you want to see shiny graphs and current tests http://status.openstack.org/zuul/	00:52
* fungi guessed right		00:53
Alex_Gaynor	jeez, 600+ outstanding events	00:53
Alex_Gaynor	s/events/results/	00:53
clarkb	Alex_Gaynor: this is what happens before milestone 3 every single time	00:53
clarkb	Alex_Gaynor: for grizzly it was particularly painful	00:53
Alex_Gaynor	clarkb: ahaha, this is my first milestone I guess	00:54
clarkb	Alex_Gaynor: if we had the grizzly load today we would've been fine, but you guys keep writing more code :)	00:54
Alex_Gaynor	clarkb: sorry?	00:54
Alex_Gaynor	:D	00:54
Alex_Gaynor	clarkb: these events/results are all bottlenecked on git?	00:57
*** anteaya has quit IRC		00:58
lifeless	mordred: is the expectation that doing 'pip install -r requirements.txt' will grab everything a service needs?	00:58
lifeless	mordred: pyudev which neutron wants is not listed in it's requirements.txt. I suspect it's a transitive dependency :(	00:58
clarkb	Alex_Gaynor: events definitely are. I don't think results are so it is weird to see results so high	00:59
clarkb	Alex_Gaynor: actually I take that back. results end up merging code in gerrit which would be bottlenecked too	00:59
clarkb	Alex_Gaynor: events is gerrit events input into zuul. Things like new patchset or new comment. results are results from jenkins	01:00
Alex_Gaynor	clarkb: I assume results are serialized, so it's really a head of the line problem?	01:01
clarkb	Alex_Gaynor: correct	01:02
*** lbragstad has joined #openstack-infra		01:02
clarkb	comparing cacti graphs for zuul and review.o.o this really seems to be a zuul problem	01:02
clarkb	mordred: jeblair fungi I think we should merge the change to point d-g at git.o.o	01:02
clarkb	jeblair: and I wonder if we shouldn't artificially throttle zuul, or at least have the option to	01:04
clarkb	I feel better when things are slow but under control :)	01:04
jeblair	clarkb: what?	01:05
clarkb	jeblair: see the queue lengths on the zuul status page	01:05
bodepd	was mgagne in here asking about redirects?	01:06
*** beagles has quit IRC		01:06
* bodepd searches logs...		01:06
clarkb	bodepd: he was at some point last week iirc	01:07
bodepd	clarkb: what was the verdict?	01:08
bodepd	clarkb: shoul I open a ticket?	01:08
bodepd	clarkb: we've got a lot of changes that need to happen, and decision to make based on if that happens	01:09
clarkb	bodepd: I want to say he made the change and it merged	01:09
clarkb	bodepd: check in the git log for openstack/config	01:09
bodepd	the repo does not exist	01:09
clarkb	er openstack-infra/config	01:09
*** pabelanger has quit IRC		01:10
bodepd	no, I meant stackforge/puppet-quantum	01:10
clarkb	oh renames	01:10
clarkb	he wanted puppet lint file redirects. I thought that is what you were talking about	01:11
clarkb	mordred: ^ rename question	01:11
bodepd	sorry.for hte lack of context	01:11
jeblair	i believe that repo has been renamed	01:11
bodepd	mordred: basically, a github redict stackforge/puppet-quantum -> stackforge/puppet-neutron	01:11
bodepd	would be awesome	01:11
bodepd	I know it's possible to do if you are admin of an account	01:12
jeblair	bodepd: i'm opposed to that.	01:12
bodepd	jeblair: ok.	01:12
bodepd	jeblair: that is what I need to know. (if it is going to happen or not)	01:12
bodepd	b/c we have lots of code that needs to be updated otherwise	01:12
bodepd	jeblair: what is the reason against it?	01:13
jeblair	bodepd: sorry, it's an extremely busy time, we're even shorter staffed then normal, and we need to focus on keeping openstack running	01:13
clarkb	jeblair: the last log item for processing result events is from 00:25	01:13
*** xchu has joined #openstack-infra		01:13
*** pabelanger has joined #openstack-infra		01:13
jeblair	clarkb: yeah, i'm trying to figure out what it's doing	01:14
jeblair	clarkb: oh really, i thought this was the last	01:15
jeblair	2013-08-20 00:09:35,360 DEBUG zuul.Scheduler: Processing result event <Build 3133095c056a4d7ab064e05a01c7b310 of gate-tempest-devstack-vm-postgres-full>	01:15
pleia2	clarkb: am away from my laptop for a few hours, can do some tests later (my test server is still up)	01:16
clarkb	pleia2: awesome. That would be helpful as it seems like I am doing 2 other things at the moment	01:16
clarkb	pleia2: and I think it can wait for tomorrow	01:16
jeblair	oh you're right	01:17
jeblair	2013-08-20 00:25:24,949 DEBUG zuul.Scheduler: Processing result event <Build 339f8f6144644de8b354d56303879d7b of gate-cinder-pep8>	01:17
clarkb	jeblair: which is interesting because it is a result that should end up merging code or anything like that	01:20
*** lcestari has quit IRC		01:21
clarkb	jeblair: but that would trigger pipeline.manager.onBuildCompleted(build)	01:23
clarkb	jeblair: 42726,2 is in the check queue	01:25
jeblair	clarkb: any completion event triggers the pipeline processor	01:25
clarkb	jeblair: it does look like the gate queue is still being processed though?	01:26
jeblair	it does?	01:27
fungi	bodepd: per github redirects, i got the impression from the article on their site that it happens automagically when a repo is moved/renamed. but maybe not	01:28
clarkb	jeblair: well the existing changes are getting some updates. I think anything going through the global event loop is stuck	01:28
Alex_Gaynor	fungi: yes, when a repo is renamed the redirects should be automatic	01:28
clarkb	jeblair: though it looks like that is happening for check changes too. So status on the changish/eventqueueobject is being updated but the big while true loop is stuck so we don't update much more than that	01:28
clarkb	jeblair: are we stuck in the while self.processQueue loop in the pipeline manager?	01:30
clarkb	jeblair: https://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/scheduler.py#n1036	01:31
*** coderanger has left #openstack-infra		01:32
*** Ryan_Lane has quit IRC		01:32
*** mriedem has quit IRC		01:33
clarkb	jeblair: http://paste.openstack.org/show/44559/ is the last time I see that log message	01:34
jeblair	clarkb: it recently logged it again 2013-08-20 01:27:07,488 DEBUG zuul.IndependentPipelineManager: Starting queue processor: check	01:34
clarkb	jeblair: yeah my version of the debug log was out of date	01:35
jeblair	clarkb: did it move?	01:35
jeblair	clarkb: istr top of check had no running jobs	01:35
clarkb	jeblair: yeah looking at the log it seems to have moved	01:36
jeblair	clarkb: 2013-08-20 01:27:07,148 DEBUG zuul.Scheduler: Run handler sleeping	01:36
jeblair	2013-08-20 01:27:07,148 DEBUG zuul.Scheduler: Run handler awake	01:36
*** dina_belova has joined #openstack-infra		01:36
jeblair	clarkb: so basically it just spent 1 hour in one iteration of that loop	01:36
clarkb	jeblair: http://paste.openstack.org/show/44560/	01:36
clarkb	jeblair: yes	01:36
Alex_Gaynor	it looks like the queue started to move again?	01:37
Alex_Gaynor	at least a little	01:37
clarkb	Alex_Gaynor: yeah a little	01:37
clarkb	I need to head home or food will be cold. But I will check back in from there	01:37
clarkb	jeblair: tail -f /var/log/zuul/debug.log \| grep 'zuul.*PipelineManager' is what I am running now to see it move	01:38
fungi	is the gerrit-overloaded-slowing-merges-and-result-posting theory still being batted around? with load average ~300 there and cpu pegged flat out, it seems reasonable for that to crawl	01:40
fungi	er, ~200 i guess	01:41
*** dina_belova has quit IRC		01:41
Alex_Gaynor	everything broke together is a pretty reasonable explanation it seems	01:41
jeblair	fungi: it's possible; but we didn't see this earlier when we were busier	01:41
fungi	mmm, point	01:42
Alex_Gaynor	so what changed such that things started moving again?	01:42
Alex_Gaynor	(there's still a ton of oustadning events/results)	01:43
HenryG	Trying to figure out what went wrong in gate-grenade-devstack-vm here: https://review.openstack.org/35085	01:44
HenryG	Help?	01:45
fungi	HenryG: could this be the client backwards compat issue which was causing problems earlier today? have you asked in #openstack-qa?	01:47
*** pcrews has joined #openstack-infra		01:47
mordred	yes it is	01:47
*** ftcjeff has joined #openstack-infra		01:47
mordred	HenryG: known issue from earlier. should be fixed now	01:47
HenryG	mordred: fungi: thanks. recheck bug #?	01:47
jeblair	HenryG: it's at the top of the page here: http://status.openstack.org/rechecks/	01:48
fungi	HenryG: yeah, looking at the console log for that change it looks the same	01:48
Alex_Gaynor	so I'm starting to think those queue counts can't possibly be right	01:49
jeblair	Alex_Gaynor: why? it's been stuck/slow for over an hour	01:49
Alex_Gaynor	jeblair: well, there are ~50 patches in tehre right now, how can there be 965 results (is that queue entirely jenkins results/)	01:50
jeblair	Alex_Gaynor: those are start and stop events for jenkins; something like more than 700 have arrived since the start of the slowness	01:53
Alex_Gaynor	so 50 * (say 6 tests per) * 2 still doesn't account for 900?	01:53
fungi	and yeah, it does seem from the cacti graphs that cpu/load have fallen dramatically on zuul in the past couple hours	01:53
Alex_Gaynor	Random other point: the SCP step for the logs seems to be slower today	01:54
jeblair	Alex_Gaynor: it's more than 6 jobs per change	01:55
jeblair	Alex_Gaynor: nova runs 13	01:55
jeblair	in the check queue	01:55
Alex_Gaynor	gah, good point, I guess it does add up	01:55
Alex_Gaynor	1k events :(	01:56
*** nati_ueno has quit IRC		01:57
jeblair	i have attached a debugger.	01:58
jeblair	i need to get a stack trace, but the last time i tried that with gdb, the old trick i used to use didn't work	01:59
clarkb	gdb or pdb?	02:00
jeblair	gdb	02:00
jeblair	can you attach pdb to a running process?	02:00
Alex_Gaynor	attach a gdb, acquire the GIL, use pdb :)	02:00
jeblair	Alex_Gaynor: do you have instructions for that?	02:01
dstufft	you'll have to teach me how to do that someday Alex_Gaynor	02:01
Alex_Gaynor	if it's a recent gcc there's actually a python embedded that let's you do stuff	02:01
Alex_Gaynor	gdb&\	02:01
Alex_Gaynor	gdb*\	02:01
Alex_Gaynor	http://wiki.python.org/moin/DebuggingWithGdb has some details	02:02
jeblair	Alex_Gaynor: afaict, the 'py-bt' thing is a fedora-ism	02:02
jeblair	https://fedoraproject.org/wiki/Features/EasierPythonDebugging#New_gdb_commands	02:02
Alex_Gaynor	jeblair: it was originalyl developed by a redhat person for fedora, but it's upstream now	02:02
jeblair	oh. this is on precise	02:03
Alex_Gaynor	maybe debian/friends don't compile with the needed flags or something :(	02:03
jeblair	Alex_Gaynor: i think those are extra gdb commands	02:03
*** rfolco has quit IRC		02:04
jeblair	ah, they are also in the precise python dbg package	02:04
fungi	load average on review.o.o has collapsed too now	02:04
* fungi needs to head out to a dinner reservation. bbl		02:05
Alex_Gaynor	I need to head home from the office because at some point it became 7PM, I'll be around more when I'm home	02:05
*** rfolco has joined #openstack-infra		02:05
clarkb	jeblair: anything else I can be doing now to help?	02:09
jeblair	clarkb: i'm still unable to get a stacktrace. 'py-bt' just says (unable to read python frame information) for every frame	02:10
jeblair	clarkb: figuring out how to get a stacktrace from a running python on ubuntu precise is what i'm working on now. any help there would be appreciated	02:11
*** yaguang has joined #openstack-infra		02:11
clarkb	jeblair: ok	02:12
jeblair	clarkb: apparently those macros expect to be run with python-dbg, which of course is not how we started zuul	02:13
clarkb	jeblair: http://www.python.org/~jeremy/weblog/031003.html not quite a stack trace but possibly useful	02:13
*** xBsd has joined #openstack-infra		02:16
clarkb	jeblair: also http://svn.python.org/projects/python/trunk/Misc/gdbinit	02:17
jeblair	clarkb: i think the objects have changed since then	02:17
clarkb	jeblair: that gdbinit comes with a pystack function	02:18
*** ^demon has quit IRC		02:20
jeblair	clarkb: No symbol "co" in current context.	02:20
jeblair	clarkb: these all seem to be obsolete.	02:20
clarkb	:( yeah they are fairly old	02:20
* clarkb finds python2.7 branch		02:21
*** lbragstad has quit IRC		02:22
jeblair	clarkb: i think it's due to gcc optimizations	02:23
clarkb	jeblair: http://hg.python.org/cpython/file/c048b211f634/Misc/gdbinit doesn't seem different but I haven't actually diffed them	02:23
clarkb	jeblair: ah so the symbols just don't exist because gcc	02:23
jeblair	i wonder if we could even do Alex_Gaynor's pdb trick with the current level of symbol mangling	02:25
Alex_Gaynor	jeblair: if you can grab the Gil and use c execute simple string it should be possible	02:26
jeblair	Alex_Gaynor: that sounds easy but i have no idea how to go about that	02:26
Alex_Gaynor	When I'm at a computer and not my phone I'll try to find av reference	02:27
mordred	jeblair: I'm here - I do not what what I can do to be helpful to you	02:30
clarkb	mordred: we need a stacktrace from running zuul	02:30
mordred	http://www.jmcneil.net/2012/04/debugging-your-python-with-gdb-ftw/	02:31
mordred	reading this now	02:31
jeblair	mordred: my understanding of that is that it does not work because of gcc optimizations	02:32
mordred	jeblair: yeah. I believe you are correct	02:32
mordred	btw - symbol stripping, which debian is obsessed with, has no real noticable benefit most times	02:33
mordred	and screws you in times like this	02:33
mordred	jeblair: have you installed python-dbg? sometimes deb packages extract the symbols and put them into external files	02:33
jeblair	thanks debian!	02:33
jeblair	mordred: yes i have	02:33
mordred	and gdb can be told to load them as symbol maps	02:33
mordred	let me see if i can get some info on that	02:34
jeblair	mordred: that made the backtraces look like this: #33 0x0000000000466a42 in PyEval_EvalFrameEx ()	02:34
jeblair	mordred: but still no understanding of arguments or local variables	02:34
*** eharney has joined #openstack-infra		02:34
mordred	so "p *co" does nothing	02:35
jeblair	No symbol "co" in current context.	02:35
mordred	awesome	02:35
jeblair	so, we could call this a wash	02:36
*** dina_belova has joined #openstack-infra		02:36
jeblair	and restart zuul using the 'python-dbg' binary	02:36
mordred	oh - wait	02:36
mordred	there's a thing dhellman tweeted about the other day	02:37
*** jfriedly has quit IRC		02:37
clarkb	this must be why people gentoo	02:37
jeblair	and if it happens again, we'd be in a better place (no idea what that would do to performance though, since i think it is doing refcount debugging as well)	02:37
jeblair	mordred: that's exciting; i'm holding for your tweet	02:37
jeblair	(i'll be really excited if the actual method is less than 140 characters)	02:37
mordred	ok. I don't think this is it, but, while I'm looking, look at: https://github.com/albertz/pydbattach	02:38
*** rfolco has quit IRC		02:38
*** dina_belova has quit IRC		02:38
jeblair	mordred: wilco	02:38
*** mriedem has joined #openstack-infra		02:40
jeblair	mordred: neat, but it's complicated, and i don't really want to audit it or compile/run it on our server right now	02:41
mordred	jeblair: ok. that's the closest thing I can find right now	02:41
mordred	I think that call a wash and restart zuul with python-dbg is our best bet	02:42
clarkb	wfm	02:42
clarkb	not elegant, but if it keeps things moving...	02:42
*** bingbu has joined #openstack-infra		02:43
jeblair	okay that's clearly more complicated than it seems	02:46
jeblair	ImportError: /usr/local/lib/python2.7/dist-packages/Crypto/Util/_counter.so: undefined symbol: Py_InitModule4_64	02:46
jeblair	ok, so i can just restart it as normal, and add some more debug lines to it i guess.	02:47
jeblair	maybe add a jenkins style "threadDump" command. won't that just be the best?	02:48
jeblair	zuul has been restarted. it has no queue.	02:48
*** mriedem has quit IRC		02:49
*** pcrews has quit IRC		02:49
clarkb	jeblair: that will work too	02:49
Alex_Gaynor	well that doesn't sound good	02:49
mordred	jeblair: sigh. I believe, now that you mention, to use python-dbg, you will need -dbg versions of all of the c-based python libraries you might have installed	02:49
mordred	in addition to the -dbg versions of the c libraries they depend on	02:49
jeblair	mordred: lets move all our servers to rhel	02:50
mordred	jeblair: ok	02:50
clarkb	jeblair: or gentoo	02:50
mordred	jeblair: or gentoo - and we can compile from source ourselves	02:50
jeblair	mordred: https://bugs.launchpad.net/nova/+bug/937554/comments/13	02:51
uvirtbot	Launchpad bug 937554 in nova "Lots of problems with deleting a server immediately after create (dup-of: 934575)" [High,Fix committed]	02:51
uvirtbot	Launchpad bug 934575 in nova "notifier endless loops in is_primitive" [Medium,Fix released]	02:51
*** eharney has quit IRC		02:51
* mordred is looking at the debian packaging and cannot figure out why stack information is missing in the normal python		02:51
*** melwitt has quit IRC		02:51
mordred	they aren't passing stupid optimizer flags	02:51
jeblair	handy instructions for building your own python, in a nova bug report no less!	02:51
jeblair	mordred: "	02:52
jeblair	#Recompiling python with make "CFLAGS=-g -fno-inline -fno-strict-aliasing" solves this problem.	02:52
jeblair	mordred: ^ from that bug report; that help?	02:52
mordred	ahhhh	02:52
mordred	yes	02:52
mordred	-fno-inline	02:52
mordred	I forgot - python actually has a bunch of stuff defined in header files	02:52
mordred	so -O2 is going to wind up inlining the shit out of it	02:52
mordred	-O2 includes -finline-small-functions	02:55
mordred	-O0, which python-dbg is compiled with, does not	02:55
mordred	they're all compiled with -g but then dh_strip puts the symbols into python-dbg	02:55
mordred	none of that is helpful here	02:56
*** afazekas_zz is now known as __afazekas_zz		03:02
jeblair	i have reverified all the changes that were approved and did not have a vrfy-2	03:04
*** rcleere has joined #openstack-infra		03:04
*** markmcclain has quit IRC		03:05
jeblair	i have had a very long day and am not useful. tomorrow i intend to work on nodepool. if anyone wants to add some more debugging or a threadDump feature to zuul, that would be great; otherwise, i'll get to it later this week	03:06
jeblair	also, i'm thinking we should have the gearman-plugin stop seding work status packets.	03:06
jeblair	sending	03:06
*** Ryan_Lane has joined #openstack-infra		03:07
Alex_Gaynor	so, are no builds happening right now?	03:07
clarkb	I can look into zuul threaddumps	03:07
clarkb	after I propose changed to add mysql backups (that should be quick)	03:07
jeblair	Alex_Gaynor: i restarted zuul, should be running now	03:07
Alex_Gaynor	jeblair: there doesn't appear to be anythign on http://status.openstack.org/zuul/	03:07
clarkb	jeblair: are work status packets causing problems?	03:07
clarkb	Alex_Gaynor: refresh? there is stuff for me	03:07
*** pcrews has joined #openstack-infra		03:08
jeblair	Alex_Gaynor: you may need to reload it?	03:08
Alex_Gaynor	I don't even know. I hate browsers.	03:08
jeblair	clarkb: no, but we ignore them. just busy work.	03:08
mordred	jeblair: oh, for some reason I thought we were using them for status bars - I agree with anything you say	03:12
clarkb	mordred: that is what I thought they were for too	03:12
clarkb	and isn't zuul LOST status the result of not getting a status from gearman?	03:13
*** erfanian has quit IRC		03:13
jeblair	mordred: we could. what we actually do is grab the estimated time from the first one and then calculate it ourselves.	03:13
mordred	jeblair: ah. nice	03:13
jeblair	clarkb: no, it polls gearman to see if the job is still in the queue. that would be a reasonable thing to do though...	03:14
jeblair	clarkb: it would have helped with the jobs that got stuck in the jenkins queue and never ran	03:14
jeblair	clarkb: maybe we should keep it and just reduce the logging.	03:14
clarkb	++	03:14
jeblair	i've seen several jobs lost because of errors like this: https://jenkins02.openstack.org/job/gate-grenade-devstack-vm/2370/console	03:15
jeblair	i have no idea what's going on there. perhaps a dead slave (nodepool does not have a periodic job to recheck ssh access)	03:15
jeblair	but it seems to happen a lot for that.	03:15
Alex_Gaynor	For all that jobs that were lost when zuul was restarted, are the patch authors notified so they can recheck/reverfiy?	03:16
clarkb	Alex_Gaynor: no, but I think jeblair indicated he did it for them	03:16
jeblair	Alex_Gaynor: i reverified the ones that were approved;	03:16
Alex_Gaynor	Oh, that's nice of you!	03:17
jeblair	I have not done rechecks.	03:17
jeblair	it's hard to get a gerrit query for that.	03:17
Alex_Gaynor	Things that don't hvae a current status from jenkins	03:17
Alex_Gaynor	gerrit doesn't have an easy way to do that? :(	03:17
mordred	-label:Verified<=2 will get you the ones that are completely new - but it's hard to get the ones that may have had a new patchset uploaded since the last time they were check verified	03:19
mordred	because we don't clear the verified status on the start of a new check job like we do for the gate	03:19
mordred	actually, you'd want -label:Verified<=2 -label:Approved for the first one, to make sure that you're not catching a thing that the gate has cleared the verified vote	03:20
mordred	but still, you're still missing a ton there	03:21
*** HenryG has quit IRC		03:25
*** zul has quit IRC		03:31
*** cthulhup has joined #openstack-infra		03:33
*** cthulhup has quit IRC		03:37
*** dina_belova has joined #openstack-infra		03:37
*** dina_belova has quit IRC		03:42
*** afazekas has joined #openstack-infra		03:42
*** boris-42 has joined #openstack-infra		03:49
bodepd	fungi: I just went through the following process: https://gist.github.com/bodepd/6276932	03:52
bodepd	fungi: and my redirects worked as expected. I did, however, use github's GUI, and I am not sure what process was used by your team	03:53
*** xBsd has quit IRC		03:53
*** jfriedly has joined #openstack-infra		03:55
*** wenlock has joined #openstack-infra		03:56
*** mberwanger has joined #openstack-infra		03:59
*** yaguang has quit IRC		03:59
*** vogxn has joined #openstack-infra		04:01
*** michchap_ has joined #openstack-infra		04:04
*** michchap has quit IRC		04:08
*** yaguang has joined #openstack-infra		04:12
*** ftcjeff has quit IRC		04:23
*** wenlock has quit IRC		04:24
*** dims has quit IRC		04:25
*** cthulhup has joined #openstack-infra		04:27
*** cthulhup has quit IRC		04:31
*** dina_belova has joined #openstack-infra		04:38
*** mberwanger has quit IRC		04:38
*** dina_belova has quit IRC		04:42
*** xBsd has joined #openstack-infra		04:47
*** reed has quit IRC		04:53
*** yaguang has quit IRC		04:59
*** rcleere has quit IRC		05:03
fungi	bodepd: yeah, mordred did the stackforge/puppet-{quantum,neutron} move, but not sure what he did in github land for it. our http://ci.openstack.org/gerrit.html#renaming-a-project recipe suggests "12. Rename the project in GitHub..." so i would assume that's what he did	05:07
*** dmakogon_ has joined #openstack-infra		05:08
*** yaguang has joined #openstack-infra		05:12
*** cthulhup has joined #openstack-infra		05:21
*** SergeyLukjanov has joined #openstack-infra		05:24
*** cthulhup has quit IRC		05:25
*** nicedice_ has quit IRC		05:29
mordred	fungi, bodepd I'm pretty sure I just deleted the old project and let the new project be created by manage_projects	05:34
*** dina_belova has joined #openstack-infra		05:38
*** dina_belova has quit IRC		05:43
*** thomasbiege has joined #openstack-infra		05:48
*** DennyZhang has joined #openstack-infra		05:55
*** mikal has quit IRC		05:55
*** thomasbiege1 has joined #openstack-infra		05:59
*** thomasbiege has quit IRC		06:02
*** thomasbiege1 has quit IRC		06:13
*** cthulhup has joined #openstack-infra		06:15
*** thomasbiege has joined #openstack-infra		06:17
*** cthulhup has quit IRC		06:20
*** dina_belova has joined #openstack-infra		06:39
*** dina_belova has quit IRC		06:43
*** tian has quit IRC		06:44
*** nayward has joined #openstack-infra		06:47
*** fbo is now known as fbo_away		06:49
*** SergeyLukjanov has quit IRC		06:50
*** jfriedly has quit IRC		06:52
bodepd	mordred: :( . I'm trying to reach out to some folks at github to see if they can help us setup those redirects	06:57
bodepd	mordred: I may need someone with actual credentials to approve it once I get a hold of the right person	06:58
*** michchap has joined #openstack-infra		07:00
*** xchu has quit IRC		07:00
*** michchap_ has quit IRC		07:02
*** cthulhup has joined #openstack-infra		07:09
*** SergeyLukjanov has joined #openstack-infra		07:11
*** xchu has joined #openstack-infra		07:12
*** cthulhup has quit IRC		07:14
*** SergeyLukjanov has quit IRC		07:14
*** ruhe has joined #openstack-infra		07:26
*** pblaho has joined #openstack-infra		07:29
*** boris-42 has quit IRC		07:34
*** SergeyLukjanov has joined #openstack-infra		07:38
*** dina_belova has joined #openstack-infra		07:39
*** michchap has quit IRC		07:39
*** michchap has joined #openstack-infra		07:39
odyi	bodepd: Simply contacting Github support had really good turn around on the redirects from puppetlabs/puppetlabs-* to stackforge/puppet-*.	07:41
odyi	They manually put them in long before I actually deleted the repositories.	07:42
*** odyssey4me3 has joined #openstack-infra		07:47
odyi	The "Approved" label that seems to be a part of each Gerrit project. What is it used for? Gerrit docs don't make mention of it so I assume it is a custom label.	07:48
* odyi also couldn't find it mentioned in any of the OpenStack/Gerrit workflow docs.		07:50
*** michchap has quit IRC		07:52
*** morganfainberg is now known as morganfainberg\|a		07:55
*** DennyZhang has quit IRC		07:56
*** SergeyLukjanov has quit IRC		08:00
*** vogxn has quit IRC		08:03
*** cthulhup has joined #openstack-infra		08:03
*** jpich has joined #openstack-infra		08:04
*** derekh has joined #openstack-infra		08:06
*** fbo_away is now known as fbo		08:07
*** cthulhup has quit IRC		08:08
*** xchu has quit IRC		08:09
*** alex_dolby has joined #openstack-infra		08:15
*** jhesketh has quit IRC		08:16
alex_dolby	hi guys.. i am running tox -epy26 in python-novaclient compoennt and getting error about pbr version versions	08:17
alex_dolby	pbr version in setup.py and requirement.txt has different versions..	08:18
alex_dolby	any pointers?	08:18
*** mkerrin has quit IRC		08:20
*** dina_belova has quit IRC		08:21
*** ladquin has quit IRC		08:24
*** xchu has joined #openstack-infra		08:26
*** psedlak has joined #openstack-infra		08:27
*** SergeyLukjanov has joined #openstack-infra		08:27
*** boris-42 has joined #openstack-infra		08:40
*** cthulhup has joined #openstack-infra		08:57
*** vogxn has joined #openstack-infra		09:02
*** cthulhup has quit IRC		09:02
*** arezadr has quit IRC		09:12
*** dina_belova has joined #openstack-infra		09:22
*** dina_belova has quit IRC		09:26
*** bingbu has quit IRC		09:27
*** SergeyLukjanov has quit IRC		09:32
*** dina_belova has joined #openstack-infra		09:32
*** dina_belova has quit IRC		09:34
*** dina_belova has joined #openstack-infra		09:35
*** yaguang has quit IRC		09:43
*** odyssey4me3 has quit IRC		09:54
*** xchu has quit IRC		10:03
*** odyssey4me3 has joined #openstack-infra		10:05
*** dina_belova has quit IRC		10:09
*** LinuxJedi has quit IRC		10:09
*** ruhe has quit IRC		10:12
*** alexpilotti has joined #openstack-infra		10:12
*** odyssey4me3 has quit IRC		10:17
*** LinuxJedi has joined #openstack-infra		10:20
*** ruhe has joined #openstack-infra		10:21
*** odyssey4me3 has joined #openstack-infra		10:24
*** thomasbiege has quit IRC		10:28
*** SergeyLukjanov has joined #openstack-infra		10:37
*** nayward has quit IRC		10:45
dhellmann	mordred, jeblair : were you looking for https://github.com/dhellmann/smiley/ last night?	10:49
*** mkerrin has joined #openstack-infra		10:52
*** nayward has joined #openstack-infra		10:52
*** markmc has joined #openstack-infra		10:56
*** thomasbiege has joined #openstack-infra		11:02
*** dina_belova has joined #openstack-infra		11:09
jswarren	After the glanceclient fix yesterday, I reviewed three changes with "recheck no bug" about 12 hours ago. Jenkins has not re-reviewed them yet. Anything else I need to do?	11:09
jswarren	For example: https://review.openstack.org/#/c/40232/	11:11
*** SergeyLukjanov has quit IRC		11:12
*** dina_belova has quit IRC		11:14
*** vogxn has quit IRC		11:16
*** lcestari has joined #openstack-infra		11:17
*** vogxn has joined #openstack-infra		11:18
*** zul has joined #openstack-infra		11:19
*** dina_belova has joined #openstack-infra		11:19
*** dims has joined #openstack-infra		11:20
*** dina_belova has quit IRC		11:24
*** nayward has quit IRC		11:29
*** weshay has joined #openstack-infra		11:31
*** vogxn has quit IRC		11:31
*** ruhe has quit IRC		11:32
*** SergeyLukjanov has joined #openstack-infra		11:39
*** nayward has joined #openstack-infra		11:41
*** SergeyLukjanov has quit IRC		11:44
*** zul has quit IRC		11:46
*** pcm_ has joined #openstack-infra		11:46
*** vogxn has joined #openstack-infra		11:46
*** HenryG has joined #openstack-infra		11:49
openstackgerrit	Julien Danjou proposed a change to openstack/requirements: Add gevent https://review.openstack.org/42871	11:50
*** jjmb1 has quit IRC		11:58
*** afazekas is now known as afazekas_no_irq		11:59
*** yaguang has joined #openstack-infra		12:02
*** ruhe has joined #openstack-infra		12:06
*** rfolco has joined #openstack-infra		12:07
*** alex_dolby has quit IRC		12:09
*** vogxn has quit IRC		12:12
*** apcruz has joined #openstack-infra		12:18
*** mriedem has joined #openstack-infra		12:19
*** dina_belova has joined #openstack-infra		12:20
*** sandywalsh has quit IRC		12:22
*** sandywalsh has joined #openstack-infra		12:24
*** dina_belova has quit IRC		12:25
*** anteaya has joined #openstack-infra		12:27
*** SergeyLukjanov has joined #openstack-infra		12:35
*** ruhe has quit IRC		12:36
*** zul has joined #openstack-infra		12:37
*** dims has quit IRC		12:38
*** dprince has joined #openstack-infra		12:39
*** dkranz has joined #openstack-infra		12:39
*** dims has joined #openstack-infra		12:40
*** dina_belova has joined #openstack-infra		12:43
zul	so im curious why jenkins hasnt been triggered for https://review.openstack.org/#/c/41093/ and https://review.openstack.org/#/c/42789/	12:44
*** ruhe has joined #openstack-infra		12:47
markmc	zul, you know, I think I'm seeing this too with my nova reviews	12:47
* markmc looks		12:47
*** dina_belova has quit IRC		12:47
*** SergeyLukjanov has quit IRC		12:47
markmc	zul, ok, not seeing it now - but think I saw zuul missing some submissions yesterday	12:48
zul	hmmm	12:48
zul	is there a way to kick them off again?	12:49
markmc	looks like recheck doesn't work, I don't know of another way	12:51
markmc	just change the commit message of the first patch and re-submit	12:51
zul	ok	12:55
*** dkranz has quit IRC		12:55
anteaya	markmc zul there were issues yesterday with jenkins. The best I understand is that jenkins was ddosing our git server and there was much work to bring about a resolution. Reading the logs, I can not definitively point to a solution that was found. What you are seeing _may_ be related.	13:00
markmc	ok, thanks	13:01
*** jog0 is now known as jog0-away		13:01
zul	anteaya: cool thanks	13:01
anteaya	np	13:01
*** mberwanger has joined #openstack-infra		13:01
*** adalbas has quit IRC		13:03
*** kiall has quit IRC		13:08
*** dkliban has quit IRC		13:11
*** changbl has quit IRC		13:12
*** whayutin_ has joined #openstack-infra		13:14
*** weshay has quit IRC		13:16
*** xchu has joined #openstack-infra		13:20
*** w_ has quit IRC		13:23
*** sgviking has quit IRC		13:25
*** sgviking has joined #openstack-infra		13:25
*** sgviking has quit IRC		13:26
*** sgviking has joined #openstack-infra		13:26
*** lbragstad has joined #openstack-infra		13:27
*** HenryG has quit IRC		13:27
*** michchap has joined #openstack-infra		13:30
*** mberwanger has quit IRC		13:35
*** prad_ has joined #openstack-infra		13:37
*** burt has joined #openstack-infra		13:42
*** thomasbiege2 has joined #openstack-infra		13:43
mordred	dhellmann: yes	13:44
*** cppcabrera has joined #openstack-infra		13:45
*** thomasbiege has quit IRC		13:46
jd__	ttx, mordred, dhellmann, whoever, I'd need https://review.openstack.org/#/c/42871/ quickly to unblock Ceilomeer CI failing	13:46
jd__	zul: ^	13:46
mordred	jd__: can you point me to the failing thing?	13:46
ttx	jd__: looks like I don't have +2 on requirements	13:47
* mordred wants to understand why our mirror builder isn't picking it up		13:47
ttx	mordred: I thought I had, but meh	13:47
jd__	mordred: http://logs.openstack.org/46/42846/1/check/gate-ceilometer-python27/caaca73/console.html.gz	13:47
mordred	thank you	13:47
jd__	mordred: Pymongo does not specify the dependency…	13:48
ttx	I can certainly spare the effort	13:48
mordred	jd__: o m g	13:48
mordred	jd__: SERIOUSLY?	13:48
mordred	I hate people	13:48
*** dina_belova has joined #openstack-infra		13:48
jd__	I couldn't agree more	13:48
jd__	I've opened a ticket upstream https://jira.mongodb.org/browse/PYTHON-558	13:48
mordred	aprvd	13:48
ttx	mordred: was I supposed to have +2 on requirements ? I forget what we originally said (discovered recently I wasn't subscribed to it)	13:49
mordred	ttx: I'm happy to give you +2 on them - makes sense for you to have it	13:49
*** whayutin_ has quit IRC		13:49
jd__	+1 :)	13:50
ttx	can't remember if I signed up for it or not	13:51
*** dina_belova has quit IRC		13:52
mordred	jd__: ok- there is feedback on that bug...	13:53
ttx	mordred: let me watch the reviews for some time to see if I actually care enough	13:53
jd__	mordred: just saw, I'm responding	13:53
mordred	jd__: I did too	13:53
jd__	ah	13:53
* jd__ lags		13:53
mordred	jd__: "gevent doesn't support python 3 or pypy" -- is there an internal feature of pymongo that you're using that's going to get us in trouble with python 3 and pypy support?	13:54
jd__	mordred: no, we use nothing fancy	13:54
mordred	k. cool	13:54
mordred	I'll be interested to see what's going on here	13:54
jd__	that's why I'm surprised we see errors about gevent now that we pull pymongo 2.6	13:54
*** michchap has quit IRC		13:57
*** dina_belova has joined #openstack-infra		13:58
*** ftcjeff has joined #openstack-infra		13:59
openstackgerrit	Russell Bryant proposed a change to openstack-infra/config: Disable tempest in the cells job temporarily https://review.openstack.org/42898	14:00
*** weshay has joined #openstack-infra		14:01
*** vogxn has joined #openstack-infra		14:01
jd__	ah now that talks about greenlet and I'm going to be lost in that again	14:02
* jd__ runs		14:02
*** dina_belova has quit IRC		14:03
openstackgerrit	Russell Bryant proposed a change to openstack-infra/config: Disable tempest in the cells job https://review.openstack.org/42898	14:06
*** dkliban has joined #openstack-infra		14:08
*** xBsd has quit IRC		14:10
*** xBsd has joined #openstack-infra		14:10
*** xBsd has quit IRC		14:12
mordred	jd__: I think we can remove use_greenlets	14:15
mordred	jd__: "If you need to use standard Python threads in the same process as Gevent and greenlets"	14:16
jd__	indeed, we don't use threads so that should be ok I guess	14:16
markmc	are you sure?	14:16
markmc	libraries have been known to spawn random threads :)	14:17
*** pabelanger has quit IRC		14:17
jd__	now I'm unsure and scared	14:17
mordred	markmc: well, let's solve that problem when we come to it for real - keeping the option means we're adding another python3 incompatability	14:17
mriedem	dhellmann: ping	14:17
mordred	jd__: can we try a patch to ceilometer that removes the option?	14:18
markmc	jd__, cooperative coroutines mumble mumble ... oh, look over there!	14:18
jd__	mordred: sure, it'll take me a sec'	14:18
mordred	jd__: sending patch in...	14:18
jd__	mordred: cool	14:18
mordred	jd__: https://review.openstack.org/42906	14:20
*** changbl has joined #openstack-infra		14:20
jd__	mordred: ack, approving, if Jenkins' happy, we'll be too	14:21
mordred	great!	14:21
jd__	and we'll be able to revert gevent fortunately	14:21
*** xBsd has joined #openstack-infra		14:21
mordred	I already blocked that from merging	14:21
mordred	and https://jira.mongodb.org/browse/PYTHON-558?focusedCommentId=407277#comment-407277 for anyone who wants to play along	14:22
mriedem	dhellmann: nevermind	14:22
*** odyssey4me3 has quit IRC		14:22
ttx	nice turnaround on that bug report	14:22
mordred	dhellmann: I'm reading the mailing list as being in approval of give Alex_Gaynor +2 on requirements...	14:23
mordred	dhellmann: shall we make that happen?	14:23
*** beagles has joined #openstack-infra		14:28
*** thomasbiege2 has quit IRC		14:28
*** rcleere has joined #openstack-infra		14:32
*** mrmartin has joined #openstack-infra		14:33
*** gordc has joined #openstack-infra		14:35
*** markmcclain has joined #openstack-infra		14:37
*** ruhe has quit IRC		14:38
gordc	hi folks, would anyone happen to know when the cron job runs to update CI mirror? i just made a release for a lib and was wondering when jenkins would pick it up ... or if i could force it to get picked up.	14:38
*** datsun180b has joined #openstack-infra		14:40
*** yaguang has quit IRC		14:40
*** __afazekas_zz has quit IRC		14:41
mordred	gordc: it runs after we land requirements changes - which lib? is it a thing that we should raise the min in openstack/requirements for?	14:47
*** odyssey4me4 has joined #openstack-infra		14:47
*** senk has joined #openstack-infra		14:47
*** derekh has quit IRC		14:50
gordc	mordred: its for pycadf library (a new lib for audit data) -- i did not include a min since some changes were still being made aruond time it was added	14:50
*** SergeyLukjanov has joined #openstack-infra		14:51
*** dina_belova has joined #openstack-infra		14:57
*** david-lyle has quit IRC		14:57
*** cthulhup has joined #openstack-infra		14:58
*** sandywalsh has quit IRC		15:01
*** ryanpetrello has joined #openstack-infra		15:03
*** wu_wenxiang has joined #openstack-infra		15:04
mriedem	gordc: hey, i noticed that this didn't automatically change the status/assignee of the bug in launchpad: https://review.openstack.org/#/c/42904/	15:04
mriedem	was going to ask dhellmann if the pycadf project is hooked up to launchpad via gerrit for status changes	15:05
gordc	mriedem: it probably isn't hooked up correctly. i created the launchpad project so good chance i mucked it up :)	15:06
wu_wenxiang	I find some commit didn't start check for a long time, for example: https://review.openstack.org/#/c/38963/ and https://review.openstack.org/#/c/42794/	15:06
*** ruhe has joined #openstack-infra		15:08
*** sridevi has joined #openstack-infra		15:08
*** xchu has quit IRC		15:08
jeblair	wu_wenxiang: leave a comment with "recheck no bug"; we had to restart zuul yesterday and it lost its queue	15:09
sridevi	Hi could someone help me with https://review.openstack.org/#/c/34801/	15:09
sridevi	I see "ERROR:root:Could not find any typelib for GnomeKeyring" failures	15:10
*** ^d has joined #openstack-infra		15:12
*** ^d has joined #openstack-infra		15:12
*** xBsd has quit IRC		15:12
wu_wenxiang	jeblair: Thanks	15:12
*** SlickNik has quit IRC		15:13
*** vogxn has quit IRC		15:13
*** SlickNik has joined #openstack-infra		15:14
*** pabelanger has joined #openstack-infra		15:15
*** wu_wenxiang has quit IRC		15:16
*** david-lyle has joined #openstack-infra		15:17
*** sandywalsh has joined #openstack-infra		15:17
ryanpetrello	jeblair: Can I bug you to take a peek at this review? https://review.openstack.org/#/c/42685/2	15:17
*** UtahDave has joined #openstack-infra		15:19
ryanpetrello	or clarkb for that matter	15:19
*** ruhe has quit IRC		15:20
jeblair	ryanpetrello: i'm hacking on a fix for a production problem we've been having right now, but i will make it a point to review it today if the rest of the team hasn't taken care of it	15:21
ryanpetrello	thanks	15:21
ryanpetrello	this obviously takes a back seat :)	15:21
*** ruhe has joined #openstack-infra		15:22
openstackgerrit	gordon chung proposed a change to openstack/requirements: assign a min version to pycadf https://review.openstack.org/42923	15:23
*** reed has joined #openstack-infra		15:23
*** dina_belova has quit IRC		15:24
*** sridevi has quit IRC		15:24
mordred	ryanpetrello: done	15:30
ryanpetrello	jeblair: Monty approved it, thanks	15:30
mordred	jeblair: anything I can do to help you?	15:30
ryanpetrello	(thanks)	15:30
openstackgerrit	A change was merged to openstack-infra/config: Add WSME to StackForge. https://review.openstack.org/42685	15:36
*** nayward has quit IRC		15:39
*** afazekas_no_irq is now known as afazekas		15:42
*** thomasbiege has joined #openstack-infra		15:42
*** vogxn has joined #openstack-infra		15:43
NobodyCam	jeblair: seems we have no core members on stackforge/pyghmi we did before the rename	15:45
*** rnirmal has joined #openstack-infra		15:46
*** zehicle_at_dell has joined #openstack-infra		15:47
mordred	NobodyCam: looking	15:49
NobodyCam	thnank you mordred :)	15:50
openstackgerrit	Monty Taylor proposed a change to openstack-infra/config: Rename python-impi acl file to pyghmi https://review.openstack.org/42932	15:51
NobodyCam	w00t	15:51
mordred	NobodyCam: should be fixed soon	15:51
*** changbl has quit IRC		15:51
NobodyCam	:) TY	15:51
NobodyCam	mordred: shouldn't you be burning things about now?	15:52
mordred	NobodyCam: soon	15:52
NobodyCam	:)	15:52
*** mrodden has quit IRC		15:53
*** davidhadas has quit IRC		15:54
*** ruhe has quit IRC		15:55
openstackgerrit	A change was merged to openstack-infra/config: Rename python-impi acl file to pyghmi https://review.openstack.org/42932	15:58
*** xBsd has joined #openstack-infra		15:59
clarkb	morning	16:00
NobodyCam	good morning clarkb	16:01
clarkb	mordred jeblair: which production issue?	16:01
mordred	clarkb: I'm assuming the thing from yesterday	16:01
*** sridevi has joined #openstack-infra		16:02
mordred	clarkb: if you have a second, a ton of these: https://review.openstack.org/#/q/watchedby:mordred%2540inaugust.com+-label:CodeReview%253C%253D-1+-label:Verified%253C%253D-1+-label:Approved%253E%253D1++-status:workinprogress+-status:draft+-is:starred+-owner:mordred%2540inaugust.com,n,z	16:02
clarkb	mordred: which one :) it was like a horrible train wreck	16:02
mordred	clarkb: could use a second +2 and are trivial changes	16:02
*** sridevi has quit IRC		16:03
clarkb	mordred ok I have a couple things I want to fix while I am thinking of them but can look at those after	16:03
mordred	clarkb: k. they're not important, but most of them are simple enough to be 'while drinking first cup of coffee' fodder	16:03
clarkb	mordred jeblair what do you think of adding something like celery.contrib.rdb to zuul for stack traces and remote pdb	16:03
mordred	oy	16:04
mordred	something about using celery in a project that uses gear seems weird	16:04
clarkb	I would simplify and vendor it	16:04
*** mrodden has joined #openstack-infra		16:04
clarkb	mordred forget it is celery :) but their contrib.rdb module seems relatively decent and they have tests for it	16:04
mordred	neat	16:05
*** gyee has joined #openstack-infra		16:05
mordred	then why not just requirements celery?	16:05
clarkb	we could do that too... seems heavy for something like a contrib module. I could go either way vendor or require	16:06
NobodyCam	mordred: should that merge have fixed us?	16:06
mordred	NobodyCam: it'll take a minute	16:07
NobodyCam	ahh ok :) TY	16:07
mordred	NobodyCam: we have to wait for the git pull cron followed by the puppet agent - so it could be as long as 30 minutes from merge	16:07
*** jfriedly has joined #openstack-infra		16:08
*** gordc has left #openstack-infra		16:08
mordred	clarkb: also, your haproxy patch has 3 +2's : https://review.openstack.org/#/c/42784/ so I think whenever you want to land that and ride shotgun, you know, whatever	16:09
*** odyssey4me4 has quit IRC		16:11
jeblair	mordred, clarkb: i am reworking nodepool (as i mentioned yesterday)	16:11
*** pabelanger has quit IRC		16:12
jeblair	clarkb: the celery thing is heavyweight. i don't think we need a full remote debugger, we just need better logging, and the ability to get a stacktrace if something is stuck...	16:13
clarkb	jeblair: It needs an update. because the proxy is a single source we need to bump xinetd limits... i will propose that shortly	16:13
*** thomasbiege has quit IRC		16:13
pleia2	testing 42784 here now	16:13
jeblair	clarkb: and that's just for a desparate situation -- in reality we should always be able to figure out what's going on from logs. this is perhaps the first time we've been unable to do that with zuul. :(	16:13
clarkb	jeblair: ok, I figured remote debugger would give us that and more, but can just log stacktraces as a start	16:14
pleia2	clarkb: there are some errors for 42784, investigating and drafting up comment now	16:16
clarkb	pleia2 thanks. /me -> office	16:16
openstackgerrit	Russell Bryant proposed a change to openstack-infra/config: Disable tempest in the cells job https://review.openstack.org/42898	16:22
*** boris-42 has quit IRC		16:22
*** dina_belova has joined #openstack-infra		16:24
ryanpetrello	mordred: how long does it generally take for merged openstack-infra/config projects to show up in github.com/stackforge ?	16:26
mordred	ryanpetrello: usually quicker than this - let me look	16:26
ryanpetrello	thx	16:26
openstackgerrit	Monty Taylor proposed a change to openstack-infra/config: Make the gitweb links in gerrit point to git.o.o https://review.openstack.org/42694	16:27
*** pabelanger has joined #openstack-infra		16:27
*** markmc has quit IRC		16:29
clarkb	pleia2 try stopping xinetd first. It has port 9418	16:32
clarkb	or rather kick it to pick up the new config	16:33
*** nicedice_ has joined #openstack-infra		16:34
pleia2	clarkb: ah, yeah! it didn't pick up the new config, restarting it then starting haproxy is fine	16:34
*** xBsd has quit IRC		16:36
*** psedlak has quit IRC		16:36
clarkb	cool I eill encode into puppet	16:37
*** adalbas has joined #openstack-infra		16:38
*** dina_belova has quit IRC		16:41
*** pycabrera has joined #openstack-infra		16:42
*** nati_ueno has joined #openstack-infra		16:42
*** kgriffs has joined #openstack-infra		16:42
*** nati_ueno has joined #openstack-infra		16:43
*** pblaho has quit IRC		16:43
pleia2	having some trouble getting it to clone with haproxy enabled, browsing logs	16:43
kgriffs	hey guys, Kurt here from the Marconi team. We'd like to enable logging and/or meetbot for #openstack-marconi - what's the recommended way to do this?	16:43
kgriffs	host it ourselves, or is there a shared bot?	16:43
*** cppcabrera has quit IRC		16:43
*** pycabrera is now known as cppcabrera		16:43
pleia2	kgriffs: there is a shared bot, hang on, I'll grab a recent review as an example	16:44
annegentle	modules/gerritbot/files/gerritbot_channel_config.yaml	16:44
annegentle	kgriffs: I think that's it. ^^	16:44
*** alexpilotti has quit IRC		16:44
pleia2	kgriffs: https://review.openstack.org/#/c/41512/	16:44
annegentle	pleia2: mine's not so recent, but https://review.openstack.org/#/c/21696/	16:44
annegentle	heh	16:44
pleia2	for logging it's modules/openstack_project/manifests/eavesdrop.pp	16:44
pleia2	not gerritbot	16:44
kgriffs	cool, thanks!	16:45
pleia2	gerritbot is the one that tells you updates in reviews merges and things :)	16:45
kgriffs	actually, I think we are in gerritbot	16:45
cppcabrera	yup, we have gerritbot running as of yesterday. :D	16:45
ryanpetrello	mordred: that seemed to work, thx :)	16:45
pleia2	kgriffs: once it's in eavesdrop you get logs up on http://eavesdrop.openstack.org/	16:45
ryanpetrello	I noticed, however that one of the groups was created - https://review.openstack.org/#/admin/groups/202,members - while the other, wsme-ptl, wasn't	16:46
dhellmann	mordred: I am, too. I was going to wait the number of days specified in https://wiki.openstack.org/wiki/Governance/Approved/CoreDevProcess but I don't have	16:46
kgriffs	pleia2: excellent	16:46
dhellmann	mordred: added Alex_Gaynor to requirements-core group in gerrit	16:48
openstackgerrit	Clark Boylan proposed a change to openstack-infra/config: Proxy git-daemon with haproxy. https://review.openstack.org/42784	16:48
clarkb	pleia2: ^ slightly updated. You may want to try those settings as the xinetd ACLs are slightly relaxed to be more friendly to haproxy	16:48
pleia2	clarkb: great, thanks	16:49
openstackgerrit	Monty Taylor proposed a change to openstack-dev/pbr: Rework run_shell_command https://review.openstack.org/42337	16:49
openstackgerrit	Monty Taylor proposed a change to openstack-dev/pbr: Use wheel by default https://review.openstack.org/41255	16:49
mordred	ryanpetrello: you are now in wsme-core, so you should be able to add other people	16:51
mordred	as you see fix	16:51
mordred	fit	16:51
mordred	ryanpetrello: poking wsme-ptl	16:51
*** SlickNik has quit IRC		16:52
ryanpetrello	awesome, and thank you	16:52
mordred	NobodyCam: you should be set	16:52
*** SlickNik has joined #openstack-infra		16:52
mordred	ryanpetrello: I'm excited to have wsme moved in!	16:52
*** alexpilotti has joined #openstack-infra		16:52
dhellmann	mordred: cdevienne is looking forward to having more contributors :-)	16:53
NobodyCam	mordred: Thank you !!	16:53
mordred	dhellmann: :)	16:53
*** kgriffs has left #openstack-infra		16:53
*** afazekas has quit IRC		16:54
mordred	dhellmann: while you're here, could I get a second +2 on https://review.openstack.org/#/c/42515/ ? I have another patch that's wanting it and I'm trying to clear as much of my outstanding niggly stuff before I am out today	16:54
dhellmann	mordred: sure, looking now	16:54
mordred	dhellmann: (there's two other in requirements that could probably use love as well)	16:54
dhellmann	mordred: I've got a standup in 3 minutes, but after that can look at anything you'd like reviewed	16:56
clarkb	pleia2: anything I can do to help testing/debug git-daemon?	16:57
openstackgerrit	Alejandro Cabrera proposed a change to openstack-infra/config: feat: add marconi channel to eavesdrop https://review.openstack.org/42956	16:57
pleia2	clarkb: the patch helps us stop losing the puppet lottery (xinetd should have to look at the file it's subscribed to first before haproxy stuff happens) but still unable to clone from git:// with it enabled, looking for haproxy related logs now	17:00
*** ladquin has joined #openstack-infra		17:00
*** fbo is now known as fbo_away		17:01
*** jerryz has joined #openstack-infra		17:04
*** morganfainberg\|a is now known as morganfainberg		17:04
pleia2	gosh, looking for issues with git specifically is a fun google-fu problem	17:08
*** dprince has quit IRC		17:08
morganfainberg	Alex_Gaynor: ping	17:09
clarkb	pleia2: is it like googling for Go?	17:14
pleia2	yeah, and screen(1) :)	17:14
pleia2	might be an issue with my test isntance though, it doesn't have a fqdn for one	17:14
fungi	i cannot, for the life of me, figure out how to adjust bugtask metadata for git-review on bug 1179008 (trying to set it to fix-committed for example). tried repeatedly over the past few days and every time i get a launchpad "timeout error..." ideas?	17:15
openstackgerrit	Clark Boylan proposed a change to openstack-infra/zuul: SIGUSR2 logs stack traces for active threads. https://review.openstack.org/42959	17:15
uvirtbot	Launchpad bug 1179008 in python-neutronclient "rename requires files to standard names" [Medium,In progress] https://launchpad.net/bugs/1179008	17:15
pleia2	and ipaddress might show up weird on hpcloud (the local address the machine thinks it has in `ip addr` is not the public address	17:15
* pleia2 manual tweaks		17:16
*** vipul is now known as vipul-away		17:16
koolhead17	pleia2: hi there	17:17
clarkb	jeblair: ^ 42959 is a bit of a WIP but I figured I would get that out sooner than later. I am still working on testing it (is the easiest way to do that to write a unittest?)	17:17
clarkb	fungi: it times out for me too. Maybe we attached too many projects to that bug?	17:18
pleia2	koolhead17: hey, hope you're enjoying your stay in SF :)	17:19
*** vipul-away is now known as vipul		17:22
koolhead17	pleia2: i am. :)	17:23
jeblair	clarkb: lgtm; you might need to actually run it in order to test it. also, i think there is something messed up with signals and using the internal gear server.	17:23
koolhead17	lets catch up sometime over weekend	17:23
mordred	jeblair: oh lovely	17:23
* koolhead17 waves jeblair mordred clarkb & everyone :D		17:23
mordred	hey koolhead17 - enjoyin SF?	17:23
koolhead17	yes sir. its great	17:24
koolhead17	:)	17:24
koolhead17	i might be in seattle for a day	17:25
clarkb	koolhead17: one day is not enough for seattle :P	17:25
Alex_Gaynor	morganfainberg: pong	17:25
koolhead17	clarkb: i know :(	17:26
clarkb	jeblair: is there still a dev zuul that I can use to test within a running system?	17:26
koolhead17	clarkb: won`t mind coming to portland for beer for few hr though. :D	17:26
koolhead17	Alex_Gaynor: hi there	17:26
morganfainberg	Alex_Gaynor: hey, wanted to follow up with you regarding: https://review.openstack.org/#/c/42455/ (since you, in theory could bump up to a +2 now, btw, gratz on core for requirements)	17:27
pleia2	clarkb: so netstat tells me git daemon isn't even running when not on the default port, so trying to fix that now	17:27
clarkb	jeblair: I have at least one small updated to that. I realized that a reconfigure will also reconfigure logging so I am just going to get the logger each time I need to dump stack traces	17:27
morganfainberg	Alex_Gaynor: see if there was any outstanding concerns, since thats thenext blocker for my caching stuff in keystone.	17:27
clarkb	pleia2: doesn't xinetd fork git-daemon's on demand as connections come in?	17:28
*** vogxn has quit IRC		17:28
fungi	however xinetd should be listening on that port	17:29
Alex_Gaynor	morganfainberg: I don't think there are any outstanding concerns, but I'll have to give it a once over before +2ing :) I'll come around in a few minutes to it	17:29
*** pcm_ has quit IRC		17:29
pleia2	clarkb: yeah, but it still should have: :::9418 :::* LISTEN 10606/xinetd	17:29
pleia2	as fungi says :)	17:29
morganfainberg	Alex_Gaynor: thanks! i appreciate it :)	17:29
clarkb	pleia2: haproxy will be 9418, xinetd on 29418	17:29
pleia2	right, haproxy shows up on 9418 and no xinetd at all	17:30
pleia2	can't get it to listen on 29418	17:30
clarkb	weird	17:31
* pleia2 confirms it's not selinux		17:31
*** pcm_ has joined #openstack-infra		17:32
*** SergeyLukjanov has quit IRC		17:34
clarkb	jeblair: woot, I wrote a small script that sits in a while loop with that signal handler configured and it seems to work	17:35
clarkb	jeblair: much easier testing that way than getting a complete zuul running	17:35
*** cthulhup has quit IRC		17:35
pleia2	Aug 20 17:36:39 git-vanilla xinetd[10709]: Service git expects port 9418, not 29418	17:36
pleia2	heh	17:36
pleia2	dear xinetd, do it anyway	17:37
*** mgagne has joined #openstack-infra		17:37
*** mgagne has quit IRC		17:37
*** mgagne has joined #openstack-infra		17:37
openstackgerrit	Anita Kuno proposed a change to openstack-dev/hacking: Testing how .html files are rendered by cgit. https://review.openstack.org/42961	17:42
*** zul has quit IRC		17:42
morganfainberg	Alex_Gaynor: looks like dhellmann got to it before you. thanks :)	17:46
Alex_Gaynor	morganfainberg: okey doke, sorry bout that, I'm writing some scripts to setup swift for some benchmarkming :)	17:46
morganfainberg	Alex_Gaynor: not a problem man, was just following up with people today about it. thanks again!	17:47
openstackgerrit	Anita Kuno proposed a change to openstack-dev/hacking: Testing how .html files are rendered by cgit https://review.openstack.org/42961	17:48
clarkb	pleia2: maybe we should consider running it as a stand alone daemon?	17:48
clarkb	pleia2: and rely on haproxy to do the DDoS protection	17:48
pleia2	clarkb: so it looks like xinetd uses /etc/services to determine where it should bind stuff, by patching /etc/services I got it to work, but this seems sub-optimal	17:50
pleia2	(commented out the 9418 git lines, added ones for 29418)	17:51
*** dina_belova has joined #openstack-infra		17:52
clarkb	pleia2: so cloning works now? its a start :)	17:52
pleia2	yeah! This is with haproxy running: git clone git://15.185.127.146/openstack-infra/config.git	17:53
pleia2	browsing git-daemon docs, the /etc/services thing may actually be more git daemon and less xinetd	17:54
pleia2	so maybe we do need to change /etc/services	17:55
*** dina_belova has quit IRC		17:57
clarkb	ok	17:58
anteaya	pleia2: to add to your list of things to do, here is a patch consisting of an .html file I generated with rst2html: https://review.openstack.org/#/c/42961/	17:58
anteaya	let me know how it looks	17:58
clarkb	pleia2: that seems hacky though	17:59
*** cppcabrera is now known as cppcabrera_afk		17:59
pleia2	clarkb: yeah, so if we run it stand alone without --inetd we should be able to specify an alternate --port	18:00
clarkb	pleia2: I like that better	18:01
pleia2	I am not sure of the best way to do this, as "the centos way" is using xinetd to run services that don't have specific init scripts, git is just a command line "git daemon..."	18:03
clarkb	pleia2: ubuntu's git daemon package comes with an init script. we could vendor it for centos	18:03
clarkb	I am sure that the red hat folk in the channel want to beat me after saying that	18:03
pleia2	hehe	18:04
pleia2	so we'd just drop it in /etc/init.d/ ? I am really unfamiliar with rh init system stuff	18:04
pleia2	(well, after tweaking it to work properly, of course)	18:05
openstackgerrit	Clark Boylan proposed a change to openstack-infra/zuul: SIGUSR2 logs stack traces for active threads. https://review.openstack.org/42959	18:07
clarkb	jeblair: ^ that comes with a test. Let me know what you think	18:07
clarkb	pleia2: yes, dropping it in /etc/init.d/ and having puppet ensure the service is enabled should be sufficient	18:07
clarkb	assuming that the debian/ubuntu script doesn't have a bunch of debianisms in it that centos won't like	18:08
*** changbl has joined #openstack-infra		18:09
pleia2	clarkb: looking now, it does - hard coded paths, /etc/default references, might actually be worth rewriting	18:09
pleia2	there are useful things I can pull from it though, hacking away	18:10
pleia2	anteaya: ok, I'll have a look in a little bit	18:12
pleia2	or use one someone already wrote http://robescriva.com/blog/2009/01/13/git-daemon-init-scripts-on-centos-52/	18:13
anteaya	k thanks	18:13
* pleia2 frowns at no license		18:14
pleia2	ah, easy enough to write own	18:14
clarkb	pleia2: let me know if there is anything I can do to help	18:16
clarkb	I half feel like I threw my crazy haproxy idea over the wall >_>	18:17
*** cthulhup has joined #openstack-infra		18:17
clarkb	was not my intention :)	18:17
*** zul has joined #openstack-infra		18:17
pleia2	no worries, it mostly worked, certainly didn't anticipate it being so cranky about non-standard ports, it shouldn't be like this :)	18:18
*** xBsd has joined #openstack-infra		18:23
*** melwitt has joined #openstack-infra		18:23
*** cthulhup has quit IRC		18:24
clarkb	pleia2: jeblair: mordred: Worth noting that the g-g-p times with https://git.o.o seem to be better than when against review.o.o on centos unittest slaves	18:25
clarkb	so maybe we should stop worrying too much about git://	18:25
pleia2	hmm, maybe there is a way I can edit the server_args line to support port	18:27
*** vipul is now known as vipul-away		18:30
*** vipul-away is now known as vipul		18:30
pleia2	not so much	18:31
reed	need a staging server for activity.openstack.org	18:32
pleia2	clarkb: maybe, seems unlikely that if we point everything at https that there will be enough load on git:// to cause problems	18:34
*** arezadr has joined #openstack-infra		18:35
jeblair	reed: do you want to write the puppet (we can point you to some docs), or do you want someone else to do it?	18:39
reed	jeblair, send me the puppet stuff, I'd like to learn	18:39
reed	(is that a good answer or what?)	18:39
*** markmcclain has quit IRC		18:40
pleia2	reed: http://ci.openstack.org/sysadmin.html#adding-a-new-server is a good start :)	18:40
anteaya	are we waiting for anything specific for this patch: https://review.openstack.org/#/c/38177/ Use cgit server instead of github for everything There is quite the lineup of green +'s on it	18:40
jeblair	reed: it's the most perfect answer ever. :)	18:40
* reed admires his most perfect answer ever, sipping coffee		18:41
jeblair	reed: http://ci.openstack.org/sysadmin.html#adding-a-new-server	18:41
jeblair	reed: you should actually start at the top of that doc	18:41
pleia2	anteaya: still working to tune the git server before we throw everything at it	18:41
jeblair	reed: it has background info, and also instructions on how to test	18:41
anteaya	pleia2: ah, okay	18:41
jeblair	reed: but the section i pointed to has the actual steps	18:41
jeblair	reed: and somewhere, there's mrmartin's change to add his staging server	18:42
jeblair	looking	18:42
reed	jeblair, oh, right... I can copy that too	18:42
jeblair	reed: https://review.openstack.org/#/c/42608/	18:42
reed	sweet	18:42
*** SergeyLukjanov has joined #openstack-infra		18:43
jeblair	reed, mrmartin: and sorry i haven't reviewed that yet. it is a high priority, after we get some of the operational issues we've been dealing with under control	18:43
*** SergeyLukjanov has quit IRC		18:43
jeblair	(this week is very busy due to a feature freeze deadline)	18:43
reed	np, mrmartin is on vacation today anyway	18:44
mordred	damn feature freeze	18:44
mordred	clarkb, pleia2: git-daemon wants us to edit /etc/services to run it on another port?	18:45
*** vipul is now known as vipul-away		18:46
pleia2	mordred: well, inetd does	18:46
pleia2	if running it from xinetd or using --inetd on the command line, you can't specify --port because it just does an /etc/services lookup and will only use what's in that file	18:47
pleia2	I vote that this is broken :)	18:47
pleia2	but it is what it is	18:47
*** openstack` has joined #openstack-infra		18:51
*** openstack has quit IRC		18:51
*** pabelanger has quit IRC		18:52
*** openstack` is now known as openstack		18:52
*** boris-42 has joined #openstack-infra		18:52
*** afazekas has joined #openstack-infra		18:53
mordred	pleia2: it seems like a very poor design	18:55
openstackgerrit	James E. Blair proposed a change to openstack-infra/nodepool: WIP: provider manager https://review.openstack.org/42973	18:56
openstackgerrit	Clark Boylan proposed a change to openstack-infra/zuul: SIGUSR2 logs stack traces for active threads. https://review.openstack.org/42959	18:57
jeblair	mordred, clarkb: ^ that is my solution to the problems with rate limits we saw yesterday ^. i also think it's a bit cleaner and more reliable.	18:58
clarkb	jeblair: I will review after the meeting	18:58
jeblair	mordred, clarkb: it needs a little more work, and testing with a real provider instead of my fake one, but it's mostly there and worth a general review	18:58
clarkb	jeblair: the zuul change should be ready for review as well	18:59
jeblair	clarkb: thanks	18:59
jeblair	meeting time!	18:59
jeblair	almost	18:59
*** AJaeger has joined #openstack-infra		19:01
*** pabelanger has joined #openstack-infra		19:01
*** mriedem1 has joined #openstack-infra		19:02
*** cppcabrera_afk is now known as cppcabrera		19:03
*** mriedem has quit IRC		19:03
AJaeger	Hi infra team, I'd like to have some guidance and help on getting the Basic Install guide build now also for openSUSE - and thus on the docs.openstack.org	19:05
AJaeger	annegentle guided me in https://review.openstack.org/#/c/41777/ to you.	19:06
*** thomasbiege1 has joined #openstack-infra		19:06
clarkb	AJaeger: we are in our weekly meeting currently, so we may be a bit slow to answer, but will catch up after the meeting	19:06
AJaeger	clarkb, sorry, didn't know. Ok, I'll stay around and let you finish your meeting. Thanks for the quick heads-up.	19:06
*** gyee has quit IRC		19:07
*** vipul-away is now known as vipul		19:07
*** vipul is now known as vipul-away		19:07
*** vipul-away is now known as vipul		19:07
AJaeger	clarkb, btw if I should send an email or use other means, just tell me	19:07
clarkb	AJaeger: IRc is probably easiest, it will just be maybe an hour before we can really answer your questiosn	19:08
AJaeger	clarkb, ok, thanks	19:08
*** markmcclain has joined #openstack-infra		19:15
Alex_Gaynor	So the amount of time between whe a job finishes on jenkins, and when zuul records it as done seems why too large. Are there any known bottlenecks there, and what can be done to improve that?	19:15
*** fbo_away is now known as fbo		19:15
nati_ueno	Jenkinsreview on Gerrit get really readable! Nice	19:17
*** dprince has joined #openstack-infra		19:18
reed	jeblair, pleia2: since the activity-staging server needs to have apache and mysql should I draw inspiration from static.pp for the include::apache and various mods??	19:19
*** kiall_ has joined #openstack-infra		19:20
pleia2	reed: yes	19:21
reed	cool	19:21
jeblair	Alex_Gaynor: link to an example change?	19:22
mordred	nati_ueno: thanks! (jeblair did it)	19:22
*** vipul is now known as vipul-away		19:22
nati_ueno	jeblair: Thanks!	19:23
*** nati_ueno has quit IRC		19:26
*** jerryz has quit IRC		19:26
*** HenryG has joined #openstack-infra		19:26
*** cthulhup has joined #openstack-infra		19:26
Alex_Gaynor	jeblair: just random ones I'm noticing as they happen	19:27
*** nati_ueno has joined #openstack-infra		19:30
*** gordc has joined #openstack-infra		19:34
*** thomasbiege1 has quit IRC		19:34
*** cthulhup has quit IRC		19:37
*** cthulhup has joined #openstack-infra		19:41
*** xBsd has quit IRC		19:42
jeblair	Alex_Gaynor: don't forget about severed heads;	19:43
*** vipul-away is now known as vipul		19:43
jeblair	Alex_Gaynor: the head of the queue was just severed because it failed a test, but it's still running its tests and won't report until they are done	19:43
jeblair	Alex_Gaynor: (scroll to the bottom of the gate queue to see it)	19:43
Alex_Gaynor	jeblair: so the case I was looking at was teh top item in the gate queue	19:44
Alex_Gaynor	s/queue/pipeline	19:44
russellb	btw, i put up this change earlier today to help free up some jenkins resources over the next couple weeks: https://review.openstack.org/#/c/42898/	19:45
*** zul has quit IRC		19:47
SlickNik	hey guys.	19:49
SlickNik	just wanted to report in that review.openstack.org is being much slower than usual.	19:50
fungi	SlickNik: yes, it's being used much more than usual	19:51
clarkb	SlickNik: yup, it is getting bogged down by all of the testing to test all of your code :) I think we just agreed to merge a change that will hopefully alleviate some of this	19:51
clarkb	jeblair: do you want to force merge that change or should I just go ahead and do it?	19:51
jeblair	clarkb: i'll do it	19:51
fungi	SlickNik: with the icehouse feature freeze looming, lots of people are trying to submit/review/merge much more code volume than usual	19:51
openstackgerrit	A change was merged to openstack-infra/devstack-gate: Use git.openstack.org as origin https://review.openstack.org/42693	19:52
clarkb	jeblair: thanks	19:52
SlickNik	Cool, thanks! Understandable with the FF looming.	19:52
SlickNik	And thanks for being on top of it (as usual).	19:52
SlickNik	Chers.	19:53
SlickNik	Cheers*	19:53
clarkb	SlickNik: in the mean time you will probably find that using git review -d and the gerrit ssh interface to be a little more responsive	19:53
clarkb	and do your reviews locally (not sure if you can do inline comments this way, but otherwise it should work)	19:53
openstackgerrit	Anne Gentle proposed a change to openstack-infra/config: Ensure that the release.path.name is set for the Block Storage https://review.openstack.org/42984	19:54
*** afazekas has quit IRC		19:54
ryanpetrello	anybody know if there's a generalized sphinx upload hook for pythonhosted.org ?	19:54
pleia2	clarkb: I'm heading out to lunch in a couple minutes (might run a bit long), will finish up init script upon my return!	19:54
ryanpetrello	that does e.g., http://pythonhosted.org/an_example_pypi_project/buildanduploadsphinx.html	19:54
ryanpetrello	similar to what the rtfd hook does, but uploads directly to pythonhosted.org?	19:55
ryanpetrello	if not, I'd be glad to experiment in writing one, just wanting to make sure it doesn't already exist...	19:55
*** markmc has joined #openstack-infra		19:55
mordred	ryanpetrello: we have not made one	19:55
ryanpetrello	I wonder if doc_upload has the same permissions as how maintainer roles work	19:56
mordred	at some point, I'd love to get a good general design/direction around rtfd/pythonhosted/docs.o.o	19:56
ryanpetrello	i.e., if you're a maintainer, you can upload docs	19:56
mordred	dhellmann, annegentle ^^	19:56
mordred	ryanpetrello: also, look at how we do pypi-upload	19:56
clarkb	ryanpetrello: note we don't use setup.py to upload stuff to pypi because ugh. Instead we have a wrapper around curl to do it so that we don't have to run arbitrary code	19:56
annegentle	mordred: I've met with Todd Morey in the last couple weeks to try to synch with www for design	19:56
jeblair	ryanpetrello: i lookd into it briefly	19:56
mordred	ryanpetrello: it's probably more directly related to how we'd need to upload docs to pypi	19:56
annegentle	mordred: Sphinx does work well for dev docs	19:56
jeblair	ryanpetrello: it can be done by uplodaing a zipfile	19:56
clarkb	AJaeger: still around?	19:56
AJaeger	clarkb, Yes.	19:57
jeblair	ryanpetrello: so basically, it would be like the pypi-upload job	19:57
mordred	annegentle: main questoin is - which of the three available locations should we automatically upload to?	19:57
mordred	annegentle: or - should we upload to all of them?	19:57
annegentle	mordred: ah	19:57
clarkb	AJaeger: ok, give me a quick minute to settle back into doing stuff and I will do my best to answer your questions about new doc jobs	19:57
annegentle	mordred: one place.	19:57
dhellmann	mordred: we're looking at pythonhosted for wsme because that's one of the places it is already using	19:57
*** dina_belova has joined #openstack-infra		19:57
ryanpetrello	why not as many as you specify via hooks?	19:57
*** SergeyLukjanov has joined #openstack-infra		19:58
dhellmann	my preference is for rtfd.org, because that's what most people are doing for new projects	19:58
ryanpetrello	if elect pythonhosted vs rtfd	19:58
annegentle	ryanpetrello: why clutter the internet? :)	19:58
ryanpetrello	the submission process for those is quite different	19:58
dhellmann	annegentle: +1	19:58
ryanpetrello	no, I agree	19:58
annegentle	dhellmann: my issue with rtfd is we need the GA info to make good decisions about docs	19:58
jeblair	openstack projects should have their docs uploaded to docs.openstack.org	19:58
ryanpetrello	just staying we should give folks the flexibility to choose	19:58
dhellmann	for openstack stuff, I think we should just host it ourselves	19:58
annegentle	jeblair: yes	19:58
jeblair	stackforge projects can do whatever they want	19:58
mordred	sure	19:58
jeblair	and we do give them the flexibility to do that right now.	19:58
dhellmann	annegentle: right, this would just be for third-party or stackforge stuff	19:58
annegentle	jeblair: sure	19:58
ryanpetrello	right, Doug and I are mostly referring to stackforge in this context	19:58
annegentle	dhellmann: ok	19:58
*** ^demon has joined #openstack-infra		19:59
*** ^demon has joined #openstack-infra		19:59
ryanpetrello	just suggesting that stackforge folks may find a "auto-upload to pythonhosted.org on release" useful	19:59
ryanpetrello	they currently have this for rtfd	19:59
ryanpetrello	just considering another option	19:59
dhellmann	yep	19:59
dhellmann	I think we should allow pythonhosted, but encourage rtfd where possible	19:59
annegentle	ryanpetrello: ok. nice that it happens on upload	20:00
ryanpetrello	+1	20:00
mordred	++	20:00
annegentle	ryanpetrello: but there are good reasons to ci docs	20:00
annegentle	I'd probably encourage continuous publishing	20:00
clarkb	AJaeger: we configure all of our jenkins jobs using the Jenkins Job Builder, http://ci.openstack.org/jjb.html	20:00
ryanpetrello	sure, s/on release/whenever is applicable	20:00
dhellmann	annegentle: good point	20:00
ryanpetrello	continuous, if it's right for your project/preference	20:01
*** lcestari has quit IRC		20:01
clarkb	AJaeger: that page is a good starting point for learning how JJB works. With the help of that page you should be able to grab an existing doc job that does something similar to what you want and copy pasta as needed without losing too much understanding of what is going on	20:01
*** ^d has quit IRC		20:01
clarkb	AJaeger: then the second thing you need to do is tell zuul to run that jenkins job when you need it to be run	20:02
*** mikal has joined #openstack-infra		20:02
clarkb	AJaeger: https://github.com/openstack-infra/config/blob/master/modules/openstack_project/files/zuul/layout.yaml is where you do that. http://ci.openstack.org/zuul.html has a brief zuul intro and links ot more in depth docs	20:03
clarkb	AJaeger: so from a super high level your change will have two parts. 1. add job to jenkins with JJB and 2. tell zuul to run new job in layout.yaml	20:03
AJaeger	clarkb: Thanks, I'll check how the current guides are build and see whether I need to duplicate that setup or can somehow hook into it...	20:04
*** zehicle_at_dell has quit IRC		20:05
openstackgerrit	Clark Boylan proposed a change to openstack-infra/config: Make mysql backup crons quiet. https://review.openstack.org/42785	20:06
clarkb	jeblair: mordred fungi ^	20:06
clarkb	and now time for reviews	20:06
*** mikal has quit IRC		20:07
fungi	clarkb: lgtm. i'm popping out for lunch and then i'll try to review a few changes before my next meeting	20:09
clarkb	AJaeger: feel free to ask questions as they arise. I know I gave the high level info dump and wasn't very specific	20:10
AJaeger	clarkb, that helped a lot - I got the right pointer. I'll propose a change in a few minutes for you to review that I didn't miss anything...	20:12
*** mikal has joined #openstack-infra		20:13
openstackgerrit	Andreas Jaeger proposed a change to openstack-infra/config: Build Basic Install Guide for openSUSE https://review.openstack.org/42988	20:15
*** dmakogon_ has quit IRC		20:16
AJaeger	clarkb, my feeling is just that I'm missing something. That was too easy ;)	20:16
vipul	you guys aware of review.o.o being a slow today?	20:17
clarkb	vipul: yes, we are DDoSing it with the jenkins slaves	20:17
vipul	ooh fun!	20:18
clarkb	we recently merged a devstack gate change that will point more tests to git.openstack.org which will hopefully alleviate the pressure on review.o.o but we need the currently running tests to flip over before we see	20:18
openstackgerrit	James E. Blair proposed a change to openstack-infra/nodepool: Add ProviderManager https://review.openstack.org/42973	20:19
clarkb	vipul: this is the typical pre feature freeze rush that never fails to break something	20:21
clarkb	vipul: tl;dr you need to write more code during H1 :)	20:21
*** pabelanger has quit IRC		20:21
vipul	clarkb: h1 is for recovering from all the hangovers at the summit :D	20:22
*** pcm_ has quit IRC		20:22
*** HenryG has quit IRC		20:23
*** mikal has quit IRC		20:24
jeblair	i think jenkins02 is experiencing a similar slowness as before; i've got jstack trying to get a thread dump; it is responding, but very slowly, and it has a bunch of offline nodes sitting aroind.	20:27
jeblair	clarkb, fungi: ^ i uploaded a polished version of the providermanager change; i'm about to start live-testing it	20:27
clarkb	jeblair: ok, it is next up in my queue.	20:28
jeblair	clarkb, fungi: i think i will also do something similar to serialize jenkins access, and try to deploy both of those together.	20:28
openstackgerrit	Clark Boylan proposed a change to openstack-infra/devstack-gate: Replace review.o.o with git.o.o. https://review.openstack.org/42989	20:28
clarkb	jeblair: ^ I noticed that needed doing	20:28
jeblair	clarkb: no it doesn't we don't use those anymore	20:29
clarkb	jeblair: well it needs doing at least for the README	20:29
clarkb	jeblair: the image building is elsewhere, maybe there should be a clean up d-g commit then do the git stuff on top of it	20:29
jeblair	clarkb: ok, sure, we can change the readme. i'm pretty sure the image building, whether run manually or nightly, is not causing current performance problems, so i deferred it	20:30
jeblair	similarly, i have deferred removing those things until there's a replacement	20:31
jeblair	(for manually running)	20:31
jeblair	clarkb: but can we at least avoid adding that to the gate queue until it's not busy?	20:31
clarkb	jeblair: ya	20:31
clarkb	I will WIP it	20:32
*** pabelanger has joined #openstack-infra		20:32
openstackgerrit	Jim Branen proposed a change to openstack/requirements: Allow use of hp3parclient 1.1.0. https://review.openstack.org/42991	20:32
clarkb	russellb: https://jenkins01.openstack.org/job/gate-nova-python26/1366/console seems to be a fairly frequent test failure	20:32
clarkb	jeblair: FYI ^ I think that has semi broken the gate (only nova runs that test so only nova is affected)	20:33
russellb	boris-42: ^^^	20:33
russellb	boris-42: can you help dig into that? since you (and your team) have been working most in that area	20:34
mordred	anteaya: when you get a moment, would you look at the scrollback in the meeting channel	20:35
mordred	anteaya: and the discussion of setting up a repo that we'll use for voting for TC motions?	20:36
markmc	russellb, I think someone from his team submitted a patch	20:36
* markmc digs it up		20:36
anteaya	mordred: I was following some of that	20:36
*** kiall_ is now known as Kiall		20:36
boris-42	russellb I am here	20:36
anteaya	mordred: am I the resource volunteered for duty?	20:36
markmc	russellb, it was victor, https://review.openstack.org/#/c/42649/	20:36
anteaya	:D	20:36
mordred	anteaya: yup	20:37
russellb	markmc: nicedice_	20:37
anteaya	okey dokey smokey	20:37
russellb	err, nice.	20:37
boris-42	russellb yeah this is already solve	20:37
mordred	anteaya: you know, if you want :)	20:37
anteaya	yeah yeah yeah	20:37
russellb	clarkb: looks like we have a patch up for that ... need to get it reviewed/merged though	20:37
anteaya	so the way I understand it, I go back through the TC meeting logs and pull out past decisions	20:37
clarkb	russellb: markmc: great. Note that any nova changes approved before that one probably won't merge	20:37
*** mikal has joined #openstack-infra		20:38
anteaya	and offer them up as patches to the repo	20:38
anteaya	that I am about to create	20:38
anteaya	to gather the history	20:38
markmc	clarkb, it only happens like 1 in every 5 times from what I've seen	20:38
anteaya	is that one of the tasks, apart from creating the repo itself	20:38
anteaya	ttx: what do we want to call this TC decision repo?	20:39
anteaya	at the very least, I will learn a lot about the history of the TC	20:39
russellb	clarkb: that change is approved now	20:42
boris-42	russellb nice	20:42
boris-42	russellb thnaks	20:42
boris-42	=)	20:42
*** SergeyLukjanov has quit IRC		20:42
russellb	boris-42: yep, np	20:42
jeblair	clarkb, mordred: i think jstack is stuck in its deadlock detection.	20:43
mordred	jeblair: wow	20:44
*** dina_belova has quit IRC		20:44
*** cthulhup has quit IRC		20:45
*** cthulhup has joined #openstack-infra		20:45
clarkb	load on git.o.o is ~18 and under 1 on review.o.o	20:46
clarkb	jeblair: that is an impressive feat	20:46
openstackgerrit	Russell Bryant proposed a change to openstack-infra/config: Disable tempest in the cells job https://review.openstack.org/42898	20:46
jeblair	i'm attaching the debugger and will try that way	20:46
mordred	clarkb: woot!	20:46
clarkb	jeblair: I am working my way through the nodepool client manager change right now	20:46
*** cppcabrera is now known as cppcabrera_afk		20:47
notmyname	mordred: tags for getting pbr with swift...got a few minutes?	20:48
lifeless	Alex_Gaynor: do you know, is there a way to get a unicode string directly from a memoryview, rather than copy to bytestrnig, then decode to unicode string?	20:48
jeblair	clarkb, mordred: i think it's slow because there are so many nodes still attached to it (which is true because it is slow)	20:48
jeblair	mordred: got a few mins?	20:49
Alex_Gaynor	lifeless: apparently! codecs.utf_8_codecs(memoryview) seems to wokr (for example)	20:49
*** cthulhup has quit IRC		20:50
lifeless	Alex_Gaynor: ahha, thanks!	20:50
jeblair	mordred: i think jenkins02 needs to be stopped, and have all the nodes removed from its config.xml; all related nodes deleted from nova, and then started again.	20:50
clarkb	jeblair: that is no good. What do you think about an artificial throttle in zuul or nodepool, so that we can at least prevent it from overrunning itself	20:50
lifeless	Alex_Gaynor: though 2.7's codecs module has no utf_8_codecs attribute	20:51
jeblair	clarkb: i mentioned that i wanted to serialize access to jenkins, do you want something else?	20:51
Alex_Gaynor	lifeless: codecs.utf_8_decode(m)	20:52
dprince	jeblair: question on Gerrit comment syntax. I noticed recently that 'SUCCESS' is green.... and 'FAILED' is red. Is that HTML formatting that does that? or some sort of magic gerrit syntax you'd need to use?	20:52
lifeless	Alex_Gaynor: ahha! cool.	20:52
clarkb	jeblair: I think serializing access to jenkins is part of the answer, doing more to add a configurable queue length so that anything going over some limit blocks	20:52
jeblair	clarkb: if we wanted the whole system to be slow, we could have done nothing. it was self limiting earlier.	20:53
jeblair	clarkb: and still is	20:53
jeblair	clarkb: the point is to actually be able to run all of the tests we need to run	20:53
jeblair	clarkb: that's why we're scaling jenkins horizontally and adding more masters	20:53
clarkb	jeblair: I am not suggesting to make it slow, you can still make the limit arbitrarily high	20:53
jeblair	clarkb: what are you suggesting then?	20:54
clarkb	jeblair: but in cases like this we would be much more better off putting a limit on how fast it can be	20:54
jeblair	clarkb: how fast what?	20:54
clarkb	jeblair: jobs per hour	20:54
jeblair	clarkb: are you talking about zuul?	20:54
clarkb	jeblair: or nodepool concurrent nodes	20:54
clarkb	jeblair: I am think of zuul and or nodepool. They can both be throttled to take some of the pressure off of jenkins and gerrit	20:55
jeblair	clarkb: okay, so we just merged a change that will cause tests to not touch gerrit	20:55
*** mikal has quit IRC		20:56
jeblair	clarkb: zuul accesses gerrit serially when creating its changes	20:56
jeblair	ideally, we have just done quite a lot to take the pressure off of gerrit	20:56
jeblair	clarkb: so what pressure on gerrit do you want to relieve?	20:56
ttx	anteaya: openstack/governance ideally, though it's a bit overreaching	20:56
clarkb	jeblair: our major problem today and yesterday appears to be a thundering herd. If we can let them thunder at a tunable pace we should be able to reign in when jenkins runs faster than it shoes can move	20:56
jeblair	clarkb: i think you are over-generalizing	20:56
ttx	but openstack/tech-governance is a mouthful	20:56
anteaya	ttx: I'm fine with openstack/governance	20:57
clarkb	jeblair: I am trying to be generic, because next milestone it will be some other DDoS	20:57
ttx	and it's not as if we never renamed any project in the past	20:57
anteaya	do we want it in the openstack/ namespace or the openstack-infra/ namespace do you think, ttx?	20:57
ttx	well if one thing is openstack/, that would be it	20:57
anteaya	very good	20:57
clarkb	jeblair: and a generic pace enforcment will help us at least keep moving rather than needing emergency fixes to keep going	20:57
* anteaya goes back to looking up docs for creating a new git repo		20:58
jeblair	clarkb: overgeneralizing a problem does not help provide a solution. how do you write a patch to "don't cause problems"?	20:58
jeblair	clarkb: your second point	20:58
jeblair	clarkb: pressure on jenknis	20:58
ttx	mordred: your cookiecutter thing looks good -- looks like an automated mordred-goes-to-fix-your-project merge	20:59
jeblair	clarkb: we have seen that jenkins can run a lot of jobs, and have a lot of slaves	20:59
anteaya	ttx: I don't have any expectation of any gate or check tests for openstack/governance	20:59
jeblair	clarkb: but right now, we've seen issues with slaves not being removed from jenkins	20:59
ttx	anteaya: we could enforce some common template	20:59
jeblair	clarkb: i don't know why that is. there may be a bug in the gearman-plugin. the 'thundering herd' of deleted nodes may just be too much contention for that kind of operation.	21:00
ttx	anteaya: but not yet maybe	21:00
anteaya	ttx: got one in mind?	21:00
anteaya	ttx: very good	21:00
jeblair	clarkb: and as you observed earlier, jenkins does not do well if you do lots of things at once	21:00
clarkb	jeblair: ya	21:00
jeblair	clarkb: so serializing access to adding and removing nodes from jenkins may help with that	21:00
jeblair	clarkb: at least, we might get a better idea of what is going on	21:00
clarkb	jeblair: I am all for fixing the specific bottlenecks because I want to be able to do as many operations as possible. But I also think having some way of pull back so that everything doesn't shut down is useful	21:01
jeblair	clarkb: anyway, you've had some good suggestions, and i'm trying to implement solutions for the problems we've seen based on them	21:01
jeblair	clarkb: that sounds great. i have no idea what you're talking about though.	21:01
anteaya	ttx: who do you want as core for openstack/governance?	21:02
clarkb	jeblair: I am not sure where we would want the control to go (proabably in zuul) but being able to tell it launch at most 300 jobs per hour or some number of jobs per minute/second etc will be useful so that in cases like now we can continue to run jenkins jobs without making the problem worse.	21:03
jeblair	clarkb: why would we want to do that? what problem does that solve?	21:03
ttx	anteaya: that's where it gets tricky. You want +2/-2 for TC members. And APRV for the chair (me)	21:03
* ttx is in a meeting		21:03
clarkb	jeblair: I also see that as being useful so that it can be tied to a PID loop (or similar) where it automatically increases the limit and decreases it based on job throughput or some other metric	21:03
anteaya	ttx: okay, sorry more questions later	21:04
clarkb	jeblair: right now it would potentially give jenkisn a chance to catch back up on its own	21:04
jeblair	clarkb: catch up with what?	21:04
clarkb	jeblair: deleting nodes	21:04
jeblair	clarkb: oh, i don't think that has anything to do with it	21:04
clarkb	jeblair: or $otheroperation that has slowed to a crawl	21:04
jeblair	clarkb: it can't delete nodes because it's deleting nodes	21:04
jeblair	clarkb: not because it's running jobs	21:04
jeblair	clarkb: there _are_ things we can control to tune this whole system, but we need to tune the right things.	21:04
*** gyee has joined #openstack-infra		21:05
*** pblaho has joined #openstack-infra		21:05
jeblair	clarkb: if you want to rate-limit starting or stopping jobs, that can be done with zuul and gearman, in how they dispatch jobs	21:05
jeblair	clarkb: but setting an arbitrary jobs-per-hour limit doesn't address an actual problem.	21:05
clarkb	jeblair: right, I see it as a tool help implement proper bottleneck fixes	21:06
jeblair	clarkb: i really don't think it will help	21:06
jeblair	clarkb: you're creating and tuning a parameter that has nothing to do with the systems that are actually running	21:07
clarkb	but it is a parameter that influences everything	21:07
jeblair	clarkb: for instance, it would do nothing to prevent mass simultaneous deletions of nodes, which is an ACTUAL problem	21:07
*** nati_ueno has quit IRC		21:07
jeblair	(or at least seems to be)	21:07
*** melwitt has quit IRC		21:08
*** melwitt1 has joined #openstack-infra		21:08
clarkb	just noticed that the zuul status timers don't do hours properly...	21:08
*** nati_ueno has joined #openstack-infra		21:08
clarkb	jeblair: but it would reduce the number of nodes that would be deleted together	21:08
jeblair	clarkb: no, the fix that i'm trying to write right now will do that	21:09
jeblair	clarkb: it will delete only one node from a jenkins at a time	21:09
jeblair	clarkb: why would you want to try to fix that another way?	21:09
clarkb	I am not suggesting this as a fix	21:09
jeblair	clarkb: what are you suggesting?	21:09
clarkb	you would still want to fix that particular problem with the change you are writing	21:09
clarkb	jeblair: I am suggesting that we have some way of slowing everything down to usable levels while you write that fix	21:10
*** rfolco has quit IRC		21:10
clarkb	we are very spiky and the ability to smooth out really big spikes will help in fixing the fallout	21:10
jeblair	clarkb: the fix i want to write will do that? why don't i just go write that instead of something else that won't fix it?	21:11
clarkb	because next week or during icehouse freeze we will run into similar yes different problems	21:11
*** cppcabrera_afk is now known as cppcabrera		21:14
*** fbo is now known as fbo_away		21:16
jeblair	mordred, fungi: ping	21:17
mordred	jeblair: pong	21:17
jeblair	mordred: can you clean up jenkins02?	21:17
mordred	jeblair: yes. is there a description of the problem in the scrollback?	21:18
*** vipul is now known as vipul-away		21:18
jeblair	mordred: yes	21:18
mordred	jeblair: great. I will find it	21:18
jeblair	mordred: thanks	21:18
*** vipul-away is now known as vipul		21:18
mordred	ttx: next year, can we move the nova FF one week prior? having me be only partially here due to burningman prep is not fantastic	21:18
mordred	jeblair: oh wow. ok. force stop ok yeah?	21:19
jeblair	mordred: yep	21:19
mordred	stopping	21:20
mordred	btw - salt-master has cpu pegged on puppetmaster - I'm going to restart it	21:20
jeblair	mordred: i thought we stopped all the minions? maybe stop the master too.	21:21
mordred	great	21:21
clarkb	we should make a second pass at cleaning up the salt stuff after featurefreeze	21:22
clarkb	I believe the minions are still going crazy after the ssh thing	21:22
clarkb	s/ssh/crypto/	21:22
jeblair	oh, we didn't stop them?	21:22
*** thomasbiege1 has joined #openstack-infra		21:22
reed	fungi, jeblair, pleia2: let me know if you think it may work https://review.openstack.org/#/c/42998/	21:23
clarkb	jeblair: we stopped them by hand, then restarted them then ran the rekey thing in hopes it would make them sane again	21:24
ttx	mordred: nex tyear, you shall scream when I show the schedule on the screen	21:24
clarkb	jeblair: but it didn't we should probably just disable the minion service on the slaves	21:24
mordred	ttx: yes, I will	21:24
clarkb	ttx: I think he did	21:25
mordred	clarkb: oh, you're right	21:25
mordred	I did	21:25
mordred	I believe I mentioned something like "there's going to be a rush and I'm not going to be much help" if the FF is that week	21:25
*** thomasbiege1 has quit IRC		21:26
ttx	next year if we separate summit/conf it would happen earlier	21:26
mordred	perfect	21:26
lifeless	mordred: when do you leave for burning man	21:27
lifeless	?	21:27
openstackgerrit	James E. Blair proposed a change to openstack-infra/nodepool: Add ProviderManager https://review.openstack.org/42973	21:27
jeblair	clarkb: ^ live-tested	21:27
*** prad_ has quit IRC		21:27
openstackgerrit	Anita Kuno proposed a change to openstack-infra/config: Creating/adding the openstack/governance repository https://review.openstack.org/43002	21:27
jeblair	clarkb: i'm basically just going to do the same thing for jenkins now.	21:27
clarkb	jeblair: ok	21:28
clarkb	jeblair: I have only found one minor issue so far	21:28
clarkb	jeblair: but it won't cause any bugs	21:28
anteaya	mordred ^	21:29
mordred	jeblair: I've stopped jenkins02, amd currently working on deleting devstack slaves that were attached to it	21:29
anteaya	so in addition to this patch (I basically just followed the instructions for stackforge repos) what else to I have to do to create the repo?	21:29
mordred	lifeless: first thing in the morning	21:30
anteaya	do I just create it on my laptop and push it as an empty repo?	21:30
clarkb	lifeless: too soon	21:30
anteaya	giving it a .gitreview file	21:30
lifeless	mordred: ack	21:30
*** alexpilotti has quit IRC		21:33
mordred	jeblair: ERROR: n/a (HTTP 400)	21:34
mordred	jeblair: is that ^^ a symptom of az1 rate limiting?	21:34
Alex_Gaynor	so trying to access the jenkins pages for some of hte running jobs on the zuul status page is resulting in 502s	21:35
jeblair	mordred: not that i'm aware; i don't see current rate limiting errors from nodepool	21:37
mordred	AWESOME	21:37
anteaya	are there tc meeting logs prior to October 2012? this link has October 2012 through to now but not prior: http://eavesdrop.openstack.org/meetings/tc/	21:37
jeblair	Alex_Gaynor: mordred is working on that	21:37
Alex_Gaynor	jeblair: okey doke (as always if I can help in some way, let me know)	21:37
mordred	jeblair: I'm getting that error a lot from running nova list and nova delete	21:37
mordred	btw - ERROR: n/a (HTTP 400) is a TERRIBLE error message	21:38
*** dprince has quit IRC		21:38
jeblair	mordred: OverLimit: This request was rate-limited. (HTTP 413)	21:40
mordred	ok	21:40
jeblair	mordred: ^ that's what that looks like (and just happened)	21:40
mordred	fantastic	21:41
*** boris-42 has quit IRC		21:41
*** cppcabrera has left #openstack-infra		21:42
mordred	jeblair: I'm not having much luck in deleting the nodes... how important is that part of the step?	21:44
jeblair	mordred: i think you can skip it, nodepool should be able to clean up	21:47
jeblair	mordred: it will be slow about it, which probably isn't a bad thing	21:47
mordred	jeblair: ok. then I'm going to delete the node section from config.xml and restart	21:47
jeblair	mordred: just the devstack nodes	21:48
*** mrmartin has quit IRC		21:49
*** prad_ has joined #openstack-infra		21:50
*** AJaeger has quit IRC		21:51
*** thomasbiege1 has joined #openstack-infra		21:51
mordred	jeblair: jenkins02 is starting	21:55
mordred	jeblair: and yes - just hte devstack nodes were delete	21:55
*** dina_belova has joined #openstack-infra		21:55
*** weshay has quit IRC		21:55
* fungi is caught up on scrollback from lunch and reviewing gate-performance-improving changes as a first priority		21:57
clarkb	jeblair: woo finally got through that change	21:58
clarkb	jeblair: the only major concern I have is with the default timeout used by the manager code	21:58
pleia2	oh, my lunch was productive, got to talk to a redhat admin who thinks that for our use case running git daemon as a service makes more sense than xinetd anyway since we're using it so much, feel less bad about writing the init script now ;)	21:59
Alex_Gaynor	So is this how it works every feature freeze? We fix the latest rounds of bottlenecks ?	21:59
clarkb	Alex_Gaynor: yes	21:59
*** dina_belova has quit IRC		22:00
*** thomasbiege1 has quit IRC		22:00
clarkb	pleia2: oh good	22:00
mordred	Alex_Gaynor: each time, the feature freeze has been significantly larger than the previous too	22:00
Alex_Gaynor	mordred: sure, that was the underlying premise of my statementn, I didn't meean to imply we weren't making progress :)	22:00
clarkb	Alex_Gaynor: the number of changes that go in the week before feature freeze is not only much greater than the previous feature freeze but much greater than the weeks before it	22:00
*** gyee has quit IRC		22:01
*** markmc has quit IRC		22:02
notmyname	mordred: I can do the needful this afternoon for the tagging process to get pbr working	22:02
*** rnirmal has quit IRC		22:02
*** mriedem1 has quit IRC		22:03
*** markmcclain has quit IRC		22:03
mordred	notmyname: ok. from my side, I believe we can do that	22:05
notmyname	mordred: here's, IMO, a simple thing I think will make it all work	22:07
*** burt has quit IRC		22:08
mordred	ooh. I like simple things	22:08
notmyname	mordred: we tag today with 1.9.2 and consume that version number (ie we won't ever "release" a 1.9.2). This will let pbr do the right thing and create version numbers that sort properly	22:08
notmyname	mordred: if we have another minor release, it will be 1.9.3	22:08
notmyname	mordred: but most likely will be 1.10.0 anyway	22:09
mordred	well... we could do that ...	22:09
mordred	but it will cause a 1.9.2 to be released to tarballs.o.o	22:09
mordred	but I'm ok with that if you are	22:09
openstackgerrit	Clark Boylan proposed a change to openstack-infra/zuul: SIGUSR2 logs stack traces for active threads. https://review.openstack.org/42959	22:09
notmyname	mordred: I don't see that as a problem, but do you have an alternate suggestion?	22:09
clarkb	jeblair: ^ now with documentation	22:09
jeblair	clarkb: just looked at your comment	22:09
mordred	notmyname: tagging 1.9.2-dev - which will not cause a release to be cut	22:10
mordred	and will map closely to your current version in tree	22:10
jeblair	on cleanupServer in providerManager...	22:10
notmyname	mordred: to quote from clay on the pbr patch "Rather than waiting for imminent merge, we really should get a 1.9.2 tag on the origin repo now so the git based versioning works in sane fashion for review. I don't really care about 1.9.2-dev which doesn't parse by distutils.version.StrictVersion anyway."	22:10
clarkb	jeblair: about the timeout value	22:10
jeblair	clarkb: yeah	22:10
mordred	notmyname: ok. I'm sold by that	22:11
jeblair	clarkb: so the timeout loop is a big loop that runs inside of the thread that is trying to delete the server	22:11
notmyname	mordred: ya, mostly the last line	22:11
notmyname	mordred: and if you haven't you should read his full comment on https://review.openstack.org/#/c/28892/	22:11
notmyname	mordred: but I think we can go forward with a 1.9.2 tag and then merge the patch	22:12
jeblair	clarkb: inside of that loop, it puts a task on the queue to get the server, and waits for that to complete	22:12
jeblair	clarkb: so i don't think anything about the timeout value changes	22:12
jeblair	clarkb: overall, we still wait, er, an hour for the server to be deleted (in a thread that is pretty much dedicated to trying to delete the server)	22:12
jeblair	clarkb: but that shouldn't affect anything else, other than every 2 seconds, that thread asks the provider thread to check on the server	22:13
mordred	notmyname: reading now	22:13
notmyname	mordred: so I think that leaves it here: I'll approve/merge the pbr patch when I see the 1.9.2 tag on master upstream	22:13
jeblair	clarkb: (if a lot of servers are being slow to be deleted, everything else about that provider will be slow too, but i think that's desirable. mostly.)	22:13
*** ^demon has quit IRC		22:13
*** gyee has joined #openstack-infra		22:14
*** ^d has joined #openstack-infra		22:14
*** pblaho has quit IRC		22:14
clarkb	jeblair: will it not prevent other tasks for running? for some reason I thought it would, but that function is called from outside the manager thread and does the poll loop there	22:14
*** ^d has quit IRC		22:14
*** ^d has joined #openstack-infra		22:14
clarkb	jeblair: so I think I was concerned about nothing	22:14
clarkb	The running of the delete task runs in the manager thread which is quick	22:15
clarkb	jeblair: I will update my vote	22:15
jeblair	clarkb: exactly, all of those methods just put a task on the manager's queue, running those tasks happens in the dedicated thread, and all the tasks should be simple 1:1 nova api calls	22:15
clarkb	jeblair: done	22:16
clarkb	jeblair: pleia2 http://logs.openstack.org/93/42593/4/gate/gate-grenade-devstack-vm/6de9e45/logs/devstack-gate-setup-workspace-new.txt	22:18
anteaya	ttx: when you are around but not in a meeting, here is my first attempt: https://review.openstack.org/#/c/43002/	22:18
*** ^d has quit IRC		22:19
clarkb	jeblair: pleia2: I think that may be replication related	22:19
clarkb	though I am not sure because I would've expected git to make that more atomic	22:20
mordred	notmyname: ok. yes. I tihnk it's a well written comment, and I appreciate the willingness to go along.	22:20
*** dkliban has quit IRC		22:20
mordred	notmyname: do you want me to cut a tag? or do you want to do it?	22:20
notmyname	mordred: I can't make tags for swift (unless that's changed)	22:20
notmyname	mordred: if I have the perms, I'd be happy to do it	22:21
jeblair	clarkb: i agree, it wfm locally	22:21
clarkb	jeblair: pleia2 http://paste.openstack.org/show/44689/ is what I see in the apache log	22:21
mordred	ttx: you around?	22:21
jeblair	clarkb: what a strange error	22:22
clarkb	jeblair: ya, file exists though and has timestamps from days in the past	22:22
pleia2	that is odd, it's just ssh that replicates so it shouldn't be doing something like deleting it first (huh, would it?)	22:22
notmyname	mordred: after midnight in paris right now..	22:22
mordred	notmyname: ok. I'll just do it	22:23
jeblair	notmyname, mordred: he's not in that timezone	22:23
notmyname	ah, ok then :-)	22:23
clarkb	pleia2: I don't expect it to and the mod time on that dir is from the 13th	22:23
notmyname	mordred: ok. who has permission to push tags? with the change to pbr is that changing?	22:23
notmyname	jeblair: clarkb: ^ ?	22:23
mordred	notmyname: no - it should be still ttx since it's a server project	22:23
notmyname	ok	22:24
mordred	notmyname: the main change is that it won't need to commit to change the version anymore	22:24
mordred	notmyname: so the chances of your milestone-proposed brnach being any different than master are _REALLY_ low :)	22:24
openstackgerrit	Jim Branen proposed a change to openstack/requirements: Allow use of hp3parclient 2.0 https://review.openstack.org/42991	22:25
mordred	notmyname: 5c6f0015d56478108a623cf65641a39ea91fc2b5 work for you?	22:25
notmyname	mordred: confirm. 5c6f0015d56478108a623cf65641a39ea91fc2b5	22:25
*** changbl has quit IRC		22:26
mordred	notmyname: done	22:26
notmyname	mordred: thanks	22:27
notmyname	mordred: final tests on pbr branch	22:27
notmyname	rd	22:27
clarkb	I wonder	22:29
*** lbragstad has quit IRC		22:29
clarkb	jeblair: pleia2 so apache is allowed to read the pack and idx files directly without talking to the git http thing	22:33
clarkb	jeblair: pleia2 and that is what appears to have failed	22:33
*** jungleboyj has joined #openstack-infra		22:33
jungleboyj	Can anyone answer questions about how the Transifex Translations are being automatically done?	22:34
clarkb	pleia2: any chance selinux is involved?	22:34
clarkb	jungleboyj: yes I can, whats up?	22:34
jungleboyj	clarkb: Awesome. Thank you!	22:35
*** jhesketh has joined #openstack-infra		22:35
pleia2	clarkb: good question, it shouldn't since everything in /var/lib/git should have the right selinux magic to serve it up to httpd	22:35
pleia2	clarkb: but this is getting quite far out of my git expertise to understand what is happening git-wise (pack and idx files?)	22:36
clarkb	pleia2: in .git/objects/pack	22:36
jungleboyj	clarkb: I am working on Cinder and noticed that we had some english strings that were coming our wrong. When I look at the .po files for en_US I see that it has a msgstr defined that is either incomplete or all together wrong. Trying to figure out the right way to fix that. I had gone through and removed all the msgstr s (msgstr="") since it doesn't make sense to translate English to English but now I see the latest	22:37
mordred	jungleboyj: can you defined "coming out wrong" ?	22:37
clarkb	pleia2: the pack files contain a bunch of object files all compressed together, I believe the idx files tell git where to look in that compressed blob for specific objects	22:37
clarkb	pleia2: that particular file has been in place since the 13th though	22:38
pleia2	clarkb: I see, so that doesn't sound to me like anything strange that selinux would have a problem with inside /var/lib/git/	22:38
clarkb	jungleboyj: can you link to a particular example in a proposed change?	22:38
clarkb	jungleboyj: and I think the way i8ln works it does make sense to translate English to English depending on the locale :)	22:39
jungleboyj	mordred: I had the string _("Failure creating image %s. Error %s", vol_id, error) or something like that. In the .po the msgstr for that was just "Failure creating image" and that was all that was printed to the logs.	22:39
lifeless	bad translator, no cookie	22:39
*** apcruz has quit IRC		22:40
*** sandywalsh has quit IRC		22:40
* clarkb updates cinder repo		22:40
*** shardy is now known as shardy_afk		22:40
clarkb	pleia2: the normal permissions all look fine. I don't know why else apache would fail to see a dir	22:41
*** nijaba has quit IRC		22:42
mgagne	With JJB, has anyone had the great idea to use parameterized jobs in job-group?	22:42
jungleboyj	clarkb: Here is the specific example: https://review.openstack.org/#/c/40948/2/cinder/locale/en_US/LC_MESSAGES/cinder.po Line 583	22:42
pleia2	clarkb: /var/log/audit.log is where selinux logs violations, so you can look there	22:43
clarkb	pleia2: thanks	22:43
jungleboyj	msgid "Failed to copy image to volume: %(reason)s"	22:43
jungleboyj	msgstr "Failed to copy image to volume"	22:43
clarkb	jungleboyj: we treat transifex as the source of truth for those msgstrs	22:45
clarkb	jungleboyj: the old string there may have been a casualty of babel doing a fuzzy translation and not understanding the %(reasons) I am not actually sure there	22:46
jungleboyj	clarkb: Ok, well, in the case of Cinder the msgstrs are incomplete or wrong. Need to figure out how to fix it. Saw the same thing in other projects too.	22:46
clarkb	jungleboyj: but for patchset 1 the removal of the msgstr would've come from transifex or the update_catalog that we run prior to updating from transifex	22:46
clarkb	jungleboyj: yeah, things were wrong at one point because babel allows fuzzy translations by default, we have since disabled that. Let me get you a link to the script that proposes these chagnes	22:47
fungi	jungleboyj: i have seen translations from the "c" source language to en get extremely stale because nobody is checking them for some projects, so eventually the source strings grow different numbers of format string parameters than the obsolete en versions which should normally be identical	22:47
clarkb	jungleboyj: https://github.com/openstack-infra/config/blob/master/modules/jenkins/files/slave_scripts/propose_translation_update.sh	22:48
clarkb	jungleboyj: https://github.com/openstack-infra/config/blob/master/modules/jenkins/files/slave_scripts/propose_translation_update.sh#L46-L55 is the most relevant section. I wonder if this is fallout from when we didn't prevent fuzzy matches	22:49
fungi	jungleboyj: i did a fairly massive pass through nova some months back to clean up english translations (which basically resulted in me duplicating the source strings)	22:49
fungi	i'm not familiar with what the impact from fuzzy matches might be though	22:50
clarkb	jungleboyj: from git blame http://paste.openstack.org/show/44691/ that was long enough ago to be when fuzzy matching was allowed so I think that is the issue	22:50
*** mikal has joined #openstack-infra		22:51
clarkb	fungi: jungleboyj: we may want to reseed them all with non fuzzy strings based on what is in transifex to get past the cruft that babel let through initially	22:51
*** mikal has quit IRC		22:52
*** prad_ has quit IRC		22:52
fungi	i take it there's no way to identify a fuzzy vs. non-fuzzy translation of a string solely from the pofile	22:53
*** sandywalsh has joined #openstack-infra		22:53
notmyname	mordred: patch merged (merging) and email sent to ML	22:53
mordred	notmyname: woot!	22:53
notmyname	mordred: thanks for your help on it	22:53
mordred	notmyname: thanks for yours! I believe pbr is much better today than it was originally due to addressing your concerns	22:54
*** nijaba has joined #openstack-infra		22:54
clarkb	fungi: there is the # fuzzy comment, but I think babel may not remove those when it has a non fuzzy translation	22:54
clarkb	fungi: which makes it a little painful to work with	22:54
jungleboyj	clarkb: So, let me make sure that I understand. There are some old en translations that didn't happen properly because fuzzy matching was allowed.	22:54
*** ftcjeff has quit IRC		22:55
*** markmcclain has joined #openstack-infra		22:55
notmyname	mordred: in my email I said, "If you have any issues, just ask Monty. Preferably after 10pm on Tuesdays" ;-)	22:55
*** michchap has joined #openstack-infra		22:55
mordred	clarkb: speaking of i18n, we should get swift on the transifex bandwagon - they already use babel and everything	22:55
fungi	clarkb: right. unless we actually expect un-fuzzed translations to result in the #fuzzy comment also getting removed, no way to tell just from the translated string itself	22:55
mordred	clarkb: and their translations are in top level like I sort of want everyone else's to be :)	22:55
mordred	notmyname: I look forward to those questions :)	22:56
clarkb	jungleboyj: correct	22:56
jungleboyj	clarkb: If that is the case, how can I get fixes for those strings that got fuzzed up.	22:56
clarkb	jungleboyj: you can translate them in transifex, or I think it is still possible to propose a patch that fixes them, but that may not be the case. I will have to double check that	22:57
openstackgerrit	Elizabeth Krumbach Joseph proposed a change to openstack-infra/config: Swap git daemon in xinetd for service https://review.openstack.org/43012	22:57
*** mkirk_ has quit IRC		22:58
jungleboyj	clarkb: Forgive all the noob questions. How do I translate them in transifex?	22:58
clarkb	jungleboyj: https://github.com/openstack-infra/config/blob/master/modules/jenkins/files/slave_scripts/upstream_translation_update.sh#L42-L53 we still push local git contents back to transifex so you can propose a fix in git if you like	22:58
*** mkirk_ has joined #openstack-infra		22:58
clarkb	jungleboyj: I have actually never done it :) but I believe you log into https://transifex.com find the cinder project and then you can either update strings in your browser or use the tx tool	22:59
*** gordc has left #openstack-infra		22:59
jungleboyj	clarkb: Ok.	22:59
jungleboyj	clarkb: FYI, the pot file doesn't have any msgstrs defined in it. Will changing the pos make a difference?	23:00
clarkb	the pot file is a template, it should not have any msgstrs in it	23:00
clarkb	the .po files contain the actual translations	23:00
*** rcleere has quit IRC		23:01
openstackgerrit	Elizabeth Krumbach Joseph proposed a change to openstack-infra/config: Swap git daemon in xinetd for service https://review.openstack.org/43012	23:01
jungleboyj	clarkb: That is what I thought. So, I would need to actually put the changes in the POs.	23:01
*** sgviking has quit IRC		23:02
*** dkliban has joined #openstack-infra		23:02
clarkb	jeblair: pleia2 mordred https://jenkins01.openstack.org/job/gate-neutron-pep8/434/console ugh. I think centos and ubuntu must be sufficiently different that this doesn't work quite right. Or something replication related	23:02
clarkb	jungleboyj: yup	23:02
jungleboyj	clarkb: Once I do that, is there something I need to do to get a new transifex import to happen?	23:03
*** jpich has quit IRC		23:03
clarkb	jungleboyj: using transifex's tx tool you can get pull the pos and push them back to transifex if you want to use their workflow	23:03
clarkb	jungleboyj: we import from transifex once a day per project	23:03
clarkb	so you don't need anything special it should just happen	23:03
jungleboyj	clarkb: Ok, and you don't recommend clearing out all the english msgstrs ? Just fix the ones that are wrong?	23:04
clarkb	jungleboyj: right. as en_US is different than C	23:04
jeblair	clarkb: yeah, three differences: replication over ssh, operating system, git version	23:04
clarkb	and different than en_UK and so on	23:04
jungleboyj	clarkb: Ok. Thank you so much for the help!	23:04
pleia2	clarkb: I think it's a rewrite problem! pulling that file from /cgit works, but not the direct git.openstack.org/openstack/neutron/... location	23:05
clarkb	pleia2: interesting	23:05
openstackgerrit	Mathieu Gagné proposed a change to openstack-infra/jenkins-job-builder: Job-specific subst. in a job group's job list https://review.openstack.org/43013	23:05
*** mrodden has quit IRC		23:06
clarkb	pleia2: /cgit will be served by cgit though right?	23:07
clarkb	pleia2: so possibly completely different processes	23:07
pleia2	clarkb: right	23:07
pleia2	but at least the files do exist and are servable by apache somewhere	23:07
pleia2	might be right about git version weirdness	23:08
jeblair	clarkb: maybe check if that file exists on disk?	23:08
pleia2	cgit is serving it	23:08
jeblair	pleia2: could be cached	23:09
pleia2	ah	23:09
jeblair	pleia2: if it exists on disk and apache does not serve it, it's as you say, a rewrite problem	23:09
jeblair	pleia2: if not, we're back to where we were	23:09
clarkb	jeblair: the files do exist on disk, at least the ones that I have seen	23:09
clarkb	s/seen/looked at/	23:09
*** sgviking has joined #openstack-infra		23:09
jeblair	clarkb: does openstack/neutron/objects/pack/pack-de6d5d31c8684408cf90392a88fb0176b4ca8f01.idx ?	23:10
clarkb	https://github.com/openstack-infra/config/blob/master/modules/cgit/templates/git.vhost.erb#L19-L30 for those follwoing along.	23:10
clarkb	jeblair: checking	23:10
clarkb	jeblair: yes -r--r--r--. 1 cgit cgit 4488 Aug 20 06:18 pack-de6d5d31c8684408cf90392a88fb0176b4ca8f01.idx	23:11
jeblair	pleia2: sounds like you're on to something	23:12
clarkb	jeblair: pleia2 does the RewriteRule and ScriptAlias conflict?	23:12
pleia2	hmm	23:12
clarkb	oh you know	23:13
*** jerryz has joined #openstack-infra		23:13
clarkb	actually no that can't be it	23:13
pleia2	the regex for pack\|idx seems right	23:14
clarkb	pleia2: yeah that comes straight from the git http man page iirc	23:14
*** dims has quit IRC		23:15
*** ken1ohmichi has joined #openstack-infra		23:18
*** ryanpetrello has quit IRC		23:20
openstackgerrit	James E. Blair proposed a change to openstack-infra/nodepool: Add JenkinsManager https://review.openstack.org/43014	23:21
openstackgerrit	James E. Blair proposed a change to openstack-infra/nodepool: Add an ssh check periodic task https://review.openstack.org/43015	23:21
openstackgerrit	James E. Blair proposed a change to openstack-infra/nodepool: Change credentials-id parameter in config file https://review.openstack.org/43016	23:21
openstackgerrit	James E. Blair proposed a change to openstack-infra/nodepool: Reduce timeout when waiting for server deletion https://review.openstack.org/43017	23:21
openstackgerrit	James E. Blair proposed a change to openstack-infra/nodepool: Add ProviderManager https://review.openstack.org/42973	23:21
mgagne	which repo should I clone to test? I was able to clone stackforge/puppet-glance and openstack/python-heatclient without problem	23:21
clarkb	mgagne: neutron and nova appear to currently be failing fairly frequently according to the logs	23:21
mgagne	clarkb: is it therefore an intermittent issue?	23:22
pleia2	clarkb: so can it get to some pack-de6d5d31c8684408cf90392a88fb0176b4ca8f01.idx files?	23:23
clarkb	mgagne: yes, it seems to be intermittent	23:23
pleia2	er, .idx files	23:23
clarkb	pleia2: I am not sure yet, actually let me try getting that file direclty	23:23
clarkb	pleia2: mgagne: this may in part depend on the local state of your repo	23:23
mgagne	clarkb: I'm cloning from scratch, are tests fetching and checking out a specific ref instead?	23:24
clarkb	mgagne: tests will clone if the repo doesn't already exist otherwise they will do a remote update to fetch what they are missing	23:25
clarkb	pleia2: directly fetching one of those neutron files with wget fails. This must've been what you tested before	23:25
clarkb	pleia2: for whatever reason I thought you tested with a git clone which does work	23:25
pleia2	clarkb: I just tested via web browser	23:26
clarkb	pleia2: looking at the vhost cgit will serve anything not under .*/objects because ScriptAlias / /usr/libexec/git-core/git-http-backend/ will never be used as we rewrite / to /cgit	23:27
clarkb	pleia2: oh but we rewrite ^/$ to /cgit so anything like /openstack/foo should go to git-http-backend right?	23:28
pleia2	clarkb: yeah, I think those rewrite things are not for cgit	23:28
*** mrodden has joined #openstack-infra		23:28
pleia2	clarkb: I think they are just for git-http-backend	23:29
pleia2	fungi added them in a change to support git-http-backend	23:29
*** changbl has joined #openstack-infra		23:29
*** dims has joined #openstack-infra		23:30
*** HenryG has joined #openstack-infra		23:31
jeblair	clarkb: ^ the new stack of nodepool changes is in production	23:32
fungi	yup	23:32
jeblair	clarkb: (i did reduce that timeout, btw, because i think it was ridiculously large)	23:32
fungi	from an hour to...?	23:33
*** ken1ohmichi has quit IRC		23:33
jeblair	10 mins	23:33
* fungi nods. sounds sane		23:33
jeblair	which is just, well, large. :)	23:33
fungi	s/ridiculously//	23:33
pleia2	clarkb: confirmed, I don't have any of the pack rewrite rules in my test instance and I can download packs via cgit (hi fungi!)	23:33
clarkb	pleia2: I think it may be an selinux thing	23:34
clarkb	pleia2: httpd itself will access the git files when they hit the AliasMatches	23:35
* fungi retries to grok where the ^/$ rewrite could conflict at all with the git-http-backend cgi scriptalias		23:35
clarkb	but httpd runs under a different selinux type	23:35
clarkb	I am very quickly learning about selinux types so that I can test	23:35
jeblair	selinux would show that error	23:35
jeblair	clarkb: look in audit.olg	23:35
jeblair	log	23:35
clarkb	audit.log was a pain to look at ...	23:36
pleia2	hah	23:36
pleia2	can grep for git probably	23:36
clarkb	but I think I just get annoyed when there are no timestamps. I will look again	23:36
fungi	clarkb: well, there are timestamps, you just need to learn to read unixtime directly ;)	23:37
clarkb	I don't see any AVC messages in audit.log	23:38
mgagne	clarkb: I think it has to do with the way packs are generated. Could be that they are generated on-the-fly and there is contention issues on git.o.o due to the high volume of clone, fetch, etc.	23:38
mgagne	clarkb: https://www.kernel.org/pub//software/scm/git/docs/git-update-server-info.html	23:39
clarkb	mgagne: it seems to know where the files are though, it just can't get them	23:39
mgagne	clarkb: a curl returns the file? Could it be caching issue? Or is it a timing issue, by the time you test the existence of the file, it got generated. Trying to figure out what have been tried/tested.	23:41
*** rfolco has joined #openstack-infra		23:42
clarkb	mgagne: wgetting the file that was failed to fetch on a jenkins slave fails, but the file is on disk and has been there for at leasthours	23:42
clarkb	mgagne: https://jenkins01.openstack.org/job/gate-neutron-pep8/434/console has a list of things that can't be fetched	23:42
clarkb	mgagne: however changing the root of the url to /cgit you are able to get the file	23:43
clarkb	mgagne: so it is only when apache attempts direct access via https://github.com/openstack-infra/config/blob/master/modules/cgit/templates/git.vhost.erb#L28-L29 that it fails	23:43
jeblair	further evidence the scriptalias is not working: the actual apache error log message says "File does not exist: /var/lib/git/openstack/neutron"	23:44
jeblair	and that _doesn't_ exist	23:44
jeblair	because it's /var/lib/git/openstack/neutron.git	23:44
jeblair	so presumably the scriptalias directive to use the smart http server would normally translate that,	23:44
clarkb	oh that may be it	23:44
pleia2	oh wow, right	23:45
jeblair	but it's not, so apache is just trying to serve a simple file	23:45
pleia2	https://git.openstack.org/openstack/neutron.git/objects/pack/pack-8dd2daf4e48bc336b39e06bcb5612bdc2c7bec7c.idx works!	23:46
pleia2	nice one jeblair	23:46
jeblair	but looking at that, i think we're trying to get apache to just serve the files	23:46
jeblair	it looks like the aliasmatch directives are intended to take precedence, and then scriptalias catches the rest	23:47
mrodden	any idea why i'm seeing this in my tox runs? http://paste.openstack.org/show/44692/	23:47
mrodden	cannot import setuptools	23:47
clarkb	jeblair: the config comes from https://www.kernel.org/pub/software/scm/git/docs/git-http-backend.html	23:47
mrodden	but it actually installs setuptools 1.0 above...	23:47
jeblair	clarkb: yeah, and it's the same as on review	23:47
*** mriedem has joined #openstack-infra		23:48
jeblair	clarkb: what if the git smart http server is providing the wrong urls?	23:48
jeblair	(git version difference)	23:48
clarkb	jeblair: could be	23:49
mgagne	GIT_PROJECT_ROOT has a trailing slash	23:49
mgagne	could it be?	23:49
clarkb	mrodden: the uninstall of distribute that happens first is causing the problem I htink	23:49
mgagne	doc doesn't show/use trailing slash	23:49
clarkb	mrodden: try updating tox?	23:50
pleia2	mgagne: perhaps, maybe if it has a trailing slash it does assume neutron/ and won't expand to neutron.git/	23:50
mrodden	clarkb: ok i'm on 1.4	23:50
mrodden	1.4.2 i think	23:50
clarkb	there is a trailing slash on review.o.o, but I can go ahead and update it git.o.o and restart apache to check	23:51
mrodden	wow they have 1.6.0 out now...	23:51
clarkb	mrodden: there has been a lot of churn around setuptools and distribute merging	23:51
clarkb	mrodden: so there are a bunch of updates from tools	23:51
fungi	well, we have trailing / on GIT_PROJECT_ROOT for the gerrit servers and zuul in fact	23:51
*** UtahDave has quit IRC		23:51
mrodden	crazy	23:51
openstackgerrit	Joshua Hesketh proposed a change to openstack-infra/zuul: Move gerrit specific result actions under reporter https://review.openstack.org/42644	23:52
openstackgerrit	Joshua Hesketh proposed a change to openstack-infra/zuul: Add support for emailing results via SMTP https://review.openstack.org/42645	23:52
openstackgerrit	Joshua Hesketh proposed a change to openstack-infra/zuul: Separate reporters from triggers https://review.openstack.org/42643	23:52
clarkb	fungi: yeah but this is the only server with this version of git	23:52
clarkb	anyways restarting apache now	23:52
clarkb	didn't help	23:53
pleia2	nope :\	23:53
jeblair	uh, so there are very few references to pack files in the gerrit logs	23:54
clarkb	jeblair: maybe it isn't working there either?	23:54
mordred	clarkb: oh - interesting	23:54
jeblair	some of them are to '.git' dirs, and they work, some omit '.git' and are 404s	23:54
pleia2	same thing here	23:55
jeblair	by very few, i mean 1 client this week.	23:55
clarkb	warning hack: what if we just symlink openstack/foo to openstack/foo.git on disk?	23:55
clarkb	and handle both cases?	23:55
pleia2	clarkb: it hurts, but if we do we can do it in the jeepyb script	23:56
jeblair	clarkb: maybe to stop the bleeding? but we really should figure out the problem.	23:57
clarkb	jeblair: I agree	23:57
clarkb	let me add a neutron symlink then try grabbing that idx file again	23:57
clarkb	that will at least tell us if this is the only problem	23:58
* pleia2 nods		23:58
jeblair	(i don't think we should add it to jeepyb, (unless we decide it's the actual solution) we'll never fix it)	23:58
pleia2	jeblair: ah, ok	23:58
jeblair	mordred: i forgot a step earlier: set the nodes to deleted in nodepool	23:59
jeblair	i'll do that now	23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!