13:31:39 #startmeeting powervm_ci_meeting 13:31:39 Meeting started Thu Dec 8 13:31:39 2016 UTC and is due to finish in 60 minutes. The chair is esberglu_. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:31:40 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:31:42 The meeting name has been set to 'powervm_ci_meeting' 13:31:49 Hey guys 13:32:26 * adreznec waves 13:33:40 o/ 13:34:36 #topic status 13:34:51 So those runs are _slowly_ going through 13:35:13 Yeah 13:35:16 The runs themselves seem to be fine 13:35:18 how slow? 13:35:20 Seeing some scary times out there on the queue 13:35:52 thorst_: I can ping you the zuul ip if you want to look 13:36:06 I have it 13:36:08 10 hours? 13:36:13 But like 10 hours 13:36:14 Yeah 13:36:28 Do we know what's bogging things down yet? 13:36:31 are any actually going? 13:36:59 the jenkins has a ton of idle VMs. 13:37:40 Yeah. There are 3 going through right now, 20 - 30 have gone through in the last 12 hours 13:37:46 Yeah 13:38:00 It doesn't actually look like anything's been running for all that long 13:38:03 That's about the volume I would expect 13:38:07 but things have been in the queue for a long time 13:38:13 Its just they sit around in the queue forever first 13:39:16 Which means the queue just keeps getting bigger 13:41:03 Ok, so I think we need to nail down exactly what's causing the initial queuing to build up 13:41:28 If it's git issues, then we probably need to invest in mirrors at this point 13:42:57 did we tell Zuul to only let 3 through at a time? 13:43:16 wasn't there some gate in zuul about throughput? 13:43:34 I don't really know how this could be git...what's that train of thought there? 13:44:21 (not that a mirror is a bad idea...) 13:44:38 No the only zuul conf that changed was moving nova from silent to check pipeline 13:44:45 hmm 13:45:13 Well we were seeing those git performance issues yesterday, and one theory was that we were hitting some kind of internal timeouts doing the clones/fetches 13:45:40 ahh, cause zuul does some sort of clone 13:45:51 which I don't understand...I'd have thought that was just in the Jenkins slave VM 13:45:51 Because we could see it attempting to do the same fetch multiple times on different PIDs 13:46:03 We were seeing these git fetch 13:46:10 That seemed to just be looping 13:46:15 Not sure we have enough data to say that concretely 13:46:21 But it was a theory 13:46:54 The only other thing that I thought it might be 13:47:09 There are these changes in the queue that depend on like 10 other changes 13:47:31 And some of the changes are having merge issues 13:48:01 why is zuul doing this? ssh -i /var/lib/zuul/ssh/id_rsa -p 29418 powervmci@review.openstack.org git-upload-pack '/openstack/nova' 13:48:18 Here's an example of one of those changes https://review.openstack.org/#/c/337789/ 13:48:32 Not sure 13:49:34 that process has been running for a while 13:51:27 Interesting... 13:51:29 the commit message in zuul about why that runs is "I'll document this later" 13:51:48 That should never really be a particularly long-running command 13:52:27 I suggest we kill that proc 13:52:30 and see if we unwedge. 13:52:54 Sure 13:53:23 thorst_: How long is a while? 13:53:27 Hours? 13:53:51 says 08:48 in the ps aux output 13:54:15 so under 5 min. 13:54:19 its done now. 13:54:36 Yeah I killed it. Another one just popped up in its place 13:55:09 did you kill that second one? 13:55:13 they just seem to be really slow 13:55:14 No 13:56:13 Right 13:56:35 wonder what git-upload-pack does 13:56:44 needs some investigation, because I don't think a clone would help that... 13:58:52 well...when in doubt, just run by hand. 13:59:05 it returns quite the amount of data. 14:00:22 I think it does discovery/fetching of objects from git during a fetch 14:00:39 Not 100% sure on that 14:02:45 alright...so is that the status. Figure out why we're wedged. 14:02:56 (since we're over on time in the meeting) 14:03:09 Yeah 14:03:16 Clearly we need longer than 30 minutes to investigate this 14:03:28 just running the command ourselves may take 30 minutes 14:03:33 Yeah. Other than that I put a wiki page up for CI 14:03:48 If you guys want to take a look. Still need to finish a few sections and polish it up 14:04:56 Where did it land? 14:04:58 Novalink wiki? 14:05:34 Neo dev wiki 14:05:49 Subpage under PowerVM CI System 14:06:04 Ok 14:06:12 _WIP_ CI System and Deployment 14:06:16 so that is also for wangqwsh as you train him to be able to redeploy the CI? 14:06:20 Yep 14:06:38 excellent. And if we do need a git mirror, that may be a good project for wangqwsh to drive 14:07:24 #endmeeting