17:02:03 <krtaylor> hey everybody, its that time again
17:02:13 <krtaylor> anyone here for CI working group?
17:02:37 <rfolco> o/
17:02:45 <mmedvede> hey krtaylor
17:02:51 <patrickeast> hey
17:03:06 <krtaylor> hi rfolco mmedvede patrickeast
17:03:17 <krtaylor> patrickeast, havent seen you in a while
17:04:19 <sweston> \o
17:04:21 <krtaylor> here's the agenda for today:
17:04:24 <asselin_> hi, I'm half here
17:04:27 <krtaylor> #link https://wiki.openstack.org/wiki/Meetings/ThirdParty#8.2F18.2F15_1700_UTC
17:04:38 <krtaylor> hi sweston asselin_
17:04:46 <sweston> hi krtaylor
17:05:09 <krtaylor> any quick announcements? deadlines?
17:05:19 <krtaylor> none on the agenda
17:06:32 <krtaylor> #topic Common CI
17:06:46 <krtaylor> asselin_ I understand if you are too busy to discuss
17:07:06 <krtaylor> LOTS of patches
17:07:18 <krtaylor> #link https://review.openstack.org/#/q/topic:downstream-puppet,n,z
17:07:35 <asselin_> no updates from me. Still working through patches and reviews. There's progress which is good news.
17:07:50 <krtaylor> several look really close to merge
17:09:31 <krtaylor> ok, next then
17:09:37 <krtaylor> #topic Spec to have infra host scoreboard
17:09:51 <krtaylor> this is moving along
17:10:32 <krtaylor> although not NEARLY as fast as we were hoping when jeblair and I first discussed this idea
17:10:57 <krtaylor> the idea was to push this out fast, then start working on radar
17:10:58 <krtaylor> sigh
17:11:15 <mmedvede> the main block is that it would not be as temporarily as initially was intended
17:11:16 <krtaylor> anyway, it should be fairily close if we can et reviewers
17:11:34 <krtaylor> why couldn't it be mmedvede ?
17:11:41 <sweston> what can folks do to hurry it along?
17:11:42 <patrickeast> ill try and take another look at it today
17:11:53 * patrickeast has been busy with cinder things :(
17:12:30 <mmedvede> because we are asking infra team to deploy it, there needs to be infrastructure for maintaining it
17:12:54 <krtaylor> #link https://review.openstack.org/#/c/194437/
17:12:57 <mmedvede> I am writing puppet module to deploy scoreboard. Almost done
17:13:43 <krtaylor> it would be good to get more reviews so I could stop having to push a refresh for every nit
17:14:00 <sweston> can this be set up in such a way that when we are ready to suggest radar hosting, it can be easily dropped in?
17:14:38 <mmedvede> patrickeast: should we name puppet module puppet-ci_scoreboard or puppet-scoreboard?
17:14:47 <krtaylor> mmedvede, for the naming, there are so many dashboards now proposed, I felt it added clarity without too much overhead
17:15:00 <krtaylor> but will change it if you feel strongly about it
17:15:34 <krtaylor> sweston, that is the intention, at least the url, vm, etc would be there
17:15:49 <sweston> krtaylor: good enough
17:16:19 <mmedvede> krtaylor: no strong feelings. If people agree with naming, need more opinions
17:16:45 <patrickeast> yea i don't really have strong feelings on the puppet module naming either
17:17:15 <krtaylor> darn white space, I checked it too (then added that line) sigh
17:17:26 <krtaylor> anyway, new patchset pushed just now
17:17:52 <krtaylor> check it out and review, the more +1's we get on it, the more likely that the infra folks will assist
17:18:32 <mmedvede> krtaylor: my -1 was not about whitespace. I try not to -1 for style
17:18:37 <sweston> Will do
17:18:50 <mmedvede> it was for section still missing about gerrit account requirement
17:19:01 <krtaylor> mmedvede, understood, all good, I didn't think so
17:19:30 <krtaylor> oh, I didn't add that, crap
17:19:32 <krtaylor> will do
17:20:03 <krtaylor> mmedvede, what do you think? I feel like it should have it's own id
17:21:11 <krtaylor> anyone else have any comments on the hosting spec?
17:21:14 <mmedvede> krtaylor: it might be a necessity
17:21:22 <krtaylor> mmedvede, agreed
17:21:33 <mmedvede> patrickeast: do you think the scoreboard would be able to handle the load once it is used by more people?
17:21:55 <krtaylor> mmedvede, I based this spec on other hosting specs, and it was not mentioned in them, it may be a "given"
17:22:21 <mmedvede> patrickeast: is uses flask. I know it is possible to also use apache along flask to make it more resilient
17:22:23 <patrickeast> mmedvede: maybe, my biggest concern would be how it is serving static files we should at some point switch it to using apache or something
17:22:32 <patrickeast> mmedvede: yea exactly
17:22:50 <krtaylor> or put that work into radar
17:23:03 <patrickeast> mmedvede: it should be ok, the one i have in a little aws vm uses like <2GB of ram peak and like .1 cpu load on average
17:23:17 <mmedvede> krtaylor: hmm, maybe. I wanted to make the account requirement explicit, so infra team knows what they are getting into :)
17:23:23 <patrickeast> mmedvede: i don't anticipate a ton more folks would start hitting it
17:23:32 <krtaylor> .1, that much  :)
17:24:03 <krtaylor> mmedvede, I'll add it right after this, thought I did
17:24:16 <krtaylor> too many tasks atm
17:24:39 <mmedvede> patrickeast: I remember you aws instance was down sometimes, did you figure out the reason?
17:24:49 <patrickeast> ' 17:21:11 up 169 days, 16:59,  1 user,  load average: 0.10, 0.10, 0.13 '
17:24:50 <patrickeast> from uptime
17:25:26 <patrickeast> mmedvede: its something wrong with flask/python socket handling, i haven't tracked it down yet but also haven't had much time to look at it
17:26:08 <mmedvede> patrickeast: I think after official announcement, the ci-dashboard.o.o might get more traffic. We can try to harden it, but it would be easier once it is running.
17:26:09 <patrickeast> mmedvede: would probably be fixed if we migrate towards an apache integrated solution
17:27:45 <krtaylor> mmedvede, I am proposing a Work Item to create a user account for 'ci-dashboard'
17:27:52 <krtaylor> see any problems with that?
17:28:16 <krtaylor> thought a generic account name could be reused for whatever solution is deployed in the future
17:29:00 <mmedvede> krtaylor: the thing is, it would need to be account managed by infra team
17:29:23 <mmedvede> e.g. they would need to manage private ssh key
17:29:48 <krtaylor> mmedvede, wouldn't that be a work item?
17:30:05 <krtaylor> so you are thinking a dependency?
17:31:34 <mmedvede> krtaylor:  Are you talking about work item inside the spec?
17:31:54 <krtaylor> yes, in the Work Item section
17:32:22 <krtaylor> I can note that it would need to be created and maintained by the infra team
17:32:30 <mmedvede> +1
17:32:41 <krtaylor> since none of us have that acls
17:33:09 <krtaylor> ok, cool, I'll finish that asap
17:33:25 <krtaylor> any other comments on hosting the dashboard spec?
17:34:37 <krtaylor> BTW, lightning session topic: "Using CI Dashboard to check on a CI system's health"
17:34:45 <krtaylor> for Tokyo
17:34:54 <krtaylor> I'm just sayin....  :)
17:35:42 <krtaylor> ok, let's move on
17:35:45 <krtaylor> #topic Radar spec
17:36:06 <krtaylor> sweston graciously moved the spec to our third party tool repo
17:36:18 <krtaylor> #link https://review.openstack.org/#/c/211713/
17:37:08 <krtaylor> so, there are two ways to address this
17:37:22 <krtaylor> 1) is to merge it then patch it for spec changes
17:37:48 <krtaylor> 2) is to wait till we all agree on its content them merge it meaning that the design is complete
17:37:54 <krtaylor> I am leaning toward 1
17:37:57 <krtaylor> comments?
17:38:14 <mmedvede> the (2) kind of defeats the purpose of moving it
17:38:31 <sweston> I am leaning toward 1 as well, this spec has been hanging too long for approval, and it is blocking progress
17:38:31 <mmedvede> we could work on it at its original location
17:38:54 <krtaylor> well, moving it always implied that it would be re-proposed to infra after we worked on it
17:39:26 <krtaylor> mmedvede, yes, long history here, it was agreed to move it
17:40:12 <krtaylor> mmedvede, that would have been the preferred approach, but it was too confusing for some reason
17:40:41 <krtaylor> we can't wait to improve CI system trust
17:41:04 <krtaylor> the sytems that are busting their behinds to push reliable results need to be trusted
17:41:21 <krtaylor> and in order to do that we have to show developers the test results
17:42:02 <krtaylor> hence, the tactical/strategic approach
17:42:26 <sweston> krtaylor: +1
17:42:27 * krtaylor gets off his soapbox
17:43:12 <krtaylor> so, do we agree on merging it first, then patching design ideas and corrections?
17:44:15 <krtaylor> if no one disagrees, then I'll merge it this afternoon
17:44:32 <sweston> krtaylor: I vote for merging it as it is now.  I will be writing Gerrit queries and integrating some data into Radar over the next two weeks, and I would prefer to have the spec approved
17:44:34 <patrickeast> +1 for merging it
17:45:23 <mmedvede> hard for me to have a good opinion, I did not work on many specs to understand what is best, so I abstain :)
17:45:46 <asselin_> +1 merge
17:45:50 <krtaylor> done
17:46:06 <krtaylor> so, next up
17:46:31 <krtaylor> #topic Patches
17:46:36 <krtaylor> #link https://review.openstack.org/#/q/project:stackforge/third-party-ci-tools+status:open,n,z
17:47:07 * krtaylor looking
17:47:32 <patrickeast> i need to update my FC passthrough one
17:48:17 <patrickeast> asselin_: did you and hemnafk figure out how to get the offline check one to work?
17:48:57 <asselin_> patrickeast, that's on hold a bit
17:49:00 <krtaylor> patrickeast, looks like some minor changes
17:49:18 <asselin_> patrickeast, I'd like to get your landed first, and then look at detaching at the end of the job
17:49:30 <krtaylor> patrickeast, explain offline check?
17:49:43 <asselin_> and then include the offline check
17:49:43 <patrickeast> krtaylor: the HBA's can get into an 'offline' state
17:49:51 <patrickeast> krtaylor: and then the passthrough will fail
17:50:05 <patrickeast> krtaylor: the idea is to have some notification early on that it is happening
17:50:18 <patrickeast> asselin_: that makes sense
17:50:24 <krtaylor> ah, interesting, thanks for the education
17:50:37 <asselin_> patrickeast, or include that check adter the detach before the attach
17:51:28 <asselin_> patrickeast, perhaps we can make it more general, when fc fails altogether, send an e-mail (if configured)
17:52:01 <patrickeast> asselin_: yea i was thinking it might be a better nagios kind of check since they don't seem to recover automatically
17:52:08 <patrickeast> needs some manual intervention (right now)
17:53:10 <krtaylor> ah, it is well documented in the next patch, my bad
17:54:09 <krtaylor> so a quick open discussion then
17:54:26 <krtaylor> #topic Open Discussion
17:54:27 <mmedvede> krtaylor: there was another topic, about stackforge migration
17:54:41 <mmedvede> small thing - we need to remember to add our repo to the list
17:54:57 <mmedvede> #link http://lists.openstack.org/pipermail/openstack-infra/2015-August/003069.html
17:54:59 <krtaylor> ah, just added, I needed to refresh
17:55:34 <mmedvede> that is all for that :). I do not believe there is a wiki page yet
17:56:46 <krtaylor> mmedvede, yeah, not seeing a page yet
17:57:02 <krtaylor> but it is good to keep that on the agenda, we don't want to miss the "move"
17:57:07 <krtaylor> thanks mmedvede
17:57:55 <krtaylor> any other topics?
17:58:10 <asselin_> I've been running into a recent issue with nodepool and juno openstack clouds.
17:58:47 <patrickeast> asselin_: i have been having issues with it too, what problem are you getting?
17:58:48 <asselin_> don't update unnecessarily...still working on the fix
17:59:12 <asselin_> {"error": {"message": "Project ID not found: admin (Disable debug mode to suppress these details.)", "code": 401, "title": "Unauthorized"}}[
17:59:31 <asselin_> {"error": {"message": "User e082a15d2e6b490ba8329e60e7f092ea is unauthorized for tenant admin (Disable debug mode to suppress these details.)", "code": 401, "title": "Unauthorized"}}[
