15:01:00 <krtaylor> #startmeeting third-party 15:01:01 <openstack> Meeting started Wed Feb 18 15:01:00 2015 UTC and is due to finish in 60 minutes. The chair is krtaylor. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:02 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:04 <openstack> The meeting name has been set to 'third_party' 15:01:11 <krtaylor> Hi everyone 15:01:17 <patrickeast> hey 15:01:17 <ja_> moin moin 15:01:18 <mmedvede> hi 15:01:29 <omrim> Hello 15:01:34 <krtaylor> time for another Third Party CI WG meeting 15:02:01 <lennyb> hi 15:02:06 <lennyb> hi 15:02:38 <krtaylor> looks like we have a good group today 15:02:39 <rfolco> o/ 15:03:01 <krtaylor> here is the agenda: 15:03:05 <krtaylor> #link https://wiki.openstack.org/wiki/Meetings/ThirdParty#2.2F18.2F15_1500_UTC 15:03:51 <krtaylor> #topic Announcements 15:04:19 <asselin_> hi 15:04:32 <krtaylor> I'll start off by reminding everyone of gerrit being upgraded March 21st 15:04:41 <krtaylor> hi asselin 15:05:02 <ja_> what is the "impact" of the upgrade to users? 15:05:32 <asselin_> ja_, for those with firewalls blocking the port, it means firewall updates 15:05:39 <krtaylor> ja_, really only if your CI system needs some kind of firewall egress configuration 15:05:48 <ja_> ok thx 15:05:51 <krtaylor> asselin beat me to it 15:06:20 <krtaylor> ok, any other quick announcements before we move on? 15:07:09 <krtaylor> #topic Third-party CI documentation 15:07:59 <krtaylor> ok, so we still have some work to do here, especially in running-your-own 15:08:41 <krtaylor> and I am thinking that it is slowed due to the need to walk through it 15:09:26 <krtaylor> we have some patches, which reminds me 15:09:57 <krtaylor> rfolco, can you change your topic to 'third-party-ci-documentation' on your patch 15:10:20 <krtaylor> that is for everyone - so we can track all with one query 15:10:32 <krtaylor> #link https://review.openstack.org/#/q/topic:third-party-ci-documentation,n,z 15:10:37 <rfolco> kragniz, summary line ? 15:10:46 <rfolco> krtaylor, ^ 15:10:59 <rfolco> (sorry) 15:11:25 <krtaylor> rfolco, you should have a little writing pad next to Topic in the upper left of your patch review 15:11:35 <krtaylor> you can edit the topic in gerrit 15:12:11 <rfolco> topic is master, is that one ? 15:12:53 <krtaylor> rfolco, just below Branch 15:13:12 <rfolco> done https://review.openstack.org/#/c/155864/ 15:13:16 <krtaylor> thanks 15:13:45 <krtaylor> lennyb, since you are getting started, any comments on the running-your-own doc would be helpful as well 15:14:22 <krtaylor> ja_, your continued input is appreciated too 15:14:51 <krtaylor> ok, onward 15:15:03 <krtaylor> #topic Spec for in-tree 3rd party ci 15:15:14 <lennyb> krtaylor, thanks I will try to document 15:15:26 <krtaylor> asselin, you have a new rev on the spec 15:15:28 <asselin_> so I updated the spec a bit yesterday 15:15:39 <krtaylor> #link https://review.openstack.org/#/c/139745/ 15:15:56 <asselin_> yes, it is now a 'priority effort' for openstack-infra 15:16:02 <krtaylor> I haven't had a chance to review it yet, will today 15:16:17 <krtaylor> yea! 15:16:34 <asselin_> yes, very excited about that! :) 15:17:18 <krtaylor> asselin, it will enable a lot of goodness 15:17:44 <rfolco> asselin, is the refactor a override on infra puppet, a fork or what ? could you please clarify ? 15:18:23 <asselin_> rfolco, the refactor is to allow the puppet scripts to be more easily reused 15:19:04 <asselin_> rfolco, there are lots of sections in system-config that are needed, but not easily resuable 15:19:51 <rfolco> I read the spec and I had the impression it was a fork from infra scripts 15:20:15 <asselin_> that today's solution 15:20:41 <asselin_> rfolco, could you comment on the specific sections? I will try to clarify 15:21:12 <rfolco> asselin, wil ldo thx 15:21:37 <krtaylor> yes, and just like the puppet module split-out, a great opportunity for third-party WG to get involved and help out 15:21:38 <asselin_> #link https://review.openstack.org/#/c/137471/ 15:21:49 <asselin_> ^^ this is a related spec that will help a lot as well 15:23:20 <krtaylor> any other questions on in-tree ? 15:23:34 <krtaylor> an action for everyone, please go read the spec 15:24:04 <krtaylor> thanks for the overview asselin 15:24:29 <krtaylor> oops asselin_ 15:24:41 <asselin_> np :) 15:24:49 <krtaylor> ok, next 15:24:52 <krtaylor> #topic Repo for third party tools 15:25:09 <krtaylor> like last week, I am socializing the idea of creating a repo 15:25:41 <krtaylor> a place for ci teams to share their scripts and other goodies that make their job easier 15:26:19 <asselin_> +1 I like the idea 15:26:21 <krtaylor> if the consensus is that it is a good idea, I'll see about getting that setup 15:26:51 <patrickeast> i like this idea, you thinking a openstack repo or just a public github kind of thing? 15:26:57 <krtaylor> there are several tools available, but unless you know about someone's github account, you don't know they exist 15:27:18 <krtaylor> actually, a stackforge repo 15:27:29 <patrickeast> gotcha 15:27:46 <krtaylor> we'd have to discuss the organization of it, etc 15:28:19 * krtaylor wonders if we'd need a spec to propose the idea formally 15:28:54 <patrickeast> that might be a good idea, or at least a wiki page or something to capture what we decide for organization 15:29:08 <krtaylor> well, I haven't found anyone that hated the idea...yet 15:29:40 <asselin_> A wiki or etherpad might be good to start. 15:29:52 <krtaylor> I know we have tools internally that we could share, scripts, dashboards, etc 15:29:59 * krtaylor agrees 15:30:38 <krtaylor> #action krtaylor to set up wiki for third-party CI WG repo, and/or etherpad 15:31:46 <krtaylor> ok, goodness, thanks everyone for the input 15:31:54 <krtaylor> lets move on 15:31:59 <krtaylor> #topic Restart monitoring dashboard effort 15:32:05 <krtaylor> sweston, ping? 15:32:48 <krtaylor> there is an effort to have a public monitoring dashboard, basically a new/nicer/more featured radar 15:33:20 <krtaylor> some good comments here: 15:33:26 <krtaylor> #link https://review.openstack.org/#/c/135170/ 15:34:01 <krtaylor> it would replace the need to change status on ThirdPartySystems, at least eventually 15:35:01 <krtaylor> sweston has been swamped with work from his dayjob, but everything is available, just needs input, reviews, ideas to converge 15:36:00 <krtaylor> ok, we can come back to that if time permits, but I want to get to the next bit 15:36:47 <krtaylor> #topic Highlighting Third-Party CI Service 15:37:15 <krtaylor> continuing on the success of rfolco 's discussion of PowerKVM CI 15:37:37 <krtaylor> this week we have Pure Storage CI 15:38:01 <krtaylor> patrickeast, can you share a brief intro on your system 15:38:06 <patrickeast> yep 15:38:16 <krtaylor> maybe some problems you had and how you solved them 15:38:35 <patrickeast> so, first off i made some stuff to share 15:38:41 <patrickeast> http://ec2-54-67-119-204.us-west-1.compute.amazonaws.com/ci_stuff.svg 15:38:49 <patrickeast> https://github.com/patrick-east/os-ext-testing-data 15:38:58 <patrickeast> a poorly drawn diagram of our setup 15:39:04 <patrickeast> and what we use for our data repo 15:39:23 <krtaylor> #link http://ec2-54-67-119-204.us-west-1.compute.amazonaws.com/ci_stuff.svg 15:39:31 <krtaylor> #link https://github.com/patrick-east/os-ext-testing-data 15:39:49 <patrickeast> as of a month (almost 2) ago i have switched our system over to using asselin’s https://github.com/rasselin/os-ext-testing scripts 15:40:12 <krtaylor> nice 15:40:16 <patrickeast> prior to that we had started with the instructions on jay pipes blog post and cobbled together a system without nodepool 15:40:28 <patrickeast> we ran into all kinds of issues with re-using static slaves though 15:40:38 <asselin_> nice pic 15:40:56 <krtaylor> yep, without nodepool due to setup requirements? 15:41:12 <patrickeast> we went that way originally just due to lack of knowing any better 15:41:35 <krtaylor> ah, ok 15:42:00 <asselin_> patrickeast, what's "RDO" 15:42:13 <krtaylor> RedHat Repo? 15:42:17 <patrickeast> https://openstack.redhat.com/Main_Page 15:42:31 <patrickeast> its like their open source version of the redhat openstack stuff 15:42:41 <patrickeast> similar to centos vs rhel 15:43:13 <patrickeast> it made it very very easy to get setup with openstack 15:44:12 <patrickeast> so, as you may have noticed on the diagram we have the nice high speed data connections that are currently not used… thats on my list of todo’s 15:44:20 <patrickeast> we are testing our cinder driver 15:44:22 <krtaylor> patrickeast, so I take it you are only testing cinder patches 15:44:27 <patrickeast> correct 15:44:37 <patrickeast> right now we are only listening for openstack/cinder changes on master 15:44:56 <patrickeast> and run the volume api tempest tests 15:45:31 <patrickeast> we are planning to add a FC cinder driver for our array in L-1 15:45:48 <patrickeast> so i’ll be adding support for that into the system early in L 15:46:02 <krtaylor> what was a really tricky part that you had to work through? 15:46:37 * asselin_ has fc ci scripts to share in some future repo tbd 15:47:08 <patrickeast> probably the hardest part was figuring out how to properly configure everything… all told there are like 50 config files involved between the openstack provider and ci system 15:47:33 <patrickeast> this is where that documentation push is really going to shine 15:48:09 <asselin_> patrickeast, does rdo help out with the openstack provider configs? or are those the ci config to point to the provider? 15:48:48 <patrickeast> it does get everything setup and working, but we’ve had to go back through and customize things a bit 15:49:00 <patrickeast> like where nova stores intances, and glance keeps images 15:49:05 <patrickeast> due to partitioning on the system 15:49:17 <patrickeast> and we had to delete all of its automatic network setup and do our own 15:49:36 <krtaylor> patrickeast, we had a similar situation, but as we worked through everything, we found ways to use upstream as-is and have less delta 15:50:49 <patrickeast> yea my goal is to try and reduce that when we add in the FC testing 15:51:10 <patrickeast> right now its only a single initiator we test with 15:51:29 <patrickeast> i’ve got 2 more even bigger ones on the rack next to it waiting to be hooked up with the array 15:51:48 <patrickeast> for those ones i’m hoping to improve upon the current setup a bit 15:52:00 <krtaylor> patrickeast, have you automated any other parts of the system for your testing? 15:52:41 <patrickeast> nothing significant, we’ve added in some scripts to cleanup the array once it is done testing 15:53:07 <krtaylor> created any monitoring framework? 15:53:20 <patrickeast> actually yea, a little one 15:53:29 <patrickeast> https://github.com/patrick-east/os-ext-testing-data/tree/master/tools/server_monitor 15:53:31 <patrickeast> so 15:53:38 * asselin_ looki ng 15:53:44 <patrickeast> we ran into a few times where the system would start failing for X reason 15:53:48 * krtaylor looks too 15:53:48 <patrickeast> either disk out of space 15:53:56 <patrickeast> or the job was unregistered 15:54:01 <patrickeast> or the array went down 15:54:20 <patrickeast> so i made a little script that sits on the master and sends email alerts whenever something like that happens 15:54:30 * patrickeast doesn’t know how to use nagios 15:54:43 <krtaylor> hehheh 15:54:53 <krtaylor> we are looking at using nagios also 15:55:18 <krtaylor> hm, something else to share with the community... 15:55:44 <patrickeast> my companies IT and internal ci teams use nagios quite a lot, so i’m hoping to get their help one day to make this integrated with all their dashboards and stuff 15:55:58 <asselin_> what does -infra use? 15:56:00 <patrickeast> but for now its nice to get an email instead of checking in and seeing it failed the last 50 builds 15:56:34 <krtaylor> asselin_, eyeballs :) 15:56:56 <asselin_> I configured zuul to send me e-mails on job status. I check the results periodically. 15:57:03 <asselin_> also a good way to fill up your mailbox 15:57:20 <krtaylor> +1000, yeah we did that, then turned it off 15:57:31 <asselin_> but looking for something better. thanks patrickeast :) 15:57:50 <krtaylor> excellent, thanks for sharing that 15:57:53 <patrickeast> oh, also, not sure if anyone else is interested but https://github.com/patrick-east/os-ext-testing-data/blob/master/tools/clean_purity.py is the cleaning script, it parses the cinder logs for any volumes/hosts/whatever from that particular test run and wipes out any left over 15:58:12 <patrickeast> we have a relatively small max volume limit on our arrays so it becomes an issue if we aren’t agressive about it 15:58:15 <krtaylor> #link https://github.com/patrick-east/os-ext-testing-data/tree/master/tools/server_monitor 15:58:37 <krtaylor> #link https://github.com/patrick-east/os-ext-testing-data/blob/master/tools/clean_purity.py 15:58:40 <lennyb> you can also query Jenskins with json api to see last jobs status 15:59:23 <patrickeast> yep, we thats all the server_monitor script does, although i dialed it back to just alerting when things hit the fan (health score of 0) 15:59:36 <patrickeast> since i was tired of emails for actual failures on bad patches 15:59:57 <krtaylor> well, we are close to time 16:00:06 <lennyb> yeah, we are getting now emails only after N failed jobs 16:00:23 <krtaylor> thank you patrickeast for sharing this info about your system 16:00:26 <patrickeast> np 16:00:36 <asselin_> big thanks! 16:00:36 <patrickeast> let me know if you guys have more questions 16:00:41 <krtaylor> thanks everyone, great meeting! 16:00:44 <mmedvede> patrickeast: thank you, very good 16:00:56 <asselin_> patrickeast, a few. will ask offline 16:01:01 <krtaylor> #endmeeting