19:03:49 <jeblair> #startmeeting infra 19:03:50 <openstack> Meeting started Tue Nov 19 19:03:49 2013 UTC and is due to finish in 60 minutes. The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:03:51 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:03:54 <openstack> The meeting name has been set to 'infra' 19:03:58 <jeblair> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting 19:04:01 <krtaylor> o/ 19:04:40 <jeblair> #link http://eavesdrop.openstack.org/meetings/infra/2013/infra.2013-11-12-19.03.html 19:04:53 <jeblair> #topic Actions from last meeting 19:05:07 <jeblair> #action jeblair move tarballs.o.o and include 50gb space for heat/trove images 19:05:10 <jeblair> clarkb: etherpad? 19:05:32 <clarkb> etherpad-dev is dead, didn't get to etherpad.o.o yet. Plan to kill it post meeting 19:05:55 <clarkb> that work kept getting bumped for more important things over the week :/ 19:06:33 <jeblair> clarkb: what are the things you want to double check before killing? 19:06:44 <clarkb> jeblair: that the db backups properly overlap 19:06:47 <jeblair> clarkb: (and more specifically, anything you need help/coordination with) 19:06:55 <clarkb> so that I don't lose any potentially useful data 19:07:03 <clarkb> I shouldn't need help with that 19:07:22 <jeblair> cool 19:07:31 <jeblair> #topic Trove testing (mordred, hub_cap) 19:07:55 <hub_cap> hey. nothing to report here. still getting caught up w/ trove 19:08:21 <hub_cap> going to touch base in the next few days tho regarding resuming progress :) 19:08:30 <jeblair> ok. makes sense, given the schedule of the past couple weeks. 19:08:33 <jeblair> hub_cap: sounds good! 19:08:44 <jeblair> #topic Tripleo testing (lifeless, pleia2) 19:09:02 <pleia2> working through some basics with dprince and derekh, nothing to report on the infra side at the moment though 19:10:31 <mordred> o/ 19:10:34 <jeblair> #topic Wsme / pecan testing (sdague, dhellman) 19:10:56 <jeblair> sdague: might be busy/afk today? 19:10:59 <clarkb> yup 19:11:02 <jeblair> dhellmann: ping 19:11:05 <clarkb> I do have an updated from the d-g side though 19:11:07 <fungi> all week i think sdague said 19:11:10 <dhellmann> hi 19:11:20 <clarkb> to get the havana periodic/bitrot jobs in I have started my refactoring of the d-g jobs 19:11:40 <clarkb> I am trying to do one small step at a time for my sanity and for reviewers' sanity 19:11:50 <fungi> there was also a suggestion i think that sqlalchemy-migrate might ought to get the same love as pecan and wsme as far as dependency gating goes? 19:12:01 <clarkb> but I think a path towards having better d-g job templates is emerging which should make the wsme jobs easy 19:12:07 <dhellmann> we are trying really hard to get rid of sqlalchemy-migrate 19:12:17 <fungi> that sounds like a better solution anyway ;) 19:12:38 <dhellmann> we have https://review.openstack.org/#/c/54333/ open for running tox tests for pecan to gate against wsme, ceilometer, and ironic 19:12:51 <mordred> well, we are - but I think that adding pecan/wsme like gating to it would be good, based on the recent breakage 19:13:00 <dhellmann> ok, I wasn't aware of any breakage 19:13:16 <mordred> s-m made a release and it broke things :) 19:13:22 <dhellmann> more testing is better, I was just trying to help avoid extra work 19:14:28 <anteaya> mordred: s-m? 19:14:37 <clarkb> if the JJB job refactor ends up where I would like to have it end up it may not be a whole lot of extra work 19:14:43 <jeblair> anteaya: sqlalchemy-migrate 19:14:49 <anteaya> k 19:15:03 <jeblair> clarkb: does the jjb refactor conflict with https://review.openstack.org/#/c/54333/5 ? 19:15:23 <clarkb> jeblair: it shouldn't I am talking about the d-g jobs specifically 19:15:28 <jeblair> clarkb: k 19:15:37 <clarkb> since they have grown very unweildy and need better templating 19:16:34 <jeblair> clarkb: so your change would make it easier to add full devstack testing for wsme, which is additional to the tox jobs in 54333 19:16:41 <clarkb> yup 19:16:47 <jeblair> cool, makes sense 19:17:10 <jeblair> end-of-topic? 19:17:25 * dhellmann has nothing to add 19:17:30 <jeblair> dhellmann: thanks 19:17:36 * mordred thins refactor is great 19:17:39 <jeblair> clarkb: is etherpad still a separate topic, or already covered? 19:17:44 <clarkb> jeblair: covered 19:17:46 <jeblair> mordred: thin refactors are the best 19:17:53 <jeblair> #topic Retiring https://github.com/openstack-ci 19:18:11 <jeblair> who wanted to talk about this? 19:18:26 <fungi> we talked about it,m but wanted to get your input too 19:18:26 <pleia2> someone stumbled upon it recently and it was confusing 19:18:55 <pleia2> want to at least see about deleting publications and gerrit-trigger-plugin 19:19:05 <pleia2> or close the whole thing down 19:19:17 <jeblair> pleia2: i think there is more history in publications that need to be pushed into infra/pub 19:19:22 <clarkb> or, add a dummy project with a README that says go to opensatck-infra instead 19:19:31 <jeblair> clarkb: like https://github.com/openstack-ci/moved-to-openstack-infra 19:19:32 <fungi> clarkb: that's already there 19:19:37 <clarkb> ah 19:19:54 <jeblair> pleia2: the existing historical publications need to be pushed and tagged so they show up 19:19:59 <jeblair> pleia2: then i think publications can go away 19:20:04 <pleia2> ok 19:20:08 <jeblair> that leaves gerrit-trigger-plugin 19:20:25 <fungi> jeblair: i thought we retained the history for pubs and then cleaned it, so the master branch still has those commits, they just need tags 19:20:45 <fungi> or better, separating into branches 19:20:50 <fungi> first 19:21:19 <jeblair> fungi: yeah, i think to be compatible with the new system, we need to make some new commits that move each pub into a top level and then tag those 19:22:08 <jeblair> gerrit-trigger-plugin is a genuine fork; i'm not sure if all our changes were upstreamed 19:22:35 <mordred> I don't believe they were 19:22:41 <jeblair> i think we need to figure out the status of that, and decide whether it's a useful historical artifact 19:23:22 <fungi> so it sounds like we have a couple bugs/action items out of this topic... branchify/tag the old pubs, and decide the fate of g-t-p 19:23:37 <jeblair> yep 19:24:06 <jeblair> i'm not in a position to volunteer for those right now, but will certainly do publications if no one gets to it first. 19:24:18 <pleia2> I could use some practice with tag/branch fun if someone will be available to answer questions as I go 19:24:27 <fungi> pleia2: i can help you on that 19:24:32 <clarkb> pleia2: happy to 19:24:40 <pleia2> ok cool, action me for digging into publications then 19:24:55 <jeblair> #action pleia2 add historical publications tags 19:25:00 <jeblair> pleia2: thanks! 19:25:19 <jeblair> #action jeblair file bug about cleaning up gerrit-git-prep repo 19:25:21 <jeblair> gah 19:25:25 <jeblair> #action jeblair file bug about cleaning up gerrit-trigger-plugin 19:25:35 <jeblair> #topic Savanna testing (SergeyLukjanov) 19:25:43 <jeblair> SergeyLukjanov: ping 19:25:51 <SergeyLukjanov> nothing interesting to report atm 19:25:52 <SergeyLukjanov> o/ 19:25:58 <SergeyLukjanov> is it ok to make a job to build images using savanna-image-elements and publish them to tarballs.o.o? 19:26:05 <SergeyLukjanov> just to clarify 19:26:26 <jeblair> SergeyLukjanov: absolutely; i believe trove will need to do something similar 19:26:29 <jeblair> mordred, hub_cap: ^ 19:26:31 <mordred> yah 19:26:33 <mordred> that's correct 19:26:38 <mordred> we'd like to generalize that 19:26:54 <SergeyLukjanov> yep, we talked about this need with hub_cap at summit 19:27:22 <SergeyLukjanov> and eventually it'll be great torun integration tests in this images... 19:27:40 <ruhe> SergeyLukjanov, would it be possible to run Savanna integration tests on those images before they get published? 19:28:05 <SergeyLukjanov> ruhe, we can publish master and release images I think 19:28:25 <SergeyLukjanov> and eventually add integration tests to the gate pipeline 19:28:39 <SergeyLukjanov> when will achieve 'hadoop vs. nested virt.' 19:28:46 <fungi> similar to how we do branch-tip, pre-release and release tarballs i expect 19:29:04 <SergeyLukjanov> so, nothing to add from my side, still hope to start creating CRs this week 19:29:44 <SergeyLukjanov> fungi, yep, it should work ok 19:30:06 <jeblair> SergeyLukjanov: ok cool, thanks 19:30:09 <jeblair> #topic Goodbye Folsom (ttx, clarkb, fungi) 19:30:18 <fungi> #link http://lists.openstack.org/pipermail/openstack-stable-maint/2013-November/001723.html 19:30:27 <fungi> i've officially eol'd the integrated projects 19:30:34 <mordred> woot 19:30:40 <pleia2> \o/ 19:30:55 <fungi> still need to know what we're supposed to do (if anything) with things like devstack, grenade, oslo-i, manuals, tempest, reqs 19:30:57 <jeblair> yaaay! 19:31:09 <jeblair> fungi: kill them all 19:31:14 <clarkb> with fire! 19:31:24 <fungi> they have stable/folsom branches. tag then delete, same as the rest? 19:31:51 <jeblair> fungi: i believe that is the correct thing to do. note that many of them will have significant trouble landing patches to those branches now. :) 19:32:17 <fungi> and the topic more properly should have been goodbye folsom, hello havana since clarkb worked on getting the new stable/havana jobs in last week 19:32:18 <clarkb> definitely reqs, devstack and tempest 19:32:31 <clarkb> since we can't effectively test them on folsom anymore 19:32:42 <clarkb> oh and grenade, might as well do them all 19:32:51 <fungi> clarkb: i approved your job templating last night and it seems to have worked 19:33:11 <clarkb> yes we have periodic/bitrot jobs for havana now and changes to d-g are tested against havana and grizzly as well as master 19:33:33 <fungi> though dprince's fgrenade change is still grinding in the gate of doom 19:33:42 <fungi> #link https://review.openstack.org/57066 19:34:32 <fungi> (to move the base for master tests to havana instead of grizzly) 19:34:46 <jeblair> fungi, clarkb, dprince: thanks for this! 19:34:56 <jeblair> #topic Jenkins 1.540 upgrade (zaro, clarkb) 19:35:10 <dprince> fungi: I am actually a bit perplexed by my grenade failure there. 19:35:46 <dprince> fungi: can take that offline w/ sdague/dtroyer perhaps though 19:35:58 <fungi> dprince: i suspect it's just the volume of tempest tests being run against d-g coupled with the nondeterminism in the gate right now 19:36:02 <zaro> the reason we want to upgrade jenkins is that there was a major fix to reduce number of threads by 75% 19:36:18 <sdague> dprince, there is a grenade patch that needs to land as well for that to work 19:36:25 <clarkb> and with recent jenkins trouble, we figure it is worth a shot to upgrade and run jenkins that doesn't have so much overhead 19:36:27 <sdague> maurosr was working on it, I don't know it's status 19:36:47 <anteaya> sdague: hello 19:36:49 <zaro> clarkb has latest jenkins on jenkins-dev.o.p 19:36:53 <dprince> fungi: thanks, sdague: can you point me to it... or in the right direction? 19:36:58 <clarkb> I upgraded jenkins-dev yesterday, it seems to be fine but the jenkins test script in config is very old d-g and old zuul oriented 19:37:03 * maurosr reading 19:37:34 <fungi> and the release notes for new jenkins mention a lot of churn in parts of the api we use, i think 19:37:35 <clarkb> jeblair: fungi: mordred: any opinions on how we should test jenkins-dev (they upgraded the ssh-slaves and credentials plugins that are bundled with jenkins and changed some of the permissions around node creation/update/delete) 19:37:47 <mordred> oy. that sounds like fun 19:37:56 <jeblair> clarkb: i think the api calls it exercises should still be about the same, right? add/remove nodes, etc... 19:38:14 <clarkb> jeblair: possibly. they actual calls are made in the zuul and d-g source code 19:38:23 <sdague> maurosr, this is the patch that takes the version specific upgrade scripts into that separate directory, instead of just based on the branch we're in in grenade 19:38:25 <jeblair> clarkb: ah, heh. 19:38:27 <clarkb> jeblair: so I think rewriting it to use nodepool is what we may need to do? 19:38:43 <clarkb> or and this was my crazy idea this morning 19:38:47 <anteaya> could we introduce a new jenkins master with the upgrade? and then move over from there? 19:38:54 <clarkb> maybe we can point prod nodepool at jenkins-dev? 19:39:07 <clarkb> I don't think prod nodepool at jenkins-dev will work due to ssh key mismatches 19:39:11 <maurosr> sdague: yup, I'm finishing it, I can submit it today (yesterday and friday had some troubles and the holiday that didnt let me work on it) 19:39:34 <maurosr> but will submit it today for sure 19:39:47 <jeblair> clarkb: so maybe we need a nodepool-dev? ugh. 19:39:50 <maurosr> replacing the other one which cleans everything 19:39:52 <clarkb> anteaya: that is one possibility but we should be able to do an upgrade in place 19:40:03 <clarkb> jeblair: that was another thing I considered, possibly just run it on jenkins-dev 19:40:04 <fungi> clarkb: mordred has had success getting nodepool running on his laptop... maybe something similar could be pointed at jenkins-dev instead of needing a nodepool-dev server? 19:40:25 <jeblair> colocating nodepool-dev service on jenkins-dev sounds fine 19:40:42 <fungi> oh, or locally installed on jenkins-dev itself. yeah, not a bad idea at all 19:40:59 <anteaya> clarkb k 19:41:25 <jeblair> clarkb: sound like a plan? 19:41:27 <clarkb> sure 19:41:38 <clarkb> zaro: any chance you want to work on the puppet to do that? 19:41:38 <jeblair> cool 19:42:02 <zaro> clarkb: can do. 19:42:09 <jeblair> clarkb, zaro: in fact, i think that may actually be how the old devstack-gate stuff was set up on jenkins-dev. 19:42:31 <clarkb> #action zaro setup dev nodepoold on jenkins-dev 19:42:39 <jeblair> #topic New devstack job requirements (clarkb) 19:43:28 <clarkb> one of the things that came out of adding havana d-g jobs was that we have two large classes of d-g jobs. There are d-g jobs that run against patches and d-g jobs that run periodically against tip of $branch 19:44:09 <clarkb> I would like to propose that we require and new d-g jobs supply both forms as two different templates so that when icehouse rolls around adding d-g jobs for it is as simple as updating zuul layout.yaml 19:44:36 <clarkb> with havana I was juggling a lot of missing pieces and I think staying on top of that through a cycle would be better 19:44:59 <jeblair> clarkb: i agree with the proposal in principle; i may want to see your refactor before agreeing to the specifics 19:45:00 <fungi> and i think the current state with regard to that is great now, after your last change went in 19:46:20 <clarkb> jeblair: thats fair, there is a little more work to coalesce the branch specific check jobs with the rest of the check/gate jobs and stable branche periodic jobs with master periodic jobs 19:46:40 <clarkb> right now we have ~4 distinct classes of d-g job and I think I can roll that into two 19:46:53 <clarkb> so getting it to two is a good first step 19:47:56 <jeblair> cool; if there are any new d-g jobs while clarkb works on this, we should probably run those changes by him to make sure it fits with this work 19:48:09 <fungi> yes, agreed 19:48:26 <jeblair> #topic Increased space yet again on static.o.o (fungi) 19:48:39 <fungi> this was more of a public service announcement 19:48:52 <pleia2> time to add some disk space monitoring? 19:49:02 <jeblair> fungi: was this during the summit, or once again, afterwords? 19:49:18 <fungi> i increased the logs volume during the summit to 4tb, but then caught it again last week just before it filled up and pushed it to 5tb 19:49:28 <fungi> http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=717&rra_id=all 19:49:36 <fungi> er 19:49:41 <fungi> #link http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=717&rra_id=all 19:50:06 <fungi> i confirmed the deletion weekly cron job is working as intended to expire 6-month-old content 19:50:21 <fungi> we're just on an ever-increasing treadmill of log collection 19:50:44 <jeblair> fungi: that's exciting. 19:50:54 <fungi> similarly, docs-draft is up to 400gb now 19:51:11 <clarkb> we have a giant firehose filling water ballons 19:51:24 <fungi> anyway, that's all i wanted to mention on the topic 19:51:35 <jeblair> fungi: i think we should make those volumes be at 50% 19:51:44 <fungi> jeblair: i'm happy to do that 19:51:50 <jeblair> fungi: cool, thanks 19:52:07 <fungi> #action fungi grow logs and docs-draft volumes so they're 50% full 19:52:09 <jeblair> i believe we had expected them to stabilize. we might have been wrong about that. :( 19:52:25 <clarkb> so I haven't been able to confirm this yet 19:52:48 <clarkb> but the nova logs exploded in size (~8MB compressed for n-cpu??) due to the iso8601 logging 19:53:02 <clarkb> fungi: any idea if that caused a significant uptick in log size? 19:53:27 <fungi> clarkb: it was hard to tell, but i'd expect compression to water that increase down if it's repetitive lines 19:53:51 <jeblair> hrm, total artifact size by zuul job would be a nice metric; the new log uploading system could report that. 19:54:20 <fungi> also, there was no obvious uptick in utilization on the voume, just a fairly steady linear progression there 19:54:28 <clarkb> gotcha 19:54:29 <jeblair> #topic Open discussion 19:54:31 * zaro meeting topic, sorry came in late. 19:54:48 <zaro> new jjb release 19:55:05 <zaro> anybody have issues with that? 19:55:11 <clarkb> oh ya, we had someone submit a bug requesting a new release and I am +1 on doing it 19:55:23 <pabelanger> Also wanted to remind people about https://review.openstack.org/#/c/56107/ maybe get some feedback about the packaging import 19:55:35 <clarkb> I do think we should get the new jjb-ptl group change in so that zaro and maybe others can do the releases 19:55:54 <clarkb> #link https://review.openstack.org/#/c/56823/ 19:56:01 <jeblair> clarkb: jjb has a ptl? 19:56:04 <anteaya> I'm continuing to work in -neutron 19:56:15 <clarkb> jeblair: no, but read the the change comments for why it was named that way :) 19:56:24 <clarkb> jeblair: mordred wanted consistency 19:56:49 * mordred doens't feel strongly about it 19:56:53 <jeblair> clarkb: well, the group is named that way to remind us to keep it small. :) 19:57:09 <mordred> but I know we've pushed back on people before making a thing not called -ptl 19:57:13 <clarkb> yup 19:57:16 <clarkb> I am fine with the name 19:57:21 <jeblair> mordred: i'm not opposed to the name 19:57:22 <mordred> that said - perhaps -ptl is the wrong name for that role and -release-manager is a better name? 19:57:27 * mordred doesn't want to bikeshed too much 19:57:28 <jeblair> mordred: i'm not sold on the _idea_ 19:57:33 <mordred> ah 19:57:35 <mordred> gotcha 19:57:56 <clarkb> jeblair: so I suggested it because openstack-ci-core has really taken a back seat in JJB reviews lately 19:58:09 <jeblair> clarkb: one of us has been on vacation 19:58:11 <clarkb> I think a subset of the jjb core group is in a better position to cut releases 19:59:50 <fungi> and that's about it for time 19:59:59 <jeblair> clarkb: the people who can make releases should definitely be a subset of the jjb core group. it currently is. 20:00:12 <jeblair> #endmeeting