19:02:48 <jeblair> #startmeeting infra
19:02:49 <openstack> Meeting started Tue Nov 26 19:02:48 2013 UTC and is due to finish in 60 minutes.  The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:02:50 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:02:53 <openstack> The meeting name has been set to 'infra'
19:02:57 <jeblair> #topic Actions from last meeting
19:03:12 <jeblair> this should shock no one
19:03:15 <jeblair> #action jeblair file bug about cleaning up gerrit-trigger-plugin
19:03:33 * fungi feigns shock
19:03:46 <jeblair> #action jeblair move tarballs.o.o and include 50gb space for heat/trove images
19:03:56 <jeblair> pleia2: how do you want to track the work on historical publications?
19:04:30 <jeblair> pleia2: want to just open a bug about it?
19:04:35 <pleia2> jeblair: sounds good
19:04:48 <pleia2> last week was kind of chaos, so I didn't ask anyone to help me get started
19:04:49 <jeblair> funzo: how are static volumes?
19:04:52 <fungi> bit of a snag here...
19:05:02 <fungi> #link http://www.rackspace.com/knowledge_center/product-faq/cloud-block-storage
19:05:10 <fungi> "What's the maximum number of Cloud Block Storage volumes I can attach to a single server instance?"
19:05:16 <jeblair> fungi: gee, sorry about your nick there.
19:05:17 <fungi> "You may have up to 14 Cloud Block Storage volumes attached to a single Server."
19:05:26 <mordred> hahaha
19:05:29 <fungi> i like funzo
19:05:36 <fungi> could be my new friday nick
19:05:37 <jeblair> yay cloud?
19:05:52 <fungi> so anyway, yeah. suggestions? slowly migrate some pvs from 0.5t to 1t?
19:06:04 <jeblair> so we're at 5.5T out of a possible 14T
19:06:07 <jeblair> right?
19:06:12 <clarkb> thats the best thing I can come up with
19:06:13 <fungi> i can shift data from smaller to larger cinder volumes and replace them until we have enough
19:06:19 <jeblair> fungi: i think that's the way to go
19:06:31 <fungi> not a possible 14t either, no. see the faq ;)
19:06:39 <soren> I think that's a technical limitation.
19:06:41 <fungi> "The limit for Volumes and storage is 10 TB total stored or 50 volumes per region (whichever is reached first)."
19:06:58 <jeblair> also, jog0 and sdague merged a patch thursday that should _greatly_ reduce the log size
19:07:02 <fungi> soren: yes, i've seen a lot of references to xen being unable to present more than 16 block devices into a domu
19:07:09 <soren> Up to 8 partitions per disk, up to 16 disks, where two are assigned by Nova by default.
19:07:10 <fungi> at least older xen versions
19:07:11 <clarkb> jog0: sdague: have a link to that change?
19:07:27 <soren> ...and that's all you get with 256 minor numbers to choose from.
19:07:34 <jeblair> clarkb: it cleaned up the isomumblemumble log spam
19:07:42 <soren> Uh..
19:07:45 <jeblair> clarkb: something like 10G -> 2G
19:07:46 <clarkb> jeblair: nice
19:07:53 <mordred> so it's possible this might put more fuel on the 'figure out swift' fire?
19:07:54 <soren> Unless, of course you know how to do basic arithmetic.
19:08:05 * soren shuts up now before he makes more of an arse of himself.
19:08:06 <jeblair> mordred: i'm not sure that fire needs more fuel
19:08:13 <mordred> well, right :)
19:08:36 * mordred adds fuel to exploding gas bombs...
19:08:41 <fungi> so anyway, i'm happy to wiggle a few pvs this week, just wanted to make sure there weren't better options to get us short-term gains with minimal additional effort before i pressed ahead on that
19:08:49 <jeblair> jhesketh is not here, but i hope to get together with him soon to find out if he's planning on working on that and supply him with any help needed
19:09:01 <jeblair> fungi: i think that's the best thing to do now
19:09:06 <fungi> #action fungi grow logs and docs-draft volumes so they're 50% full
19:09:11 <jeblair> and hopefully we'll get on swift before it becomes an issue
19:09:20 <sdague> clarkb: https://review.openstack.org/#/c/56471/
19:09:30 <clarkb> #link https://review.openstack.org/#/c/56471/
19:09:35 <sdague> that's the review on the logs
19:09:39 <jeblair> and if we don't quite make that, we can start ratcheting down the logs we keep (1 year to 9 months, for instance)
19:09:48 <clarkb> jeblair: fungi ++ good intermediate fix
19:10:02 <mordred> jeblair: ++
19:10:07 <fungi> jeblair: don't we already only keep 6 months of logs?
19:10:16 <jeblair> fungi: maybe update the docs to say to only add 1TB volumes
19:10:24 <pleia2> fungi: yeah
19:10:27 <fungi> jeblair: will do
19:10:30 <jeblair> fungi: i thought it was 2 releases or 1 year.  i might be wrong.
19:10:39 <pleia2> it's 6 months, one release
19:10:41 <fungi> #action fungi update docs for static to recommend 1t cinder volumes
19:10:47 <jeblair> nope i'm wrong.
19:10:51 <jeblair> -type f -mtime +183 -name \*.gz -execdir rm \{\} \; \
19:10:57 <fungi> yeah, it's a metric crapton of logs
19:11:06 <jeblair> so, er, 4 months then, i guess.  yeesh.
19:11:10 <fungi> anyway, we've beaten this item to death for now
19:11:12 <jeblair> hopefully we won't need that.
19:11:22 <jeblair> #topic Trove testing (mordred, hub_cap)
19:11:33 <jeblair> mordred, hub_cap: what's the latest?
19:11:43 <mordred> jeblair: I have done nothing on this in the last week - hub_cap anything on your end?
19:13:04 <mordred> hub_cap: also, I'm a bit cranky that you have someone working on turning on your non-tempest tests when you have not finished getting tempest up and going
19:13:22 <jeblair> wow
19:14:38 <mordred> hub_cap: in fact, I'm sorry my brain didn't fire on this properly the other day - I believe we should -2 any changes from you that add support for your other thing until you've got tempest wired up
19:14:50 <jeblair> mordred: is there such a change?
19:15:01 <hub_cap> hey im ok w/ that. i just figured you'd wanted both and id rather he do the legacy tests
19:15:04 <mordred> jeblair: we told them how to do it a day or two ago
19:15:17 <mordred> we want everythig - but you need tempest tests to be integrated
19:15:21 <mordred> so you sohuld really get those going
19:15:26 <hub_cap> i didnt know there was an order
19:15:28 <mordred> and I want your legacy tests deleted
19:15:33 <hub_cap> yes mordred i agree
19:15:33 <mordred> because O M G
19:15:46 <hub_cap> i know..... i know :)
19:15:48 <mordred> so, let's get your tempest stuff wired up, _then_ we can get your additional things added
19:15:51 <mordred> k?
19:15:52 <hub_cap> kk
19:16:10 <hub_cap> good by me!
19:16:11 <jeblair> hub_cap, mordred: when do you think that might happen?
19:16:32 <mordred> hub_cap: what are we blocking on for that for you right now? anything on my end?
19:16:42 <hub_cap> nothing is blocking other than me not doing the work
19:16:44 <hub_cap> thats the blocker heh
19:16:53 <mordred> great. I'll start poking you more then
19:16:56 <hub_cap> dirty
19:17:16 <hub_cap> :)
19:17:23 <mordred> #action mordred to harrass hub_cap until he's writen the tempest patches
19:17:53 <jeblair> #topic Tripleo testing (lifeless, pleia2)
19:18:09 <lifeless> hi
19:18:29 <pleia2> continuing to patch tripleo for the multi test environments, and working on networking now
19:18:30 <lifeless> uhm, we just did this in #openstack-meeting-alt
19:18:38 <lifeless> I wonder if we can dedup the topics somehow
19:18:41 <pleia2> indeed we did!
19:18:50 <pleia2> for more, see tripleo meeting :)
19:19:19 <jeblair> okay, i'm not sure we need that detail
19:19:26 <jeblair> pleia2: can you relate this to infra?
19:19:41 <pleia2> no updates for infra at this time
19:20:09 <jeblair> okay.  so 'still hacking on test-running-infra' is more or less the status
19:20:13 <pleia2> yep
19:20:27 <lifeless> progress is being made
19:20:33 <lifeless> visibly so
19:20:58 <jeblair> ok
19:21:13 <jeblair> #topic Savanna testing (SergeyLukjanov)
19:21:31 <SergeyLukjanov> working on setting up jobs for d-g
19:21:33 <mordred> I've seen patches for this
19:21:36 <jeblair> so i think we basically just told SergeyLukjanov to wait just a bit for the savanna devstack tests
19:21:42 <jeblair> which is an unusual thing for us to do!
19:21:47 <mordred> jeblair: yay us!
19:22:02 <jeblair> but clarkb is doing a d-g refactor, and we want to get the savanna jobs into that refactor
19:22:09 <SergeyLukjanov> I'll rebase my change on clarkb one
19:22:11 <clarkb> but there is a good reason for that. I am reasonably happy with how the d-g job refactor turned out and want people building on top of that
19:22:11 <jeblair> so hopefully it shouldn't be long
19:22:22 <jeblair> i look forward to reviewing it! :)
19:22:47 <SergeyLukjanov> and I'm already have some draft code of api tests for savanna
19:23:04 <SergeyLukjanov> and hope to make a patch later this weel to tempest
19:23:12 <SergeyLukjanov> later this week*
19:23:33 <jeblair> SergeyLukjanov: excellent! sdague ^
19:23:46 <sdague> cool
19:24:12 <jeblair> #topic Goodbye Folsom (ttx, clarkb, fungi)
19:24:19 <fungi> heyhey
19:24:23 <clarkb> its gone !
19:24:31 <mordred> woohoo
19:24:33 <jeblair> i bring it up again because we linked to this review last week: https://review.openstack.org/#/c/57066/
19:24:36 <fungi> yep, folsom is gone, havana tests are all implemented now
19:24:41 <jeblair> and it's abandoned due to -1
19:24:47 <fungi> oh
19:24:56 <jeblair> so i wanted to check: is the grenade situation straightened out, or do we have work to do still?
19:25:10 <sdague> it's still in process
19:25:17 <fungi> i think there is more to do. dprince?
19:25:41 <fungi> ahh, right. sdague, you said there was still some grenade code support missing for that change?
19:26:00 <sdague> yeh, the first one is maybe merging today
19:26:34 <SergeyLukjanov> sdague, just to clarify - is it ok to start with only /smth api tests and client?
19:26:44 <SergeyLukjanov> sdague, I mean with only one 'endpoint'
19:27:06 <jeblair> #link https://review.openstack.org/#/c/57066/
19:27:18 <jeblair> sdague: can you link the change you're referring to?
19:27:22 <sdague> SergeyLukjanov: yeh, I think so, but it will take looking at the patch when it comes in to be sure
19:27:58 <SergeyLukjanov> sdague, sure, thx
19:28:07 <sdague> jeblair: https://review.openstack.org/#/c/57744/
19:28:12 <jeblair> #link https://review.openstack.org/#/c/57744/
19:28:44 <dprince> fungi: sorry, too many meetings, let me catch up here...
19:29:37 <ttx> yeeha
19:30:20 <dprince> fungi: https://review.openstack.org/#/c/57066/
19:30:38 <dprince> fungi: I was waiting on some other grenade core fixes first though.
19:30:44 <fungi> dprince: right, that was the question. sdague caught us up
19:31:08 <fungi> so i think we're cool on this topic for the moment?
19:31:18 <dprince> fungi: Cool. FWIW this stuff is actually blocking parts of a Nova patch series for me. So I'll know when its done!
19:31:22 <clarkb> sounds like it
19:32:08 <jeblair> #topic Jenkins 1.540 upgrade (zaro, clarkb)
19:32:31 <jeblair> i think the change to get nodepool up is in progress
19:32:42 <clarkb> yup zaro pushed that
19:32:50 <jeblair> so let's check back in on this later
19:32:57 <jeblair> #topic New devstack job requirements (clarkb)
19:33:11 <jeblair> clarkb: you have a change up, yeah?
19:33:14 <clarkb> I do
19:33:31 <clarkb> #link https://review.openstack.org/#/c/58370/
19:33:36 <jeblair> when i find changes to devstack-gate.yaml, i've been suggesting they coordinate with you
19:33:53 <jeblair> except if they are for non-official projects, in which case i think we want those jobs in a different file
19:34:03 <clarkb> jeblair: thank you ti has been helpful, I have been leaving a long comment on those changes linking back to 58370 with details
19:34:05 <jeblair> (and in that case, i don't think they need to consider this refactor)
19:34:22 <clarkb> jeblair: right I think they can continue to abuse devstack for whatever erason
19:35:03 <clarkb> 58370 is handy because it clearly shows how to write devstack gate job templates that are useful in all the places we want to use them. check, gate, check on stable branch for d-g changes, and periodic bitrot jobs for releases
19:35:17 <clarkb> and we cover all of these bases with two templates per logical job. Overall I am pretty happy with it
19:35:36 <clarkb> also WSME/Pecan can overload branch-designator to have special jobs just for them that are otherwise identical to the gate jobs
19:35:54 <clarkb> so this will help us integrate the world
19:36:21 <jeblair> clarkb: how does that work? (i haven't looked at the change)  what would they set, for example?
19:37:21 <clarkb> jeblair: the tail end of the jobs names has -{branch-designator} in it. I have been using that to distinguish between stable-grizzly and stable-havana and master for periodic jobs and use -default for things that run against proposed changes on the proposed branch
19:37:42 <jeblair> got it
19:37:47 <clarkb> jeblair: WSME/pecan could put wsme-default in that var instead and get a new job that zuul won't put in the gate with everyone else that is otehrwise identical
19:38:18 <jeblair> #topic  Jenkins Job Builder Release (zaro, clarkb)
19:38:26 <jeblair> #link https://pypi.python.org/pypi/jenkins-job-builder/0.6.0
19:38:28 <jeblair> exists ^
19:38:39 <clarkb> oh cool so that got done ++ for doing that
19:38:44 <jeblair> cut maybe an hour ago
19:38:57 <jeblair> #topic  Puppetboard (anteaya, Hunner, pleia2)
19:38:57 <clarkb> I can update the bug that asked us to cut a release if that hasn't been done yet
19:39:01 <jeblair> clarkb: ++
19:39:05 <pleia2> o/
19:39:18 <pleia2> so last week Hunner gave anteaya and I an internal demo of puppetboard
19:39:23 <fungi> clarkb: and probably need to switch other outstanding bugs from committed to released too
19:39:34 <pleia2> it's pretty cool, has published logs from servers, basic stats
19:39:49 <pleia2> faster than dashboard, but does require use of puppetdb (which we don't currently use)
19:39:56 <Hunner> I have code, but still working out the apache manifest stuff (it's an older module version that my prototype used). So not yet pushed to review
19:40:13 <pleia2> did we have any other requirements?
19:40:19 <jeblair> where does puppetdb run?
19:40:23 <Hunner> Puppetdb is essentially swap out the puppet.conf lines that post to the dashboard, and add it the lines for sending to the puppetdb
19:40:25 <pleia2> puppet master
19:40:29 <pleia2> (right?)
19:40:40 <Hunner> Currently the puppetdb would be running on the puppetboard box, since it's kind of related
19:40:58 <jeblair> okay, i like keeping the master simple.
19:41:02 <Hunner> Currently masters -> dashboard; in the future masters -> puppetdb -> puppetboard
19:41:07 <fungi> that makes the most sense to me. it's not privileged in any way, right?
19:41:08 <Hunner> Yeah, low impact to masters
19:41:17 <Hunner> It's not privileged, no
19:41:26 <fungi> yeah, better on the puppetboard system then
19:41:50 <jeblair> Hunner: i see that it stores facts and catalogs from each node; does that include hieradata?
19:42:02 <clarkb> it does run on postgres (which may or may not be a problem)
19:42:13 <Hunner> One point to note is: just like mysql running on the dashboard box, postgresql will run on the puppetdb/puppetboard box
19:42:16 <jeblair> Hunner: specifically, i'm wondering if this puts plaintext passwords on more systems.
19:42:30 <Hunner> jeblair: It does not store hieradata; just the facts, reports, and compiled catalogs
19:42:38 <jeblair> Hunner: great
19:42:42 <Hunner> Think of hieradata like manifests... neither of those are outputs
19:42:47 <Hunner> Only inputs
19:42:57 <Hunner> puppetdb stores the outputs
19:43:04 <Hunner> (facts, catalogs, reports)
19:43:08 <jeblair> *nod*
19:43:18 <Hunner> And is HTTP REST API queryable
19:43:35 <Hunner> But that's extra shiny that you don't need to care about
19:43:59 <fungi> i can see caring about it down the road. being able to query system status for other things could be very, very helpful
19:44:21 <Hunner> It's essentially trying to be a centalized "best guess" at the whole infrastructure
19:44:32 <Hunner> So yeah, useful down the road for what you say
19:45:03 <Hunner> But as a gui report server ("What's failing? Oh...") it works great
19:45:32 <jeblair> the postgres thing is kind of a bummer, since we're trying to move to cloud databases (which are only mysql right now), but i think we can live with it.
19:46:08 <jeblair> so what's next?  wait for Hunner to finish apache manifests?
19:46:24 <Hunner> Yep. Hopefully have a review by next meeting
19:46:52 <jeblair> pleia2: any other questions?
19:46:57 <jeblair> Hunner: thank you very much!
19:47:03 <pleia2> that's it from me
19:47:10 <fungi> i take it postgres is a hard requirement (substring search or other non-mysql feature)
19:47:10 <pleia2> thanks Hunner
19:47:34 <Hunner> fungi: At this time yes. I could ask for the details and share, since I'm curious too
19:47:42 <fungi> cool. thanks!
19:47:55 <jeblair> #topic Multi-node testing (mestery)
19:47:55 <Hunner> The backend is actually swappable, though the only two existing backends are postgres and in-memory
19:48:05 <mestery> hi
19:48:32 <mestery> So, anteaya set me up with this slot to discuss this, mostly wanted to ask some questions and get some direction around this.
19:48:55 <mordred> gah. had network glich. someone talk to me about hard reqs for postgres at some point please
19:49:22 <jeblair> Hunner: maybe you can pass on what you find to mordred
19:49:30 <jeblair> mestery: what are your questions?
19:49:46 <mestery> For the Neutron ML2 plugin, we would like to do gate testing in a multi-VM environment to test out the tunneling capabilites and new drivers in ML2.
19:49:55 <mestery> So a) is that possible today (multi-node testing in the gate)?
19:50:09 <mordred> Hunner: yeah - grab me offline and let's chat about  that - I'll try not to troll you tooo much
19:50:50 <mordred> mestery: we're working on some solutions aroud that in the tripleo testing workstream
19:51:03 <jeblair> mestery: a) it's not possible today
19:51:06 <mestery> mordred: Great!
19:51:24 <mordred> a) it's not ready yet - but somethign tells me that we'll have to cook up the same things to do it outside of that workstream, so perhaps you guys could lend a hand to that if you have some bandwidth
19:51:37 <mestery> mordred: Perfect, I think that would be good!
19:51:42 <jeblair> mordred: i'm hesitant to suggest that as a solution; i'm not sure it will suffice
19:51:42 <mestery> We have some ML2 folks who could help with this.
19:51:57 <fungi> mestery: i take it it's not feasible to test out tunneling entirely within a single devstack install on one machine
19:52:15 <jeblair> mordred: afaik, the multi-node tripleo work is focused on a limited set of multi-node hardware environments
19:52:23 <fungi> (i.e. each network device as a vm in devstack)
19:52:28 <mordred> jeblair: that is a good point
19:52:29 <mestery> fungi: We can test tunneling perhaps, but we need multiple nodes to run agents on each node for testing as well.
19:52:47 <mestery> fungi: The thing we want to test is the agent communication as well and that path.
19:53:07 <jeblair> mordred: and at the moment, the work the tripleo folks are doing is not intended to be reusable in infra
19:53:15 <fungi> in this case "nodes" means more than one neutron controller instance?
19:53:18 <mestery> I apologize, I'm still learning a lot of this infra code as well, so please bear with me. :)
19:53:45 <pleia2> mestery: dprince, lifeless, derekh and I are having a status chat on Google+ about our work with tripleo testing in a bit (maybe right after this meeting? I can ping you), you're welcome to be a fly on the wall if you want to see where we are at
19:53:59 <mestery> fungi: No, one control node, multiple compute nodes.
19:54:17 <jeblair> pleia2: any chance you guys could open that up a bit?
19:54:19 <fungi> i wonder if we could set up more than one nova service on one devstack machine
19:54:25 <pleia2> jeblair: oh yeah, anyone is welcome
19:54:27 <jeblair> pleia2: we don't use g+ for openstack development
19:54:31 <mestery> pleia2: Cool! Shoot me the info, if I can make it I will join.
19:54:45 <pleia2> jeblair: oh, that, maybe we should use the asterisk
19:55:27 <jeblair> mestery: i'd like to get you some straightforward answers to your question
19:55:42 <mestery> jeblair: So it sounds like it's not supported, and it will take some work to make it happen?
19:56:06 <jeblair> mestery: please feel free to check out what the tripleo folks are doing, but i do not want you to be misled into thinking that is the shortest or most certain path to multi-node gate testing.
19:56:19 <mestery> jeblair: Understood, I'll do that. Thank you!
19:56:40 <fungi> and it's also worth exploring whether multi-node testing (in the sense we've been discussing) is necessary for what you want to test too
19:56:42 <jeblair> mestery: for the at-scale testing we do on virtual machines in the gate, we do need a multi-node solution
19:56:59 <mestery> jeblair: OK.
19:57:16 <mestery> fungi: It is necessary, because we need multiple Neutron nodes, 1 control node and one with neutron agents on it.
19:57:25 <jeblair> mestery: however, it's a few steps away, and probably won't be available for a while.  we need to have non-jenkins test workers running in order for that, and there are still some things to do before we can get even to that point.
19:57:49 <Hunner> For multi-node testing with virtual machine groups, there is rspec-system or its rewritten counterpart beaker-rspec. At least that's what we use (if I understand your requirements)
19:57:51 <mestery> jeblair: Understood. This came up in our Neutron ML2 meeting last week, thus my interest in talking to everyone here.
19:58:09 * ijw sneaks out of the woodwork
19:58:37 <clarkb> Hunner: the problem here is we use public cloud resources that intentionally hobble the networking things we can get away with
19:58:40 <ijw> I've run VM groups for Openstack testing by starting a VM that, in turn, installs and starts others.  I wasn't using devstack, I was using something stackforge-based, but there's no reason why devstack wouldn't also work.
19:59:05 <clarkb> and we are trying to test networking things
19:59:19 <Hunner> Sounds good. Carry on
19:59:58 <mestery> ijw: Thanks, will check that out. Some things to think about here, thanks everyone.
20:00:32 <jeblair> the problem statement for us is more like: we have a pool of nodes that are all online and connected to zuul -- how do we get a collection of those running a single job.
20:00:41 <jeblair> we brainstormed about that a bit at the summit and have some ideas
20:00:49 <jeblair> but regardless, they are still a few steps away
20:00:54 <jeblair> and our time is up
20:01:02 <jeblair> so thanks everyone, and sorry about the topics we didn't get to
20:01:12 <jeblair> if there's something urgent, ping in -infra
20:01:14 <jeblair> #endmeeting