19:01:18 <ianw> o/
19:01:20 <krotscheck> o/
19:01:20 <jeblair> #link agenda https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting
19:01:21 <fungi> hey there
19:01:21 <jesusaurus> o/
19:01:34 <jeblair> #link previous meeting http://eavesdrop.openstack.org/meetings/infra/2014/infra.2014-09-09-19.04.html
19:01:48 <zaro> o/
19:01:59 <jhesketh> o/
19:02:40 <jeblair> we've established some areas of priority, to help reviewers target their work to make sure we keep moving on the big-picture things we decided on at the summit and meetup
19:02:48 <jeblair> so the first part of the meeting is addressing those
19:02:55 <jeblair> #topic Puppet 3 Migration
19:03:08 <jeblair> last week we said that by now, we hoped that we had a puppetmaster up and were checking hosts
19:03:17 <fungi> most of the nails are in puppet 2's coffin now
19:03:20 <jeblair> in fact what we did is just did the entire migration on thursday and friday
19:03:28 <anteaya> yay
19:03:42 <clarkb> and are now cleaning up the last few things
19:03:53 <nibalizer> good job everyone!
19:03:54 <nibalizer> http://puppetboard.openstack.org/fact/puppetversion
19:03:55 <fungi> yeah, i think we have changes proposed for all the cleanup now?
19:04:05 <anteaya> anything left that has to happen prior to Sept. 30?
19:04:21 <fungi> give puppet 2.7 a going away party
19:04:21 <jeblair> very much thanks to nibalizer who managed to get most of the prep for that done before we started
19:04:28 <anteaya> yay nibalizer
19:04:42 <fungi> yes, huge thanks nibalizer!
19:04:47 <nibalizer> woo, happy to help!
19:05:03 <nibalizer> what is the fate of ci-puppetmaster? will we be turning it off?
19:05:09 <jeblair> yep
19:05:20 <jeblair> did someone move the launch scripts over?
19:05:23 <fungi> it should be fine to turn off now, in fact
19:05:29 <fungi> jeblair: yeah, moved and tested
19:05:39 <clarkb> is there anything else on that host we might want?
19:05:48 <clarkb> (logs maybe?)
19:05:49 <krtaylor> o/
19:05:51 <jeblair> yeah, so we have passwords, hiera, launch scripts... i think that should be it.
19:06:12 <fungi> #link https://review.openstack.org/121654
19:06:21 <fungi> there's a test server you can qa if you like
19:06:47 <clarkb> fungi-test.o.o iirc
19:06:53 <clarkb> I hopped on it briefly and it looked good
19:07:11 <fungi> i test-drove the server a little and seemed sane to me too
19:07:28 <fungi> or i wouldn't have un-wip'd that change ;)
19:08:00 <jeblair> okay, so we can delete at our leisure
19:08:24 <anteaya> who will do that?
19:08:29 <nibalizer> cool
19:09:23 <fungi> anteaya: any of the infra root admins can. i'll do it once the cleanup changes merge, if nobody beats me to it
19:09:28 <anteaya> kk
19:09:31 <fungi> it takes, literally, seconds
19:09:34 <jeblair> #topic  Swift logs
19:10:21 <jeblair> #link https://etherpad.openstack.org/p/swift_logs_next_steps
19:10:27 <jeblair> jhesketh prepared that before the meeting ^
19:10:35 <jeblair> #link https://review.openstack.org/#/c/109485
19:10:41 <fungi> jhesketh you workhorse
19:11:04 <jeblair> oh cool, so i hadn't reviewed that because i thought we might still be experimenting with the test job
19:11:20 <jeblair> but if that's first on the list, i'm assuming that the experimental job is working and we're ready to proceed
19:11:43 <clarkb> it is working, but there was perceived slowness?
19:11:53 <jhesketh> yep, the experiemental job has been working well for a while (although it hasn't been ran regularly)
19:11:54 <jeblair> clarkb: in what way?
19:12:04 <clarkb> jeblair: in the fetcing of logs
19:12:11 <jhesketh> clarkb, jeblair: the slowness is in the fetching
19:12:16 <clarkb> jeblair: I don't think we quantified it super well yet
19:12:21 <clarkb> maybe we should try doing that too?
19:12:28 <jhesketh> so os-loganalyze can be slow
19:13:04 <jeblair> do we want to quantify that with the current experimental job, or should we merge 109485 and work from there?
19:13:32 <clarkb> it will probably help to have the larger dataset?
19:13:32 <anteaya> I'm for merging and debugging
19:13:44 <fungi> having a broader sample set may help bring the performance issues to light, and perhaps inflict pain on people to improve them
19:13:48 <anteaya> if we merge the job will be run more frequently
19:13:49 <jeblair> jhesketh: we check disk first, right?  so 109485 isn't a behavior change on its own
19:13:51 <jhesketh> there's no reason to block on merging 109485 imo... Unless we want to back out of having logs in swift, we can work on speeding up serving in parallel
19:14:02 <clarkb> ++
19:14:17 <fungi> that change lgtm too
19:14:19 <jhesketh> jeblair: actually, that's a good point... we check swift first and fall back to disk
19:14:35 <jeblair> oh, so we'll actually end up making viewing logs for all python jobs slow
19:14:42 <jhesketh> maybe we should hold off, and/or move a less impacting jjb job over
19:14:53 <fungi> viewing and presumably indexing (for logstach workers?)
19:15:15 <clarkb> fungi: yeah the logstash workers will be hit but the pipelining should smooth it over for them
19:15:16 <fungi> obviously i meant logstache
19:15:26 <clarkb> I think the bigger concern is for humans looking to debug their test
19:15:31 <jeblair> yeah, let's go for reduced impact
19:16:02 <anteaya> what is a less impacting jjb job?
19:16:15 <clarkb> the infra jobs are all candidates imo
19:16:22 <anteaya> I agree with that
19:16:23 <jhesketh> okay, I'll take a todo to pick a less impacting job and also put up some swift vs disk comparisons
19:16:23 <clarkb> since we can/should be aware of this work
19:16:47 <fungi> we do seem to enjoy inflicting pain on ourselves first
19:16:54 <fungi> makes sense
19:16:59 <anteaya> it does seem to be a pattern
19:17:41 <jeblair> #action jhesketh rework 109485 to impact only infra jobs
19:18:06 <jeblair> #topic  Config repo split
19:18:25 <jeblair> #link https://review.openstack.org/#/c/121650/
19:18:42 <anteaya> I have put up a patch to create the new project-config repo
19:19:01 <jeblair> very cool :)
19:19:01 <anteaya> please share your thoughts to ensure I have the tests and acl file as you would like them to be
19:19:16 <anteaya> I am reading up on git filter branch and will be playing with it
19:19:41 <jeblair> anteaya: cool.  we want to import this from a repo built with filter branch, so we may want to wip your change until that is ready
19:19:45 <anteaya> once I feel confident that I can filter config so the selected repos are in their own repo and that they are removed from config
19:19:55 <anteaya> I will let you know so we can do the freeze ans such
19:20:03 <anteaya> I can do that
19:20:14 <anteaya> if I can get some feedback on the tests and acl
19:20:23 <jeblair> we also have a bit of work to prepare for the (system-config) repo itself
19:20:25 <anteaya> I would like to get taht confirmed before I move on
19:20:37 <anteaya> okay I will practice with filter branch
19:20:48 <anteaya> let me know or what I can do to get config in shape
19:20:51 <jhesketh> should we also time this to a change to remove the said files from the config repo?
19:20:59 <anteaya> yes
19:21:05 <jeblair> we need to update quite a number of places where we currently reference the config files in there to use the new repo instead
19:21:08 <anteaya> they need to happen during the same freeze
19:21:14 <jeblair> and yeah, we should do all of those things around the same time
19:21:18 <jhesketh> ie have a dependant change pre-approved so it goes in at the same time, avoiding patches proposing to both repos
19:21:21 <anteaya> in my eyes
19:21:49 <jeblair> anteaya: so once you're done with the filter-branch, we should probably still delay merging that change until everything is ready to go at once
19:21:56 <jeblair> anteaya: and you can run filter-branch again right before we do it
19:22:08 <jeblair> to make sure the new repo has the latest changes
19:22:12 <anteaya> jeblair: yes
19:22:25 <anteaya> jeblair: I hope to have a command I can run anytime
19:22:29 <jeblair> cool
19:23:16 <jeblair> so the other part of this section is the lots-of-puppet-modules split
19:23:33 <jeblair> yesterday we switched over to running the new apply integration test
19:24:03 <jeblair> this is really cool -- we're using the zuul cloner to check out config and the puppet-storyboard module and then we run puppet apply
19:24:47 <clarkb> these are the first jobs to use the zuul cloner right?
19:24:49 <fungi> it's also a great next-step to further gutting ref management out of devstack-gate
19:24:49 <jeblair> (the clone mapper that hashar added to zuul cloner came in handy, it lets us map "openstack-infra/puppet-storyboard" into /etc/puppet/modules/storyboard in one line of yaml)
19:25:05 <jeblair> clarkb, fungi: yup
19:26:05 <nibalizer> i have one review up to split a module out, it has some feedback and i'll put a new patchset up soon
19:26:17 <jeblair> nibalizer: is there a story for this spec?
19:27:06 <jeblair> nibalizer: i don't see one.  would you please create one in storyboard and update the spec to link to it? http://specs.openstack.org/openstack-infra/infra-specs/specs/puppet-modules.html
19:27:22 <nibalizer> sure
19:27:42 <jeblair> then we should create a task for each puppet repo, so that people can assign those tasks to themselves as they work on it
19:28:08 <jeblair> it would be good to start slow and break out just one module at a time at first to make sure we have the process right
19:28:13 <jeblair> then i think we can go open season :)
19:28:25 <fungi> after the first couple are behind us, they'll make good low-(medium?)hanging-fruit tasks
19:28:55 <jeblair> anyway, as we do it, we should be able to add them to the integration test so that we can be relatively sure that we're not breaking ourselves as we go
19:28:56 <anteaya> fit for third party participation I am hoping
19:29:18 <anteaya> so documentation of the process is greatly appreciated
19:29:30 <jeblair> anteaya: it should be in the spec; if it changes, we should update the spec
19:29:42 <anteaya> great
19:29:44 <jeblair> (specs are not written in stone.  they are written in rst!)
19:29:53 <anteaya> I do believe it is yes
19:29:57 <anteaya> stone is too slow
19:30:08 <jeblair> anything else on these?
19:30:20 <fungi> anteaya: stone flows faster at higher temperatures
19:30:27 <anteaya> fungi: that it does
19:30:29 <jeblair> #topic  Nodepool DIB
19:30:40 <jeblair> i was hoping mordred would be by to share the status here
19:31:04 <fungi> is he at openstack silicon valley?
19:31:15 <jeblair> i looked at the stack this morning and found that the bottom of the stack, despite 4 revisions since, hasn't addressed a pretty fundamental concern i brought up
19:31:32 <jeblair> so the bottom is now at -2 basically just to get attention.  :(
19:31:35 <anteaya> we should talk to yolanda maybe?
19:32:04 <jeblair> i think she's out, but she didn't seem to know the reasoning when she commented on an earlier patchset
19:32:12 <anteaya> hmmm
19:32:30 <jeblair> so basically, i think we're waiting for mordred to finish this, or someone to take it over
19:33:13 <jeblair> if someone wants to take it over, let me know.
19:33:24 <jeblair> fungi: oh, yes i believe he is at ossv
19:33:43 <jeblair> #topic  Docs publishing
19:34:09 <jeblair> i haven't started on this yet, and probably won't for a bit yet
19:34:42 <jeblair> if anyone wants to get started on it, feel free (and let me know).  otherwise it's probably going to be a few weeks before i start on it in earnest.
19:34:58 <zaro> what is this?
19:35:00 <clarkb> I can proably take up the dib stuff again since I poked at it before
19:35:09 <jeblair> zaro: http://specs.openstack.org/openstack-infra/infra-specs/specs/doc-publishing.html
19:35:16 <ianw> jeblair: re d-i-b; i'm very interested in this, but don't want to unilaterally take things over
19:35:16 <clarkb> should be able to get up to speed on it relatively quickyl
19:35:31 <clarkb> ianw: maybe we can work together on it?
19:35:32 <jeblair> also, it's probably good for some of the swift logs stuff to settle out before we really start on docs
19:35:43 <fungi> ianw: unilaterally taking over mordred's changes is a tradition around here ;)
19:35:58 <clarkb> jeblair: I think that is sane otherwise we will be context switching too much
19:36:08 <krotscheck> it really is.
19:36:18 <jeblair> bilaterally taking over mordred's changes is less traditional but should be fine! :)
19:36:38 <jeblair> #topic  Jobs on trusty
19:36:41 <fungi> well, the serving things from swift work has implications on the docs publishing as well
19:36:53 <fungi> so having those lessons learned behind us could help
19:37:08 <ianw> clarkb: happy to ... just i talked about things with yolanda and she was at the time actively working on things, but if that is no longer the case, cool
19:37:10 <fungi> jobs on trusty!
19:37:36 <fungi> #link https://review.openstack.org/121931
19:37:55 <fungi> that'll be ready to merge as soon as i'm done confirming the remaining bare-trusty image updates complete
19:38:10 <fungi> #link https://etherpad.openstack.org/p/py34-transition
19:38:38 <fungi> that's getting whittled down though there are still a number of outstanding changes linked there which need to merge, and other projects which still need fixes
19:38:59 <fungi> a few have yet to be investigated yet
19:39:15 <clarkb> the big ones are related to broken things in py34 whcih makes this a bit difficult
19:39:19 <fungi> overall the majority of our working and voting python33 jobs run well under 3.4 as well
19:39:42 <fungi> but yeah, we do need at least one ubuntu sru to the python3.4 package in trusty
19:39:55 <jeblair> fungi: is that in progress?
19:40:20 <fungi> i believe the ubuntu package maintainer has not yet triaged the bug
19:40:26 <clarkb> the bug is filed, lifeless noted it is a good sru candidate, but unsure of where to go from there
19:40:34 <clarkb> hunt down the package maintainer?
19:40:42 <fungi> with torches
19:40:44 <jeblair> yell at zulcss?
19:41:12 <fungi> now, now... we don't want to make zul ragequit
19:41:29 <fungi> but yeah, i'll try to help get it more visibility
19:41:34 <jeblair> i think he likes being yelled at
19:41:42 <jeblair> at least, that's what mordred told me
19:41:57 <fungi> it's currently impacting oslo.messaging's unit tests on 3.4
19:42:09 <fungi> segfault in the interpreter, even
19:42:13 <clarkb> fungi: and potentially any ubuntu software run on py3.4
19:42:19 <fungi> right
19:42:27 <clarkb> since it is a subtle gc bug figure out all the affected things is hard
19:42:48 <jeblair> you should say since it's a segfault, it might be a security bug.
19:42:49 <fungi> #link https://launchpad.net/bugs/1367907
19:42:51 <uvirtbot> Launchpad bug 1367907 in python "Segfault in gc with cyclic trash" [Unknown,Fix released]
19:43:05 <fungi> (for those not wanting to dig it out of the etherpad)
19:44:09 <fungi> might be a stretch to tease code execution out of an improper cast in the gc, but denial of service is a possibility i suppose
19:44:43 <jeblair> anything else?
19:44:45 <fungi> also it's happening on teardown looks like
19:45:26 <fungi> nah, that covers current state for getting rid of the py3k-precise nodes, but not sure what the current state is for the other outstanding precise migration needs
19:45:53 <fungi> at some point we can hopefully at least simplify if not remove the custom parameter function
19:46:00 <jeblair> clarkb: maybe we can check on that for next week?
19:46:18 <jeblair> #topic  Manila project renaming (fungi, bswartz)
19:46:49 <clarkb> check on the bug?
19:47:28 <jeblair> clarkb: other parts of the precise->trusty transition
19:47:47 <fungi> for scheduling the manila project move, i have stuff going on this weekend (wife's birthday, inlaws visiting) and also won't be around thursday, so unless we want to do the manila rename friday i'll have to bow out. otherwise we punt to next week
19:48:04 <jeblair> i could do friday
19:48:16 <clarkb> friday is good here as well
19:48:31 <fungi> okay, let's say friday then... early afternoon pst?
19:48:41 <jeblair> things are less insane than in recent weeks, we can probably swing it with only a minor disruption in service.
19:48:42 <fungi> or late morning pst?
19:49:19 <jeblair> early afternoon works for me if it works for you, fungi
19:49:25 <fungi> 19:00 utc good?
19:49:39 <fungi> or maybe 20:00 so it doesn't hit lunch?
19:50:05 <jeblair> 20:30?
19:50:06 <clarkb> ++ to 2000
19:50:10 <clarkb> or 2030
19:50:13 <fungi> 20:30's good
19:50:20 <jeblair> (since it takes a bit to prepare)
19:50:34 <fungi> i'll send an e-mail to the -dev ml to give everyone including manila devs a heads up
19:50:41 <jeblair> #agreed rename manila at 20:30 utc on friday sept 17
19:50:46 <jeblair> oops
19:50:48 <jeblair> #agreed rename manila at 20:30 utc on friday sept 19
19:51:05 * jeblair just remembered undo
19:51:09 <fungi> hah
19:51:16 <jeblair> #topic  Fedora/Centos testing updates (ianw 09-16-2014)
19:51:25 <ianw> hey, we can skip most of this
19:51:31 <ianw> f20-bare nodes merged, thanks
19:51:35 <ianw> i'll keep an eye on them
19:51:48 <ianw> got a d-i-b update.  still working on the centos7 images in d-i-b
19:52:03 <jeblair> cool, and we're obviously not quite ready to use it anyway
19:52:18 <ianw> i am told that HP have production ready centos7 images, so i will be keeping an eye on that and hoping to bring up nodes there when it's ready
19:52:36 <ianw> that's all for that
19:52:40 <jeblair> #topic  Nodepool min-ready issues (ianw 09-16-2014)
19:53:00 <jeblair> #link https://review.openstack.org/#/c/118939/
19:53:04 <jeblair> has 2 +2s
19:53:07 <ianw> is the only holdup with this change just review backlog?  if a different approach is wanted, i can work on it
19:53:21 <jeblair> so we could probably merge it at will
19:53:42 <jeblair> if anyone wants to review it, do so soon, otherwise i'll merge it, say, tomorrow?
19:53:58 <jeblair> and maybe we can slip in a friday nodepool restart
19:54:26 <ianw> ok, i'll watch out for updates
19:54:38 <jeblair> #topic  Log Download (ianw 09-16-2014)
19:54:53 <jeblair> #link https://review.openstack.org/#/c/120317/
19:54:57 <ianw> so i really would like to download a bundle of logs when debugging gate failures
19:55:16 <ianw> is that review on the right track, or would we rather see it done some other way
19:55:26 <jeblair> sdague: ^ fyi
19:55:37 <ianw> wget --recursive overriding robots.txt kind of sucks
19:55:50 <jeblair> jhesketh: points out that it should perhaps be included in os loganalyze
19:55:52 <ianw> and sends down uncompressed logs
19:55:55 <jhesketh> I would like to discuss if it fits within osloganalyze
19:56:15 <jeblair> which kind of makes sense to me, since we're really looking at that as our interface to the logs now
19:56:15 <jhesketh> which has started diverging from just log markup
19:56:38 <jhesketh> well it raises the question of if it should be doing that, but I'm not sure we want to get into that discussion
19:57:38 <jeblair> well, we've already made that choice
19:58:01 <fungi> it seems like a reasonable fit, and a reasonable feature request
19:58:02 <ianw> so is the general conclusion move it as a feature of os-loganalyze?
19:58:18 <fungi> in my opinion, yeah
19:58:21 <jeblair> ianw: can you look into whether that makes sense?
19:58:27 <clarkb> this is a crazy idea so maybe ignore me, but what if the tests ship a tarball only
19:58:38 <clarkb> then loganalyze can serve from with in that? that doesn't deal with swift well
19:58:40 <clarkb> nevermind
19:59:08 <fungi> why doesn't it deal with swift well?
19:59:15 <ianw> ok, i'll look at putting it in there
19:59:27 <clarkb> fungi: because we would have to retrieve the entire tarball to get a single file
19:59:38 <jeblair> (or at least potentially the whole file)
19:59:46 <clarkb> fungi: wich will only make the slowness worse
19:59:47 <fungi> oh, i get it. yeah without local caching that's probably badbadbadness
20:00:05 <anteaya> time
20:00:25 <anteaya> thanks to jhesketh for being here!
20:00:26 <jeblair> thanks everyone; we'll move topics we didn't get to to the top of the agenda next time
20:00:30 <jeblair> #endmeeting