#openstack-meeting log

19:03:22 <jeblair> #startmeeting infra
19:03:23 <openstack> Meeting started Tue Sep  3 19:03:22 2013 UTC and is due to finish in 60 minutes.  The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:03:24 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:03:25 <clarkb> fungi: I really hope you spoke like that on the cruise
19:03:26 <openstack> The meeting name has been set to 'infra'
19:03:31 <jeblair> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting
19:03:37 <jeblair> #link http://eavesdrop.openstack.org/meetings/infra/2013/infra.2013-08-27-19.02.html
19:03:39 <fungi> clarkb: only most of the time ;)
19:04:28 <jeblair> #topic Operational issues update
19:04:51 <jeblair> i think this is kind of a non-topic at this point...
19:04:58 <jeblair> except there are some folks back from vacation
19:05:17 * fungi is very back, and very trying to catch up on what happened
19:05:19 <ttx> o/
19:05:27 <jeblair> so, maybe if fungi or sdague or anyone else has questions, we could cover what happened while they were gone
19:05:39 <clarkb> ++
19:05:54 <fungi> i think i grok most of what got done with the git.o.o fan out/ha
19:06:12 <fungi> looks like additional slaves got added too
19:06:33 <fungi> anything else major last week?
19:06:35 <jeblair> the really short version is:  nodepool is stable and fast, git.o.o is stable and fast, static.o.o is stable and fast, and jenkins is still jenkins.
19:06:37 <sdague> I'm still in dig out mode, so I'll refrain from asking questions until I've gone through the relevant lists
19:06:43 <sdague> nice
19:07:01 <jeblair> sdague, fungi: the summary from last week might be useful:
19:07:03 <jeblair> #link http://eavesdrop.openstack.org/meetings/infra/2013/infra.2013-08-27-19.02.log.html
19:07:08 * fungi nods
19:07:15 <clarkb> there were also some changes to zuul
19:07:44 <clarkb> including lots of bugfixes and a rolled back set of optmimzations that turned out to not help
19:08:00 <ttx> jeblair: was wondering... is it just calmer now, or are we processing so fast that the queue stays small ?
19:08:10 <clarkb> the test environment was not representative of production hence the miss on those optimizations
19:08:19 <jeblair> yep.  it turns out having 30,000 refs in the zuul repo meant that when it fetch new changes it was slow.  we got rid of them.  we'll figure something out long term
19:09:16 <jeblair> ttx: i think we're seeing fairly typical load at this point (though not the atypically high load we saw around proposal freeze)
19:09:32 <fungi> got it. some sort of automated expiring/pruning of old zuul refs at some point i guess
19:09:44 <jeblair> ttx: i think the infra changes and the switch to testr has really sped things up
19:10:23 <clarkb> ttx: jeblair: ya our slowest tests are under 30 minutes now
19:10:54 <fungi> wow. that's a pretty amazing development
19:11:08 <jeblair> i don't anticipate that we'll make any major changes this week (H3 is on friday)
19:11:13 <clarkb> ++
19:11:25 <fungi> that'll hopefully give me a chance to catch up anyway
19:12:01 <ttx> jeblair, clarkb awesome
19:12:08 <jeblair> after that, we have a disruptive change to zuul to add reporter support (which is pretty cool -- i think we'll be able to have it send summary email reports for the bitrot jobs)
19:12:29 <jeblair> (disruptive in this case means graceful shutdown and momentary outage)
19:13:00 <jeblair> we want to further improve responsiveness by having all of the slaves be managed by nodepool
19:13:15 <fungi> no argument there
19:13:22 <jeblair> and then i've been working on a new scheduler algorithm for zuul, which could speed up throughput even further
19:13:25 <fungi> except special-purpose slaves maybe?
19:13:35 <jeblair> (at the cost of using _even more_ test nodes)
19:13:41 <jeblair> fungi: yep
19:13:57 <clarkb> jeblair: the really nice thing about that zuul scheduling algorithm is it simplifies zuul's internals
19:13:59 <jeblair> #link https://review.openstack.org/#/c/44346/
19:14:16 <jeblair> yeah, so far (before tests), its +52 -135 lines of code
19:14:42 <jeblair> (and docs)
19:15:28 <jeblair> anything else on this topic?
19:15:47 <clarkb> we might want to consider scaling back git.o.o after the freeze
19:16:21 <jeblair> yeah, after we figured out what was going on with packed-refs, it turns out it's a bit overbuilt.
19:16:53 <jeblair> #link http://cacti.openstack.org/cacti/graph_view.php?action=tree&tree_id=2
19:17:45 <jeblair> #topic Backups
19:17:58 <jeblair> clarkb: what's the latest?
19:18:33 <clarkb> #link https://review.openstack.org/#/c/44129/
19:19:01 <clarkb> now that people are back from vacation that can get another review. Backups are in place on etherpad(-dev) and review(-dev) if you want to look at them
19:19:03 <clarkb> fungi: ^
19:19:23 <fungi> clarkb: added to the top of my reading lisr
19:19:25 <fungi> list
19:19:26 <clarkb> once that change merges (I hope to merge it today) next step is getting bup running on those hosts so that the backups end up offsite
19:19:51 <clarkb> er backups are not in place on review.o.o yet, just review-dev
19:20:06 <clarkb> 44129 adds mysql backups to review.o.o and wiki.o.o
19:20:40 <clarkb> hopefully by the time the next meeting rolls around we will have proper mysql backups for these hosts
19:21:17 <jeblair> yay!  that sounds so professional!  :)
19:22:00 <jeblair> #topic Asterisk server
19:22:02 <jeblair> russellb: ping
19:22:09 <russellb> pong
19:22:19 <jeblair> so i spun up like lots of asterisk servers
19:22:49 <jeblair> now i think we just need to schedule a time to dial into (some of?) them and see if we notice a difference in quality
19:23:03 <russellb> OK, sounds like a plan.
19:23:11 <russellb> today and tomorrow are not good for me ...
19:23:20 <russellb> other than that, most days are as good as any other
19:23:27 <russellb> (feature freeze rush)
19:23:45 <jeblair> okay, maybe friday morning then?  or we can wait till next week if it would be better
19:23:57 <russellb> friday is OK
19:24:08 <clarkb> friday works for me
19:24:09 <russellb> looks like i have a meeting at 11 AM Eastern
19:24:28 <russellb> so anything outside of that hour is fine
19:25:03 <jeblair> 10am pacific, 1pm eastern (17 utc) ?
19:25:12 <russellb> wfm
19:25:12 <fungi> sounds good to me
19:25:41 <jeblair> #action jeblair send email about asterisk testing friday, 1700 utc
19:25:59 <jeblair> thanks!
19:26:09 <jeblair> #topic puppet-dashboard (pleia2, anteaya)
19:26:17 <pleia2> anteaya: joined just in time!
19:26:22 <anteaya> yay phone guy
19:26:35 <anteaya> go pleia2
19:26:57 <pleia2> so last week the puppet-dashboard server filled up, and we decided that instead of rescuing it again we should just finally reinstall it on a new (non-legacy) server
19:27:10 <pleia2> #link https://bugs.launchpad.net/openstack-ci/+bug/1218631
19:27:11 <uvirtbot> Launchpad bug 1218631 in openstack-ci "Build new puppet-dashboard server" [Undecided,New]
19:27:43 <pleia2> anteaya outlined our options in a comment from this morning
19:28:16 <pleia2> #1 is the easiest, anteaya is suggesting #2 - discuss! :)
19:28:39 <jeblair> i like easy and supported
19:28:46 <pleia2> that's not an option
19:29:12 <pleia2> so #2 requires a different version of ruby than what ships with precise, which makes it tricky
19:29:19 <jeblair> ugh
19:29:28 <clarkb> doesn't precise offer both 1.8 and 1.9?
19:29:34 <pleia2> anteaya: or will it work on 1.8.7?
19:29:42 <anteaya> #1 just takes us back into the same pigeon holes that we have now, that ended up with a broken dashboard
19:29:51 <pleia2> clarkb: oh, if it does it would be nice, let's see...
19:29:53 <anteaya> #2 will work on 1.8.7
19:30:03 <jeblair> anteaya: i think we ended up with a broken dashboard because we didn't delete old stuff
19:30:09 <anteaya> I don't recommend 1.8.7, but #2 option will work on it
19:30:40 <clarkb> looks like 1.9.1 is on precise with some 1.9.3
19:30:40 <jeblair> #2 is worth considering in case there's a security problem with the dashboard itself, it would be nice to have an upgrade channel there
19:30:41 <anteaya> jeblair: yes, because the db optimization rake task is really unique, and broke our db at least once
19:30:51 <fungi> precise offers ruby 1.9.3.0 in its ruby1.9.1 package
19:30:55 <pleia2> anteaya: oh ok, I think I misread but now rereading it I do see that you said 1.8.7 would work
19:30:55 <anteaya> so we didn't run the db optimizaation task again
19:30:59 <jeblair> anteaya: that wasn't my understanding at all.
19:31:00 <clarkb> fungi: nice
19:31:26 <anteaya> jeblair: okay, what did you understand?
19:31:34 <jeblair> anteaya: but perhaps it's not worth going into, if we all agree that we want to run the sodabrew fork for other reasons
19:31:59 <anteaya> 1.8.7 will work with sodabrew
19:32:16 <anteaya> I lean heavily toward 1.9.3 but 1.8.7 will work with it
19:32:21 <clarkb> I think we should run the fork for the reason jeblair lists. And I think we should be able to run it with either version of ruby
19:32:23 <jeblair> anteaya: why 193?
19:32:49 <anteaya> it is being heavily supported by the ruby community and 1.8.7 has bascially been end of lifed
19:32:56 <anteaya> I can't speel
19:32:58 <anteaya> spell
19:33:15 <anteaya> clarkb: +
19:33:36 <jeblair> as long as canonical are supporting 187, it's not a big deal, but it seems they're supporting both, so that's not much of a tiebreaker
19:33:48 <jeblair> 187 means it may perhaps share more puppet configuration with our other hosts...
19:33:55 <pleia2> jeblair: +1
19:34:06 <jeblair> i worry a little about the puppet needed in order to have a different ruby version on one (puppet-managed) host
19:34:10 <anteaya> in terms of selected which of the 4 options, no not a tiebreaker
19:34:23 <fungi> worth speculating... will running dashboard under 1.9.3 and generating puppet reports from machines using 1.8.7 work? i had gotten the impression puppet used some ruby-specific data types in its reports which changed starting in 1.9.1 which was part of the issue i ran into puppeting fedora 18
19:34:37 <clarkb> fungi: oh interesting
19:34:39 <jeblair> fungi: gah!
19:34:48 <clarkb> fungi: it does use yaml which supports serializing objects
19:34:57 <anteaya> fungi: I haven't heard of this
19:35:06 <clarkb> fwiw I think we do need to start thinking long term about using newer puppet
19:35:17 <anteaya> but if it is true, I second jeblair "gah!"
19:35:30 <anteaya> clarkb: or newer provisioning
19:35:48 <fungi> anteaya: i'll dig up my old notes. puppet report in puppet 2.7 wouldn't work with the ruby 1.9.3 on fedora 18 as a result
19:35:57 <jeblair> here's how i'm leaning: sodabrew with 187, so that there's less work to stand it up, and it is similar to the rest of the hosts.
19:36:15 <anteaya> fungi: okay, and yes I would like to see your notes once you find them
19:36:20 <jeblair> and hopefully that tides us over until we have newer puppet (+ newer ruby, presumably)
19:36:28 <pleia2> jeblair: +1, and re-evaluate once we start moving to puppet3 and newer ruby
19:36:28 <clarkb> jeblair: ++
19:36:37 <anteaya> wfm
19:36:42 <jeblair> and if something comes up, we can probably upgrade that host to 193 if necessary
19:36:44 <fungi> anteaya: several ruby 1.9.3 fixes were added in puppet 3 and never backported to 2.7 was the gist
19:36:52 <anteaya> jeblair: hopefully yes
19:36:54 <pleia2> the next Ubuntu LTS comes out in 8 months, so that's a nice target and it will have newer ruby
19:37:03 <anteaya> fungi: ah that sounds about right
19:37:11 <pleia2> so, the next question is how we want to pull in sodabrew fork
19:37:22 <pleia2> vcsrepo trunk?
19:37:29 <anteaya> yay new Ubuntu LTS
19:37:47 <jeblair> pleia2: i think so, unless they're building .debs?
19:37:54 <pleia2> no debs as far as I could see
19:38:05 <clarkb> if they have tagged releases we could vcsrepo on those
19:38:24 <pleia2> clarkb: I looked at the tagged releases, they are all old
19:38:40 <clarkb> in that case trunk :) or maybe a specific commit off of trunk
19:39:29 <pleia2> ok, I'll work with anteaya to get this rolling
19:39:37 <anteaya> sounds good
19:40:18 <jeblair> cool, thanks!
19:40:22 <jeblair> #topic Bug day on Tuesday September 10th at 1700UTC
19:40:38 <pleia2> that's just an announcement really
19:40:40 <jeblair> tsia? :)
19:40:52 <pleia2> over 170 bugs right now, lots of new we should browse through
19:41:00 <jeblair> that's the second day of the doc sprint, but i believe i'll be around
19:41:15 <jeblair> s/sprint/bootcamp/
19:41:28 <clarkb> pleia2: will you be attending that too
19:41:30 <clarkb> ?
19:41:37 <pleia2> clarkb: wasn't planning on it
19:41:47 <jeblair> i'm doing some speaking the first day
19:41:53 <jeblair> about automation, etc
19:41:58 <clarkb> cool, was going to suggest maybe moving it if more than jeblair was going to be there
19:42:04 <clarkb> but  Ithink we will get by :)
19:42:13 <pleia2> yeah, we can ping him about specific updates as needed
19:42:14 <jeblair> i don't expect to attend the second day
19:42:21 <jeblair> (though that may change)
19:42:27 <pleia2> ok
19:42:39 <annegentle> bug day! Woohoo!
19:43:00 <jeblair> big day
19:43:02 <annegentle> too bad it overlaps
19:43:03 <annegentle> yeah
19:43:15 * ttx should add a bug counter for infra
19:43:28 <anteaya> ttx file a bug report
19:43:29 <ttx> too bad damn Launchpad does not do stats all by itself
19:43:29 <fungi> ttx: we don't want to break your gauge
19:43:37 <ttx> (hint hint)
19:43:52 <jeblair> #topic Open discussion
19:43:58 <ttx> summit.o.o is up for icehouse summit, still running manually on top of openstack-infra/odsreg code... haven't had time to complete infra migration there unfortunately. Will be announced tomorrow.
19:43:59 <annegentle> o/
19:44:11 <annegentle> just wanted to see who all from infra will be at doc boot camp
19:44:21 <annegentle> we're getting the first of the new tshirt design
19:44:23 <jeblair> ttx: it's at least heading in the right direction
19:44:37 <jeblair> ttx: when are the elections?
19:44:42 <mrodden> oh i had a question... was there any more discussion on the pypi mirror stuff?
19:44:47 <ttx> jeblair: it also now has the feature to handle "absence of summit" gracefully, so it's almost there
19:44:48 <anteaya> annegentle: you have shirts for your bootcamp?
19:44:53 <jeblair> annegentle: i'll be there for the 1st day, at least, second if you need me
19:44:55 <clarkb> ttx: does that host need mysql backups as well? we could potentially start with a little puppet to do that
19:44:57 <mrodden> rsync vs lmirror ... etc
19:45:04 <annegentle> anteaya: not for the bootcamp specifically but openstack tshirts
19:45:14 <anteaya> annegentle: ah okay
19:45:20 <ttx> clarkb: it does need backup, although it runs on sqlite atm
19:45:28 <pleia2> annegentle: I wasn't planning on attending during the days, but I'm local-ish (SF) so if folks are meeting for an evening meal I'd love to visit
19:45:29 <ttx> clarkb: I run manual backup informally
19:45:36 <clarkb> ttx: ok
19:45:38 <annegentle> jeblair: thanks
19:45:48 <annegentle> pleia2: you are welcome to join us Monday evening
19:45:50 <ttx> clarkb: like I said, I just missed 2 weeks to finalize it properly :/
19:45:52 <clarkb> mrodden: no, I don't think that has been brought up again. We have been busy making everything else work
19:46:04 <pleia2> annegentle: great, I'll be in touch for time/location
19:46:08 <annegentle> pleia2: and actually I'd love another local driver if you're avaialble
19:46:29 <pleia2> annegentle: my husband has our car on weekdays, but he works in mt view so I'll see what I can do and let you know
19:47:03 <annegentle> pleia2: thanks!
19:47:14 <clarkb> mrodden: there is still a need for other folks to build a repository locally? I feel like leanign towards more general mirror building scripts might be the least painful way to do that
19:47:47 <pleia2> re: this week, I'm going to be out Wednesday evening - Thursday, but I'll be back Friday :)
19:48:01 <mrodden> clarkb: i have one i run internally behind teh firewall
19:48:08 <mrodden> and more importantly "on-site"
19:48:13 <jeblair> #link http://amo-probos.org/post/15
19:48:35 <mrodden> its a bunch of scripts i threw together...
19:48:39 <jeblair> I wrote a blog post about some of the zuul/jenkins-related system changes we made over the past year
19:48:55 <anteaya> yay
19:48:58 <jeblair> fungi, sdague: ^ more catch-up reading if you want
19:49:02 <fungi> added
19:49:02 <pleia2> jeblair: nice!
19:49:08 <clarkb> mrodden: we have scripts too :) but lifeless doesn't like that they come with a lot of dependencies. We are working on splitting them into their own project
19:49:17 <sdague> jeblair: will check out
19:49:34 <clarkb> I think mordred deleted the old project that had the pypimirror name
19:49:42 <clarkb> we should be able to start working on that transition now
19:49:58 <mrodden> clarkb: oh... i have that change to re-create so we can move run_mirror.py over
19:50:07 <mrodden> it probably was auto-abandoned
19:50:14 <jeblair> mrodden: ah yeah, we should be able to merge that now if you want to restore it
19:50:41 <mrodden> https://review.openstack.org/#/c/39399/
19:51:02 <mrodden> restored
19:51:25 <clarkb> cool, I will take another look at that change. it may need ar ebase
19:51:35 <clarkb> mrodden: maybe you want to test if it is mergable locally and rebase if necessary?
19:51:57 <mrodden> clarkb: will do. i have some other fires i'm working on at the moment, but i'll try to get to it...
19:52:46 <clarkb> I plan on updating the openstack infra publications talk with logstash info (I am giving that talk at openstack on ales at the end of the month) are there other things people would like to be added to that?
19:52:49 <clarkb> jeblair: fungi ^
19:53:08 <jeblair> clarkb: the overview talk?
19:53:12 <clarkb> jeblair: ya
19:53:16 <jeblair> clarkb: cool
19:53:32 <fungi> ahh, that one
19:53:35 <pleia2> it would be nice to get publications in general sorted so people can access them (and I'd like to add mine)
19:53:45 <pleia2> via the web
19:53:52 <clarkb> pleia2: I think that may have gotten sorted out
19:54:05 <fungi> i think the index generation job/script may still be broken
19:54:06 <pleia2> http://docs.openstack.org/infra/publications/ still is Forbidden
19:54:25 <fungi> i haven't dug into it other than to confirm it seemed not to generate an index.html in there
19:54:42 <clarkb> pleia2: append overview to that and it works so you can at least directly link
19:54:49 <pleia2> clarkb: that's just one talk
19:54:53 <pleia2> we have lots of slides
19:54:53 <clarkb> I can look into the index.html when modifying the talk
19:55:11 <pleia2> the index page should create a listing of all the slide decks, including /overview
19:55:14 <fungi> that would be greatly appreciated
19:55:23 <clarkb> ok, added to the list
19:55:28 <pleia2> thanks
19:55:36 <fungi> pleia2: the script tries to, at least. probably something trivial missing
19:55:43 <pleia2> and I could use instructions for adding my talk, maybe something to add to ci.openstack.org
19:55:47 <pleia2> fungi: *nods*
19:55:52 <clarkb> pleia2: good idea
19:55:59 <clarkb> I can do that too
19:56:03 <pleia2> you rock \o/
19:57:57 <jeblair> thanks everyone!  talk to you friday, i hope.  :)
19:58:00 <jeblair> #endmeeting