19:02:47 <jeblair> #startmeeting infra 19:02:48 <openstack> Meeting started Tue Jul 30 19:02:47 2013 UTC and is due to finish in 60 minutes. The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:02:49 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:02:51 <openstack> The meeting name has been set to 'infra' 19:02:59 <jeblair> "and is due to finish in 60 minutes" ! neato :) 19:03:02 <mordred> o/ 19:03:08 <fungi> too cool 19:03:25 <jeblair> #topic asterisk server 19:03:38 <jeblair> most of the topics from last meeting were around the asterisk server anyway... 19:03:54 <jeblair> including that i signed up for a DID, so it has a phone number now... 19:03:57 * mordred bows down to the new asterisk server overlord 19:03:57 * fungi missed the last meeting 19:04:11 * fungi forgot to see what got assigned to him in absentia 19:04:15 <jeblair> fungi: i don't think you did; at least based on irc logs 19:04:30 <jeblair> it looks like there was no meeting last week 19:04:33 <clarkb> correct 19:04:43 <jeblair> pabelanger, russellb: should we all dial in? 19:04:53 <jeblair> to see if we can stress test the conf server a bit? 19:04:59 <russellb> jeblair: sure 19:05:02 <pabelanger> should be able too 19:05:12 * fungi has a telegraph^H^H^H^H^Hphone handy 19:05:21 <pabelanger> sip:conference@openstack.org 19:05:39 <russellb> pabelanger: going to use g722? 19:05:41 <clarkb> I will need a real number as android's clients are derpy and I am still without a proper headset 19:05:46 <jeblair> there's that, and the phone number is 512-808-5750 19:05:52 <pabelanger> no, was going to use DID 19:05:57 <russellb> i'll probably just use the DID as well ... 19:06:00 <pabelanger> russellb: no client 19:06:14 <russellb> i have a client, but i like my desk phone more, heh 19:06:17 <fungi> conference id number? 19:06:19 <jeblair> 6000 19:06:29 <pabelanger> in 19:06:40 <pabelanger> asterisk CLI looks good 19:07:29 <russellb> 6 people on 19:07:59 <pabelanger> load is nothing 19:08:58 <anteaya> I can get on with skype and the phone number 19:09:11 <pabelanger> *CLI 19:09:15 <pabelanger> confbridge list 19:09:21 <anteaya> using the sip client, Jitsi still fails for me 19:09:30 <pabelanger> anteaya: likely a codec issue 19:09:41 <anteaya> pabelanger: probably 19:09:58 <russellb> jeblair: *CLI> core show channels ... shows all channels as calls from the SIP provider 19:10:06 <anteaya> no 19:10:57 <anteaya> yes a little choppy for me too 19:11:39 <mordred> cell phone 19:11:53 <anteaya> I am just listening, not transmitting 19:12:08 <anteaya> who was that? 19:13:07 <pabelanger> nice 19:13:23 <russellb> pbx*CLI> channel originate Local/6000@public application playback spam 19:14:39 <jeblair> i have silence now... 19:14:48 <clarkb> jeblair: we are still talking 19:14:52 <jeblair> neat! 19:14:54 <fungi> silence huh? 19:15:10 <fungi> we have more callspam 19:15:18 <pabelanger> CPU is spiking, so we'd need to see why 19:15:27 <russellb> she will say anything. 19:15:29 <russellb> almost anything. 19:15:34 <anteaya> ha ha ha 19:15:47 <pabelanger> tt-monkeys FTW 19:18:06 <anteaya> just with the small group, the sound is about equivalent to one of the board meeting conference calls, as a listener 19:18:12 <anteaya> fungi has good sound 19:18:22 * ttx lurks 19:18:29 * mordred waves at ttx 19:18:37 <mordred> ttx: you want to dial in from france? 19:18:54 * mordred got dropped 19:19:12 <ttx> mordred: i'll pass, unless you REALLY need that tested 19:19:19 <anteaya> skype is charging me money to listen in 19:19:27 * ttx multitasks 19:19:36 <anteaya> so far I have paid 40 cents 19:19:47 <jeblair> ttx: not yet; i think we'd get a local number later 19:21:25 <jeblair> #action clarkb add pbx to cacti 19:21:32 <davidlenwell> what is the conf number ? 19:21:37 <anteaya> that actually was quite good 19:21:39 <jeblair> we all just hung up 19:21:43 <davidlenwell> oh .. too late 19:21:44 <anteaya> we just ended the session 19:21:51 <jd__> what? are we having a party on the phone or something? 19:21:53 <russellb> https://wiki.openstack.org/wiki/Infrastructure/Conferencing 19:22:04 * jd__ hides 19:22:05 <davidlenwell> jd__: we missed it 19:22:07 <fungi> we can try to bum-rush it again next week once we have it in cacti to see what the impact is 19:22:08 <russellb> i documented it! 19:22:20 <pabelanger> you could totally hook the meeting room bot into asterisk too, once the meeting start, a new conference room is create and password outputted 19:22:25 <pabelanger> don't think that would be too hard 19:22:44 <jeblair> neat :) 19:22:50 <russellb> pabelanger: it's only software 19:22:54 <ttx> russellb: "on his spare time, the hero of cloud computing sets up asterisk" 19:23:00 <russellb> ttx: heh 19:23:13 <fungi> ttx: like you're one to talk, writing your own bug tracker 19:23:20 <ttx> russellb: not jealous at all. 19:23:21 <pabelanger> russellb: would be a fun integration 19:24:11 <jeblair> so i think next we want to get a handle on resource usage there... fwiw asterisk is currently almost idle now that we've hung up. 19:24:30 <jeblair> so i'm guessing the high cpu usage is related to our call 19:24:36 <clarkb> I think fail2ban is probably a reasonable thing to try as well 19:24:43 <clarkb> it will act as a rate limiter for badness 19:24:55 <anteaya> or spammers just target us when there is something going on 19:25:35 <jeblair> shall we move on? 19:25:47 <russellb> there was no spam while we were on the call fwiw 19:25:55 <russellb> so it was just conference related 19:25:55 <pabelanger> jeblair: yes. Transcoding was the hit 19:26:06 <pabelanger> but I wouldn't expect it to be spiking 19:26:13 <russellb> pabelanger: dangit, should have looked to see what codecs were being used ... 19:26:29 <pabelanger> ya, I was trying to see that 19:26:37 <pabelanger> I think I seen some gsm with ulaw 19:26:43 <russellb> they were all from the provider, so i can just call back and check 19:27:08 <russellb> ulaw (which is good) 19:27:15 <russellb> k, can move on 19:27:26 <jeblair> ok 19:27:30 <pabelanger> might just be confbridge 19:27:32 <fungi> too attractive to spend the whole hour on a phone system ;) 19:27:36 <jeblair> #topic multiple jenkins masters 19:28:14 <jeblair> the multi-step process to get to the point where we can have multiple masters is almost complete... 19:28:33 <jeblair> we just need to do something to ensure that the bitrot jobs only run one place (and also that their logs are correctly processed) 19:29:14 <zaro> do we also need this? https://bugs.launchpad.net/openstack-ci/+bug/1082803 19:29:15 <uvirtbot> Launchpad bug 1082803 in openstack-ci "Manage jenkins global config and plugin list" [Medium,Triaged] 19:29:17 <jeblair> to that end, I think i can have a working timer trigger for zuul today, which means zuul/gearman can dispatch those jobs, and we can stop using the jenkins timer triggers for them 19:29:18 <clarkb> jeblair: reviewing your multiple triggers zuul change is on my list of things to do once we are done with bugs 19:29:37 <jeblair> zaro: that would be nice, but i think i will defer it for now... 19:29:53 <jeblair> because i'd like to have multiple masters within a few days (or weeks at the most) 19:30:04 <clarkb> jeblair: I agree. There is a pressing need for multiple masters which we can manage by hand until we have the automagic to do it with tools 19:30:24 <fungi> especially since multiple will initially be just two 19:30:30 <fungi> not counting the old one 19:30:38 <jeblair> i believe we're hitting performance problems related to the rate at which we are adding/removing slaves from jenkins for the devstack tests 19:30:54 <jeblair> so being able to scale that up is the motivating factor for this 19:31:08 <clarkb> jeblair: yes, the slave launch logs indicate it is taking up to a minute to add each slave 19:31:10 <fungi> that certainly sounds like the sort of unusual use case the jenkins devs wouldn't have optimized for 19:31:11 <clarkb> which is really really slow 19:31:31 <jeblair> after we have multiple masters, i plan on looking at more efficient and reliable ways of managing slaves. 19:31:51 <zaro> that is awsome. i am talking about the gearman-plugin at this year's jenkins conf. will having something good to show! 19:32:09 <jeblair> zaro: cool! 19:32:12 <jeblair> when is that? 19:32:25 <zaro> oct 23-24 19:32:30 <zaro> want to come down? 19:32:34 <mordred> jeblair: on the efficient and reliable ways of managing slaves - I'd love to chat about that once you're there 19:33:15 <zaro> it's in palo alto. 19:33:38 <jeblair> mordred: my thoughts so far are that d-g should be a daemon instead of a bunch of jenkins jobs, and that gearman-plugin should (optionally) handle offlining slaves when jobs are done (since it's in a good place to do that) 19:33:51 <mordred> ah yes. that. totally with you 19:34:21 <jeblair> mordred: the daemon will be able to manage the nova boots in a better way. we may still have bottlenecks adding and removing slaves... 19:34:48 <jeblair> howevever, we'll have a better handle on that in that we'll be able to tune how many api calls happen in parallel, etc... 19:34:54 <mordred> agree 19:35:09 <jeblair> also, we won't be making all these api calls at the same time jenkins itself is managing 16 running jobs (which are making the api calls) 19:35:33 <mordred> _after_ that, clarkb and I were supposing about kexec re-purposing of slaves, so that there would be less adds/removes 19:35:38 <clarkb> using a daemon will definitely make a lot of pain points less insane 19:35:40 <mordred> but I agree that tuning what we have first 19:35:49 <mordred> will let us grok the other things sanely 19:35:55 <jeblair> ah yeah, that could be cool. the only thing about that is that i don't think we can count on always doing that 19:36:07 <jeblair> some jobs will completely break the node and it will need to be destroyed and replaced 19:36:17 <jeblair> but 90% of them may be able to be reused, which could be a huge win 19:36:28 <clarkb> jeblair: yup 19:36:41 <clarkb> jeblair: and a daemon should be smart enough to know when it can and cant make use of kexec 19:37:13 <jeblair> that could be the normal case, and then if we fail to kexec after a minute or two, we kill it and let, say, a hypothetical future low-watermark load based system decide that it needs to replinish the pool. 19:37:44 <mordred> jeblair: yup. and a hung-kexec probably will not respond to health pings :) 19:37:52 <jeblair> (and hopefully, we can apply a bunch of this to the regular nodes too) 19:38:02 <clarkb> that would be amazing 19:38:06 <fungi> sounds great 19:39:01 <jeblair> cool, sounds like we have general consensus on a very long road ahead :) 19:39:09 <jeblair> #topic requirements and mirrors 19:39:18 <jeblair> mordred: you want to talk about about what's going on in this area? 19:39:26 <mordred> ugh 19:39:28 <mordred> not really 19:39:37 <mordred> can I just close my eyes and make it go away? 19:39:50 <jeblair> maybe? 19:39:51 <mordred> so ... there are several issues 19:40:00 <jeblair> (it's worth a shot) 19:40:10 <mordred> one is that pip installs things not in a dependency graph order 19:40:33 <mordred> but in a strange combo of first takes precedence and sometimes highest takes precedence 19:41:09 <mordred> which makes things gigantically problematic for devstack when things change, because sequencing will cause not what you expected to be installed 19:41:12 <mordred> SO 19:41:15 <clarkb> I did open a bug against this problem with upstream pip dstufft thought it may be fixed in 1.5 19:41:18 <mordred> current things on tap to fix this 19:41:23 <clarkb> but that doesn't help us today 19:41:40 <mordred> sdague and I are working on getting the requirements update.py script in good shape 19:41:50 <mordred> with the idea that in setup_develop in devstack 19:42:09 <mordred> the first step will be to run the update.py script from requirements on the repo to be setup_develop'd 19:42:18 <sdague> yes, we're close, except for the fact that update.py doesn't handle oslo tarball urls 19:42:23 <mordred> this will ensure a consistent worldview of which packages should be installed 19:42:25 <sdague> I think that's actually the only missing piece 19:42:28 <clarkb> #link https://github.com/pypa/pip/issues/988 there is apparently some undocumented feature that may be useful in mitigating this 19:42:39 <mordred> sdague: grab most recent trunk - I think that's fixed now 19:42:49 <sdague> mordred: really? 19:42:58 <sdague> if so I can kick my devstack job again 19:43:04 <jeblair> mordred: setup_develop sounds like a good idea 19:43:09 <mordred> then - once that's happening, we should be able to gate requirements on devstack running with requirements 19:43:14 <mordred> so we can know if a new req will bork devstack 19:43:28 <jeblair> awesome 19:43:38 <mordred> and then, with the two of those in place, we sohuld be in good shape to auto-propose patches to projects on requirements changes landing 19:43:53 <mordred> incidentally, I also just wrote a patch to update.py to make it sync the setup.py file 19:43:59 <mordred> so those really will be managed boilerplate 19:44:05 <clarkb> mordred: maybe you can get dstufft to give us a tl;dr on the undocumented feature referred to in the pip bug 19:44:19 <clarkb> and see if that actually does help us 19:44:22 <mordred> k. 19:44:30 <mordred> I'll ask him 19:44:31 <sdague> also, I've got unit testing for update.py inbound, as soon as mordred does a pbr fix 19:44:45 <mordred> yup. we have a pbr bug to bug folks about to help this 19:44:50 <sdague> because tox in requirements/ is fun :) 19:44:54 <mordred> then I'm also going to get per-branch mirrors up 19:45:34 <mordred> and finally- I think we should ditch installing from pip and figure out a way to auto-create debs and install from those - because we're spending WAY too much time on this 19:46:05 <sdague> I think that just moves the pain elsewhere 19:46:08 <mordred> but I am not working on that 19:46:09 <jeblair> mordred: won't that put us in the "can't use python-foo version X because it's not packaged yet" boat again? 19:46:26 <clarkb> jeblair: yes, that was my worry when we were discussingthis the other day 19:46:43 <mordred> possibly to both of you ... but we are spending a LOT of effort on this 19:46:47 <jeblair> mordred: which i'm okay with, if we think we're in a better place to deal with that new (less churn, more packagers, cloud archive, etc) 19:46:51 <mordred> and turning up every corner case in python 19:46:58 <jeblair> s/new/now/ 19:47:07 <mordred> I don't think it's the right thing to work on right now 19:47:20 <mordred> but I do think perhaps at the summit, we should discuss what it might look like in earnest 19:47:20 <sdague> I think it's a summit session, honestly 19:47:26 <mordred> jinx 19:47:31 <jeblair> sounds good to me 19:47:34 <clarkb> ++ 19:47:43 <mordred> I still agree that we do not want to be in the business of shipping debs or rpms 19:47:46 <jeblair> mordred: (and i agree with you in principle) 19:47:59 <mordred> but like we've talked about for infra packages, operationally helpful packages might be helpful 19:48:24 <jeblair> we're at a very different place than we were 2 years ago 19:48:38 <mordred> very much so 19:48:44 <mordred> https://review.openstack.org/39363 btw 19:48:46 <jeblair> anything else on this topic? 19:49:00 <mordred> for those who feel like looking at a pbr change related to helping the requirements work 19:49:13 <jeblair> #top Gerrit 2.6 upgrade (zaro) 19:49:23 <jeblair> #topic Gerrit 2.6 upgrade (zaro) 19:49:31 <jeblair> zaro: how's it going? 19:49:39 <zaro> ohh. it's going.. 19:49:53 <zaro> i think i have gerrit with WIP votes working now. 19:50:03 <clarkb> with one small asterisk right? 19:50:26 <jeblair> is that with a patch? 19:50:29 <zaro> nope! just figured out the bit of prolog.. 19:50:35 <mordred> neat! 19:50:39 <fungi> prolog! 19:50:40 <clarkb> zaro: but it requires the patch for the change owner right? 19:50:42 <zaro> yes, it's gerrit core. 19:50:44 <mrodden> clarkb: thanks, didn't realize the check_uptodate.sh was just nova 19:50:56 <jeblair> zaro: have you proposed it upstream yet? 19:51:14 <pleia2> o/ 19:51:24 <zaro> i just got it all to work. will submit a patch to upstream this week. 19:51:25 <fungi> pleia2 found an internet 19:51:30 <sdague> so couldn't we get WIP equiv by making APROV -1,0,+1 and letting an author -1 APROV his/her own patches? 19:51:34 <jeblair> zaro: neat! :) 19:51:36 <zaro> let's see what they say. 19:51:41 <jeblair> sdague: that's actually the plan we discussed... 19:51:45 <sdague> oh... ok :) 19:51:47 <mordred> if they're amenable to the patch in general 19:52:05 <mordred> do we think we'd be willing to roll out a 2.6 with only that patch 19:52:06 <jeblair> sdague: and i believe the "let an author..." bit is what zaro was patching to support 19:52:11 <zaro> i have been in discussions with mfink? and i did it on his suggestion. 19:52:21 <mordred> oh awesome 19:52:23 <clarkb> jeblair: and/or adding a WIP category, but both changes need the patch zaro wrote to be expressible in the ACLs 19:52:25 <jeblair> sdague: (existing gerrit acls don't support that operation) 19:52:26 <sdague> ok, I had thought that permission was already in, 19:52:28 <sdague> ok 19:52:57 <zaro> ok. that's it for now. 19:53:06 <mordred> I think, given a history of carrying a whole string of patches, carrying one patch for a cycle would not be as terrible 19:53:13 <jeblair> zaro: i think mfink has a very strong voice in the gerrit community, so if he likes it, that's great. :) 19:53:30 <zaro> good to hear. 19:53:43 <jeblair> mordred: let's give this "develop upstream" thing a shot, eh? :) 19:53:55 <mordred> jeblair: sure! 19:54:08 * mordred just wasn't sure what the 2.7 schedule was looking like 19:54:12 <jeblair> zaro: i'm very excited, thanks! 19:54:18 <clarkb> me too 19:54:22 <jeblair> #topic cgit server status (pleia2) 19:54:31 <jeblair> pleia2: welcome! 19:54:34 <pleia2> so, the one thing I wanted to talk about is addressing for grabbing git repos 19:54:44 * mordred welcomes our new git.o.o overlords 19:55:08 <pleia2> the plan right now is to do what fedora does (it's easy) and do git://git.o.o/heading/project and http://git.o.o/cgit/heading/project 19:55:13 <pleia2> so they aren't the same :( 19:55:30 <pleia2> git.kernel.org makes it so they are both git.kernel.org/prod/ 19:55:30 <mordred> I think jeblair has convinced me that I can get over my issues with the cgit in the url 19:55:35 <pleia2> err /pub 19:55:40 <clarkb> pleia2: I think we can lie and put them all under /cgit 19:55:48 <clarkb> even though cgit doesn't do git protocll 19:55:57 <pleia2> clarkb: yeah, that's really easy 19:56:00 <jeblair> pleia2: i'm okay with that because you have indicated cgit really wants to be set up like that, fedora and kernel work that way... 19:56:01 <jeblair> er 19:56:14 <mordred> I do not want to add cgit to the git:// url 19:56:20 <fungi> does kernel.org use rewrites or something? (if first node in the path matches an org, rewrite the url) 19:56:29 <jeblair> you want to put the git-protocol repos under 'cgit/'? that doesn't sound good to me 19:56:34 <mordred> if we lie about anything, I want the cgit to go away from the urls in places 19:56:49 <pleia2> fungi: probably, I clicked down into a project to see what urls they provide visually and: 19:56:53 <pleia2> git://git.kernel.org/pub/scm/bluetooth/bluez.git 19:56:53 <pleia2> http://git.kernel.org/pub/scm/bluetooth/bluez.git 19:56:53 <pleia2> https://git.kernel.org/pub/scm/bluetooth/bluez.git 19:56:53 <mordred> but I'm fine with having the web and the clone be different 19:56:55 <pleia2> ^^ ie 19:56:56 <uvirtbot> pleia2: Error: "^" is not a valid command. 19:57:16 <jeblair> https://git.kernel.org/pub/scm/bluetooth/bluez.git is not cgit 19:57:20 <jeblair> it's regular git 19:57:26 <pleia2> ah, interesting 19:57:47 <clarkb> so maybe that is what we do, put regular git http daemon behind /pub put git:// behind /pub then cgit can have /cgit? 19:57:48 <mordred> right- I think git clone http:// and git clone git:// should have the same urls 19:57:48 <jeblair> so i think it makes sense to serve cgit from /cgit 19:57:54 <clarkb> mordred: I agree 19:58:02 <clarkb> and I agree with jebalir 19:58:03 <fungi> worth noting, you don't have to clone from cgit, you can clone from http(s) published copies of the git trees 19:58:09 <fungi> er, that 19:58:10 <mordred> and I don't think we need pub - I think that can go in root 19:58:13 <jeblair> and let's serve http and git protocols without a prefix 19:58:21 <mordred> git clone git://git.openstack.org/openstack/nova.git 19:58:22 <clarkb> wfm 19:58:30 <jeblair> mordred: +1 19:58:42 <jeblair> #topic Py3k testing open for business (fungi, zul, dprince, jd__, jog0) 19:58:49 <jeblair> fungi: 1 minute! 19:58:51 <fungi> just a quick update 19:58:57 <fungi> we're basically ready 19:59:03 <mordred> w00t 19:59:05 <fungi> couple of reviews which need last-minute attention... 19:59:07 <jeblair> yaay! 19:59:12 <jeblair> oh 19:59:15 <fungi> #link https://review.openstack.org/#/q/status:open+project:openstack-infra/config+branch:master+topic:py3k,n,z 19:59:29 <jeblair> oops 19:59:44 <jeblair> (plug for my new gate test which will catch missing jobs!) 19:59:54 <fungi> but that's all the updates i have time for in here. we can pickit up i #-infra 20:00:05 <jd__> what's missing? 20:00:09 <jeblair> so close! 20:00:12 <fungi> jd__: see link 20:00:30 <fungi> jd__: we missed that when reviewnig your change earlier 20:00:37 * ttx whistles innocently 20:00:43 <jd__> too bad 20:00:47 <jeblair> thanks everyone! 20:00:47 <markmc> hey 20:00:50 <russellb> hi 20:00:54 <jeblair> #endmeeting