19:02:38 <jeblair> #startmeeting infra 19:02:38 <mordred> o/ 19:02:38 <openstack> Meeting started Tue Apr 9 19:02:38 2013 UTC. The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:02:39 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:02:41 <openstack> The meeting name has been set to 'infra' 19:02:44 <olaph> o/ 19:03:08 <jeblair> mordred: do you have anything before you get on a plane? 19:03:20 <mordred> jeblair: I need to unbreak 2.6 for pbr projects 19:03:25 <jeblair> #topic mordred plane flight 19:03:55 <jeblair> we had wanted to just wait for rhel to fix that for us 19:04:06 <jeblair> fungi: but i'm guessing that's not going to happen today 19:04:10 <mordred> yeah - but rhel seems unhappy too I thought? 19:04:25 <mordred> or is it legit just 2.6 ubuntu that's giving us the problem? 19:04:25 <jeblair> mordred: and pbr is broken now 19:04:26 <fungi> i believe dprince was still working on backports from master 19:04:51 <fungi> which is to say we could switch to rhel6 today for master/havana afaik 19:05:09 <jeblair> mordred: so what if we switch the pbr projects to rhel for 26? 19:05:21 <mordred> jeblair: let's try that as step one 19:05:32 <mordred> jeblair: and see how it goes before trying a plan b 19:05:39 <jeblair> mordred: i'm worried that otherwise it involves puppet hacking 19:05:43 <mordred> it _should_ solve the underlying problem 19:05:43 <fungi> mordred: have an example of a job you'd like me to fire on a rhel6 slave as a test? 19:06:00 <jeblair> mordred: which would be very useful, i mean, clarkb thinks we are going to run into this again with python3 19:06:06 <fungi> or is it already testing successfully on rhel6 non-voting? 19:06:15 <clarkb> jeblair: I actually spoke to someone about that and assuming pip -E works properly we should be able to patch the puppet provider 19:06:27 <clarkb> or we run puppet twice with two different python envs set 19:06:29 <mordred> fungi: gate-os-config-applier-python26 and gate-gear-python26 19:06:44 <mordred> jeblair: yah. I believe we ultimately need to solve this more resiliently 19:06:46 <jeblair> mordred, clarkb: you want to fight over which of you hacks that? :) 19:06:55 <mordred> but I think that doing that with some time to think about it properly would be nice 19:06:55 <clarkb> jeblair: I did install zmq for the log pusher script through apt and not pip to avoid this problem :) 19:07:06 <fungi> mordred: those are also failing on rhel6... 19:07:07 <jeblair> clarkb: also, that's preferred anyway. :) 19:07:09 <fungi> #link https://jenkins.openstack.org/job/gate-os-config-applier-python26-rhel6/ 19:07:20 <mordred> fungi: GREAT 19:07:35 <fungi> no idea if they're failing the same way, but they're failing 19:07:37 <clarkb> jeblair: I can bring it back up with finch over at puppetlabs, and see if we can hack something useful for all of puppet users 19:07:38 <jeblair> mordred: fascinating -- and that's with a 2.6 egg 19:08:00 <mordred> ok. that blows my previous theory 19:08:39 <clarkb> basically requires me to do more pip testing and feed him the info so that we can get a patch into the provider 19:08:50 <mordred> how about I spin up a rhel env and debug and get back to everyone. in the mean time, os-config-applier and gear could disable python2.6 tests for the time being if they're blocked 19:08:58 <jeblair> mordred: +1 19:09:05 <clarkb> sounds good 19:09:09 <jeblair> clarkb: +1 as well 19:09:13 <mordred> (not ideal, but, you know, neither project will die without 2.6 for a day) 19:09:16 <mordred> clarkb: ++ 19:09:27 * mordred will add better 2.6 testing to pbr as well 19:09:30 * mordred cries 19:09:35 <jeblair> mordred: and did you see the zuul bug i filed? 19:09:44 * dprince is willing to help out if needed too 19:09:46 <jeblair> #link https://bugs.launchpad.net/zuul/+bug/1166937 19:09:47 <uvirtbot> Launchpad bug 1166937 in zuul "Option to group multiple jobs together in job trees" [Wishlist,Triaged] 19:09:54 <mordred> jeblair: I did not. I'll look 19:09:59 <mordred> awesome 19:10:20 <mordred> dprince - if you happen to get bored and figure out why https://jenkins.openstack.org/job/gate-os-config-applier-python26-rhel6/ is breaking in the next hour before I get back online, I will buy you a puppy 19:10:22 <jeblair> i encapsulated what we talked about, including after you dropped off 19:10:29 <mordred> jeblair: thanks! 19:10:47 <jeblair> mordred: i think that implementation is mostly in zuul/model.py 19:10:58 <anteaya> dprince: make sure it is a house trained puppy 19:10:59 <mordred> excellent 19:11:00 <jeblair> mordred: and a little bit in the config parser in scheduler.py 19:11:07 <dprince> mordred: I already have one (a humpy one) 19:11:14 <mordred> nice 19:11:18 <mordred> ok - me run to plane 19:11:21 <mordred> back online in a bit 19:11:29 <jeblair> mordred: godspeed 19:11:53 <jeblair> there were no actions from last meeting 19:11:57 <jeblair> #topi gerrit/lp groups 19:11:59 <jeblair> #topic gerrit/lp groups 19:12:25 <fungi> mmm, did we still have any to-dos on that? 19:12:42 <fungi> i think it's wrapped up aside from any other cleanup ttx might have wanted to do in lp 19:12:52 <jeblair> fungi: did the ptl change land? 19:13:01 <fungi> i was just checking... 19:13:24 <fungi> #link https://review.openstack.org/25806 19:13:30 <fungi> merged yesterday 19:13:53 <jeblair> woo 19:13:55 <fungi> oh, i probably should add a note in that groups cleanup bug of ttx's 19:14:03 <jeblair> #topic grenade 19:14:19 <jeblair> dtroyer pointed me at some changes he wants to merge first, and then... 19:14:30 <jeblair> we can cut stable/grizzly branches of grenade and devstack 19:14:50 <jeblair> and then i think we'll be set to run non-voting grenade jobs widely on both master and stable/grizzly 19:15:14 <clarkb> \o/ 19:15:20 <clarkb> is it working now? 19:15:29 <jeblair> clarkb: it has occasionally succeeded 19:16:14 <clarkb> nice 19:16:16 <jeblair> clarkb: i haven't really analyzed the failures to know more about when it succeeds/fails 19:17:09 <jeblair> #topic gearman 19:17:40 <jeblair> so on my side, i wrote a new python gearman client that is much more suited to how we want to use it it zuul 19:17:43 <zaro> this is depressing me 19:17:58 <zaro> i've been debugging. 19:18:02 <jeblair> #link https://github.com/openstack-infra/gear 19:18:11 <zaro> finally figured out exactly why getting double builds. 19:18:57 <zaro> it is because error occurs when attempting to reregister functions while current build is running. 19:19:17 <jeblair> zaro: i can't see a need to register functions while a build is running 19:19:43 <zaro> when error occurs on worker it will close the connection with gearman then reopen but build is still on gearman queue so it runs again. 19:20:34 <zaro> jeblair: code i've got re-registers on events from jenkins. 19:20:55 <jeblair> zaro: right, but it doesn't need to do that while a build is running 19:20:57 <zaro> jeblair: you might want to register at anytime. 19:21:24 <zaro> jeblair: you mean block until build finishes? 19:21:30 <jeblair> zaro: functions are registered per-worker; a worker doesn't need to change its functions while a build is running 19:22:26 <jeblair> zaro: i would postpone changing functions until after the build is complete (which i included in my sketch of a worker routine i sent the other day) 19:22:46 <zaro> jeblair: i see what you mean. i was looking for a way to get more granular in registering, but didn't see a way. i can look again. 19:24:28 <zaro> ok. will try this approach again. can't remember why i gave up last time. 19:24:33 <jeblair> zaro: i'm of the opinion that the gearman-java GearmanWorkerImpl makes too many assumptions about how it's being used; I think we probably will need to write our own GearmanWorker. I'm still reading, but I'd like you to consider that as you continue to dig into it. 19:25:08 <zaro> will do. 19:25:43 <jeblair> #topic pypi mirror/requirements 19:26:07 <clarkb> we are gating openstack/requirements on the ability to isntall all requirements together 19:26:38 <jeblair> and when https://review.openstack.org/#/c/26490/ merges, we will actually be running the requirements gate jobs for projects 19:27:51 <jeblair> we probably need to make the jobs and repo branch aware pretty soon... 19:28:06 <jeblair> i think that depends on how openstack/requirements wants to handle branches. maybe a summit question. 19:28:47 <clarkb> I do have a question about reviewing openstack/requirements. currently we have +2 and approve perms, but it seems like we should defer to the PTLs for most of those reviews? 19:29:35 <jeblair> clarkb: i think so. i only really intend on weighing in when it seems to affect build/test oriented things... 19:29:47 <fungi> i've been refraining from approving them in most cases if it's only ci core votes on them, unless there's some urgency 19:30:26 <fungi> usually only when it's breaking the gate or holding back a ci project 19:30:39 <jeblair> i don't feel i have a lot of input on random library versions, so yeah, i'd say we should be conservative and mostly the openstack-common and ptls should be weighing in most of the time 19:30:40 <jeblair> fungi: +1 19:31:04 <clarkb> cool. I figured we had the perms to sort out problems, but wasnt sure if we had been asked to actuall manage the repo 19:31:08 <fungi> er, poor wording. not a ci project but rather ci work on an openstack project which uses the requirements repo 19:31:40 <jeblair> markmc did explicitly want us to be involved, so afaik, we're not stepping on anyone's toes. 19:32:10 <jeblair> we didn't accidentally get perms to the repo, we really are supposed to have them. :) 19:32:57 <jeblair> #topic releasing git-review 19:33:03 <fungi> it happened 19:33:10 <jeblair> woo! 19:33:16 <fungi> 1.21 is on pypi, manually this time 19:33:33 <fungi> 1.22 may be automated, if the gods are willing 19:33:52 <fungi> i've been running back through and closing out bug reports if they're fixed in 1.21 19:34:02 <jeblair> fungi: do we need to schedule a chat (perhaps when mordred is around) about pbr/etc for git-review? 19:34:29 <fungi> yes, some time in one of those rare moments when he's not on a plane 19:34:41 <jeblair> fungi: i'll put it on the agenda so we don't forget 19:34:51 <fungi> thanks 19:35:07 <jeblair> #topic baremetal testing 19:35:31 <pleia2> so the tripleo folks have been changing up diskimage-builder and how it creates the bootstrap node a bit 19:36:04 <pleia2> so I've been testing changes as it comes along so we're ready once they're ready to start doing formal testing 19:37:13 <pleia2> also been working on getting this going https://github.com/openstack-infra/devstack-gate/blob/master/README.md#developer-setup but keep bumping into issues with the instructions (they're a bit slim, need some local additions and modifications) 19:37:23 * ttx lurks 19:37:32 <jeblair> pleia2: they may have bitrotted too, let me know if you have questions 19:37:44 <jeblair> pleia2: i haven't actually had to follow those in months 19:37:49 <pleia2> did have a wip-devstack-precise-1365534386.template.openstack.org started on hpcloud this morning though, even if it failed once it tried to grab hiera data from puppet 19:38:09 <pleia2> jeblair: great, thanks 19:38:24 <pleia2> just trying to work with it to get a feel for how this works 19:38:31 <pleia2> (aside from just reading scripts) 19:38:50 <pleia2> that's about it though for baremetal 19:38:56 <fungi> that's awesome progress 19:39:03 <anteaya> o/ 19:39:03 <jeblair> pleia2: you may need to combine the 'install_jenkins_slave.sh' trick of running puppet apply with the devstack-gate developer setup to avoid it trying to talk to our puppetmaster 19:39:28 <pleia2> jeblair: makes sense, thanks 19:39:32 <fungi> pleia2: did you get past the sqlite db content requirements, i guess? 19:39:44 <pleia2> fungi: yeah, devananda got me sorted :) (I'll be updating the docs) 19:39:53 <jeblair> pleia2: thanks much. :) 19:39:59 <fungi> excellent 19:40:33 <jeblair> i think the config portion of the sqlite db should become a yaml file (though the status portion should probably remain a sqlite db) 19:40:55 <jeblair> that is not high on my todo list. :( 19:41:23 <jeblair> #topic open discussion 19:41:37 <clarkb> logstash 19:41:45 <anteaya> o/ I added openstackwatch as an agenda item to the wrong wikipage 19:41:53 <jeblair> anteaya: which page? 19:42:06 <anteaya> https://wiki.openstack.org/wiki/Meetings/CITeamMeeting 19:42:13 <fungi> anteaya: yeah should have been https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting 19:42:13 * clarkb blazes ahead with logstash because it should be short 19:42:21 <anteaya> yeah 19:42:22 <anteaya> d'oh 19:42:23 <jeblair> clarkb: go 19:42:39 <clarkb> logstash is running on logstash.openstack.org. you can get the web gui and query it at http://logstash.openstack.org 19:42:39 <jeblair> anteaya: (didn't know that existed; will delete) 19:42:44 <anteaya> k 19:42:51 <clarkb> currently all jenkins job console logs should be getting indexed there. 19:43:42 <clarkb> I may delete data for cleanup purposes 19:44:03 <clarkb> and some of the current data is ugly, but it is getting better as I add filters to logstash to properly parse things 19:44:18 <jeblair> clarkb: i'd say feel free to delete/reset at will as we work this out up until (if/when) we decide logstash is the primary repository instead of logs.o.o 19:44:18 <clarkb> over the course of 24 hours we have added over 21 million log lines 19:44:44 <fungi> any feel for how much retention we can reasonably shoot for there? 19:44:46 <clarkb> the index for today (UTC time) is up to almost 12GB compressed 19:44:55 <clarkb> and this is just console logs 19:45:16 <clarkb> at the end of today UTC time I will run the optimize operation on that index to see if that results in a smaller index 19:45:43 <jeblair> clarkb: that is a lot. 19:46:15 <clarkb> yeah, we may need to aggressively filter and/or run a proper elasticsearch cluster if we want to use this for long term storage 19:46:17 <jeblair> clarkb: as in, almost certainly too much, especially since it's a fraction of what we're storing. 19:46:25 <pleia2> yeah, I think we're averaging about 2G/day compressed in static form 19:46:59 <clarkb> fwiw I think logstash is viable as a short term storage location for easy querying 19:47:12 <clarkb> then have an archive like logs.openstack.org for long term storage. 19:47:30 <clarkb> then if we need to run the log-pusher script over particular logs to reshove them into logstash if we need something from the past 19:48:01 <clarkb> thats about all I have 19:48:14 <jeblair> clarkb: if we have to compromise on what we use it for, we should actually set out some goals and requirements and make sure we achieve them. 19:48:27 <jeblair> clarkb: good summit conversation fodder 19:48:29 <pleia2> reminds me, someone might want to check that my find command to rotate logs is behaving as expected 19:48:33 <clarkb> jeblair: yup 19:48:37 <pleia2> pretty sure it should have deleted some by now 19:49:09 <fungi> pleia2: maybe not... as i said before i didn't restore any logs from prior to september 26 when i rebuilt the server 19:49:19 <clarkb> anteaya: I think you are up 19:49:31 <anteaya> openstackwatch is alive: http://rss.cdn.openstack.org/cinder.xml 19:49:37 <anteaya> but serves no content 19:49:43 <pleia2> fungi: oh right, I had an off by one month in my head month-wise 19:49:55 <anteaya> this is what it should be serving: http://rss.chmouel.com/cinder.xml 19:50:11 <anteaya> so somewhere part of the script is not getting what it expected 19:50:32 <anteaya> so the question came up, do we stay with making swift work or do we go with serving xml files 19:50:34 <jeblair> anteaya: i thought the hypothesis was that review-dev was overwriting it? 19:50:49 <anteaya> that was a potential hypothesis yes 19:50:57 <anteaya> clarkb might be able to expand on that more 19:51:13 <fungi> jeblair: i chimed in later when i got back from dinner and pointed out that the config on review-dev lacks swift credentials, so could not 19:51:20 <jeblair> ah 19:51:28 <anteaya> I too had missed that, thank you fungi 19:51:36 <anteaya> so in terms of a way forward 19:51:57 <anteaya> stay with swift, or go with xml was my understanding of the question 19:52:05 <fungi> and also suggested that the stdout capability openstackwatch has as a fallback would be a useful way to troubleshoot it 19:52:39 <jeblair> yes, it seems the lack of content is a separate question from the output format. that just needs debugging. 19:52:45 <anteaya> so at this point, do we have a way to debug what is running? 19:53:06 <anteaya> at least to understand why no content is being served? 19:53:16 <jeblair> i think it would be useful for this group to decide what we actually want to do with this 19:53:18 <fungi> anteaya: you can run it yourself but don't put swift credentials in the config and it should spew on stdout what it would otherwise upload to swift 19:53:29 <jeblair> what service are we trying to provide? and how should we host that service? 19:53:34 <fungi> at least from what i could tell reading through the script 19:53:53 <anteaya> yes, well chmouel's feed bears witness to that 19:54:09 <anteaya> but I am at a loss as to why our configuration of the script serves no content 19:54:24 <fungi> well, the short description is "rss feeds of new changes uploaded for review on individual projects" 19:54:29 <clarkb> jeblair: I think the service here is providing reviewers/interested parties an alternative to the gerrit project watches and email 19:54:30 <chmouel> maybe it would be best to generate xml static file than uploading to swift? 19:55:02 <jeblair> clarkb: that sounds like a useful service; so in that case, i think we should have it automatically generate a feed for every project on review.o.o 19:55:04 <anteaya> chmouel any idea why our config would serve no content yet yours does? 19:55:31 <chmouel> humm i'm not sure 19:55:36 <chmouel> let me check the scollback 19:55:53 <fungi> chmouel: that's one suggestion which came up. basically modify it so that we serve those rss xml files directly from the apache instance our gerrit server 19:56:00 <anteaya> chmouel: the script is the same: https://github.com/openstack-infra/jeepyb/blob/master/jeepyb/cmd/openstackwatch.py 19:56:01 <jeblair> if we want to make this a more seamless integration with gerrit, then i think we sholud host it at review.o.o. perhaps at a url like 'review.openstack.org/rss/org/project.xml' 19:56:22 <fungi> and then link that in the gerrit theme? 19:56:32 <chmouel> jeblair: the proper solution would be that gerrit itself provide rss feeds :) 19:56:35 <jeblair> we could have it either read project.yaml or 'gerrit ls-projects' to get the list 19:57:05 <jeblair> chmouel: true. :) people so rarely volunteer for java hacking projects around here. 19:57:32 <chmouel> heh fair java+xml is not much fun 19:58:11 <jeblair> fungi: the theme linking will take some thought i think, especially a way to handle the per-project feeds 19:58:33 <anteaya> jeblair: should we continue to mull on it and discuss it again next week? 19:58:40 <anteaya> I'm not feeling a decision is nigh 19:58:42 <fungi> right, i'm not immediately coming up with any great ideas as to how to make that visible in the gerrit interface 19:59:03 <fungi> the projects list is not something people hit often, for example 19:59:06 <chmouel> next week is probably going to be beer^H^Hsummit time :) 19:59:18 <clarkb> plenty of time for discussion :) 19:59:19 * anteaya notes to use the correct wiki page next time 19:59:25 <jeblair> yeah, let's think about that. there's always just documenting it in the wiki; but it would be nice to get some kind of link going on. 19:59:33 <jeblair> thanks everyone! 19:59:41 <jeblair> see you next week, in person, i hope 19:59:44 <ttx> see you all soon! 19:59:52 <chmouel> see you soon! 19:59:57 <jeblair> #endmeeting