17:01:20 <thinrichs1> #startmeeting CongressTeamMeeting 17:01:20 <openstack> Meeting started Tue Mar 3 17:01:20 2015 UTC and is due to finish in 60 minutes. The chair is thinrichs1. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:01:21 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:01:24 <openstack> The meeting name has been set to 'congressteammeeting' 17:01:42 <thinrichs1> Who do we have this week? 17:02:43 <thinrichs1> jwy: I see you're here. How's the Horizon UI going? 17:03:32 <jwy> hi, good, i pushed what's there so far: https://review.openstack.org/#/c/160722/ 17:03:51 <jwy> for policy creation and deletion 17:04:02 <jwy> few more things to do for that 17:04:14 <arosen1> Hi 17:04:38 <jwy> hi 17:05:00 <jwy> also talked with Yali more about policy abstraction 17:05:09 <jwy> she is working on a func spec for that 17:06:20 <jwy> i also made some updates to horizon and the docs since the datasources are now retrieved by id instead of name 17:06:44 <jwy> waiting for review on those. i think the ci might still be broken? 17:07:07 <thinrichs1> Not sure about the CI. We've been having trouble with our cloud of late. 17:07:26 <arosen1> Yea hopefully that will be under control soon 17:07:33 <thinrichs1> It's great that you're making progress! 17:07:38 <arosen1> some how the python unit tests in our repo broke 17:07:56 <arosen1> I haven't tracked down how this is possible yet (I wanna ping the guys in the infra channel about it) 17:08:09 <arosen1> but i have a fix that should merge in a sec that makes the tests pass again. 17:08:10 <thinrichs1> Did the Makefile change get merged yet? 17:08:24 <arosen1> https://review.openstack.org/#/c/160680/ 17:08:48 <arosen1> thinrichs1: not yet that doesn't unblock the tests. 17:09:35 <arosen1> hrm weird i see the tests did pass on your patch 17:09:35 <thinrichs1> The tests *should* be failing without that patch. 17:09:46 <thinrichs1> We shouldn't be able to parse anything. 17:09:55 <arosen1> thinrichs1: the tests passed on my patch that isn't rebased on yours though. 17:10:11 <thinrichs1> Locally is different b/c you still have the output of the Makefile sittig around. 17:10:26 <arosen1> thinrichs1: I don't mean locally. 17:10:39 <arosen1> If you click on that link jenkins +1'ed it. 17:11:04 <thinrichs1> Maybe there's a Python-path thing happening somehow. 17:11:05 <thinrichs1> Not sure. 17:11:20 <arosen1> also i'm not sure how we could break the unit tests without jenkins stopping it from merging 17:11:23 <thinrichs1> But Jenkins also let in the deletion of the config file. 17:11:25 <arosen1> since they would have had to pass at one point. 17:12:00 <arosen1> in that case i'm sure the unit tests were passing then (not sure which config file you're talking about). 17:12:05 <thinrichs1> The missing test_datasource_driver_config.py. 17:12:09 <arosen1> ah yea! 17:12:23 <arosen1> That one i want to dig deep and figure out how this could occur 17:12:32 <jwy> sorry, which patch are we talking about that passed 17:12:46 <arosen1> jwy: https://review.openstack.org/#/c/160680/ 17:13:06 <jwy> i see failure for http://logs2.aaronorosen.com/80/160680/1/check/dsvm-tempest-full-congress-pg-nodepool/917a568 17:13:20 <arosen1> jwy: yea the cloud is flaky right now 17:13:30 <arosen1> the gateways are over loaded and connections timeout :( 17:13:48 <jwy> which part are you saying passed? 17:14:09 <arosen1> the python unittests 17:14:26 <arosen1> they failed to pass in the gate pipe line but they passed on check 17:14:35 <jwy> ok 17:14:38 <thinrichs1> This one passed too but probably shouldn't have. 17:14:38 <thinrichs1> https://review.openstack.org/#/c/158489/ 17:15:12 <arosen1> thinrichs1: anyways lets do a little investigating later on and try and nail down what happened. 17:15:36 <thinrichs1> arosen1: Sounds good. 17:16:25 <thinrichs1> Back to status updates. 17:16:29 <thinrichs1> jwy: thanks for the update. 17:16:36 <thinrichs1> arosen1: want to give a status update? 17:17:33 <arosen1> thinrichs1: sure 17:17:57 <arosen1> so I haven't done much on congress the last week or so. Most of my time was sucked up trying to help debug our cloud 17:18:22 <arosen1> that's it from me for now... 17:19:49 <thinrichs1> arosen1: thanks. 17:19:57 <thinrichs1> sarob couldn't attend but sent his status via email. 17:20:15 <thinrichs1> He cleaned up https://launchpad.net/congress/kilo 17:20:30 <thinrichs1> He tagged https://github.com/stackforge/congress/tree/2015.1.0b2 17:20:47 <thinrichs1> He is still working out getting the tarred file out to http://tarballs.openstack.org/ as part of the process 17:21:25 <thinrichs1> This seems like a good step toward getting us into the OS release cadence. 17:22:06 <thinrichs1> I believe kilo3 is mid-March, so a couple weeks away. 17:22:56 <thinrichs1> I think we'll delay code freeze to say 3-4 weeks before the summit, since we don't have a ton of stabilization to do. 17:23:45 <thinrichs1> That should give us enough time to do some testing, round out the features/bugs that we want available, and still have time to work on some specs for the summit. 17:24:00 <thinrichs1> How does that sound? 17:24:23 <arosen1> sounds good to me! 17:24:59 <alexsyip> same here 17:25:00 <jwy> sure 17:25:16 <thinrichs1> So that'll be the plan moving forward then. 17:25:24 <thinrichs1> alexsyip: want to give a status update? 17:25:47 <alexsyip> I’ve been working on high availability for the congress server + datasource drivers. 17:26:22 <alexsyip> For a first cut, we’ll runn two completely replicated congress servers, where each replica will fetch data from the data sources 17:26:31 <alexsyip> and clients can make api calls to either replica 17:27:06 <alexsyip> Writes (for things like rules and datasource config) will go to the database, and congress server will pull new changes on a period basis. 17:28:09 <arosen1> nice, sound good to me alexsyip 17:28:27 <thinrichs1> Sounds like the right first cut to me. 17:28:36 <thinrichs1> Excited about HA! 17:28:37 <alexsyip> Currently, I’m setting up a tempest test to run in this configuration. 17:30:00 <alexsyip> that’s all 17:30:10 <thinrichs1> That reminds me—we should fix up our logging so we don't fill up the disk (and crash). 17:30:28 <arosen1> thinrichs1: where did you see this problem? 17:30:31 <arosen1> in devstack? 17:30:38 <thinrichs1> alexsyip: do you have a blueprint for HA? 17:30:39 <arosen1> I think that logging that jwy pointed out was in horizon 17:30:57 <alexsyip> I’m working on an HA 17:30:59 <alexsyip> blueprint. 17:31:06 <thinrichs1> arosen1: if we're going for HA, and we never empty out the logs, we'll eventually fill up the disk. 17:31:16 <thinrichs1> And crash, thereby making HA harder. 17:31:17 <alexsyip> Actually, I think I already made the blueprint, but not the spec 17:31:28 <arosen1> thinrichs1: I don't think that's really related to HA per say 17:31:34 <arosen1> are you talking about syslog? 17:31:42 <arosen1> like output from congress-server logs? 17:31:50 <thinrichs1> Just the other day I saw ceilometer fill up the disk with its log. 17:32:00 <thinrichs1> arosen1: yes 17:32:22 <arosen1> thinrichs1: the ceilometer in cloud or devstack ? 17:32:30 <alexsyip> Here’s the blueprint: https://blueprints.launchpad.net/congress/+spec/query-high-availability 17:32:37 <arosen1> I think logrotate is probably not configured in that case thinrichs1 17:32:59 <thinrichs1> arosen1: so you're saying it's easy to fix the logging so it doesn't fill up the disk? 17:33:06 <arosen1> yes 17:33:12 <arosen1> logrotate does it for you 17:33:29 <arosen1> it tar.gz's your logs and deletes them eventually if it's running out of diskspace 17:33:29 <thinrichs1> If we don't have that turned on, let's turn it on by default now. 17:33:51 <arosen1> it shouldn't ever run out of diskspace in devstack though if it's dupping the output to screen. 17:34:11 <arosen1> on a proper install logrotate would handle this. 17:34:23 <arosen1> sorry battery is about to die on the train :( 17:34:58 <thinrichs1> Whatever we need to do to make sure we don't fill up the disk, (whether we're running as part of devstack or standalone), let's add a bug/blueprint to make sure that happens. 17:35:16 <thinrichs1> If it's an deployment option, let's make sure it's documented and on-by-default. 17:37:44 <thinrichs1> I added a blueprint for this and made it a dependency on query-high-availability. 17:37:55 <thinrichs1> I guess I'll give a quick status update. 17:38:25 <thinrichs1> Now that datasources are spun up/down at runtime, we need to be more careful with how we deal with column-references in policy rules. 17:38:43 <thinrichs1> Remember that a column-reference is where we identify the value for a column using its name, not its position. 17:39:16 <thinrichs1> Example: p(id=x, name=y) asks for the column named 'id' to be variable x and the column named 'name' to be variable y. 17:39:41 <thinrichs1> Previously we were compiling these into the usual datalog version *at read-time*. 17:40:07 <thinrichs1> To do that we needed to know the schema for all the tables *at read-time*. 17:40:42 <thinrichs1> If our schema for table p has columns ('name', 'id', 'status'), then we would compile the example above into... 17:40:47 <thinrichs1> p(y, x, z) 17:41:14 <thinrichs1> But since datasources are spun up/down at runtime, we no longer know the schema at read-time, so we can't do this compilation any longer. 17:41:28 <thinrichs1> So I'm adding support for column references into the heart of the evaluation algorithms. 17:41:59 <thinrichs1> We'll still be able to do schema-consistency checking (making sure people don't reference non-existent columns), but we'll want to do it whenever a schema change occurs, e.g. spin up a new datasource. 17:42:08 <thinrichs1> Hope that made some sense. 17:42:46 <thinrichs1> It should be transparent to the user. 17:43:15 <thinrichs1> That's it for me. 17:43:19 <jwy> is this related to the issue with a policy table like nova:servers being empty now? 17:43:33 <thinrichs1> jwy: Yes that's where I noticed the problem. 17:43:56 <thinrichs1> Say we have a rule like p(x) :- q(id=x) in the database. 17:44:25 <thinrichs1> Sorry.. different rule. 17:44:30 <thinrichs1> p(x) :- nova:q(id=x) 17:44:48 <thinrichs1> If we startup/restart Congress and try to load that rule, we won't know the schema for q because Nova hasn't necessarily been spun up yet. 17:45:02 <thinrichs1> So we won't load it; we'll throw an error. 17:45:20 <jwy> ah 17:45:36 <thinrichs1> jwy: the original problem was that we weren't even trying to load the rules into the policy engine. 17:45:46 <thinrichs1> So we didn't get an error. We just didn't load anything. 17:45:56 <thinrichs1> Then once I fixed it to actually load the rules, I saw the errors. 17:46:20 <jwy> glad you found those! 17:46:28 <thinrichs1> jwy: thanks for pointing those out. 17:46:48 <thinrichs1> I think it's probably a weird case when this would actually be problematic. 17:46:56 <thinrichs1> But it's just not functional as it stands. 17:47:17 <thinrichs1> And if someone deletes a datasource named 'nova' and creates a new one also called 'nova', we weren't doing the right thing. 17:47:33 <thinrichs1> This should fix all that. 17:47:46 <thinrichs1> Okay. Time to open it up for discussion. 17:48:07 <thinrichs1> Oops.. first: is anyone else here that wants to give a status update? 17:49:03 <thinrichs1> Okay—open discussion it is. 17:49:06 <thinrichs1> #topic open discussion 17:51:36 <jwy> it's daylight savings this weekend, so the local time for this meeting for some folks will change 17:51:42 <jwy> starting next week 17:52:00 <thinrichs1> jwy: thanks for the reminder! 17:53:06 <thinrichs1> Thanks for the meeting all! See you next week (an hour later b/c of daylight savings for some of us, I believe). 17:53:09 <thinrichs1> #endmeeting