#openstack-meeting log

00:00:44 <thinrichs> #startmeeting CongressTeamMeeting
00:00:45 <openstack> Meeting started Thu Aug 20 00:00:44 2015 UTC and is due to finish in 60 minutes.  The chair is thinrichs. Information about MeetBot at http://wiki.debian.org/MeetBot.
00:00:46 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
00:00:49 <openstack> The meeting name has been set to 'congressteammeeting'
00:00:55 <RuiChen> hi thinrichs
00:01:23 <thinrichs> Hi RuiChen
00:02:16 <thinrichs> I have just a couple of technical things on the agenda:
00:02:29 <thinrichs> 1. Discuss tables with differing numbers of columns
00:02:35 <thinrichs> 2. Gating on tempest tests
00:02:46 <thinrichs> 3. Progress on distributed architecture
00:02:51 <thinrichs> Is there anything else?
00:04:24 <thinrichs> If anything comes to mind, pipe up.
00:04:30 <RuiChen> if you have anything need my help, please feel free to assign to me :0
00:04:34 <RuiChen> :)
00:05:10 <thinrichs> RuiChen: thanks!  Some of the blueprints for the distributed architecture are unclaimed.
00:05:27 <RuiChen> yeah, I see it
00:05:29 <thinrichs> #link https://blueprints.launchpad.net/congress
00:06:08 <thinrichs> Not as many as I had thought though.
00:06:47 <thinrichs> Let's start with discussing this code change:
00:06:51 <thinrichs> #link https://review.openstack.org/#/c/213283/
00:07:00 <Yingxin1> :)
00:07:13 <thinrichs> Yingxin1: want to describe the problem it's solving?
00:07:22 <Yingxin1> yes
00:08:17 <Yingxin1> A table is used to generate other tables, report errors, and execute actions.
00:09:43 <Yingxin1> But I found that a table with multiple columns fail to do either of them.
00:10:18 <Yingxin1> And currently there are many bugs supporting tables with multiple columns
00:10:29 <Yingxin1> So I pushed this patch.
00:11:23 <thinrichs> Right.  The only main reason we've been hesitant to eliminate multiple columns for a single table is error.
00:11:32 <thinrichs> The error table, that is.
00:11:45 <thinrichs> It's always been useful for us to write the rule..
00:11:50 <thinrichs> error(vm) :- blah
00:11:54 <thinrichs> and also write the rule...
00:12:00 <thinrichs> error(vm, net) :- blah
00:12:40 <thinrichs> I asked Murano if they were using multi-arity tables, and they said no.
00:12:51 <thinrichs> So there's no customer we know of that this would break.
00:13:32 <Yingxin1> Yes, but an error table collecting many errors, it should also tell where these errors come from.
00:13:39 <thinrichs> In the long run, we want to enable people to write different kinds of error statements.
00:14:08 <thinrichs> Yingxin1: could you explain what you mean by "where these errors come from"?
00:14:13 <thinrichs> Maybe with an example.
00:15:49 <Yingxin1> When there are many rules generating errors to an 'error()' table.
00:16:06 <Yingxin1> So there will be many kinds of error in a single error table.
00:16:55 <Yingxin1> But a single error table cannot tell what kind of error it is.
00:17:08 <Yingxin1> So it confuses users.
00:18:11 <thinrichs> I don't think that having multiple error tables will be enough to explain where the error came from, but it might help classify the type of error.
00:18:29 <thinrichs> It sounds like you want multiple error tables.
00:19:00 <thinrichs> When someone asks for all the policy violations, we still need a way of displaying all those error tables.
00:19:25 <thinrichs> That is, having multiple tables doesn't change the work we need to do at the CLI/Horizon layer.
00:19:50 <thinrichs> It just enables us to add a syntax check.
00:20:25 <thinrichs> I think in the long run, I'd lean toward turning 'error' into a modal operator.
00:20:30 <thinrichs> So we'd write something like…
00:20:38 <thinrichs> error[table1(vm)] :- blah
00:20:47 <thinrichs> error[table2(vm, net)] :- blah
00:20:57 <thinrichs> Then someone can ask for all the errors with a query like:
00:21:07 <thinrichs> give me all x such that error[x] is true.
00:22:00 <thinrichs> What do we all think?
00:22:06 <Yingxin1> Yes, error table can be a special one.
00:22:35 <RuiChen> how we can get the different error tables row ?
00:22:42 <RuiChen> congress policy row list chenrui_p error ?
00:23:23 <thinrichs> With Yingxin1's change, you'd ask multiple queries:
00:23:32 <thinrichs> congress policy row list chenrui_p error1
00:23:38 <thinrichs> congress policy row list chenrui_p error2
00:24:13 <thinrichs> Another change would have a special CLI command, e.g.
00:24:23 <Yingxin1> Or the error tables can have a naming convention
00:24:40 <thinrichs> congress policy row list modal chenrui_p error
00:24:52 <Yingxin1> Congress can search for them automatically.
00:25:01 <thinrichs> Yingxin1: you still need the special API call to do that.
00:25:32 <thinrichs> I'm proposing a clean way to implement that API call and to write the policy rules.
00:26:28 <Yingxin1> I think so.
00:27:06 <thinrichs> RuiChen: any objection to eliminating differing column-numbers for a single table?
00:27:17 <thinrichs> It's time to move on.
00:27:56 <RuiChen> no, but I think maybe we need a specs to clarify it?
00:28:57 <thinrichs> I don't think we need a spec.  The code is already written.
00:29:12 <thinrichs> It's more a usability question than a technical question.
00:29:56 <thinrichs> Let's follow up on comments on the code itself.
00:30:06 <thinrichs> #topic Gating on Tempest tests
00:30:23 <thinrichs> I've been trying to get the tempest tests working so we can add them to our gate.
00:30:33 <thinrichs> Ensuring no code gets merged unless it passes the tempest tests.
00:30:58 <thinrichs> This came up again this week when Congress wouldn't start in devstack, which the Murano team noticed.
00:31:20 <thinrichs> The problem seems to be that the tempest tests fail > 50% of the time.
00:31:28 <thinrichs> But it's a different test that fails each time.
00:31:30 <thinrichs> Here's the latest.
00:31:45 <thinrichs> http://logs.openstack.org/27/214327/5/experimental/gate-congress-dsvm-api/95cf8cd/console.html
00:32:04 <thinrichs> Search for: Failed 2 tests
00:32:31 <thinrichs> I've tried upping the timeouts and disabling tests, just so we can get basic devstack integration tested on each checkin.
00:32:40 <thinrichs> But that hasn't worked out so well.
00:33:54 <thinrichs> At this point, I'm thinking about just disabling all but a couple of tests that work reliably
00:34:15 <thinrichs> and gating on those.
00:34:25 <thinrichs> Then we can fix the flaky tests later.
00:34:31 <thinrichs> Thoughts?
00:34:39 <pballand> +1 for that, even if the tests are trivial
00:36:11 <thinrichs> pballand: I saw your comment; do you know SKIPtest() works?
00:37:03 <pballand> I don’t - I just figured if you were renaming methods, using “SKIP” instead of XXX would be better
00:37:28 <alexsyip> I second thinrichs ’s plan.
00:37:32 <thinrichs> Oh—got it.  Thought you were saying I was using the wrong method name.
00:37:47 <thinrichs> Will do.  Seems tempest doesn't like us to skip tests.
00:37:49 <thinrichs> #link http://docs.openstack.org/developer/tempest/HACKING.html#skipping-tests
00:38:45 <thinrichs> Related question for the group: anyone know how to stop tempest from running entire files of tests?
00:39:13 <thinrichs> B/c we have nova, neutron, cinder, swift, etc. running in devstack, tempest is running all of the tests it has for all of those services too.
00:39:18 <thinrichs> Instead of just the Congress tests.
00:40:39 <Sayaji> thinrichs: We can provide a file with the list of tests to run to tempest
00:40:40 <RuiChen> maybe we can remove the file name prefix 'test***'
00:41:06 <thinrichs> Sayaji: any docs on that?  Can we do it in the gate pipeline?
00:41:55 <thinrichs> BTW: found the proper way to skip tests in tempest:
00:41:56 <thinrichs> #link http://docs.openstack.org/developer/tempest/HACKING.html#test-skips-because-of-known-bugs
00:42:09 <Sayaji> thinrichs: I have to look it up
00:42:11 <thinrichs> TLDR: Use the @skip_because decorator
00:42:32 <thinrichs> Sayaji: could you look it up and send us a link (email or whatever)?
00:42:47 <Sayaji> thinrichs: Sure, will do that
00:43:12 <thinrichs> Sayaji: thanks!
00:43:24 <Sayaji> thinrichs: np
00:43:34 <thinrichs> Next topic: status reports.
00:43:36 <thinrichs> #topic status
00:43:56 <thinrichs> Mainly on the distributed architecture, but anything the group needs to discuss is fine too.
00:44:15 <thinrichs> Since we're short on time, just volunteer if you have something.
00:44:47 <alexsyip> I checked in some changes required for rule synchronization.
00:45:05 <pballand> I pushed a draft of the dist-cross-process-dse spec.  Design is still in progress, but it would be helpful if people could check the requirements
00:45:16 <alexsyip> I started addressing some comments for the startup script.
00:45:20 <pballand> s/requirements/Problem description/
00:45:22 <alexsyip> That’s all.
00:45:22 <thinrichs> alexsyip: was that the change the eliminated duplicates from the DB?
00:45:28 <alexsyip> Yes
00:45:33 <zhenzanz> Tim: I take your bug https://bugs.launchpad.net/congress/+bug/1486246 Neutron and HA tempest tests broken
00:45:33 <openstack> Launchpad bug 1486246 in congress "Neutron and HA tempest tests broken" [High,New] - Assigned to Zhenzan Zhou (zhenzan-zhou)
00:45:52 <thinrichs> alexsyip: I had to revert it because it wouldn't work with MySql
00:46:02 <zhenzanz> do you have detail logs?
00:46:14 <alexsyip> mysql doesn’t support unique?
00:46:27 <thinrichs> alexsyip: mysql requires the field to have a fixed width for it to be unique
00:46:39 <thinrichs> And I think it could be at most 255 chars long.
00:46:50 <thinrichs> (I could be wrong about that one though.)
00:46:57 <alexsyip> That’s unfortunate.
00:47:21 <thinrichs> alexsyip: yep.  Also, we need to write a DB migration script every time we change the schema.
00:47:33 <thinrichs> Turns out to be just a matter of running a script.
00:47:39 <thinrichs> I had forgotten about it entirely.
00:48:03 <thinrichs> #link https://github.com/openstack/congress/tree/master/congress/db/migration
00:48:27 <thinrichs> zhenzanz: turns out it's not just the neutron tests.
00:48:35 <thinrichs> zhenzanz: there are a bunch that nondeterministically fail
00:48:55 <thinrichs> zhenzanz: see above discussion about tempest tests and gate.
00:49:26 <thinrichs> pballand: cool!
00:49:32 <zhenzanz> ok, thanks. Sorry I'm late
00:49:46 <thinrichs> #link https://review.openstack.org/#/c/214893/1/specs/liberty/dist-cross-process-dse
00:50:54 <thinrichs> Let's all make a point of getting pballand comments on the spec over the next few days.
00:51:02 <thinrichs> #action all review dist-cross-process-dse spec
00:51:34 <thinrichs> zhenzanz: no worries.  Figured you had missed that bit of the meeting.
00:51:38 <pballand> thinrichs: thanks - unfortunately there is no full design there, but I wanted feedback sooner than later to make sure we are in agreement on the problem we are solving
00:52:03 <thinrichs> pballand: sounds like a good idea to me.
00:52:30 <thinrichs> pballand: remember that we already have broad consensus for the approach
00:52:45 <thinrichs> pballand: so I wouldn't be timid about getting it out there.
00:53:00 <pballand> sure
00:53:25 <pballand> I am working on improving my DseNode skeleton, and updating the design incrementally
00:53:47 <thinrichs> pballand: nice!  I think a lot will become clearer once we have a first cut of that.
00:53:55 <thinrichs> That brings to mind…
00:54:27 <thinrichs> I was wondering if it's worth debugging the RPC functionality that we're adding *before* moving to the fully distributed architecture.
00:54:46 <thinrichs> That is, we could get everything working over RPC using the existing DSE
00:54:46 <pballand> what do you mean by debugging?
00:55:04 <thinrichs> and then once we move to the RPC of oslo.messaging, at least we've eliminated some basic problems.
00:56:08 <thinrichs> So we'd be sending messages over the deepsix messaging functionality.
00:56:44 <thinrichs> Instead of reaching into the policy engines and datasources directly when implementing the API.
00:57:11 <thinrichs> Or would that be too much work?
00:57:23 <thinrichs> We already have an implementation of RPC using deepsix (though we abandoned it).
00:57:33 <thinrichs> We could resurrect it.
00:58:37 <pballand> not sure…
00:58:49 <thinrichs> 2 minutes left.
00:58:52 <thinrichs> Anything else?
00:59:03 <pballand> if we use the same rpc method name, though, it may make the transition easier
00:59:17 <thinrichs> pballand: agreed.
01:00:04 <thinrichs> It'll help us make sure we've removed all references to external services and singletons.
01:00:10 <thinrichs> Times up.
01:00:31 <thinrichs> Let's keep making good progress!
01:00:35 <thinrichs> Bye
01:00:38 <thinrichs> #endmeeting