00:01:23 <ekcs> #startmeeting congressteammeeting
00:01:24 <openstack> Meeting started Thu Dec  8 00:01:23 2016 UTC and is due to finish in 60 minutes.  The chair is ekcs. Information about MeetBot at http://wiki.debian.org/MeetBot.
00:01:25 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
00:01:27 <openstack> The meeting name has been set to 'congressteammeeting'
00:02:01 <ekcs> hi all! hope everyone’s having a nice week.
00:03:08 <ekcs> a couple topics I have for today.
00:03:22 <ekcs> - gating issues (again!)
00:03:23 <ekcs> - ocata-2
00:03:24 <ekcs> - open discussion
00:03:26 <ekcs> anything else?
00:05:31 <ekcs> ok then let’s get started.
00:05:43 <ekcs> #topic gating issues
00:07:19 <ekcs> Gate has been blocked again.
00:07:45 <ekcs> This time it’s mysterious subunit parser errors in the pythonXX tests. For example: http://logs.openstack.org/29/254429/8/check/gate-congress-python27-ubuntu-xenial/b3dacaf/testr_results.html.gz
00:08:44 <ekcs> clarkb had some thoughts on the ML, but I’ve still not been able to diagnose and/or fix.
00:08:50 <ekcs> #link http://lists.openstack.org/pipermail/openstack-dev/2016-December/108497.html
00:09:24 <thinrichs> I've never seen this kind of error
00:09:34 <thinrichs> What is subunit.parser even doing?  Are we calling it?
00:09:39 <ramineni_> ekcs: but in logs all tests seem to pass right
00:10:05 <ramineni_> thinrichs: its called by testr i guess
00:10:07 <clarkb> its the data protocol between testr and the test runners
00:10:18 <ekcs> I tried disabling test_congress_haht, which seems to have reduced the issue, but not eliminated it. #link https://review.openstack.org/#/c/408326/
00:10:24 <clarkb> so that you can have massively parallel testing (potentially across many machines)
00:10:44 <clarkb> I also noticed that when I tried to run the tests locally the test suite leaked a process
00:10:47 <clarkb> (not sure if related)
00:11:57 <ekcs> oh hi clarkb ! I tried disabling the tests that could leak the process, which seems to have reduced the problem, but not entirely.
00:12:41 <thinrichs> clarkb: thanks.  Any thoughts on how we debug?  Know what some of the common causes of that kind of subunit failure might be?
00:13:03 <ekcs> so anyway that’s just an update. holding up quite a few exsting and coming patches. wish I had more to say about it.
00:13:14 <clarkb> ya I posted my ideas on the mailing list
00:13:38 <clarkb> basically run the tests without testr and see if the subunit stream is corrupting (or even run without subunit and see if things are crashing)
00:14:05 <clarkb> I don't know that I have ever seen this particular issue before. Most common issues are realted to running tests in parallel (which you don't seem to be doing here)
00:15:02 <ekcs> thanks clarkb ! could you elaborate a little bit on what could cause this when tests are run in parallel?
00:16:14 <clarkb> I don't mean that parallel testing would cause this
00:16:33 <clarkb> just that test suite related failures are common with tests run in parallel (because the tests can't assume that sharing resources is safe)
00:16:44 <clarkb> I haven't seen thsi before either serial or parallel
00:17:00 <thinrichs> clarkb: I just reread your note on the ML.  Very thorough.  Thanks.
00:17:35 <ekcs> clarkb: got it. thanks!
00:17:43 <masahito> hi. sorry late. previous meeting didn't end in time.
00:17:47 <thinrichs> Do we have multiple processes running that are writing to the same log?
00:18:26 <ekcs> hi masahito !
00:18:31 <clarkb> thinrichs: they use stdout, so potentially yes
00:19:01 <clarkb> likely if you are forking additional things you will want to close stdout or reattach to some other fd
00:19:47 <ekcs> thinrichs: we might have that in test_congress_haht. But same issue occurred even after disabling that. I’ll have to dig deeper.
00:20:32 <ekcs> Well anyway I’m going to keep at it and hopefully have a resolution.
00:21:11 <thinrichs> ekcs: is haht the only test with multiple processes?
00:21:16 <ekcs> feel free to dig into it too all. even a temp work-around would be really helpful at thi spoint.
00:21:31 <thinrichs> okay
00:21:52 <clarkb> also might be worth reaching out to lifeless directly
00:21:55 <ekcs> thinrichs: I’m pretty sure. but not 100%
00:21:58 <clarkb> as that may mean more to him as testr maintainer
00:22:22 <ekcs> thinrichs: I will verify that.
00:22:43 <ekcs> got it thanks a lot clarkb !
00:22:59 <ekcs> so anything else on this topic before we move on?
00:26:14 <ekcs> ok moving on then. we can come back to it in open discussion if there are more thoughts.
00:26:23 <ekcs> #topic ocata-2
00:27:11 <ekcs> again time flies by in this cycle. or maybe in every cycle. but ocata-2 is due next week.
00:28:14 <ekcs> here are the bugs targeting ocata-2. #link https://launchpad.net/congress/+milestone/ocata-2
00:28:46 <ekcs> let’s talk a little bit about which bugs we need to retarget.
00:29:05 <ekcs> I’m guessing all the medium priority ones would be retargeted.
00:30:02 <ekcs> “Remove openstack incubator code” is actually already done. I’ll mark it later.
00:31:39 <ekcs> “two nodes may insert rules causing unsupported recursion” turns out to be a bit complex and I’m going to need more time for that. I have a rough spec up today #link https://review.openstack.org/408352
00:33:44 <ekcs> any other thoughts?
00:34:32 <thinrichs> For this one...
00:34:34 <thinrichs> https://bugs.launchpad.net/congress/+bug/1637172
00:34:34 <openstack> Launchpad bug 1637172 in congress "rule using policy:table(…) reference fails to create" [High,Confirmed] - Assigned to Tim Hinrichs (thinrichs)
00:34:56 <thinrichs> there's the patch that's basically trying to not subscribe to services that don't exist.
00:35:27 <thinrichs> Pre oslo-messaging we subscribed to everything, but oslo-messaging doesn't like that.  Correct?
00:36:25 <thinrichs> Ideally a policy-writer could write something like p :- foo:bar(x) and whenever foo is a datasource, the policy engine ensures it's subscribed to foo.
00:36:26 <ekcs> I think so. Here’s the patch: https://review.openstack.org/#/c/404478/
00:36:38 <thinrichs> But when 'foo' isn't a datasource, the policy engine should know to not subscribe.
00:37:06 <masahito> I think so.
00:37:30 <thinrichs> So it sounds like the right solution is to have the policy engine know it's got say N potential datasources that it's interested in and there are M datasources that actually exist.
00:37:46 <thinrichs> So then it should maintain the invariant that it is always subscribed to the intersection of M and N.
00:38:18 <thinrichs> That invariant requires action when (i) the policies change, which is the part we're handling in that patch and (ii) the datasources change, which we don't handle.
00:38:46 <thinrichs> Do we have an event-handler inside the policy-engine for when the available datasources change?  (Perhaps on a heartbeat message?)
00:41:43 <ekcs> thinrichs: hmmm. we may be able to tweak dse_node to maintain subscription is a potential subscription. so all the logic in inside DSE.
00:43:07 <ekcs> thinrichs: so subscribe_table in DSE declares interest in a potentially not-yet existent datasource. dse_node handles hooking up the data if that datasource comes into existence.
00:43:33 <ekcs> need to be careful with the initial snapshot to get things to work right.
00:44:19 <thinrichs> Okay—just a thought
00:44:51 <ekcs> but regardless seems we all agree it’s important to allow the rule creation before the datasource exists right?
00:45:18 <thinrichs> Yes
00:46:11 <masahito> yes
00:47:06 <ekcs> yea it should be doable. the whole system is set up so that the publisher just posts to a common channel and the subscribers pick up the messages they are interested in. details will need to be worked out.
00:48:23 <ekcs> need to be careful with interactions with lazy polling, differential update vs snapshot, etc.
00:49:31 <ekcs> anything else on ocata-2 or related bugs/patches? feel free to comment/target/tag on the bugs themselves too.
00:50:09 <masahito> I want yours opinion for this https://bugs.launchpad.net/congress/+bug/1638742
00:50:09 <openstack> Launchpad bug 1638742 in congress "support out-of-the-box policies" [High,New] - Assigned to Masahito Muroi (muroi-masahito)
00:50:41 <masahito> I can imagine 3 options for it.
00:51:14 <masahito> 1. writing real policy and policy rules in local.conf
00:52:00 <masahito> 2. writing a file path that has policy and policy rule in local.conf
00:52:45 <masahito> 3. writing a URL for any repository in out-side of devstack node and download the policy
00:53:24 <thinrichs> I'd lean toward (2).
00:53:30 <ramineni_> a json file would be good for storing policies?
00:53:33 <ramineni_> +1
00:53:54 <thinrichs> Writing policy inside local.conf will make it difficult to manage multiple policies.  Downloading from a URL would be nice, but probably v2.
00:54:42 <ekcs> makes sense.
00:54:48 <thinrichs> A JSON file would make sense.  It would let us capture the policy name and other metadata.
00:54:59 <thinrichs> +1 ramineni_
00:55:07 <masahito> ok, 2 looks better for us.
00:55:44 <masahito> and will try to use json file to hold pre-define policy.
00:55:48 <masahito> thanks all.
00:56:42 <ekcs> ok last 4 minutes.
00:56:48 <ekcs> #topic open discussion
00:57:12 <ekcs> anything to wrap up or bring up?
00:59:49 <ekcs> all right that’s all the time we have. Thanks all and have a great week!
01:00:06 <ekcs> #endmeeting