00:01:23 #startmeeting congressteammeeting 00:01:24 Meeting started Thu Dec 8 00:01:23 2016 UTC and is due to finish in 60 minutes. The chair is ekcs. Information about MeetBot at http://wiki.debian.org/MeetBot. 00:01:25 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 00:01:27 The meeting name has been set to 'congressteammeeting' 00:02:01 hi all! hope everyone’s having a nice week. 00:03:08 a couple topics I have for today. 00:03:22 - gating issues (again!) 00:03:23 - ocata-2 00:03:24 - open discussion 00:03:26 anything else? 00:05:31 ok then let’s get started. 00:05:43 #topic gating issues 00:07:19 Gate has been blocked again. 00:07:45 This time it’s mysterious subunit parser errors in the pythonXX tests. For example: http://logs.openstack.org/29/254429/8/check/gate-congress-python27-ubuntu-xenial/b3dacaf/testr_results.html.gz 00:08:44 clarkb had some thoughts on the ML, but I’ve still not been able to diagnose and/or fix. 00:08:50 #link http://lists.openstack.org/pipermail/openstack-dev/2016-December/108497.html 00:09:24 I've never seen this kind of error 00:09:34 What is subunit.parser even doing? Are we calling it? 00:09:39 ekcs: but in logs all tests seem to pass right 00:10:05 thinrichs: its called by testr i guess 00:10:07 its the data protocol between testr and the test runners 00:10:18 I tried disabling test_congress_haht, which seems to have reduced the issue, but not eliminated it. #link https://review.openstack.org/#/c/408326/ 00:10:24 so that you can have massively parallel testing (potentially across many machines) 00:10:44 I also noticed that when I tried to run the tests locally the test suite leaked a process 00:10:47 (not sure if related) 00:11:57 oh hi clarkb ! I tried disabling the tests that could leak the process, which seems to have reduced the problem, but not entirely. 00:12:41 clarkb: thanks. Any thoughts on how we debug? Know what some of the common causes of that kind of subunit failure might be? 00:13:03 so anyway that’s just an update. holding up quite a few exsting and coming patches. wish I had more to say about it. 00:13:14 ya I posted my ideas on the mailing list 00:13:38 basically run the tests without testr and see if the subunit stream is corrupting (or even run without subunit and see if things are crashing) 00:14:05 I don't know that I have ever seen this particular issue before. Most common issues are realted to running tests in parallel (which you don't seem to be doing here) 00:15:02 thanks clarkb ! could you elaborate a little bit on what could cause this when tests are run in parallel? 00:16:14 I don't mean that parallel testing would cause this 00:16:33 just that test suite related failures are common with tests run in parallel (because the tests can't assume that sharing resources is safe) 00:16:44 I haven't seen thsi before either serial or parallel 00:17:00 clarkb: I just reread your note on the ML. Very thorough. Thanks. 00:17:35 clarkb: got it. thanks! 00:17:43 hi. sorry late. previous meeting didn't end in time. 00:17:47 Do we have multiple processes running that are writing to the same log? 00:18:26 hi masahito ! 00:18:31 thinrichs: they use stdout, so potentially yes 00:19:01 likely if you are forking additional things you will want to close stdout or reattach to some other fd 00:19:47 thinrichs: we might have that in test_congress_haht. But same issue occurred even after disabling that. I’ll have to dig deeper. 00:20:32 Well anyway I’m going to keep at it and hopefully have a resolution. 00:21:11 ekcs: is haht the only test with multiple processes? 00:21:16 feel free to dig into it too all. even a temp work-around would be really helpful at thi spoint. 00:21:31 okay 00:21:52 also might be worth reaching out to lifeless directly 00:21:55 thinrichs: I’m pretty sure. but not 100% 00:21:58 as that may mean more to him as testr maintainer 00:22:22 thinrichs: I will verify that. 00:22:43 got it thanks a lot clarkb ! 00:22:59 so anything else on this topic before we move on? 00:26:14 ok moving on then. we can come back to it in open discussion if there are more thoughts. 00:26:23 #topic ocata-2 00:27:11 again time flies by in this cycle. or maybe in every cycle. but ocata-2 is due next week. 00:28:14 here are the bugs targeting ocata-2. #link https://launchpad.net/congress/+milestone/ocata-2 00:28:46 let’s talk a little bit about which bugs we need to retarget. 00:29:05 I’m guessing all the medium priority ones would be retargeted. 00:30:02 “Remove openstack incubator code” is actually already done. I’ll mark it later. 00:31:39 “two nodes may insert rules causing unsupported recursion” turns out to be a bit complex and I’m going to need more time for that. I have a rough spec up today #link https://review.openstack.org/408352 00:33:44 any other thoughts? 00:34:32 For this one... 00:34:34 https://bugs.launchpad.net/congress/+bug/1637172 00:34:34 Launchpad bug 1637172 in congress "rule using policy:table(…) reference fails to create" [High,Confirmed] - Assigned to Tim Hinrichs (thinrichs) 00:34:56 there's the patch that's basically trying to not subscribe to services that don't exist. 00:35:27 Pre oslo-messaging we subscribed to everything, but oslo-messaging doesn't like that. Correct? 00:36:25 Ideally a policy-writer could write something like p :- foo:bar(x) and whenever foo is a datasource, the policy engine ensures it's subscribed to foo. 00:36:26 I think so. Here’s the patch: https://review.openstack.org/#/c/404478/ 00:36:38 But when 'foo' isn't a datasource, the policy engine should know to not subscribe. 00:37:06 I think so. 00:37:30 So it sounds like the right solution is to have the policy engine know it's got say N potential datasources that it's interested in and there are M datasources that actually exist. 00:37:46 So then it should maintain the invariant that it is always subscribed to the intersection of M and N. 00:38:18 That invariant requires action when (i) the policies change, which is the part we're handling in that patch and (ii) the datasources change, which we don't handle. 00:38:46 Do we have an event-handler inside the policy-engine for when the available datasources change? (Perhaps on a heartbeat message?) 00:41:43 thinrichs: hmmm. we may be able to tweak dse_node to maintain subscription is a potential subscription. so all the logic in inside DSE. 00:43:07 thinrichs: so subscribe_table in DSE declares interest in a potentially not-yet existent datasource. dse_node handles hooking up the data if that datasource comes into existence. 00:43:33 need to be careful with the initial snapshot to get things to work right. 00:44:19 Okay—just a thought 00:44:51 but regardless seems we all agree it’s important to allow the rule creation before the datasource exists right? 00:45:18 Yes 00:46:11 yes 00:47:06 yea it should be doable. the whole system is set up so that the publisher just posts to a common channel and the subscribers pick up the messages they are interested in. details will need to be worked out. 00:48:23 need to be careful with interactions with lazy polling, differential update vs snapshot, etc. 00:49:31 anything else on ocata-2 or related bugs/patches? feel free to comment/target/tag on the bugs themselves too. 00:50:09 I want yours opinion for this https://bugs.launchpad.net/congress/+bug/1638742 00:50:09 Launchpad bug 1638742 in congress "support out-of-the-box policies" [High,New] - Assigned to Masahito Muroi (muroi-masahito) 00:50:41 I can imagine 3 options for it. 00:51:14 1. writing real policy and policy rules in local.conf 00:52:00 2. writing a file path that has policy and policy rule in local.conf 00:52:45 3. writing a URL for any repository in out-side of devstack node and download the policy 00:53:24 I'd lean toward (2). 00:53:30 a json file would be good for storing policies? 00:53:33 +1 00:53:54 Writing policy inside local.conf will make it difficult to manage multiple policies. Downloading from a URL would be nice, but probably v2. 00:54:42 makes sense. 00:54:48 A JSON file would make sense. It would let us capture the policy name and other metadata. 00:54:59 +1 ramineni_ 00:55:07 ok, 2 looks better for us. 00:55:44 and will try to use json file to hold pre-define policy. 00:55:48 thanks all. 00:56:42 ok last 4 minutes. 00:56:48 #topic open discussion 00:57:12 anything to wrap up or bring up? 00:59:49 all right that’s all the time we have. Thanks all and have a great week! 01:00:06 #endmeeting