00:10:51 <thinrichs> #startmeeting CongressTeamMeeting 00:10:52 <openstack> Meeting started Thu Mar 10 00:10:51 2016 UTC and is due to finish in 60 minutes. The chair is thinrichs. Information about MeetBot at http://wiki.debian.org/MeetBot. 00:10:53 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 00:10:56 <openstack> The meeting name has been set to 'congressteammeeting' 00:11:19 <masahito> thinrichs: I tested congress for Mitaka release and accepted my presentation. 00:11:38 <masahito> https://www.openstack.org/summit/austin-2016/summit-schedule/events/7199 00:11:53 <masahito> that's highlight. 00:12:20 <thinrichs> Congratulations! 00:12:56 <thinrichs> I'm looking forward to it. 00:13:31 <masahito> The title will be slightly changed because of some reasons, but the topic of it won't be changed. 00:14:12 <thinrichs> While we're on the topic of Austin, I requested 3 working rooms just like at the last summit. 00:14:29 <thinrichs> But there are 11 more teams requesting space this summit, so we'll see if we can get all those rooms. 00:14:43 <thinrichs> There's supposedly more room in Austin than Tokyo. 00:15:26 <thinrichs> ramineni_: want to give a status update? 00:16:12 <ramineni_> thinrichs: yes 00:16:27 <ramineni_> thinrichs: I also did some testing around, and raised some patches for bugs reported 00:17:18 <thinrichs> ramineni_: any noteworthy bugs? 00:17:24 <ramineni_> thinrichs: now our tempest should be stable as propably all the outstanding tempest changes are merged now 00:17:29 <thinrichs> masahito: same question to you 00:18:19 <thinrichs> ramineni_: great! Thanks for keeping on top of that. Gate failures due to tempest changes can really hurt the team's overall productivity. But you've kept us moving along! 00:18:21 <ramineni_> thinrichs: no, didnt find any critical ones as such 00:19:52 <thinrichs> Moving on then. 00:19:56 <thinrichs> ekcs: progress report? 00:20:14 <ekcs> Been doing testing. I encountered 500 errors on policy creation & deletion, I couldn't reproduce them once I restarted congress. 00:20:14 <ekcs> Could be a database connection issue. I want to investigate further at some point, but probably not a priority right now. 00:21:34 <thinrichs> No clue as to how to replicate? That sounds like it could be a race condition. 00:22:01 <thinrichs> We changed around quite a bit of code at the API layer, so there could easily be a subtle bug. 00:22:15 <ekcs> I don’t think it’s race condition because it was very consistent. Until I restarted congress then it’s not there. 00:22:15 <thinrichs> Did you find the problem in the logs so we know where the error happened? 00:22:22 <ekcs> Yes. 00:22:42 <ekcs> I see the exception and stack trace in logs. 00:23:34 <ekcs> The bugs are here if anyone interested: https://bugs.launchpad.net/bugs/1554707 00:23:34 <openstack> Launchpad bug 1554707 in congress "500 when deleting policy" [Undecided,Invalid] 00:23:42 <ekcs> https://bugs.launchpad.net/bugs/1554712 00:23:43 <openstack> Launchpad bug 1554712 in congress "create policy fails" [Undecided,Invalid] 00:25:04 <thinrichs> The first one looks like potentially a real problem. Like a type mismatch. 00:25:39 <thinrichs> Trying to figure out why that would disappear after a restart 00:26:23 <thinrichs> Did you try inserting/deleting elements into the database directly to see if you could replicate? 00:26:24 <ekcs> Second one (create) I can see how that would disappear because db connection re-established and synchronize no longer fails. 00:26:41 <ekcs> first one (delete) I don’t understand either why it would go away. 00:27:13 <ekcs> thinrichs: no. Not sure what you’re suggesting. 00:28:04 <thinrichs> Both the 500s are caused by the same error: that event.target is a string but is expected to be an object. 00:28:05 <ekcs> thinrichs: say create policy. then delete it from db. then delete policy from congress? 00:29:20 <ekcs> thinrichs: Yup. 00:29:30 <ekcs> I can dig further if that’s desirable. 00:29:32 <thinrichs> ekcs: these feel like real bugs (probably the same real bug). 00:30:00 <thinrichs> ekcs: I'll mark these as critical so we make sure to figure out what's happening 00:30:10 <thinrichs> ekcs: it'd be great if you could dig into it a bit. 00:30:14 <ekcs> got it. 00:30:50 <thinrichs> Maybe even just manually looking through the code path for create/delete policy to figure out how that event.target might not be an object. 00:31:52 <ekcs> ok 00:32:08 <thinrichs> ekcs: off the top of my head, I'm surprised there's any mention of events when we are creating/deleting policies 00:32:32 <thinrichs> ekcs: definitely ping me if you want help 00:32:52 <ekcs> ok got it. 00:33:18 <thinrichs> I'll do a quick update. 00:33:44 <thinrichs> Worked a bit on one of the bugs ramineni_ found 00:33:54 <thinrichs> That code seems more or less ready... 00:34:16 <thinrichs> At least for more review 00:34:20 <thinrichs> #link https://review.openstack.org/#/c/289650/ 00:34:44 <thinrichs> Also last week I did the release of Mitaka3 for both the server and client. 00:35:28 <thinrichs> Been waiting to do a round of testing til this first batch of bug fixes goes in. 00:35:53 <thinrichs> I think that's about it. 00:36:23 <thinrichs> Let's open up for discussion. 00:36:26 <thinrichs> #topic open discussion 00:36:31 <thinrichs> Anyone have anything? 00:36:34 <bryan_att> I have a quick update 00:36:51 <bryan_att> I've been spending most of my time getting the OPNFV Brahmaputra release out the door, and automating the Congress install on it. 00:36:54 <thinrichs> bryan_att: meant to ask if you had resolved that glacne problem 00:37:10 <bryan_att> I sent logs about the issue I reported about glance images not showing up in the glance table. 00:37:33 <bryan_att> In my Ravello deploy the same issue did not occur. I will try to repeat it on my NUC deploy using the JOID and Apex OPNFV installers. Also the list of issues I reported earlier will be re-tested and if they are still there, send to the list individually. 00:37:52 <bryan_att> (I have a working install of OPNFV B on Ravello with Congress and my test webapp for it. I'll be using that for demos and testing while I'm at ONS next week.) 00:38:07 <bryan_att> My Congress install script is now totally automated, though still in bash. I'm getting Puppet training so I can implement it in Puppet. That will be the common install tool I will focus on across OPNFV installer projects. 00:38:13 <bryan_att> that's about it 00:38:20 <thinrichs> bryan_att: Lots of progress! 00:38:31 <thinrichs> bryan_att: I saw your email with the logs about glance. 00:38:42 <thinrichs> masahito, ramineni_: did either of you get a chance to look at the logs? 00:38:59 <thinrichs> Nothing stood out to me as pointing toward a solution. 00:39:10 <bryan_att> if it reappears I will send additional logs 00:39:23 <thinrichs> bryan_att: that'd be great! 00:39:32 <masahito> I took a glance the log. 00:39:56 <masahito> It looks like the client works well. 00:40:13 <bryan_att> yes, just doesn't send any image rows! 00:40:24 <thinrichs> Everything but that. :) 00:40:36 <ramineni_> looks like sync didnt happen yet? 00:41:04 <bryan_att> I tried the test after a few hours and the same result 00:41:14 <thinrichs> That could easily do it. But I thought we would pull the data immediately if we hadn't already pulled it. 00:42:01 <ramineni_> bryan_att: ya, did you see any errors in congress log while polling data? 00:42:19 <bryan_att> no, none that stood out but I will look closer 00:42:41 <ramineni_> ohok 00:42:58 <bryan_att> The #1 unique thing here is that I am running Congress in an LXC container on the main OpenStack controller node. 00:43:11 <thinrichs> Off the top of my head I don't remember this: do we have the --trace option available for the row list command? 00:43:37 <thinrichs> That would at least print out the search it's trying to do. 00:43:50 <thinrichs> Though maybe for a basic datasource list query that won't help much. 00:44:04 <thinrichs> Oh, but I just remembered we have a writeup on how to debug these kinds of problems… 00:44:18 <bryan_att> I'll try that (--trace) also. 00:44:46 <thinrichs> #link https://congress.readthedocs.org/en/latest/troubleshooting.html#datasource-troubleshooting 00:44:51 <bryan_att> sorry for taking so much time - thanks for the ideas and help! 00:44:52 <thinrichs> That's for the datasource one in particular. 00:45:19 <thinrichs> bryan_att: our pleasure. It's wonderful getting feedback from real users!! 00:45:46 <bryan_att> trying to get there... 00:46:37 <thinrichs> Any other topics for discussion? 00:46:57 <ekcs> thinrichs: Back to the event.target, do you know if they are meant to be strings or objects or either? 00:46:58 <ekcs> I see this comment in agnostic: 00:47:00 <ekcs> def _update_obj(self, events, theory_string): 00:47:01 <ekcs> """Apply events. 00:47:03 <ekcs> Checks if applying EVENTS is permitted and if not 00:47:04 <ekcs> returns a list of errors. If it is permitted, it 00:47:06 <ekcs> applies it and then returns a list of changes. 00:47:07 <ekcs> In both cases, the return is a 2-tuple (if-permitted, list). 00:47:08 <ekcs> Note: All event.target fields are the NAMES of theories, not 00:47:09 <ekcs> theory objects. theory_string is the default theory. 00:47:10 <ekcs> """ 00:48:35 <ekcs> My best guess right now is they’re always meant to be strings. the failure occors because of a simple mistake of trying to use it as an object. But the line is reached so rarely. 00:48:37 <thinrichs> I imagine the comment is correct: that event.target is a string. Though quite possibly later we destructively modify that field and turn it into an object. 00:49:20 <thinrichs> ekcs: would make sense. What I don't understand is why update_obj is called during a create_policy. 00:49:34 <ekcs> reached so rarely because policy creation/teletion doesn’t lead to disabled events normally. 00:49:34 <thinrichs> I wonder if there's a cascading delete: delete the policy results in deleting all the rules in the policy. 00:49:41 <ekcs> ok that’s helpufl. iall look forther. 00:50:00 <thinrichs> Last thoughts? 00:51:14 <ekcs> thinrichs: another quci question. 00:51:50 <thinrichs> ekcs: ok 00:52:19 <ekcs> which code section is responsible for ordering the evaluation of rules so that eg negated literals are only evaluated when ground. 00:52:58 <ekcs> is it in topdown? 00:53:34 <thinrichs> Definitely not topdown. We do it statically. 00:53:49 <thinrichs> I think it's in agnostic.py:Runtime._update_obj_datalog 00:53:58 <thinrichs> You'll see the update_would_cause_errors 00:54:16 <ekcs> hmm ok. thanks. 00:54:36 <thinrichs> Actually… it's the update() call in that function, 00:54:43 <thinrichs> which is implemented in nonrecursive.py, 00:54:56 <thinrichs> which then calls compile.reorder_for_safety 00:54:57 <ekcs> ok. 00:55:57 <thinrichs> Thanks all! 00:56:00 <thinrichs> #endmeeting