19:59:54 <redrobot> #startmeeting barbican 19:59:55 <openstack> Meeting started Mon Mar 23 19:59:54 2015 UTC and is due to finish in 60 minutes. The chair is redrobot. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:59:57 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:59:59 <openstack> The meeting name has been set to 'barbican' 20:00:18 <redrobot> #topic Roll Call 20:00:26 <kfarr> o/ 20:00:26 <jvrbanac> _o/ 20:00:27 <rellerreller> o/ 20:00:27 <dave-mccowan> o/ 20:00:27 <igueths> o/ 20:00:31 <elmiko> o/ 20:00:44 <tkelsey> o/ 20:00:46 <jaosorior> o/ 20:00:57 <chellygel> ヽ( ̄д ̄)ノ 20:01:18 <redrobot> lots of barbicaneers here today 20:01:23 <redrobot> #topic Action Items 20:01:48 <redrobot> first, we had an update to oslo_log in Castellan by jaosorior 20:02:00 <elmiko> yay 20:02:03 <redrobot> which looks like it's already merged 20:02:09 <arunkant> o/ 20:02:27 <redrobot> thanks for that jaosorior 20:02:35 <jaosorior> redrobot: no prob 20:02:42 <redrobot> #link http://eavesdrop.openstack.org/meetings/barbican/2015/barbican.2015-03-16-20.01.html 20:02:49 <redrobot> ^^ link for action items, btw 20:03:01 <redrobot> the second item was that I was going to reach out to tsv about quotas 20:03:11 <redrobot> buuuuut I totally dropped the ball on that 20:03:18 <redrobot> in any case 20:03:22 <redrobot> I did bump the Blueprint to Liberty 20:03:36 <redrobot> since we haven't seen tsv in a few weeks. 20:03:49 <redrobot> #action redrobot to reach out to tsv about quota BP 20:04:01 <redrobot> I'll definitely reach out to him this week. 20:04:06 <redrobot> moving on... 20:04:10 <redrobot> #topic Kilo-3 20:04:28 <redrobot> Kilo-3 was released last week 20:04:30 <redrobot> #link https://launchpad.net/barbican/+milestone/kilo-3 20:04:40 <redrobot> thanks everyone for all the contributions to the project! 20:04:54 <redrobot> unfortunately CA stuff isn't quite fully baked, but we'll get it fixed soon 20:05:42 <redrobot> the next milestone is going to be Release Candidate 1 20:05:48 <redrobot> #link https://launchpad.net/barbican/+milestone/kilo-rc1 20:06:14 <redrobot> we're in feature freeze, so no blueprints will be added to the release, only the ones that have FFE status that are already listed there. 20:06:35 <redrobot> FFE = Feature Freeze Exception 20:07:18 <redrobot> any questions about Kilo-3 or RC-1 ? 20:07:37 <arunkant> For per secret policy change which is in RC1 list, I am waiting for code review comments. I have some comments on part 1. Will be helpful to have some comments on part 2/3/4 20:08:12 <redrobot> #info we need more reviews of the blueprints in FFE 20:08:18 <redrobot> ok, moving on 20:08:39 <redrobot> #topic ACL User storage, csv vs table 20:09:08 <redrobot> #link https://review.openstack.org/#/c/164334/ 20:09:19 <arunkant> redrobot, added this as one of comment mentioned to get consensus around this. 20:09:48 <redrobot> there were a few comments in the review regarding the data structure used to store the users that are added via the per-user ACL 20:10:10 <redrobot> I was a bit concerned that we're storing all users that were granted access in a comma separated list 20:10:44 <redrobot> my concern was mainly at the limit it imposes by storing every user id in a single column. 20:11:02 <redrobot> by arunkant's calcutaions it would be ~ 1000 user ids 20:11:29 <redrobot> the alternative would be to add a separate table to hold secret_id -> user_id mapping, and join on that table when doing the ACL check 20:11:31 <arunkant> The field type is text field which is similar to mysql CLOB field which can store 64 kilobytes.. 20:12:14 <redrobot> yeah, the goal is to try to reach consensus on the best approach for the ACL list storage 20:12:15 <rellerreller> What is the resistance for not using a table? Or is there any? 20:12:43 <redrobot> rellerreller there is some concern by arunkant about the performance implication of having to join tables for every secret access. 20:13:37 <arunkant> Having a separate table requires sql join. And all of the lookups are driven by secret id or container id. This lookup is done as policy enforcement logic so will impact most of barbican requests. 20:14:31 <rellerreller> Will the user's access control be checked on every request? 20:15:05 <rellerreller> What is the cost of SQL join vs split and rebuild strings? 20:15:06 <redrobot> I would think that we would only check the per-user ACL after failing the typical Project check ? 20:15:27 <arunkant> Yes..eventually when ACL operation are checked for write, delete operations...Currently its only for read secret and container 20:16:39 <rellerreller> My vote is for table 20:16:44 <arunkant> Also there is no benefit of having user ids as separate columns..as there are no queries done specifically. 20:17:12 <rellerreller> arunkant what do you mean exactly? 20:17:22 <arunkant> Just to be clear..this will means two tables..one for secret and one for container users. 20:17:53 <redrobot> I think it could be done with one table that maps ResourceID to UserID ? 20:18:30 <arunkant> I meant lookup is always done for secret id or container id and then related acl record is looked. There are no lookup done only by users. 20:19:42 <rellerreller> Do you mean there is no query that asks does this user have permission to this secret? 20:20:47 <arunkant> Its always looked up by secret id first..not directly give me all acls for this users? 20:22:44 <arunkant> I means queries are always looked up by secret and container id and then narrowed down to what ACL operation and the users are there for that secret or container 20:23:50 <redrobot> I'm not sure I understand what the lookup argument against a separate table is? 20:24:04 <redrobot> sure we can craft a query that will produce the data we need 20:24:56 <redrobot> anyone else care to weigh in? 20:24:56 <rellerreller> I think one of the arguments for the table is that you can do other queries easily, like find out which secrets a user has access to based upon the ACL. 20:24:58 <redrobot> bueller? 20:25:00 <jaosorior> I think arunkant's argument against the table-based solution is the performance 20:25:16 <rellerreller> I'm not sure that really matters to me. 20:25:49 <alee> sorry -- got here late -- whats the issue? 20:25:57 <jaosorior> rellerreller: depends on the usage for barbican. If we are going to get a ton of requests, and have to do joins every single time, then it might get painful. But I'm not really sure how much it really affects 20:26:14 <arunkant> Yes..you can but in this case, it may not be that useful as lookup criteria has secret id. Anyway..the downside is performance as it will impact all of barbican operations 20:26:29 <redrobot> alee for per-user-acl, data storage, csv list of IDs, vs a join table where we map resource to users 20:27:12 <rellerreller> I'm not convinced that join will have that much impact. If it does then we should examine other areas of DB. 20:27:20 <alee> redrobot, arunkant and we think a join table will be more performant? 20:28:03 <redrobot> alee, rellerreller and I would prefer a table. It does not impose an arbitrary limit to the # of ids (the width of the column), and it can be useful in other operations 20:28:07 <rellerreller> jcoffman has phd in DB. We can get him online today or tomorrow and ask him. 20:28:14 <redrobot> alee, arunkant is concerned about the performance implications 20:28:47 <jvrbanac> +1 about getting a db expert's opinion 20:28:50 <arunkant> Its and additional call to get users from a separate table instead of reading it from one column within ACL table 20:29:07 <arunkant> s/and additiona/an additional 20:29:14 <jaosorior> rellerreller: +1 20:29:18 <redrobot> arunkant yes, this is true, but it would only happen when the user's project fails to grant access 20:30:05 <redrobot> #action rellerreller to check with jcoffman about per-user-acl performance concern 20:30:15 <arunkant> No...it will happen as part of getting data ready for policy enforcement..so the needed queries will be done and then passed to policy logic 20:30:16 <jaosorior> redrobot: I thought the ACL would have presedence over the project 20:30:31 <alee> redrobot, arunkant - ok I dont really have a preference. when the spec was written, it was written from the point of view of getting something that made sense. I do worry about optimizing before we have any real knowledge that the performance implications 20:30:46 <alee> and yeah - acl takes precedences over project. 20:30:59 <rellerreller> I just called him. 20:31:07 <rellerreller> He will online in a few seconds. 20:31:26 <redrobot> jaosorior it does from a functional pov, in the impl, I think we would want to check the project-level access first, since this is the normal workflow for all of openstack, and only if the project-level access fails, then we would query the user-level acl 20:32:14 <rellerreller> joel-coffman There is an outstanding question on database. It involves CSV in a column vs having a mapping table. 20:32:41 <joel-coffman> okay 20:32:54 <rellerreller> Basically we are mapping secrets to users for access control. Proposal 1) Have a column of CSV (i.e. "user1,user2,user3" 20:32:57 <joel-coffman> sorry to be joining late 20:32:58 <alee> redrobot, no - thats done through the policy enforcement. we read any acls if they exist and pass to the policy layer. the policy layer specifies access if project or acl. 20:33:04 <arunkant> redrobot, data needed for policy enforcement is queried first and then passed to enforcement logic. So queries gathering that are going to execute first 20:33:12 <rellerreller> Proposal 2) Create a new tables with map from secret ID to user ID 20:33:24 <alee> redrobot, we're not trying to access policy layer twice. 20:34:23 <rellerreller> joel-coffman Those are the two options. CSV vs Table. What do you think? 20:34:55 <joel-coffman> A separate table is definitely the preferred approach from a database consistency and normalization perspective 20:35:17 <redrobot> alee arunkant I stand corrected. 20:35:21 <arunkant> There is no query which are just going to query users table...and this will only have two columns (secret id , userid) 20:35:57 <redrobot> arunkant but we could either modify the query, or add a new query? 20:36:02 <arunkant> But it seems like case of too much normalization as it does not have any other data.. 20:36:05 <redrobot> joel-coffman +1 thanks 20:36:05 <joel-coffman> CSV would almost certainly entail more implementation effort and introduces the possibility of update anomalies, etc. that databases are designed to prevent 20:36:12 <jaosorior> joel-coffman: Do you think it will affect performance in a meaninful manner? 20:36:58 <joel-coffman> The amount of data shouldn't be an issue... 20:37:17 <joel-coffman> A separate table is the typical means of expressing a many-to-many relationship 20:37:51 <jaosorior> joel-coffman: alright. +1 for the separate table on my side 20:37:54 <joel-coffman> jaosorior: can you give me an example query or operation that requires the information 20:38:00 <arunkant> So here its going to be one-to-many relationship. And we will need separate table for container as well 20:38:51 <alee> joel-coffman, just to be clear - what we have here is a table of acls -- each entry contains secret_id/container_id, operation and a list of users. 20:39:12 <alee> we're talking about replacing the list of users with references to another table 20:39:39 <alee> so I'm guessing the other table will have something like "acl.id" and userid 20:39:58 <alee> and we'd do a query to get all the userids associated with an acl. 20:40:43 <alee> seems like there are more db queries / db work for an extra table 20:40:47 <joel-coffman> alee: what will you do with the user ids? 20:40:56 <redrobot> compare it to an incoming id 20:40:57 <joel-coffman> will you also need any user information? 20:41:03 <alee> they are passed to the policy enforcement layer 20:41:03 <joel-coffman> okay 20:41:04 <redrobot> so the query could be smarter with a table... 20:41:06 <alee> no 20:41:36 <joel-coffman> so a separate table should be much faster 20:41:40 * redrobot is starting to think he needs to revisit all the policy stuff 20:41:50 <joel-coffman> index both the ACL id and the user id 20:42:06 <arunkant> joel-coffman, No. It will be queried by secret_id and then we get linked users id based on acl.id 20:42:55 <joel-coffman> sorry, my mistake 20:43:01 <alee> arunkant, right - I think he meant that the second table will be linked by acl.id? 20:43:11 <joel-coffman> index *secret id* and user id 20:43:29 <arunkant> joel-coffman, there are no other user attributes. just user ids. 20:43:45 <alee> either way - if its going to be much faster to have a separate table - then +1 for me 20:43:48 <joel-coffman> database can optimize the lookup based on the indexes => zero implementation effort 20:44:42 <joel-coffman> parsing the CSV for comparison is linear in the number of CSV entries :-( 20:45:45 <joel-coffman> plus you also get some other advantages like revoking all ACLs for a particular user 20:46:01 <arunkant> Its an additional lookup..so not sure if it can be faster. We just need to think that this is additional query done for barbican requests.. 20:46:10 <rellerreller> I like that additional feature 20:46:11 <joel-coffman> which only requires lookups based on the user id index instead of reading every ACL entry 20:47:15 <rellerreller> joel-coffman arunkant is talking about using ORM to get the secret object. 20:47:37 <rellerreller> We call get secret and that has eager fetch to retrieve all of its properties, including the acl. 20:48:04 <arunkant> joel-coffman: yes..that will be benefit if there is use case like that. 20:48:08 <rellerreller> I think he is talking about that instead of issuing a DB query to say, "Does this user have access to this secret." 20:48:53 <rellerreller> I think revocation is a plausible use case. I think being notified of a bad user and revoking everything sounds like a good use case. 20:49:59 <redrobot> rellerreller +1 I think in the long run a separate table will be much more useful 20:50:01 <joel-coffman> arunkant: the overhead of a single query shouldn't be significant 20:50:47 <joel-coffman> (would be interested to know if that's *not* the case, particularly as the database scales) 20:51:37 <joel-coffman> redrobot: +1 the risk of a few milliseconds for an additional query is usually well-worth the price given the advantages 20:52:06 <redrobot> we're running out of time for the day 20:52:23 <redrobot> but I think most of us are leaning towards having a separate table 20:52:35 <arunkant> Okay. So if consensus is too have another table I will add that. So there will be 2 new tables. 20:52:44 <arunkant> s/too/to 20:53:57 <redrobot> #agreed we'll use a separate table to track individual user_ids for per-user-policy 20:54:31 <redrobot> arunkant I don't think there's a benefit to combining into a single table? Two tables should be fine. 20:54:52 * redrobot debates wether we can answer Castellan questions in 5 minutes 20:55:08 <redrobot> #topic Castellan Initial Release 20:55:27 <arunkant> redrobot, yes..so one table for container and second one for secret users per operation. 20:55:40 <redrobot> I have two outstanding patches that I would like to see make it into the initial release 20:55:42 <redrobot> #link https://review.openstack.org/#/q/status:open+project:openstack/castellan,n,z 20:55:45 <redrobot> kfarr 20:56:04 <kfarr> Thanks redrobot 20:56:05 <redrobot> to answer your question, the Castellan project has its own Launchpad 20:56:21 <redrobot> #link 20:56:30 <redrobot> #link https://launchpad.net/castellan 20:57:04 <kfarr> redrobot, what about the bug tracker? https://bugs.launchpad.net/castellan 20:57:29 <redrobot> kfarr sorry about that. seems it was not turned on 20:57:32 <redrobot> kfarr should be there now. 20:57:52 <kfarr> Oh, that was fast! 20:58:13 <redrobot> kfarr trying to squeeze the topic into 5 min ;) 20:58:32 <elmiko> is there a doc CR in the making for castellan? 20:58:49 <redrobot> elmiko not that I'm aware of 20:59:04 <redrobot> elmiko but keep your radar on, since I'll probably be writing a lot of it for the upcoming integration talk in Vancouver 20:59:12 <elmiko> awesome, thanks 20:59:20 <elmiko> i'm really eager to start using it 20:59:45 <redrobot> ok, we're almost out of time. arunkant I'll get to the usage topic next week, or you can ping me in the main channel 20:59:49 <redrobot> #endmeeting