#openstack-meeting-alt log

19:59:54 <redrobot> #startmeeting barbican
19:59:55 <openstack> Meeting started Mon Mar 23 19:59:54 2015 UTC and is due to finish in 60 minutes.  The chair is redrobot. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:59:57 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:59:59 <openstack> The meeting name has been set to 'barbican'
20:00:18 <redrobot> #topic Roll Call
20:00:26 <kfarr> o/
20:00:26 <jvrbanac> _o/
20:00:27 <rellerreller> o/
20:00:27 <dave-mccowan> o/
20:00:27 <igueths> o/
20:00:31 <elmiko> o/
20:00:44 <tkelsey> o/
20:00:46 <jaosorior> o/
20:00:57 <chellygel> ヽ(￣д￣)ノ
20:01:18 <redrobot> lots of barbicaneers here today
20:01:23 <redrobot> #topic Action Items
20:01:48 <redrobot> first, we had an update to oslo_log in Castellan by jaosorior
20:02:00 <elmiko> yay
20:02:03 <redrobot> which looks like it's already merged
20:02:09 <arunkant> o/
20:02:27 <redrobot> thanks for that jaosorior
20:02:35 <jaosorior> redrobot: no prob
20:02:42 <redrobot> #link http://eavesdrop.openstack.org/meetings/barbican/2015/barbican.2015-03-16-20.01.html
20:02:49 <redrobot> ^^ link for action items, btw
20:03:01 <redrobot> the second item was that I was going to reach out to tsv about quotas
20:03:11 <redrobot> buuuuut I totally dropped the ball on that
20:03:18 <redrobot> in any case
20:03:22 <redrobot> I did bump the Blueprint to Liberty
20:03:36 <redrobot> since we haven't seen tsv in a few weeks.
20:03:49 <redrobot> #action redrobot to reach out to tsv about quota BP
20:04:01 <redrobot> I'll definitely reach out to him this week.
20:04:06 <redrobot> moving on...
20:04:10 <redrobot> #topic Kilo-3
20:04:28 <redrobot> Kilo-3 was released last week
20:04:30 <redrobot> #link https://launchpad.net/barbican/+milestone/kilo-3
20:04:40 <redrobot> thanks everyone for all the contributions to the project!
20:04:54 <redrobot> unfortunately CA stuff isn't quite fully baked, but we'll get it fixed soon
20:05:42 <redrobot> the next milestone is going to be Release Candidate 1
20:05:48 <redrobot> #link https://launchpad.net/barbican/+milestone/kilo-rc1
20:06:14 <redrobot> we're in feature freeze, so no blueprints will be added to the release, only the ones that have FFE status that are already listed there.
20:06:35 <redrobot> FFE = Feature Freeze Exception
20:07:18 <redrobot> any questions about Kilo-3 or RC-1 ?
20:07:37 <arunkant> For per secret policy change which is in RC1 list, I am waiting for code review comments. I have some comments on part 1. Will be helpful to have some comments on part 2/3/4
20:08:12 <redrobot> #info we need more reviews of the blueprints in FFE
20:08:18 <redrobot> ok, moving on
20:08:39 <redrobot> #topic ACL User storage, csv vs table
20:09:08 <redrobot> #link https://review.openstack.org/#/c/164334/
20:09:19 <arunkant> redrobot, added this as one of comment mentioned to get consensus around this.
20:09:48 <redrobot> there were a few comments in the review regarding the data structure used to store the users that are added via the per-user ACL
20:10:10 <redrobot> I was a bit concerned that we're storing all users that were granted access in a comma separated list
20:10:44 <redrobot> my concern was mainly at the limit it imposes by storing every user id in a single column.
20:11:02 <redrobot> by arunkant's calcutaions it would be ~ 1000 user ids
20:11:29 <redrobot> the alternative would be to add a separate table to hold secret_id -> user_id mapping, and join on that table when doing the ACL check
20:11:31 <arunkant> The field type is text field which is similar to mysql CLOB field which can store 64 kilobytes..
20:12:14 <redrobot> yeah, the goal is to try to reach consensus on the best approach for the ACL list storage
20:12:15 <rellerreller> What is the resistance for not using a table? Or is there any?
20:12:43 <redrobot> rellerreller there is some concern by arunkant  about the performance implication of having to join tables for every secret access.
20:13:37 <arunkant> Having a separate table requires sql join. And all of the lookups are driven by secret id or container id. This lookup is done as policy enforcement logic so will impact most of barbican requests.
20:14:31 <rellerreller> Will the user's access control be checked on every request?
20:15:05 <rellerreller> What is the cost of SQL join vs split and rebuild strings?
20:15:06 <redrobot> I would think that we would only check the per-user ACL after failing the typical Project check ?
20:15:27 <arunkant> Yes..eventually when ACL operation are checked for write, delete operations...Currently its only for read secret and container
20:16:39 <rellerreller> My vote is for table
20:16:44 <arunkant> Also there is no benefit of having user ids as separate columns..as there are no queries done specifically.
20:17:12 <rellerreller> arunkant what do you mean exactly?
20:17:22 <arunkant> Just to be clear..this will means two tables..one for secret and one for container users.
20:17:53 <redrobot> I think it could be done with one table that maps ResourceID to UserID ?
20:18:30 <arunkant> I meant lookup is always done for secret id or container id and then related acl record is looked. There are no lookup done only by users.
20:19:42 <rellerreller> Do you mean there is no query that asks does this user have permission to this secret?
20:20:47 <arunkant> Its always looked up by secret id first..not directly give me all acls for this users?
20:22:44 <arunkant> I means queries are always looked up by secret and container id and then narrowed down to what ACL operation and the users are there for that secret or container
20:23:50 <redrobot> I'm not sure I understand what the lookup argument against a separate table is?
20:24:04 <redrobot> sure we can craft a query that will produce the data we need
20:24:56 <redrobot> anyone else care to weigh in?
20:24:56 <rellerreller> I think one of the arguments for the table is that you can do other queries easily, like find out which secrets a user has access to based upon the ACL.
20:24:58 <redrobot> bueller?
20:25:00 <jaosorior> I think arunkant's argument against the table-based solution is the performance
20:25:16 <rellerreller> I'm not sure that really matters to me.
20:25:49 <alee> sorry -- got here late -- whats the issue?
20:25:57 <jaosorior> rellerreller: depends on the usage for barbican. If we are going to get a ton of requests, and have to do joins every single time, then it might get painful. But I'm not really sure how much it really affects
20:26:14 <arunkant> Yes..you can but in this case, it may not be that useful as lookup criteria has secret id. Anyway..the downside is performance as it will impact all of barbican operations
20:26:29 <redrobot> alee for per-user-acl, data storage, csv list of IDs, vs a join table where we map resource to users
20:27:12 <rellerreller> I'm not convinced that join will have that much impact. If it does then we should examine other areas of DB.
20:27:20 <alee> redrobot, arunkant and we think a join table will be more performant?
20:28:03 <redrobot> alee, rellerreller and I would prefer a table.  It does not impose an arbitrary limit to the # of ids (the width of the column), and it can be useful in other operations
20:28:07 <rellerreller> jcoffman has phd in DB. We can get him online today or tomorrow and ask him.
20:28:14 <redrobot> alee, arunkant is concerned about the performance implications
20:28:47 <jvrbanac> +1 about getting a db expert's opinion
20:28:50 <arunkant> Its and additional call to get users from a separate table instead of reading it from one column within ACL table
20:29:07 <arunkant> s/and additiona/an additional
20:29:14 <jaosorior> rellerreller: +1
20:29:18 <redrobot> arunkant yes, this is true, but it would only happen when the user's project fails to grant access
20:30:05 <redrobot> #action rellerreller to check with jcoffman about per-user-acl performance concern
20:30:15 <arunkant> No...it will happen as part of getting data ready for policy enforcement..so the needed queries will be done and then passed to policy logic
20:30:16 <jaosorior> redrobot: I thought the ACL would have presedence over the project
20:30:31 <alee> redrobot, arunkant - ok I dont really have a preference.  when the spec was written, it was written from the point of view of getting something that made sense.  I do worry about optimizing before we have any real knowledge that the performance implications
20:30:46 <alee> and yeah - acl takes precedences over project.
20:30:59 <rellerreller> I just called him.
20:31:07 <rellerreller> He will online in a few seconds.
20:31:26 <redrobot> jaosorior it does from a functional pov, in the impl, I think we would want to check the project-level access first, since this is the normal workflow for all of openstack, and only if the project-level access fails, then we would query the user-level acl
20:32:14 <rellerreller> joel-coffman There is an outstanding question on database. It involves CSV in a column vs having a mapping table.
20:32:41 <joel-coffman> okay
20:32:54 <rellerreller> Basically we are mapping secrets to users for access control. Proposal 1) Have a column of CSV (i.e. "user1,user2,user3"
20:32:57 <joel-coffman> sorry to be joining late
20:32:58 <alee> redrobot, no - thats done through the policy enforcement.  we read any acls if they exist and pass to the policy layer.  the policy layer specifies access if project or acl.
20:33:04 <arunkant> redrobot, data needed for policy enforcement is queried first and then passed to enforcement logic. So queries gathering that are going to execute first
20:33:12 <rellerreller> Proposal 2) Create a new tables with map from secret ID to user ID
20:33:24 <alee> redrobot, we're not trying to access policy layer twice.
20:34:23 <rellerreller> joel-coffman Those are the two options. CSV vs Table. What do you think?
20:34:55 <joel-coffman> A separate table is definitely the preferred approach from a database consistency and normalization perspective
20:35:17 <redrobot> alee arunkant I stand corrected.
20:35:21 <arunkant> There is no query which are just going to query users table...and this will only have two columns (secret id , userid)
20:35:57 <redrobot> arunkant but we could either modify the query, or add a new query?
20:36:02 <arunkant> But it seems like case of too much normalization as it does not have any other data..
20:36:05 <redrobot> joel-coffman  +1 thanks
20:36:05 <joel-coffman> CSV would almost certainly entail more implementation effort and introduces the possibility of update anomalies, etc. that databases are designed to prevent
20:36:12 <jaosorior> joel-coffman: Do you think it will affect performance in a meaninful manner?
20:36:58 <joel-coffman> The amount of data shouldn't be an issue...
20:37:17 <joel-coffman> A separate table is the typical means of expressing a many-to-many relationship
20:37:51 <jaosorior> joel-coffman: alright. +1 for the separate table on my side
20:37:54 <joel-coffman> jaosorior: can you give me an example query or operation that requires the information
20:38:00 <arunkant> So here its going to be one-to-many relationship. And we will need separate table for container as well
20:38:51 <alee> joel-coffman, just to be clear - what we have here is a table of acls -- each entry contains secret_id/container_id, operation and a list of users.
20:39:12 <alee> we're talking about replacing the list of users with references to another table
20:39:39 <alee> so I'm guessing the other table will have something like "acl.id" and userid
20:39:58 <alee> and we'd do a query to get all the userids associated with an acl.
20:40:43 <alee> seems like there are more db queries / db work for an extra table
20:40:47 <joel-coffman> alee: what will you do with the user ids?
20:40:56 <redrobot> compare it to an incoming id
20:40:57 <joel-coffman> will you also need any user information?
20:41:03 <alee> they are passed to the policy enforcement layer
20:41:03 <joel-coffman> okay
20:41:04 <redrobot> so the query could be smarter with a table...
20:41:06 <alee> no
20:41:36 <joel-coffman> so a separate table should be much faster
20:41:40 * redrobot is starting to think he needs to revisit all the policy stuff
20:41:50 <joel-coffman> index both the ACL id and the user id
20:42:06 <arunkant> joel-coffman, No. It will be queried by secret_id and then we get linked users id based on acl.id
20:42:55 <joel-coffman> sorry, my mistake
20:43:01 <alee> arunkant, right - I think he meant that the second table will be linked by acl.id?
20:43:11 <joel-coffman> index *secret id* and user id
20:43:29 <arunkant> joel-coffman, there are no other user attributes. just user ids.
20:43:45 <alee> either way - if its going to be much faster to have a separate table - then +1 for me
20:43:48 <joel-coffman> database can optimize the lookup based on the indexes => zero implementation effort
20:44:42 <joel-coffman> parsing the CSV for comparison is linear in the number of CSV entries :-(
20:45:45 <joel-coffman> plus you also get some other advantages like revoking all ACLs for a particular user
20:46:01 <arunkant> Its an additional lookup..so not sure if it can be faster. We just need to think that this is additional query done for barbican requests..
20:46:10 <rellerreller> I like that additional feature
20:46:11 <joel-coffman> which only requires lookups based on the user id index instead of reading every ACL entry
20:47:15 <rellerreller> joel-coffman arunkant is talking about using ORM to get the secret object.
20:47:37 <rellerreller> We call get secret and that has eager fetch to retrieve all of its properties, including the acl.
20:48:04 <arunkant> joel-coffman: yes..that will be benefit if there is use case like that.
20:48:08 <rellerreller> I think he is talking about that instead of issuing a DB query to say, "Does this user have access to this secret."
20:48:53 <rellerreller> I think revocation is a plausible use case. I think being notified of a bad user and revoking everything sounds like a good use case.
20:49:59 <redrobot> rellerreller +1  I think in the long run a separate table will be much more useful
20:50:01 <joel-coffman> arunkant: the overhead of a single query shouldn't be significant
20:50:47 <joel-coffman> (would be interested to know if that's *not* the case, particularly as the database scales)
20:51:37 <joel-coffman> redrobot: +1 the risk of a few milliseconds for an additional query is usually well-worth the price given the advantages
20:52:06 <redrobot> we're running out of time for the day
20:52:23 <redrobot> but I think most of us are leaning towards having a separate table
20:52:35 <arunkant> Okay. So if consensus is too have another table I will add that. So there will be 2 new tables.
20:52:44 <arunkant> s/too/to
20:53:57 <redrobot> #agreed we'll use a separate table to track individual user_ids for per-user-policy
20:54:31 <redrobot> arunkant I don't think there's a benefit to combining into a single table?  Two tables should be fine.
20:54:52 * redrobot debates wether we can answer Castellan questions in 5 minutes
20:55:08 <redrobot> #topic Castellan Initial Release
20:55:27 <arunkant> redrobot, yes..so one table for container and second one for secret users per operation.
20:55:40 <redrobot> I have two outstanding patches that I would like to see make it into the initial release
20:55:42 <redrobot> #link https://review.openstack.org/#/q/status:open+project:openstack/castellan,n,z
20:55:45 <redrobot> kfarr
20:56:04 <kfarr> Thanks redrobot
20:56:05 <redrobot> to answer your question, the Castellan project has its own Launchpad
20:56:21 <redrobot> #link
20:56:30 <redrobot> #link https://launchpad.net/castellan
20:57:04 <kfarr> redrobot, what about the bug tracker? https://bugs.launchpad.net/castellan
20:57:29 <redrobot> kfarr sorry about that. seems it was not turned on
20:57:32 <redrobot> kfarr should be there now.
20:57:52 <kfarr> Oh, that was fast!
20:58:13 <redrobot> kfarr trying to squeeze the topic into 5 min ;)
20:58:32 <elmiko> is there a doc CR in the making for castellan?
20:58:49 <redrobot> elmiko not that I'm aware of
20:59:04 <redrobot> elmiko but keep your radar on, since I'll probably be writing a lot of it for the upcoming integration talk in Vancouver
20:59:12 <elmiko> awesome, thanks
20:59:20 <elmiko> i'm really eager to start using it
20:59:45 <redrobot> ok, we're almost out of time.  arunkant I'll get to the usage topic next week, or you can ping me in the main channel
20:59:49 <redrobot> #endmeeting