18:00:02 #startmeeting keystone 18:00:03 Meeting started Tue Dec 19 18:00:02 2017 UTC and is due to finish in 60 minutes. The chair is lbragstad. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:05 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 18:00:07 The meeting name has been set to 'keystone' 18:00:07 #link https://etherpad.openstack.org/p/keystone-weekly-meeting 18:00:09 o/ 18:00:09 agenda ^ 18:00:11 o/ 18:00:15 ping ayoung, breton, cmurphy, dstanek, edmondsw, gagehugo, henrynash, hrybacki, knikolla, lamt, lbragstad, lwanderley, kmalloc, rderose, rodrigods, samueldmq, spilla, aselius, dpar, jdennis, ruan_he 18:00:20 davidalles: ^ 18:00:21 o/ 18:00:24 o/ 18:00:26 o/ 18:00:27 o/ 18:00:32 hello 18:00:36 we have all sorts of stuff to talk about today 18:00:42 hi 18:01:04 just a quick couple of announcements, gagehugo is gonna give a quick vmt update, then we'll get into the specs 18:01:36 we'll give folks a couple minutes to join 18:03:14 alright - let's get started 18:03:28 #topic announcements: feature proposal freeze 18:03:32 #info Feature proposal freeze for Queens is this week 18:03:45 for what we've accepted for this release, i don't think anything is in jeopardy 18:04:03 unified limits, application credents, project tags, and system scope all have implementations that are up and ready for review 18:04:26 which means we will have a busy milestone for reviews, but glad to see things proposed 18:04:39 in addition to that 18:04:40 #info All targeted/expected specifications have merged 18:04:46 good job to all the folks doing hard work to make that happen 18:04:53 +2 18:05:10 yay 18:05:29 so - as far as what we discussed at the PTG and the Forum/Summit, we're consistent in our roadmap and our specification repository 18:05:34 which is a really good thing 18:05:53 especially now that unified limits merged 18:05:54 #link https://review.openstack.org/#/c/455709/ 18:06:11 thanks cmurphy kmalloc wxy for pushing that through and capturing the details we got from sdake 18:06:16 sdague* 18:06:21 (sorry sdake!) 18:06:32 lbragstad it happens 18:06:39 (all the time:) 18:06:48 lol 18:07:03 :) 18:07:05 woot 18:07:46 tons of things are up for review for those looking to review some code 18:07:55 new features too that need to be played with 18:08:21 and without a doubt, some client stuff that still needs to be done based on the new feature work 18:08:34 (unified limits, application credentials, for sure) 18:08:58 there are no shortages of work, that's for sure :) 18:09:21 thanks again to folks for shaping up the release, things are looking good 18:09:31 #topic VMT coverage 18:09:33 gagehugo: o/ 18:09:46 o/ 18:10:13 so fungi and lhinds have looked over the vmt doc for keystonemiddleware and documented their findings here 18:10:23 #link https://etherpad.openstack.org/p/keystonemiddleware-ta 18:10:44 so now I think the idea is that the keystone team and them meet to finalize it 18:11:00 sounds like we need to review their findings 18:11:05 yeah 18:11:14 s/finalize/review and finalize/ 18:11:33 we can setup a meeting after the holidays if that works 18:11:51 that sounds good to me 18:11:53 also there are drafts for keystoneauth and oslo.cache 18:12:08 #link https://review.openstack.org/#/q/topic:keystone-vmt 18:12:11 gagehugo: thanks! I appreciate you working so hard on this 18:12:26 kmalloc :) anytime 18:12:34 it's been an interesting process 18:12:48 that's all I got for that update 18:12:49 ++ 18:12:54 cool 18:12:57 thanks gagehugo 18:13:12 #topic specification review and discussion 18:13:21 hrybacki: ruan__he davidalles_ o/ 18:13:24 o/ 18:13:28 #link https://review.openstack.org/#/c/323499/ 18:13:34 here we are 18:13:49 Enterprise objective in telecom Cloud is to be able to Audit and Remediate the list of users and their roles on multiple openstack deployments inside one country via dedicated processed run outside from openstack IaaS 18:13:54 uhm. 18:13:58 As a consequence we want to implement inside the IaaS regions the Identity and Access Management features so that local keystones are the source of truth on ‘per-openstack deployment’ basis 18:14:01 my -2 disappeared from a patch. 18:14:12 hold on i need to go chat w/ infra. brb 18:14:12 kmalloc: doh! 18:14:20 still there kmalloc 18:14:24 currently what is missing is the proper way to synchronize the reference for projects 18:14:45 on the various openstack deployments 18:15:00 this is essentially an attempt to federate diverse clouds runining different version of opentstack 18:15:01 nvm - terrible gerrit UI, old patches don't show scores. 18:15:12 ==>synchro is run from one external tool 18:15:22 which will use the standard keystone APIs 18:15:46 except for the keystone/createProject where we would like to 'force' the project's ID: 18:16:02 same openstack ID on all deployment 18:16:09 when we talk about an external tool, it's a module which coordinates multiple Openstack 18:16:21 correct 18:16:24 * knikolla sorry for being late 18:16:45 if we have this feature, the module will be able to create projects of different openstack with the same id 18:17:02 this is what we call "synchronization" 18:17:27 shouldn't this be an upstream project to work out all the details of federated disjoint clouds? 18:17:48 it could be, I've presented this work to LOCC yet 18:18:10 but, don't we need to do that step by step? 18:18:25 davidalles: ruan__he when we talked, it sounded like you had some restrictions you had to adhere to? 18:19:16 my request on keystone team is to start step-by-step: allowing to synchronize the project’s ID across sites 18:19:40 (then we may open a refactoring for this external tool for IAM & ACM) 18:19:41 i am against the standard APIs being used for that. 18:19:42 ftr 18:19:51 this should be a system-scoped/admin only option IMO 18:20:01 kmalloc, this feature existed before 18:20:07 not in v3 18:20:16 it has never existed in v3 18:20:36 sounds before noone fully expressed a need to 'port it' to v3? 18:20:52 sounds no one fully.. 18:21:00 no, we explicitly decided that we didn't want it in v3, it was brought up 18:21:09 and it was decided that v3 100% controlls all ids. 18:21:23 ah, it is different! 18:21:39 for which raison you want 100% control? 18:22:12 for a lot of reasons, mostly centering around the fact that we had issues with guaranteeing no conflicts for generated/derived resources 18:22:18 aka, mapping ending for users. 18:23:15 [I understand that Cern company also have the same request because the same requirement]: How do you identify one alternative approach 18:23:23 it prevents malicious use / issues creating a project that could then generate for mapped users a known set of IDs from a different IDP or to surreptitiously own a domain/project prior to the IDP being linked 18:23:27 to comply to the requirement? 18:23:33 if it only option, it will not change the standard usage 18:23:35 I believe that the revocation events + role assignments controls should become way more difficult to deal with with we enable that change, so I believe that we should specify in that spec, how we are gonna deal with that kind of sync evens on that scenario 18:23:52 so, if you want this, I am going to tell you it cannot be in the standard APIs. 18:24:06 it must be done as a privileged (cloud admin/system scope) 18:24:26 it means a cloud admin/system scope is needed to do this work, not a "domain admin" 18:24:33 the issue is around domain-admins who can create projects. 18:24:39 even so - we should still think about the ramifications of the revocation argument 18:24:43 and similar effects. 18:24:52 why does this need to be part of the API instead of synchronizing the project table? 18:25:14 Agreed: let's enforce this via 'system scoping' only because this is a Cloud Service Provider privilege 18:25:25 cmurphy: because the are entirely different clouds 18:25:35 why not use k2k with a mapping which guarantees project -> project mappings. 18:25:39 ^^ does everyone understand this is their use case? 18:25:51 now. that said, the best option is to navigate allowing an external authority on project IDs similar to federated auth, that can be sync'd. 18:25:56 but that is new tech. 18:25:58 ... I understand that K2K is a synchro on Authentication, not authz 18:26:01 and i don't know how that works. 18:26:09 davidalles: ruan__he can you explain (high level) your project and how it differs from the way the cores are used to working with OpenStack? 18:26:38 k2k is basically SAML2 between keystones. 18:26:42 ok, we set up a upper layer to coordinate multiple independant openstack 18:26:58 it is AuthN with AuthZ still being based upon the SP side of the handshake (RBAC) 18:27:34 I understand that standard usage of openstack deployments are independent cloud and customers are consuming various clouds in 'hybrid way' (from one or multiple CSP) 18:27:35 admin manipulate directly on the upper layer, and the upper layer in charge of send requests to each openstack 18:27:59 but it's expected to be able to use the same token between each cloud? 18:28:11 so it's more like different regions in a single deployment, right? 18:28:14 lbragstad: yes, that is what they are going for. I don't know of anyone else that has tried this 18:28:23 things we also need to explicitly cover: what happens if cloudX has the same ID already as CloudY for a project. -- this is something we can't really fix (nor should we) but it is a major security concern [in your upper layer] 18:28:25 single openstack per region 18:28:40 being synced/controlled by an 'over the top layer' of custom software 18:28:47 if you're using the same token across different clouds, i am a hard -2 on all this. 18:28:55 we're not adding that support upstream in keystone. 18:28:59 Here for Telco cloud with multiple sites, the CSP is responsible for the creation of the customer accounts (projects on all sites, allocating the users and configuring their permissions) 18:29:08 with k2k you can trivially get a token for the other side. 18:29:09 it is just way way way too much exposure. 18:29:24 as a security expert, the same project with different project ID in each openstack is a security concern ! 18:29:30 k2k is explicitly setup and you're getting a token issued by the otherside. 18:29:50 so revocations/project checks/etc are all managed on the SP side. 18:30:12 and because the CSP is the Telco operator it is forced to comply to local regulation with hard IAM process and strong Audit & Compliance management 18:30:19 kmalloc lbragstad -- how do we handle very large consumers of OpenStack that want to push bounds? I think the 'opt in' approach here makes this a 'enter at your own risk' Perhaps there is a process around 'experimental' features? 18:31:18 we used to mark APIs as experimental, but we haven't done that in a while 18:31:24 we also don't do extensions 18:31:30 hrybacki: simply put, I'm going to say if you want to use the same token on multiple clouds, you're doing a ton of synchornization already, all IDs must be in sync, so my stance is I am not conceding any thing besides project IDs via system scope 18:31:44 'experimental' for CSP usage at 'system scoping' would be a very nice balance for use: and our openstack editor will support this with/out support exception. 18:32:10 but on my side I am not requesting to synchronize all the IDs on all the resources 18:32:16 kmalloc, we have tested K2K, it has a performance issue for our requirement 18:32:33 is it possible to get bugs opened for those performance issues? 18:32:35 you're probably moving into realms where you're solving how to sync the backends for keystone behind the scenes. i don't want every resource to have the "submit an API request with an ID to make a special resource" 18:32:49 reminder: only need to be able to control the assignment on 'per-keystone deployement' 18:33:02 ruan__he: how about proper SSO [OIDC] or SAML to the individual keystone's 18:33:03 ? 18:33:15 rather than "just take a token issued by keystone A to keystone B 18:33:15 ruan__he: only when you get a token, you can reuse it for the lifetime of it with zero impact on performance 18:33:16 " 18:33:28 davidalles: that can be done with roles assignments on shadow-users, right? 18:33:40 kmalloc: i agree, this sounds like a federation scenario 18:33:50 I am really concerned about all the other id's, this just feels like an opportunity for difficult to debug synchronization issues. 18:34:17 all the other resources (and their IDs) are in the 'project scoping' 18:34:20 jdennis: i think this is something that needs to be built into the backend in the write phase, a driver. not on top of the APIs. 18:34:24 then no request on them 18:34:59 davidalles: ruan__he is it possible to see what happened with your performance findings? 18:35:17 if there are things that aren't performant from a federation perspective upstream, we should fix those 18:35:44 if we make this feature as an option, we can show you later how we achieve this federation 18:36:11 it sounded like you tried k2k already but hit performance road blocks? 18:36:17 lbragstad, we worked on k2k 2 years before, and then we decided to go to another approach 18:36:50 were the issues that drove you to another approach performance related? 18:36:59 but you're asking to expose an API that may have significant issues, what do we do when the API is present and the problems are found? 18:37:13 another important point is that, it's not only about the federation of 2-3 openstack, but about maybe more than 10! 18:38:03 This requirement is on the shoulders of the CSP, to comply to regulations and hard OPS process 18:38:13 jdennis: i really feel like this whole thing needs to be a driver. 18:38:29 and exactly we are dealin,g with tenS of telco datacenters 18:38:29 jdennis: so it can coordinate centrally and provide acks/etc. 18:38:50 rather than a soft "we'll try and do it via public APIs" 18:39:03 davidalles: who is the CSP in this case? just so everyone can level set on terms 18:39:36 ruan__he: if we're federating 10 different deployments and it trips up because performance is bad, then we should try to fix those performance issues 18:39:58 The CSP is the Telco itself: both IaaS provider 18:40:00 kmalloc, maybe a driver will be a good choice, but at least we should start by making it as an option? 18:40:16 of the multiple Telco DC and the customer who deploys the workloads 18:40:27 the benefit is a driver can be developed out of tree you can build in all the logic to cooridnate the project-id sync. 18:40:39 and if you're liking it, we can consider it for in-tree. 18:40:49 or it could be on a feature branch 18:41:28 we can contribute to this feature firstly, and in charge of developping a driver later 18:41:37 * lbragstad thinks that could lead to interop issues 18:41:51 so, let me be really clear, a "new API" is a big hurgle 18:41:52 hurdle 18:42:06 we cannot ever remove an API 18:42:10 so. 18:42:24 to ALL: is a driver-logic acceptable when you accessed that 'V3 must control the generation of IDs; driver is installed by the CSP 18:42:42 ... the 'system scoping' in other way 18:42:48 it's not a new API, but an option for a existing API 18:42:49 the driver is below the API, so - the "make a project" generates an ID in the manager/controller layer 18:42:57 kmalloc: are modifications to an existing api considered a new api? 18:42:58 no new API's, especially one that could be problematic 18:43:01 ruan__he: it is NOT an option for an existing API 18:43:28 hrybacki: i'm going to -2 changes to the project_api for specifying the ID. 18:43:50 this really needs to be a system-scope thing if we're doing it. 18:43:57 i'm willing to be open to that. 18:44:11 just not a modification of the current project_api. 18:44:12 now.... 18:44:17 let's help ruan__he and davidalles develop a plan that is congruent with the core's thoughts 18:44:22 driver could be acceptable: but could this be done as experiental in Queen? 18:45:02 because RedHat company is out distro editor and they are working on 'per Long Term Support' basis: Queen then 'T' version 18:45:04 davidalles: it is something that would be best on a feature branch of out of tree as expirimental- we've largely moved away from in-tree expirimental (though lbragstad could say I'm wrong and we can do that again, i'd be ok iwth it) 18:45:27 so in driver logic, as a quick rundown of how it would work: 18:45:42 API request to make new project is received. the id is generated at the manager/controller layer 18:46:32 the driver would check/sync the project ID to the participating clouds (with ACKs) -- this is not via public APIs though another mechanism to guarantee uniquness and acks / cloud was down and needs to get created ids after it comes back up cases 18:46:39 driver would then store the project id locally. 18:47:06 ++ 18:47:15 so - we developed k2k specifically for this 18:47:19 you need to handle the sync logic, and the "catchup from downtime" logic in the external service / keystone-comes-back-on-line. 18:47:20 kmalloc: acks being "yes, I can accept this id"? 18:47:31 jdennis: correct. or "I accepted and stored locally" 18:47:33 if there are performance issues with k2k, then it'd be good to see what those are and go fix those 18:47:42 lbragstad: ++ 18:47:47 lbragstad: ++ 18:48:00 jdennis: probably a combination of "i can and did accept" 18:48:04 because that means there are still issues in that implementation and it's not useable for what we designed it for 18:48:21 and I 100% get the desire for the same project_ids across deployments 18:48:47 i think that is a fine thing to strive towards, i worry about public apis to do the sync 18:48:55 I haven't understand how the driver can garrantee the same projecd ID? 18:49:26 ruan__he: the driver can modify the values stored. we only display the id in the response 18:49:34 at the manager layer it would have to query the other cloud to get the ID and then use that instead of generating its own 18:49:49 so you could even ask "hey external service, issue me a unique-uuid-that-is-not-conflicted" you can do that 18:50:42 it seems to be a good temporal solution in short term 18:50:45 yep: seams possible: on site 'Y' or 'X', this request would result in 'the centralized ID generator knows the ID to be given because already allocate on site '1' 18:51:26 you're still dealing with something that is sync'd between sites, but if it is just a store of IDs you can do high-performance lookup table 18:51:42 and that should avoid the legal troubles of syncing the actual keystone data 18:51:58 since it's all random (cryptographically generated) IDs and nothing else. 18:52:34 now, that said, it would still be a requirement that the IDs are UUID-like in form, since other projects require that. 18:52:37 (e.g. nova) 18:53:00 i am very very concerned with changing the APIs and/or adding a new API. 18:53:21 is it possible to get the performance metrics where k2k starts to become unusable? 18:53:26 also look into https://github.com/openstack/mixmatch 18:53:30 but i would like to also fix k2k 18:53:31 if you decide to go k2k 18:53:40 if there are real performance issues. 18:53:50 it seems like this is exactly what k2k is meant to fix. 18:53:53 but if the modification of api doesn't impact others, I don't think that it will be an issue 18:53:54 it's a k2k proxy with caching. 18:54:19 I would still like to hear a ack/nack on the idea of running this as an experimental feature (for record) 18:54:19 that's my concern... we built something to solve this already, but we're going to try and solve it again by changing our API 18:54:24 ruan__he: you're missing the point. APIs are not allowed to be "config based behavior" in keystone. 18:54:33 so, once we land a change it's forever 18:54:41 and we need to support it 18:54:46 and we pledged to never remove v3. 18:54:58 we already built things for this, if they aren't working for perf reasons, i want to fix that 18:55:00 kmalloc, i get the point, that's why I called it an option 18:55:03 if they don't work for functional reasons. 18:55:46 then i want to supply more functional support, but adding/changing APIs is a large hurdle if it could be done behind the scenes cleanly and not impacting interop/security of the APIs (user-facing UX) 18:55:49 ruan__he: there no such thing as an option, it's either in the API or not 18:55:50 the concern for an optional solution is that it can changed across deployments (which is something we've been trying to not do openstack-wide) 18:55:54 jdennis: ++ 18:56:25 it just makes interoperability really hard for consumers moving between clouds 18:56:30 i've -2'd a number of things that are "optional" proposed to keystone. Keystone should work 100% the same (API wise) across all deployments of that release. 18:56:53 ... OKi... because I am not a source code specialist: is this driver mechanism already available: can we use it inside the existing Queen code 18:57:09 kmalloc, but at the same time, we should improve keystone for new use cases 18:57:19 without any source code modification ... ? or even on previous versions? 18:57:35 davidalles: you can load any driver via keystone's config 18:57:44 you could subclass the current one and write your logic over the top 18:57:57 so supply a driver and say "load this driver" 18:58:13 kmalloc, the way we use openstack becomes much more wide then before 18:58:21 there is some minimal documentation here https://docs.openstack.org/keystone/latest/contributor/developing-drivers.html 18:58:28 #link https://docs.openstack.org/keystone/latest/contributor/developing-drivers.html 18:58:36 bah - cmurphy beat me to it! 18:58:56 understood: thanks.... Ruan, how to measure the work effort to develop the driver? staff*weeks and what's the level of knowledge to do it? 18:59:08 one minute remaining 18:59:18 and how to share with out partner? 18:59:18 we can move to -keystone to finish this up if needed 18:59:24 apologies for the lack of open discussion 18:59:26 brauld: are you there? 18:59:30 but the infra folks we need the meeting room soon :) 18:59:40 i wrote an assignment driver myself, it only took a couple of days 18:59:49 a resource driver would not take that much work 18:59:54 drivers are easy imo 18:59:55 let's move to openstack-keystone 19:00:00 #endmeeting