17:04:49 #startmeeting keystone-office-hours 17:04:50 Meeting started Tue Jun 5 17:04:49 2018 UTC and is due to finish in 60 minutes. The chair is lbragstad. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:04:51 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:04:53 The meeting name has been set to 'keystone_office_hours' 17:25:04 lbragstad: ah yeah, i thought we basically came to a conclusion on that in-channel at the time, and that we'd review the available developer volunteers and approaches in Denver at the PTG, but that the general thought was "YES WE WANT THIS" it just wasn't prioritized 17:25:16 so i didn't think we needed to discuss it in the meeting :P 17:25:25 but, reading through that now 17:28:13 i THINK people had the right idea 17:29:07 we really shouldn't be *removing records*. and yeah, obviously a deleted project would not be significantly different than a disabled one -- honestly delete just being an alias to disable is practically acceptable from my PoV, it's just a terminology issue 17:29:29 we can't retrain end users to "please use disable instead of delete" because that just isn't feasible 17:30:14 imagine trying to get every random user of your cloud (who you mostly have no power over, they're just people who sign up / pay you) to use a different method that doesn't mesh with every other project in existence 17:30:34 "When we don't want a resource anymore, we delete it" <-- is what everyone thinks, for everything 17:31:14 i wasn't ever saying we need an undelete (though I'm sure it could essentially be done manually without too much trouble, just to get the project itself back in an urgent case) 17:31:51 Morgan Fainberg proposed openstack/keystone-specs master: Hierarchical Unified Limits https://review.openstack.org/540803 17:32:02 kmalloc: i was just about to start reading this 17:32:32 lbragstad: ^ addressing ayoung's concerns with the explicit endpoint data sharing and some reasoning for the strict two-level, calling out we allow oversubscription 17:32:36 i think ayoung and I agree in principle and he's on the same page as I am generally 17:33:17 rm_work: the way i would accept this change (soft delete): change how delete works to always be soft delete, still do the "delete" process but maintain records 17:33:26 yep 17:33:28 that's basically all i want 17:33:30 rm_work: i am a hard -2 for a "new and special delete" 17:33:46 all deletes should become soft (for projects and other must maintain records) 17:33:50 delete should just ... set the disabled flag, hide the record from returning in lists 17:33:58 no, do not used "disabled" 17:33:59 please 17:34:07 delete should still remove grants 17:34:08 etc 17:34:19 ah, k, yeah, that's prolly fine 17:34:27 i just mean "it should also do all the same things disable does" 17:34:34 IE, can't create tokens 17:34:36 right. 17:34:41 disable is a lot less invasive 17:34:57 what is the word for the trivial case of something 17:34:57 and whatever else, because a deleted project is a superset of disabled 17:35:04 disable is toggle, soft delete should never [really] expected to be toggled back except cases 17:35:18 like...when a complicated solution has a base case that could be covered by a simpler solution? 17:35:25 you will also need a new API to purge records [sanely] or a keystone-manage command 17:35:37 based upon 17:35:44 yeah i imagined something like what ... someone brought up, one sec 17:35:53 stepping away to grab lunch quick and i'm going to go through the unified limit stuff 17:36:11 ah yeah, you 17:36:14 hehe 17:36:15 ;) 17:36:16 kmalloc, I'm with you on the "don't just use disabled" etc. 17:36:21 16:43:25 and track deleted time with a "cleanup" option so "remove records for projects with deleted and deleted before 17:36:31 ayoung: ++ disabled has it's use. 17:36:37 actually i just said that because i thought it'd be MORE palatable 17:36:45 rm_work I hear ya 17:36:47 "you didn't pay, we're turning your spigot off" 17:36:57 i don't really care how it works, as long as the project DB record stays in the DB, and doesn't show up to users but does show up to admins 17:37:01 delete is "you still haven't paid and we're closing your account" 17:37:23 rm_work: you'll probably want a new api for deleted record introspection 17:37:27 and we should be able to disable, but still get tokens to delete resources for a disables project...maybe with app creds or something. But that is a different feature 17:37:28 and further deletes and such should probably 404? but that's getting deep into implementation details 17:37:41 ayoung: yep, different features 17:37:57 rm_work: if it's already deleted it works the same as already deleted today ;) 17:38:10 yeah, for most other projects you can do a ?deleted=True and it'll allow showing/listing deleted objects 17:38:35 rm_work: sure, just make sure it's independent of disabled=True or whatever we use for that 17:38:37 yes, for all intents and purposes to a regular user it should seem like nothing changed from the way it works now 17:38:47 but for admins, we should be able to go in and still see the project if we want 17:39:23 so the (implementation detail warning) change is a going to be a new column and some constraint changes for project_name 17:39:27 as delete frees project name 17:39:41 and some changes to how delete propagates (delete from -> update) 17:39:43 hmmm 17:40:10 if you make a unique constraint and deleted can be null, be aware that null is never considered to collide in mysql 17:40:23 so (PName, Null) doesn't collide with (Pname, Null) 17:40:24 heh, noted 17:40:57 yeah i honestly am not 100% sure whether I WILL be able to work on this or not, by the time Denver rolls around. will see. 17:41:00 sure. 17:41:20 I would like to. I don't at first glance feel like this should be too involved 17:41:27 but i am sure it will rabbit-hole quickly ;P 17:41:30 so what is the word. The 2 level quota is the (blank) of the multi level approach 17:41:42 logical extrame? 17:42:07 distillation? 17:44:27 rm_work, not quite the word I am looking for...that means the essense, and I am looking for the "trivial case" or something 17:44:32 and maybe that is the term I want? 17:44:54 anyway, I see no benefit of 2 level. It requires the same overhead as the multilevel 17:45:19 just forces you to a flat, wide tree, but it will be just as expensive to calculate and query 17:45:45 For any project, we need to know its parent 17:45:54 inside of nova 17:46:11 the multi level will then chain from parent to grand parent and so on 17:46:33 the two level will not, but the same amount of info needs to be synced between keysteon and e.g. nova 17:47:21 we can put restrictions on people moving quota around such that you can never allocate more quota to your child projects than you yourself have, and then there is no "you parent quote was exceeded" 17:47:35 that is just a gotcha that we need to make sure we cover, not fundamental to the two level 17:47:40 so, yeah, 2 level dumb 17:47:44 except even in the case of that, you run into 17:47:49 non-strict enforcement 17:48:09 or does it. 17:48:19 *me drinks more coffee* 17:48:50 right so no oversub quota 17:49:23 no over sub quota 17:49:41 ayoung: oversub is the fundamental piece 17:49:54 that dictates need for two-level 17:50:08 kmalloc, ah, I reemmber 17:50:19 yeah, the oversub is a side effect of not squaring things with Keystone 17:50:31 it is kinda fundamental, but I think 2 level suffers from it as well 17:50:37 sortof... 17:50:44 but oversub is a method that people do use 17:50:54 say A->B A->C each has 4 units of a parent 8 unit quota 17:51:01 "i give you 100 cores, and each child 100 cores even though i only have 100" 17:51:09 the quota system can tell youre in aggregate over 17:51:11 B maexs out 17:51:29 but people like being able to oversub and then just buy up when children aggregate hit limit 17:51:30 maxes out at 4, C has 0 17:51:51 A then removes from B, creates D 17:51:54 oversub is a headache, but common practice 17:51:55 A->D gets 4 17:52:03 D allocates all 17:52:11 no, the not squaring with keystone is solved with the store upwards concept 17:52:12 C still has 4 quota, but it puts A over the limit 17:52:34 so we can be sure we can (cheaply?) calculate the total usage 17:52:42 meh 17:52:56 oversub is a choice made on representation to child projects 17:53:17 store upwards means what? That A cannot remove quota from B? 17:53:30 Cuz that is really where the oversub comes from 17:54:31 no, consumption of quota is stored up to the parent 17:54:35 in aggregate 17:55:02 so A cannot assign quota to B? 17:55:18 no hierarchy? That seems to contradict the name of the spec 17:55:21 so when i make a claim on quota in C (a->b->c) it stores a record upwards that consumption for C is used and A gets a record of aggregate of b+c 17:55:40 it's about storing claim aggregate usage upwards 17:55:44 in the hierarchy 17:55:53 right] 17:56:00 it means worst case to check usage we go to the top of the tree 17:56:06 that doesn't have to square with keystone ever 17:56:20 but the cost is based on the number of nodes in the tree, not the depth 17:56:21 because if A tries to game the system i am fine with saying "over quota" 17:56:49 i am less ok with something D does affecting a different branch of the tree 17:56:52 under a 17:56:59 a->b->c and a->d 17:57:01 yeah, I don't think that is possible 17:57:09 with oversub it is 17:57:11 its omnly possible if A plays games, not D 17:57:37 oversub means C+B+D might have more "allowed [not consumed]" quota than A 17:57:45 so D could use all the quota for B and C 17:57:51 with no games being played by A 17:57:52 Let me mull it over to see if there are peer-to-peer games, but I think that only A can do *that* to its Pledges 17:57:54 that is bad UX 17:58:13 oversub is only possible if A plays games 17:58:24 by not removing actual resources when it removes quota 17:58:25 in the model proposed oversub is explicitly allowed without games 17:58:38 A may allocate it's entire quota to A, B, C, and D 17:58:43 at the same time 17:58:51 that is just "don't divide the quota on sub projects" and is suppored by multi level as well 17:58:54 but the consumption check prevents use once A's quota is hit 17:59:04 yeah, that is covered in multilevel 17:59:13 so, 2 fundamental issues: 17:59:21 you are saying the fact that A can have deep trees just makes it more surprising? 17:59:22 1) Oversub makes for icky ux beyond ~2 layers 17:59:30 Got it 17:59:34 yeah 17:59:42 what if we don't allow oversub beyond to levels? 17:59:45 because D has no insight to it's peers really 17:59:49 or... 17:59:59 so it's very surprising when D runs out of quota 18:00:04 that was the idea of quota pools 18:00:17 the pool id is the id of the project that owns the quota 18:00:24 which is you or someone in your tree 18:00:31 i think the solution to store aggregate consumption to the parent(s) is easier than needing to defrag quota pools. 18:00:49 kmalloc, perhaps 18:00:51 it's net the same thing. 18:00:53 got ameeting 18:00:55 sure. 18:01:11 chat later 18:03:19 ok - does quota mean limit or usage? 18:04:27 we've been referring to limit and usage as two distinct, but related, things 18:04:40 and i'm not exactly sure what people mean by quota now 18:09:36 lbragstad: i always specify Limit as what is stored in keystone (allowance) 18:09:52 lbragstad: and quota claim/consumption as what is used (not stored in keystone) 18:10:03 just for clarity, quota is BOTH things, so it needs a specifier 18:10:17 you can't say quota without being explicit what side you're looking at 18:10:24 https://docs.openstack.org/keystone/latest/admin/identity-unified-limits.html#limits-and-usage 18:10:47 kmalloc, OK, store up can and should be done even in the multi-level approach 18:10:49 we may need to update those docs to reflect it 18:11:01 there mechanism is this: 18:11:04 ayoung: totally i added a comment to the spec to highlight it 18:11:17 kmalloc: how so? those docs look fine to me 18:11:24 there is no distinction between allocating quota for a new resource or splitting for a sub quota 18:11:27 lbragstad: if we are unclear 18:11:27 they describe keystone as being the place for limits and usage being calculated by services 18:11:44 lbragstad: right i haven't dug deep, i said "may" assuming we crossed the definitions somewhere 18:12:05 it wasn't a "it's wrong" comment, sorry, split attention between the two docs (new spec, old current documentation) 18:12:27 so A->B->C add a resource to C we add the resource to C, reduce 1 fro C available, bubble up, reduce 1 from B available, bubble up reduce 1 from A. 18:12:34 so - is a "quota system" still referring to the coordination between limits and keystone and usages at the service? 18:12:36 ayoung: basically. 18:12:49 lbragstad: both. 18:13:05 limits in keystone* 18:13:06 lbragstad: you really can't have one without the other (well you can, but it's silly) 18:13:17 you need both limits (allowance) and usage 18:13:24 right - and when people reference quota, that's what they are talking about 18:13:26 ? 18:13:30 yes. 18:13:34 ok - that helps 18:14:04 if you're tracking usage without limits for billing purposes, it's just "usage" 18:14:13 quota adds in a allowance/cap 18:14:13 lbragstad, so you still want to pursue 2 level? And if so, only due to pressure to get spec approved? 18:14:45 ayoung: i'm still about 10 steps behind understanding the reason why you don't like it 18:15:10 i'm trying to catch up 18:15:26 lbragstad, it think the 2 level restriction is not really buying anything, and is as expensive as the multi-level 18:15:38 why half-ass something when you can whole-ass it? 18:16:03 * ayoung apologizes 18:16:44 ayoung: ok - so 1.) you don't see the value in it 2.) you consider it still expensive, yeah? 18:17:20 lbragstad, and 3) we will be doing damage control once we expand to the multi level 18:17:21 what makes you think it doesn't buy us anything? 18:17:28 lbragstad: if we have to look up every child, it is expensive beyond a very small number of children 18:17:48 ok - let's just start with one 18:17:50 right, and that is not based on depth of tree, but on number of children 18:17:53 for the sake of me being slow 18:18:26 If we have a wide flat tree with 1000 nodes the work is the same as a deep tree with 1000 nodes 18:18:28 so, i wont comment to the two/not-two level. i am not opposed/for it [with exception of oversub/non-oversub ux issues] 18:19:00 And I won't hold things up, just want it to be a deliberate decision 18:19:13 by work are you referring usage calculation? 18:19:17 yes 18:19:28 collection and calculate usage when making a new claim 18:19:42 the "are we over quota" check 18:19:45 and same for storage requirements 18:19:49 fwiw - the reasonsing being two-level wasn't because calculating usage was harder for either 18:20:26 if you have 1000 projects, usage calculation will be the same regardless of what the tree looks like 18:20:34 so - sure, i'll agree there 18:20:45 the main reason for the two-level bit was from dublin 18:20:53 and that issue, i have a solution for [see comment on spec] the wide tree issue 18:20:53 in filling out support tickets 18:21:56 if someone can't create an instance because someone in a third cousin project used up the last of the limit allocation, what is that person supposed to put in a support ticket? 18:22:06 that is actually useful for operators? 18:22:34 with two level project hierarchies, everyone knows the parent 18:22:44 is oversubscription of children actually useful? i don't have an answer 18:23:14 i'm not being rhetorical here, is it useful, if it is, then the UX issue you describe is real 18:23:31 if i'm an operator 18:23:33 if it isn't useful, then we can not allow oversub and dodge the issue. 18:23:41 and I manage a deployment with a depth of 30 levels 18:24:03 and someone at the 29th level reports a ticket like that 18:24:18 i'm going to have a *real* long day tracking down where in the tree in need to make adjustments 18:24:44 right, but this only comes up if the aggregate limit (allowance) for all children is large than the parent 18:24:52 (width/depth included) 18:24:58 so is oversubscription useful? 18:25:08 s/large/larger 18:25:23 according to the requirements johnthetubaguy proposed from CERN, it sounds like it is 18:25:34 ok, that is fine then, i didn't have a good answer 18:25:43 let me see if i can findit 18:26:34 that to me tells us that in the case we allow oversub we need to be either limiting depth or adding smarts like "quota=100, oversub_children_allowance=[true|false]" so the operator can divine where the top of that "omg something borked" is 18:26:35 http://lists.openstack.org/pipermail/openstack-dev/2017-February/111999.html 18:26:51 and i don't really like oversub as a dynamic property of the individual limit 18:26:53 "Sub project over commit is OK (i.e. promising your sub projects more is OK, sum of the commitment to subprojects>project is OK but should be given an error if it actually happens)" 18:27:11 ok, overcommit is a requirement, that answers my question 18:27:50 overcommit is something we need to address in the quota limit model. 18:28:03 i imagine it being useful for deployments looking for resources to flow between sub-projects 18:28:05 and that makes deep trees tricky 18:28:20 and not having to be tightly coupled to tinkering with limits 18:28:29 i agree 18:28:30 when things are in flux 18:28:52 but i wanted to be sure we had that clearly delineated 18:29:05 because if it wasn't, i was going to ask more critically why we are allowing overcommit 18:29:46 that was the reason wxy and johnthetubaguy included it in their proposals, afaict 18:29:53 * kmalloc nods. 18:29:56 soooooo then. 18:30:10 i see two ways out of this.. well 3 18:30:13 1) as is, two level limit 18:30:26 2) overcommit=true (on the limit definition) 18:30:45 3) config value for max limit and hang the ux issues [please don't pick this one] 18:31:08 4) clever solution to error reporting to end users 18:31:16 (see, off-by one errors, common) 18:31:43 and honestly, i am fine with any of those options. 18:31:47 except 3 18:31:50 3 is a bad option 18:32:15 i'm inclined to say option 1 is the most direct. 18:32:48 ok - so what is #1 18:33:07 pretty much what we have detailed in the specification? 18:34:26 yep 18:34:29 as the spec sits 18:34:36 not much changes. 18:35:40 ok - what about #2 18:35:52 does the not have the two-level requirement? 18:38:43 nothing specific 18:39:02 just allows for operators to know where in the tree to start looking if 29 deep you say you're out of quota 18:39:09 maybe overcommit only starts ate level 28 18:39:10 at* 18:39:28 i am not a huge fan of that btw. 18:39:39 just figured i'd float options as I saw them 18:39:52 i think the 2-level bit is about as good as we're going to get for now. 18:40:53 and #3 restricts creating new projects that exceed two levels deep? 18:41:08 via config 18:41:12 which is just bad design 18:41:19 but... we do that kind of stuff elsewhere 18:41:28 "oh i want 20 deep, cool i set the config" 18:41:31 well - is that in a way similar to option #1? 18:41:35 sortof 18:41:42 but i'd rather have it hard-set 18:41:51 3 means API behavior is different based on config 18:41:54 which i REALLY hate 18:42:17 #1 does that in a way by opting into the model 18:42:27 i'm more inclined to give it a pass if it's not a 3 way config (am i using quota enforcement, what level deep, and keystone side) 18:42:34 s/3/2 18:42:50 #1 is "I opt into enforcement" cool, that changes api behavior anyway 18:42:58 because enforcement is centrally managed 18:43:01 (sortof) 18:43:20 3 is "I opt into enforcement and enforcement may behave differently depending on config" 18:43:31 mm 18:43:46 i prefer the more specific "X enforcement means X behavior" 18:43:58 than "X enforcement could be X, Y, Z, Q, R, G" behavior 18:44:03 pick one, good luck knowing 18:44:13 so would we modify the project hierarchy depth to be ignored in option #1? 18:44:22 we could. 18:44:39 or we create a new enforcement model that does explicitly that 18:44:42 we do similar things with the security_compliance configuration section 18:44:50 "multi-level-overcommit-enabled" 18:44:57 vs "two-level-overcommit-enabled" 18:45:22 security compliance is a bit more special because of how PCI-DSS works 18:45:35 well - in that we explicity say " 18:45:45 "this only matters if you are using the sql driver" 18:45:46 quota has no implications with outside certification on if you can process credit cards...if for some reason your cloud is in scope 18:46:16 quota is very much internal to openstack 18:46:36 vs potentially very externally impacting 18:46:52 we could have something similar for the tree depth configuration option saying "this option is ignored if you're using the two-level strict enforcement model" 18:46:58 correct. 18:47:13 and in fact, i'd deprecate the option in keystone to change the depth 18:47:18 set it to some reasonable max 18:47:29 and let the quota enforcement model dictate the amount 18:47:36 if it changes from the "Reasonable max" 18:49:40 lbragstad: i'd increase the max_depth to 10, and deprecate the option 18:50:06 and if someone has extra deep tree(s), we let them stay just no new extra deep ones 18:50:23 then the quota model dictates the max depth, no multiple options to reconcile. 18:50:45 * lbragstad thinks 18:51:49 also keep in mind what happens if someone enables this enforcement model and already has a tree 5 projects deep 18:51:58 does it error and say "NO WAY" 18:52:01 or? 18:52:11 cmurphy: brought that up in review 18:52:23 the specification review* 18:52:27 yep 18:52:49 we do something like that with the fernet token provider 18:53:22 if you're using fernet tokens and we can't find the key repository we fail 18:53:29 on start up 18:53:34 thats fine. 18:53:48 just as long as we have the expected behavior documented 18:54:18 anyway, i've covered my view 18:54:37 do we keep that same view if a deployment has a tree 4 projects deep? 18:54:50 it should be consistent 18:54:52 and they attempt to use the strict-two-level enforcement model? 18:55:10 if we are saying this enforcement only works with two-level, it only works with two level 18:55:17 fail on start up, have them switch back to flat, fix the project structure, and try again? 18:55:33 error with clear indication on all the failures 18:55:36 not just the first one 18:55:49 project heirarchy is too deep at locations X, y, z, 18:56:12 [i might make it a doctor command to check before swaping to an enforcement model] 18:56:27 so you don't have to play "are we going to fail to start games" 18:56:32 yeah 18:56:48 but basically, "run this command, it will tell you if the quota enforcement model can work" 18:57:02 if not, you need to fix the places it wont work 19:04:30 ok 19:11:48 i'm going to re-parse the spec again 19:17:28 Lance Bragstad proposed openstack/keystone-specs master: Update blueprint link in default roles specification https://review.openstack.org/572528 19:18:44 hrybacki: ^ 19:36:33 kmalloc: in your first comment here - https://review.openstack.org/#/c/540803/12/specs/keystone/rocky/strict-two-level-enforcement-model.rst 19:36:43 where are we writing the usage information? 19:48:48 that is data stored in the service layer [oslo.limit] 19:50:15 ack 19:50:43 if you look at the convo between me and melwitt it's a tough nut to crack 19:50:55 i just left a response to that 19:51:02 and my head hurts 19:52:23 well, fwiw I think if oslo.limit calls the per project count functions in parallel, maybe we don't really have a problem 19:53:16 er, or we should be more okay without doing the usage caching thing than I was originally thinking 19:56:00 yeah - that's a good point 19:56:19 another thing 19:56:30 CERN has pretty much been asking for this for a while 19:56:45 and they've done a good job of stretching the legs on other big initiatives 19:56:52 (cells v2.0?) 19:57:24 say we try this, we will likely get good feedback from them we can use to improve the system and refine it 19:57:25 melwitt: oh hi! 19:57:49 i really think we're going to see a wide tree issue more commonly than expected, wide enough that even parallel lookup is going to be painful 19:58:13 but, that said, i put that as a comment so we weren't holding anything up besides having a discussion 19:58:39 i dind't want to encode that behavior as part of the spec without some level of agreememnt on the approach. 19:59:08 also we can revisit if we hit performance issues, i seriously hope i'm wrong. 19:59:13 yeah, I definitely think there should be a plugin or choice that does not do the usage caching as a first cut 20:00:57 for what james was proposing [i keep forgetting his IRC nick], because multi-level was a thing we def. need a report up to avoid "game the system" issues. 20:01:02 melwitt: ftr - you're talking about making requests to calculate usage in parallel, right? 20:01:04 but that is out of scope of this spec. 20:01:18 not requests to kesytone? 20:01:25 right in nova in this case 20:01:31 usage lookup not allowance lookup 20:01:48 lbragstad: yes, so oslo.limit would spin up some green threads to count usage for project A, B, C in parallel 20:01:56 ack 20:02:07 just making sure i was reading things right 20:02:48 factoring that in, i'm not sure where i would guess things to tip over without seeing the system working or pushing it 20:03:01 Merged openstack/keystone-specs master: Update blueprint link in default roles specification https://review.openstack.org/572528 20:03:19 the real tip over is just how many children there are, even in parallel. 20:03:29 and concurrency to nova making new vms. 20:04:06 or whatever service :) 20:04:23 * kmalloc goes back to flask stuffs i think we beat this spec into submission 20:04:24 there was a thread awhile back where someone did some testing in cinder with the "counting quotas" re-architecting we did in nova and the tip over was having a lot of resources to count in one project (in the absence of caching). example, 30k volumes in one project and things started slowing down to do the "enforce" call 20:04:51 interesting 20:04:53 and i'm happy to upgrade to a +2 as it sits as long as other folks weigh in. 20:05:02 I do wonder if hierarchy is available, people would be more likely to create smaller projects, that would help things a lot 20:05:11 melwitt: i think people will. 20:05:20 to a certain extent though 20:05:29 since we're still limiting to two levels 20:05:45 maybe? 20:06:50 this was the thread, though really hard to read in plain text because there are graphics http://lists.openstack.org/pipermail/openstack-dev/2018-March/128096.html 20:07:44 the resource count is a database query sum(column) over the rows for a project and as you get a lot more records, that slows down 20:08:29 well, in the cases where it has to be a sum() (like vcpus, ram). the count() are faster if that is possible (one resource per row) 20:08:41 interesting 20:09:22 so - there is the resource per project tip-over point and the total projects to calculate usage for tip-over point 20:10:29 nova could slow down calculating that there are 10000 instances in a single project, or that there are 100 instead in 100 projects, yeah? 20:11:53 instances would be fast because that's a count() in an index but if it were vcpus, that would be a sum() and the former (10k) would be slower than the latter (100 x 100) I think 20:12:08 oh - sure, good point 20:16:28 is this enough to classify what we have in the spec as a unsupportable design? 20:17:55 or do we keep things marked as experimental just until we get a better idea at where things fall over? 20:18:06 and then iterate from there 20:21:42 I'm writing up a response but I was wondering if we can't design this in a way such that the initial model is the simplest and could be potentially chosen via config option if we add a new model that writes upwards, for example 20:21:53 in the future 20:22:20 i think that is reasonable 20:22:29 and it still leaves the "flat" enforcement we have today 20:23:20 which let's system operators enforce whatever hierarchical model they want if they're will to modify the limits manually 20:23:30 which doesn't help james as much :( 20:23:52 since he's looking for the ability for domain/project administrator to do a certain level of that on their own 20:24:56 if I understood correctly, they wanted to be able to delegate limit setting via hierarchy and that each project delegated to would have to manage their own limits manually 20:25:11 yeah 20:25:28 i'm not sure we'll be able to do that with flat enforcement 20:25:48 since the hierarchy isn't part of the limit validation process in keystone 20:25:48 I see 20:25:58 ah 20:26:09 if you have A and it has two children B and C as a single tree 20:26:23 then D is a separate tree with one child, D 20:26:26 E* 20:26:51 if you give project admin on D the ability update limits, they'd be able to modify limits on A's tree 20:27:28 I see, yeah, so flat means flat on both sides, the limit setting and the enforcement 20:27:38 right - at least right now? 20:27:49 like, flat RBAC enforcement and flat limit validation 20:27:54 which makes sense. but they were hoping for a hybrid 20:28:03 that might still be useful though 20:28:30 if you have hierarchical RBAC enforcement without the hierarchical limit validation, does that work? 20:28:55 yeah, I think it'd be useful, just a matter of whether it's too many knobs or other UX issue. to be able to choose hierarchical limit validation + flat RBAC enforcement 20:29:19 yeah - flat being only system administrator can modify limits 20:29:24 right? 20:29:34 I guess I didn't see why not but that could well be my keystone limit ignorance. I had been thinking of a toggle in oslo.limit, either it walks the hierarchy or it doesn't 20:29:49 also, keep in mind that hierarchical limit and flat enforcement needs write-upwards or the system can be gamed 20:30:00 or we have to do the entire depth search/validate anyway 20:30:24 add quota to child, spin up instanced, remove quota, add removed quota to other child, spin up instances, rinse, repeat 20:31:24 I didn't think it would need to write upwards. if we're checking quota for project E, only call the count function for project E and then compare it to project E's limit and the parent limit but don't go and count all the siblings to compare with the parent limit. maybe I'm missing something 20:31:43 right but what if you take quota away from E and give it to F 20:31:50 and then spin up instances on F without turning down e 20:31:55 then do it for D and Q 20:32:06 doing this from the perspective of an administrator of D 20:32:07 you end up with as many instances on as many projects as you want. 20:32:31 so long as it's under the limit the system administrator gave your tree at D 20:32:35 as many instances = max quota for the ultimate parent 20:32:40 yeah, that is what you're trusting your delegated party with. if they do that, it would be over quota until resources are freed by users 20:33:07 i make it a habit to not write CVEs by design ;) 20:33:36 well, in this scenario these are trusted sub-admins but I see your point 20:33:36 but that feels like something that will be a CVE pretty quickly, because someone will not expect it... even if it's documented 20:34:07 in their case, they just want to alleviate the load of putting all the limit setting work on one admin 20:34:16 if it wasn't a trusted sub admin and a malicious customer 20:34:38 i still don't trust a sub-admin in the grand scheme of things [have to play the "have security hat, will critique"] - it may even be a malicious subadmin, you want to keep quota for a given tree under X 20:35:13 fair 20:35:14 the admin has users saying "OMG NEED MORE VMS" and they, with good intentions, give the quota knowing they can game the system 20:35:42 somewhat malicious compliance to the limits delegated to them 20:35:51 I know penick would be okay with that if he could get the flat enforcement but I definitely see your point that this would open up a lot of problems for other use cases, so maybe it's a nonstarter 20:35:53 i view this as outside of the scope of the spec proposed though 20:36:06 yeah, it's outside the scope for sure 20:36:17 penick would use the flat enforcement? 20:36:21 and i am happy to hammer that issue into the dead horse that it is when we start working on it 20:36:42 penick said flat enforcement, strict commit [no overcommit], heirarchy 20:36:51 hierarchical limits* 20:37:00 that's what he said. but maybe with parallel queries the hierarchical enforcement would be fine. would have to check with him 20:37:11 ++ we should check with him on that front. 20:37:56 yeah - if he's will to manage requests :( 20:38:00 willing* 20:38:03 kmalloc: what do you think about the idea of the first model not caching usage designed in a way to allow swapping in a different model (via config option) in the future? is that something that would be possible? 20:38:31 melwitt: i'm inclined to just make that a different model (maybe make it a model family, that can be swapped between) 20:38:55 so long as the project structure adheres to the requirements of the new model 20:39:03 that seems reasonable 20:39:19 I agree with you that we're going to hit performance issues but as lbragstad mentioned, things get a lot more complicated that way (claim releases services are responsible for) and out-of-sync possibilities 20:40:03 as far as trying to provide something that doesn't have a high barrier for entry 20:40:13 model_family(hierarchical_enforcement, hierarchical_limits) => [parallel_check_model, cached_check_model, cached_check_model_with_oob_timed_updates], model_family(flat_enforcement, flat_limits) => [flat_enforcer_model], model_family(flat_enforcement, heirarchical_limits) => [...] 20:40:30 and anything in a model_family is assumed to be compatible (with maybe a seed-the-cache command) 20:40:58 so to start we end up with flat_enforcement and heirarchical/hierarchical [ugh i can spell/type] 20:41:01 and we expand from there 20:41:27 flat is what is merged today, hierarchical/hierarchical without caching is the current spec 20:41:31 and we expand from there 20:41:49 sounds cool 20:42:04 if james ends up using flat or two-level-strict 20:42:24 and i can then add some etcd, endpoint-sync data storage bits for this as well 20:42:27 i'd be interested to hear his feedback, because i wouldn't be surprised if we could us it for a new model 20:42:43 use* 20:42:45 so multi-endpoint could have a single limit set that is enforced. 20:42:59 just develop for each family of limit/enforcer combinations 21:23:54 last unified limit question for the day 21:24:16 we currently support the ability to create multiple limits in a single POST request 21:24:41 do we want to expose that through the CLI some how, or would that be weird? 21:26:10 or do we just leave CLI support as a single limit per request for the sake of not cluttering the command line or reading multiple limits from a file 21:26:11 like for multiple resources? or multiple projects? or both? I could see someone maybe wanting to do the former, but not sure tbh 21:26:47 we currently have this https://developer.openstack.org/api-ref/identity/v3/index.html#create-registered-limits 21:27:07 and https://developer.openstack.org/api-ref/identity/v3/index.html#create-limits 21:27:24 so you have create limits in batches 21:27:34 okay, so the request format can be repeated, I see 21:27:49 or it's a list rather 21:28:04 yeah - like "here are all the limits i want to create, go" 21:28:10 the only data point I have is the nova quotas let you do multiple resources for one project in one go https://developer.openstack.org/api-ref/compute/#update-quotas 21:28:47 not sure how many people do that, but it's there and it makes sense to want to set limits for several resources for a project 21:29:04 are users able to create multiple quotas in a single go from the CLI? 21:29:36 I believe so. let me double check 21:30:35 osc does it https://docs.openstack.org/python-openstackclient/latest/cli/command-objects/quota.html#quota-set 21:30:46 (per project) 21:31:26 oh - wow 21:32:41 so it is possible string together settings per project for multiple projects in a single request 21:34:20 and this is the old novaclient ref https://docs.openstack.org/ocata/cli-reference/nova.html#nova-quota-update 21:35:19 so you could do openstack quota set $SETTINGS $PROJECT_1 $SETTINGS $PROJECT_2 21:35:48 I doubt it, I think it's one project at a time only 21:35:56 oh - got it 21:36:03 i was thinking you could do it for multiple projects at one 21:36:04 once* 21:36:13 nah 21:36:16 just multiple resources 21:36:27 I don't think multiple projects makes much sense but that's just MHO 21:36:36 from a CLI perspective you mean? 21:38:34 imo - if an operator wanted to do that i could see them having a large .json file with all their requests ready to go and just curling it to keystone directly 21:41:45 yeah from CLI perspective 21:42:51 makes sense 21:43:13 mayve I'm wrong too :) 21:43:20 *maybe 21:43:33 i'm not aware of any CLI commands that let you do that 21:43:43 or support batch creation 21:44:19 * melwitt nods 21:45:57 thanks for the help today melwitt 21:46:32 #endmeeting