18:01:16 <stevemar> #startmeeting keystone 18:01:17 <openstack> Meeting started Tue Aug 16 18:01:16 2016 UTC and is due to finish in 60 minutes. The chair is stevemar. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:01:18 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 18:01:21 <openstack> The meeting name has been set to 'keystone' 18:01:22 <henrynash> it's the beret & onions that do it 18:01:27 <stevemar> alright keystoners, let's get ready! 18:01:47 <shaleh> \o 18:01:47 <stevemar> (un)fortunately, no one added anything to the meeting agenda :O 18:01:52 <stevemar> #link https://etherpad.openstack.org/p/keystone-weekly-meeting 18:01:53 <shaleh> yay short meeting 18:02:02 <henrynash> stevemar: don't tempt us 18:02:31 <stevemar> short topics, just status, bugs, and then open discussion 18:02:37 <dstanek> o/ 18:02:39 <dstanek> i showed up for no reason? 18:02:55 <stevemar> dstanek: tiny reasons 18:03:07 <stevemar> #topic 18:03:09 <topol> o/ 18:03:10 <stevemar> rfail 18:03:12 <bknudson> hi 18:03:15 <dstanek> i have a goal to get a few patches up to address some of the caching issue 18:03:20 <stevemar> #topic release status 18:03:44 <stevemar> the following blueprints are not yet complete: https://launchpad.net/keystone/+milestone/newton-3 18:03:56 <stevemar> but are in pretty darn good shape 18:04:00 <stevemar> PCI is a few hours away from being done 18:04:12 <stevemar> ldap preprocess is pretty close 18:04:22 <stevemar> rolling upgrades is chugging along 18:04:36 <stevemar> and lbragstad picked up credential encryption (thanks!) 18:04:39 <ayoung> Wow, my name does not even appear on that page. Tripleo has taken over my life 18:04:54 <stevemar> ayoung: you're forever with us in spirit 18:04:54 <henrynash> rolling upgrades is good shape (one final issue to settle on, but time to make teh chanegs which ever way we go is only an hour) 18:05:07 <dstanek> henrynash: nice! 18:05:10 <ayoung> I'll be back... 18:05:17 <dolphm> stevemar: quick, create a high priority blueprint and assign it to ayoung 18:05:20 <rderose> henrynash: ++ 18:05:21 <stevemar> lol 18:05:26 <bknudson> ayoung: I've been really impressed with your work in triple-o. This kind of cross-project work helps us do better. 18:05:36 <ayoung> bknudson, thanks 18:05:43 <henrynash> bknudson: ++ 18:05:45 <stevemar> dolphm: can it be to say things from terminator movies? 18:05:52 <ayoung> I was able to get the Policy deployment to work the way I want it to in Tripleo 18:05:52 <topol> dolphm we could give him some music to transpose 18:06:00 <dolphm> stevemar: that seems actionable enough for a bp 18:06:04 <ayoung> http://adam.younglogic.com/2016/08/rbac-policy-update-tripleo/ 18:06:05 <stevemar> a compliment of the highest order from bknudson! 18:06:11 <dolphm> stevemar: ++ 18:06:13 <dolphm> bknudson: ++ 18:06:34 <stevemar> for real dates: 18:06:37 <stevemar> Next week (Aug 22) is the last release of keystoneauth and keystonemiddleware 18:07:02 <stevemar> Next Next week (Aug 29) is the last release of keystoneclient and newton-3 driver / keystone feature freeze! (it's actually mid-week -- Sept 1) 18:07:44 <stevemar> so the blueprints and bugs should ideally be completed *before* the 29th, or have a +2 and very close 18:07:46 <dolphm> but if it's not gating by august 29, it probably won't make it 18:07:54 <dolphm> because the gate queue will explode in length 18:07:55 <stevemar> dolphm: bingo 18:07:56 <lbragstad> so - monday it is 18:08:04 <stevemar> dolphm: i was just gonna say... 18:08:06 <stevemar> lbragstad: basically 18:08:12 <dolphm> lbragstad: ergo, aim for EoD Friday 18:08:23 <lbragstad> friday it is 18:08:32 <lbragstad> ... so tomorrow? right?! 18:08:41 <stevemar> lbragstad: 1 more week :) 18:08:43 <dolphm> transient error rates will also sky rocket, so that gives us a couple days to recheck things 18:08:51 <stevemar> dolphm: yeeeep 18:09:02 <stevemar> dolphm: have you been through this before? i feel like you have 18:09:17 * topol this release will be different. I keep telling myself... re transient errors :-) 18:09:23 * dolphm is going on my 11th release 18:09:45 <stevemar> blueprints aside, we have a few nasty bugs 18:09:46 <samueldmq> dolphm: nice! 18:09:49 <stevemar> #topic bugs 18:10:07 <ayoung> dolphm, that can't be right...this is *my* 11th, and you were there before me, no? 18:10:14 <ayoung> Diablo 18:10:24 <stevemar> the biggest of which is the caching issues we've been seeing: caching woes: https://bugs.launchpad.net/bugs/1600394 18:10:24 <openstack> Launchpad bug 1600394 in OpenStack Identity (keystone) "memcache raising "too many values to unpack"" [Critical,Confirmed] - Assigned to David Stanek (dstanek) 18:10:33 <dolphm> i started immediately after the cactus release 18:10:41 <dolphm> ayoung: so, diablo was my first 18:10:47 <stevemar> dstanek has been heads down on this stuff 18:10:59 <dstanek> stevemar: i'm going to tackle that one after the current cache bugs i'm working on 18:10:59 <stevemar> dolphm & ayoung the camp fire is over at -dev :) 18:11:00 <ayoung> Ah....I think technically Essex was mine... 18:11:00 <lbragstad> thanks dstanek! 18:11:13 <dstanek> that one is very mystical though 18:11:43 <bknudson> I thought this might be due to a collision from the sha-1 hash. 18:11:46 <stevemar> dstanek: do what you can, let us know when it's ready for review, and most importantly -- if you need elp 18:11:47 <stevemar> help 18:11:56 <bknudson> but now I don't know. 18:12:00 <dstanek> i have found that we had a few tests that worked only because revocation list caching was messed up 18:12:04 <bknudson> maybe it's a threading issue? 18:12:11 <henrynash> I think https://bugs.launchpad.net/keystone/+bug/1604479 will be a Not A Bug 18:12:11 <openstack> Launchpad bug 1604479 in OpenStack Identity (keystone) "tenantId/default_project_id missing on Keystone service user in Mitaka" [Critical,In progress] - Assigned to Kam Nasim (knasim-wrs) 18:12:23 <stevemar> henrynash: why do you say that? 18:12:42 <bknudson> for some reason we're using the memcached pool and there's no need for it since we got rid of eventlet. 18:13:03 <dstanek> after this meeting i have to figure out why this tests raises two different exceptions http://paste.ubuntu.com/23062173/ 18:13:17 <henrynash> stevemar: from everything I can see... it is simply that the puppet module was assuming v2 fucntinality for teh v3 create user API 18:13:31 <stevemar> henrynash: that would be good news 18:13:45 <stevemar> henrynash: i was going to take a second look at that one today, if we can bounce it back, that would be great 18:14:00 <henrynash> stevemar: there were assuming a user with a default project would be granted a role on that project (which we don't do in v3) 18:14:13 <stevemar> right 18:14:25 <stevemar> henrynash: okay, comment as such in the bug and review, i'll catch up 18:14:37 <stevemar> on that list in the agenda we also have https://bugs.launchpad.net/bugs/1592169 18:14:37 <openstack> Launchpad bug 1592169 in OpenStack Identity (keystone) newton "cached tokens break Liberty to Mitaka upgrade" [High,In progress] - Assigned to Colleen Murphy (krinkle) 18:14:38 <henrynash> stevemar: already done so 18:14:59 <crinkle> o/ 18:15:08 <stevemar> but crinkle has a fix up and mfisch volunteered to verify it (since he is the originator too) 18:15:14 <stevemar> thanks again crinkle :) 18:15:39 <samueldmq> stevemar: crinkle: it's also a flag for reviewers 18:15:51 <samueldmq> I particularly had never hit that issue before 18:16:01 <samueldmq> but it's interesting 18:16:22 <stevemar> last on the list is https://bugs.launchpad.net/keystoneauth/+bug/1613498 -- jamie had a long chat with the bug originator last night and i think it's a no-op, but if you're interested, please weigh in 18:16:22 <openstack> Launchpad bug 1613498 in python-openstackclient "Token Auth does not work (not fetching catalog)" [Undecided,New] 18:16:24 <dstanek> did the token format change at all? 18:16:39 <bknudson> Not sure how we could have testing for these kinds of issues in general. Seems pretty much impossible. 18:17:12 <bknudson> maybe the best thing to do is to say to wipe out your memcache on upgrade 18:17:13 <dolphm> dstanek: something could be passing ?nocatalog in the token request 18:17:17 <stevemar> bknudson: the upgrade ones are definitely tricky 18:17:39 <ayoung> jamielennox not here...hmmm 18:17:48 <bknudson> when we do remember to put the code in to handle the case it's ugly. 18:18:00 <stevemar> bknudson: that is what dolphm recommended too 18:18:09 <bknudson> maybe have a version in the cache lines 18:18:16 <dolphm> bknudson: buuut.... when do you restart memcached when it's shared by multiple nodes that are undergoing a rolling upgrade? 18:18:43 <bknudson> yikes, that's scarier. versioning would be safer. 18:18:45 <dstanek> dolphm: turn it off during an upgrade? 18:18:57 <ayoung> that sounds like a spectacularly bad idea. 18:19:08 <dstanek> ayoung: which one? 18:19:18 <dolphm> ayoung: well, it would workaround the issue at least 18:19:18 <ayoung> cache shared across nodes...is not something that makes me feel good 18:19:36 <dstanek> that's the whole point of memcached though 18:19:44 <dolphm> ayoung: it's the best way to take full advantage of your memory footprint 18:19:53 <ayoung> dstanek, it violates the whole reason to use a Transactional database 18:20:18 <dolphm> ayoung: ? memcached is neither transactional nor a database 18:20:25 <knikolla> it's more of a kvs 18:20:26 <ayoung> dolphm, exactly. SQL is 18:20:42 <ayoung> and memcache violates pretty much any transaction benefit 18:20:42 <stevemar> #topic open discussion 18:20:54 <dolphm> i'm not following 18:21:00 <ayoung> anyway...I'm derailing 18:21:12 <stevemar> ayoung: for once, i don't mind your derailing :) 18:21:36 <ayoung> It seems to me that Keystone first off has to be accurate 18:21:42 <dolphm> ayoung: are you saying that things can be written to memcache before a larger transaction is complete? 18:21:44 <ayoung> and we do a lot of caching in the interest of scalability 18:21:56 <ayoung> and I wonder if we are being pennywise and pound foolish here 18:22:01 <stevemar> it's open discussion now, folks can drop off if they want, or ping me if they have topics they want to chat about 18:22:21 <stevemar> or propose it to the etherpad, i have it open 18:22:25 <ayoung> we also have issues with Galera. I wonder if all the errors add up to ...well, something nasty 18:22:32 <dolphm> stevemar: something something rolling upgrades 18:22:41 <henrynash> dolphm: ha! 18:22:43 <raildo> stevemar, should we get it touch with the TC to remove the support on v2 on Otaca release? :) 18:22:56 <bknudson> caching is disabled by default 18:22:58 <dstanek> ayoung: i would partially agree since our caching is broken in a few ways 18:23:18 <dstanek> but it can be good and safe. my goal is to make that happen 18:23:18 <stevemar> raildo: IIRC we deprecated v2 in M, and we said we would have it around for 4 releases, so not yet :) 18:23:38 <stevemar> raildo: i think the deprecation message says the Q release 18:23:45 <dstanek> we only cache data that is already in the DB so the acid guarantees are still there 18:23:46 <bknudson> dstanek: some people might have a different definition of safe. 18:23:46 <shaleh> bknudson: yes, but anyone with more than a few users enables caching so that is kind of a useless statement 18:23:47 <raildo> stevemar, hum... got it :) 18:23:52 <raildo> stevemar, yes 18:23:53 <henrynash> dolphm: (as per the comment I've added on the patch, if the weight of opinion of reviews is to go for 3 new repos, I'll have teh changes done and up by tonight) 18:24:08 <stevemar> #topic something something rolling upgrades 18:24:10 <dstanek> bknudson: you could always keep it off 18:24:46 <stevemar> dolphm henrynash rderose is there any impact in renaming the old repo? 18:25:36 <dolphm> stevemar: yes and no? renaming it won't break anything in and of itself 18:25:38 <dolphm> stevemar: but there's no reason to 18:25:46 <bknudson> https://en.wikipedia.org/wiki/CAP_theorem 18:25:51 <ayoung> dstanek, dolphm I do suspect this is why people go no-sql. We've essentially converted Keystone into a large, distributed, eventually-consistent store 18:26:01 <ayoung> bknudson, I took a course with Brewer 18:26:08 <henrynash> I can't think of a technical issue, I think it is more conceptual cleanness vs keep doing what we've been doing (for Newton) for the "expand" cycle 18:26:09 <dolphm> stevemar: and more importantly, repurposing the old repo to be the new expand repo prevents the 3 repos we're going to go forward with from having the same migration number at all times 18:26:28 <dolphm> stevemar: which is something that would be easy to enforce, and would help us prevent deployers from accidentally running migrations out of order 18:26:39 * topol why is bknudson referencing the CAP theorem? 18:26:49 * topol things getting intersting.. 18:27:12 <ayoung> bknudson, that class is one of the reasons I realize just how hard Database stuff can be to get right 18:27:24 <bknudson> topol: because ayoung was saying that we could somehow use a consistent database. It can't happen. 18:27:31 <henrynash> dolphm: you thinking that we would check on, say, a --migrate, that the spand repo was not at a lower nummber? 18:27:37 <topol> ayoung the answer is eventual consistenc.. always .. except for banks :-) 18:27:47 <henrynash> (...that the expand repo....) 18:28:06 <topol> bknudson. agreed. dont go up against the CAP theorem 18:28:31 <ayoung> topol, what we've done is doubled our exposuie to its limitations 18:28:43 <ayoung> first at the Galera level, and then again at the memcache level 18:29:02 <topol> ayoung, uggh. really? How so 18:29:26 <ayoung> topol, SO, Gallera odes "the best it can" to be ACID compliant in a distributed sense 18:29:35 <henrynash> dolphm: if so, then I'm sold...since the one thing I liked about the "status based" code I wrote originally was that we stopped deployers doing things out of order 18:29:38 <ayoung> and if it fails, well, we kindof get notified 18:29:55 <ayoung> OTOH, memcache adds an additional layer of "distributed inconsistancy" 18:30:05 <dolphm> henrynash: yes, definitely to your question 18:30:46 <dolphm> henrynash: you can also do some other cool stuff like call migrate_repo(expand_version_number) 18:30:59 <dolphm> henrynash: contract_repo(migrate_version_number) 18:31:05 <henrynash> dolphm, rderose: OK, it's a fair cop (guvs) I'll come quietly (and make the changes today!) 18:31:09 <ayoung> We think of distributed memcache as a single system, but it is not, it is multiple nodes, and they are no-where-close to thinking about consistency across them 18:31:18 <rderose> henrynash: ++ 18:31:19 <stevemar> henrynash: ++ 18:31:21 <rderose> :) 18:31:26 <stevemar> rderose: hey thats my line! 18:31:35 <ayoung> they are just supposed to be a cache. As such, it is "yes I have it" or "no let me get it" 18:31:40 <rderose> stevemar: I was first 18:31:48 <dolphm> henrynash: i feel like you're making some british pop culture reference that just went over my head lol 18:31:54 <stevemar> *grumble* 18:32:12 <ayoung> but if changes are happening behind the cache, such as what we do when we hammer on our tests, memcache does not deal 18:32:29 <topol> dolphm, I think he has been arrested before 18:32:30 <ayoung> henrynash, I thought the line was "its a fair court?" 18:32:39 <dolphm> ayoung: if you're clustering memcache nodes, then there's nothing to keep consistent because you're sharding data across them 18:32:46 <dstanek> ayoung: there is no consistency across memcached nodes by design 18:32:53 <ayoung> dolphm, right. dstanek right 18:32:59 <stevemar> http://i1.kym-cdn.com/photos/images/newsfeed/000/992/401/e37.png 18:33:02 <dstanek> there isn't and shouldn't be - a give can can only be on a single node ever 18:33:08 <henrynash> ayoung: maybe, blended by folk law... 18:33:20 <henrynash> stevemar: ++ 18:33:38 <ayoung> dstanek, and *we* kinda get that. But people outside of Keystone think of it as a single, database backed, transaction system. Hence revocations 18:33:54 <henrynash> http://dictionary.cambridge.org/dictionary/english/it-s-a-fair-cop 18:34:22 <stevemar> ayoung: i think the expectation is becoming that keystone scale globally and handle these more difficult scenarios 18:34:27 <ayoung> henrynash, all these years I've been misinterpreting the dead Bishop sketch 18:34:30 <rderose> thanks henrynash, that cleared it up 18:34:49 <dolphm> stevemar: http://s2.quickmeme.com/img/10/10e71fd6edca008ff7ab182e9c428ea756412398815dbeec35f25092e913ed2c.jpg 18:35:05 <knikolla> lol 18:35:06 <stevemar> lol 18:35:12 <ayoung> http://www.montypython.net/scripts/bishop.php 18:35:13 <topol> i dont get it 18:35:33 <stevemar> i think we're all sleep deprived and delirious 18:35:57 <stevemar> we need more open discussion? we're just arguing and posting memes at this point? 18:36:04 <lbragstad> leave it to the LA Kings 18:36:37 <shaleh> ayoung: so we are back to dropping revocations? 18:36:49 <henrynash> I like to think of us all as sleep deprived and delicious.... 18:36:52 <rderose> lbragstad: that's better than the Fargo??? wait, what teams do you have? 18:37:09 <stevemar> :) 18:37:10 <lbragstad> rderose ... well played 18:37:30 <rderose> :) 18:37:34 <topol> henrynash I have no idea how offensive this really is 18:37:34 <lbragstad> rderose we settle for the Minnesota Wild ;) 18:37:44 <topol> https://www.youtube.com/watch?v=n5qn_kShlDI 18:37:50 <rderose> lbragstad: there you go 18:38:12 <rderose> topol: ++ 18:38:21 <ayoung> "Well, I meet a lot of people and I'm convinced that the vast majority of wrongthinking people are right. " 18:38:46 <ayoung> shaleh, so, yeah, I think we can drop revocations if we go with the plan jamielennox and I were dreaming up at the midcycle 18:38:51 <stevemar> and topol is gonna have a talk with HR 18:39:02 <ayoung> well, most revocations. Explicit will still be needed 18:39:03 * topol took a risk 18:39:13 <stevemar> ayoung: i liked that plan FWIW 18:39:17 <shaleh> ayoung: if a) works and b) confuses people less that sounds good 18:39:19 <ayoung> But if we make tokens short lived...yeah, most go away 18:39:19 <ayoung> Maybe all 18:39:27 <henrynash> topol: :-) 18:39:32 <ayoung> stevemar, me, too.. Just need the time to work on it 18:39:33 <stevemar> ayoung: like 95+% 18:39:40 <stevemar> ayoung: just ditch tripleo :P 18:39:50 <stevemar> come back to us, you know you want to 18:40:01 <henrynash> ayoung: what;s the path to short lived tokens? 18:40:16 <ayoung> stevemar, I do want to, but it turns out that people can't actually use the crap we build if the installer disables it 18:40:27 <stevemar> henrynash: 2 options, one complex, one simple 18:40:31 <henrynash> ( and don't say fly to Boston and turn left) 18:40:40 <lbragstad> `[token] expiration` = 600 18:40:50 <dolphm> #shipit 18:40:54 <lbragstad> :) 18:41:07 <dstanek> lbragstad: ++ 18:41:12 <stevemar> henrynash: we re-work it completely and use "reservations" by jamie 18:41:12 <ayoung> henrynash, bascially, this: we make tokens live for a really short time as authentication proxies, but honor the data that they represent for the life time of the workflows 18:41:21 <stevemar> henrynash: or we yeah, taht ^ 18:41:44 <ayoung> stevemar, lets not call them reservations. That was a different mechanism, where the user goes to Cinder first, creates a reservation, and passes that to Nova 18:41:53 <henrynash> ayoung: right we discussed that at he midcycle...what's stopping us 18:42:03 <stevemar> so you have 5 minutes to use a token, but if you used it, it's valid for 24 hours 18:42:09 <stevemar> ayoung: i think i have that right? 18:42:17 <ayoung> henrynash, time 18:42:18 <ayoung> I need to get some team priorities done, and it is eating up my time 18:42:26 <ayoung> stevemar, you are right 18:42:37 <lbragstad> henrynash I think we need more ten-digit gnomes to do the neeful 18:42:38 <stevemar> ayoung: i like the solution, it's neat 18:42:40 <ayoung> and that is OK because we go back to Keystone and expand out the data each time 18:43:06 <stevemar> lbragstad: someone has their big book of british slang open 18:43:30 <stevemar> dstanek: i like what breton said here https://review.openstack.org/#/c/349704/5 :) 18:43:32 <shaleh> how do we settle that with the ops request for 4 hour tokens to limit security exposure 18:44:13 <stevemar> shaleh: the 24 hours part if a config option, could set it to 4 hrs or 1 18:44:14 <dstanek> stevemar: breton: woot! there is a transient failure though :-( 18:44:39 <dstanek> soon as we are done here i shall go fix 18:45:07 <stevemar> anyway, let's actually end this 18:45:12 <stevemar> thanks for coming everyone 18:45:20 <stevemar> thanks dolphm for covering me last week and next week 18:45:25 <rderose> well, bye 18:45:31 <dolphm> wat 18:45:31 <lbragstad> o/ 18:45:36 <stevemar> #endmeeting