18:01:16 <stevemar> #startmeeting keystone
18:01:17 <openstack> Meeting started Tue Aug 16 18:01:16 2016 UTC and is due to finish in 60 minutes.  The chair is stevemar. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:01:18 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
18:01:21 <openstack> The meeting name has been set to 'keystone'
18:01:22 <henrynash> it's the beret & onions that do it
18:01:27 <stevemar> alright keystoners, let's get ready!
18:01:47 <shaleh> \o
18:01:47 <stevemar> (un)fortunately, no one added anything to the meeting agenda :O
18:01:52 <stevemar> #link https://etherpad.openstack.org/p/keystone-weekly-meeting
18:01:53 <shaleh> yay short meeting
18:02:02 <henrynash> stevemar: don't tempt us
18:02:31 <stevemar> short topics, just status, bugs, and then open discussion
18:02:37 <dstanek> o/
18:02:39 <dstanek> i showed up for no reason?
18:02:55 <stevemar> dstanek: tiny reasons
18:03:07 <stevemar> #topic
18:03:09 <topol> o/
18:03:10 <stevemar> rfail
18:03:12 <bknudson> hi
18:03:15 <dstanek> i have a goal to get a few patches up to address some of the caching issue
18:03:20 <stevemar> #topic release status
18:03:44 <stevemar> the following blueprints are not yet complete: https://launchpad.net/keystone/+milestone/newton-3
18:03:56 <stevemar> but are in pretty darn good shape
18:04:00 <stevemar> PCI is a few hours away from being done
18:04:12 <stevemar> ldap preprocess is pretty close
18:04:22 <stevemar> rolling upgrades is chugging along
18:04:36 <stevemar> and lbragstad picked up credential encryption (thanks!)
18:04:39 <ayoung> Wow, my name does not even appear on that page.  Tripleo has taken over my life
18:04:54 <stevemar> ayoung: you're forever with us in spirit
18:04:54 <henrynash> rolling upgrades is good shape (one final issue to settle on, but time to make teh chanegs which ever way we go is only an hour)
18:05:07 <dstanek> henrynash: nice!
18:05:10 <ayoung> I'll be back...
18:05:17 <dolphm> stevemar: quick, create a high priority blueprint and assign it to ayoung
18:05:20 <rderose> henrynash: ++
18:05:21 <stevemar> lol
18:05:26 <bknudson> ayoung: I've been really impressed with your work in triple-o. This kind of cross-project work helps us do better.
18:05:36 <ayoung> bknudson, thanks
18:05:43 <henrynash> bknudson: ++
18:05:45 <stevemar> dolphm: can it be to say things from terminator movies?
18:05:52 <ayoung> I was able to get the Policy deployment to work the way I want it to in Tripleo
18:05:52 <topol> dolphm we could give him some music to transpose
18:06:00 <dolphm> stevemar: that seems actionable enough for a bp
18:06:04 <ayoung> http://adam.younglogic.com/2016/08/rbac-policy-update-tripleo/
18:06:05 <stevemar> a compliment of the highest order from bknudson!
18:06:11 <dolphm> stevemar: ++
18:06:13 <dolphm> bknudson: ++
18:06:34 <stevemar> for real dates:
18:06:37 <stevemar> Next week (Aug 22) is the last release of keystoneauth and keystonemiddleware
18:07:02 <stevemar> Next Next week (Aug 29) is the last release of keystoneclient and newton-3 driver / keystone feature freeze!  (it's actually mid-week -- Sept 1)
18:07:44 <stevemar> so the blueprints and bugs should ideally be completed *before* the 29th, or have a +2 and very close
18:07:46 <dolphm> but if it's not gating by august 29, it probably won't make it
18:07:54 <dolphm> because the gate queue will explode in length
18:07:55 <stevemar> dolphm: bingo
18:07:56 <lbragstad> so - monday it is
18:08:04 <stevemar> dolphm: i was just gonna say...
18:08:06 <stevemar> lbragstad: basically
18:08:12 <dolphm> lbragstad: ergo, aim for EoD Friday
18:08:23 <lbragstad> friday it is
18:08:32 <lbragstad> ... so tomorrow? right?!
18:08:41 <stevemar> lbragstad: 1 more week :)
18:08:43 <dolphm> transient error rates will also sky rocket, so that gives us a couple days to recheck things
18:08:51 <stevemar> dolphm: yeeeep
18:09:02 <stevemar> dolphm: have you been through this before? i feel like you have
18:09:17 * topol this release will be different. I keep telling myself... re transient errors :-)
18:09:23 * dolphm is going on my 11th release
18:09:45 <stevemar> blueprints aside, we have a few nasty bugs
18:09:46 <samueldmq> dolphm: nice!
18:09:49 <stevemar> #topic bugs
18:10:07 <ayoung> dolphm, that can't be right...this is *my* 11th, and you were there before me, no?
18:10:14 <ayoung> Diablo
18:10:24 <stevemar> the biggest of which is the caching issues we've been seeing: caching woes: https://bugs.launchpad.net/bugs/1600394
18:10:24 <openstack> Launchpad bug 1600394 in OpenStack Identity (keystone) "memcache raising "too many values to unpack"" [Critical,Confirmed] - Assigned to David Stanek (dstanek)
18:10:33 <dolphm> i started immediately after the cactus release
18:10:41 <dolphm> ayoung: so, diablo was my first
18:10:47 <stevemar> dstanek has been heads down on this stuff
18:10:59 <dstanek> stevemar: i'm going to tackle that one after the current cache bugs i'm working on
18:10:59 <stevemar> dolphm & ayoung the camp fire is over at -dev :)
18:11:00 <ayoung> Ah....I think technically Essex was mine...
18:11:00 <lbragstad> thanks dstanek!
18:11:13 <dstanek> that one is very mystical though
18:11:43 <bknudson> I thought this might be due to a collision from the sha-1 hash.
18:11:46 <stevemar> dstanek: do what you can, let us know when it's ready for review, and most importantly -- if you need elp
18:11:47 <stevemar> help
18:11:56 <bknudson> but now I don't know.
18:12:00 <dstanek> i have found that we had a few tests that worked only because revocation list caching was messed up
18:12:04 <bknudson> maybe it's a threading issue?
18:12:11 <henrynash> I think https://bugs.launchpad.net/keystone/+bug/1604479 will be a Not A Bug
18:12:11 <openstack> Launchpad bug 1604479 in OpenStack Identity (keystone) "tenantId/default_project_id missing on Keystone service user in Mitaka" [Critical,In progress] - Assigned to Kam Nasim (knasim-wrs)
18:12:23 <stevemar> henrynash: why do you say that?
18:12:42 <bknudson> for some reason we're using the memcached pool and there's no need for it since we got rid of eventlet.
18:13:03 <dstanek> after this meeting i have to figure out why this tests raises two different exceptions http://paste.ubuntu.com/23062173/
18:13:17 <henrynash> stevemar: from everything I can see... it is simply that the puppet module was assuming v2 fucntinality for teh v3 create user API
18:13:31 <stevemar> henrynash: that would be good news
18:13:45 <stevemar> henrynash: i was going to take a second look at that one today, if we can bounce it back, that would be great
18:14:00 <henrynash> stevemar: there were assuming a user with a default project would be granted a role on that project (which we don't do in v3)
18:14:13 <stevemar> right
18:14:25 <stevemar> henrynash: okay, comment as such in the bug and review, i'll catch up
18:14:37 <stevemar> on that list in the agenda we also have https://bugs.launchpad.net/bugs/1592169
18:14:37 <openstack> Launchpad bug 1592169 in OpenStack Identity (keystone) newton "cached tokens break Liberty to Mitaka upgrade" [High,In progress] - Assigned to Colleen Murphy (krinkle)
18:14:38 <henrynash> stevemar: already done so
18:14:59 <crinkle> o/
18:15:08 <stevemar> but crinkle has a fix up and mfisch volunteered to verify it (since he is the originator too)
18:15:14 <stevemar> thanks again crinkle :)
18:15:39 <samueldmq> stevemar: crinkle: it's also a flag for reviewers
18:15:51 <samueldmq> I particularly had never hit that issue before
18:16:01 <samueldmq> but it's interesting
18:16:22 <stevemar> last on the list is  https://bugs.launchpad.net/keystoneauth/+bug/1613498 -- jamie had a long chat with the bug originator last night and i think it's a no-op, but if you're interested, please weigh in
18:16:22 <openstack> Launchpad bug 1613498 in python-openstackclient "Token Auth does not work (not fetching catalog)" [Undecided,New]
18:16:24 <dstanek> did the token format change at all?
18:16:39 <bknudson> Not sure how we could have testing for these kinds of issues in general. Seems pretty much impossible.
18:17:12 <bknudson> maybe the best thing to do is to say to wipe out your memcache on upgrade
18:17:13 <dolphm> dstanek: something could be passing ?nocatalog in the token request
18:17:17 <stevemar> bknudson: the upgrade ones are definitely tricky
18:17:39 <ayoung> jamielennox not here...hmmm
18:17:48 <bknudson> when we do remember to put the code in to handle the case it's ugly.
18:18:00 <stevemar> bknudson: that is what dolphm recommended too
18:18:09 <bknudson> maybe have a version in the cache lines
18:18:16 <dolphm> bknudson: buuut.... when do you restart memcached when it's shared by multiple nodes that are undergoing a rolling upgrade?
18:18:43 <bknudson> yikes, that's scarier. versioning would be safer.
18:18:45 <dstanek> dolphm: turn it off during an upgrade?
18:18:57 <ayoung> that sounds like a spectacularly bad idea.
18:19:08 <dstanek> ayoung: which one?
18:19:18 <dolphm> ayoung: well, it would workaround the issue at least
18:19:18 <ayoung> cache shared across nodes...is not something that makes me feel good
18:19:36 <dstanek> that's the whole point of memcached though
18:19:44 <dolphm> ayoung: it's the best way to take full advantage of your memory footprint
18:19:53 <ayoung> dstanek, it violates the whole reason to use a Transactional database
18:20:18 <dolphm> ayoung: ? memcached is neither transactional nor a database
18:20:25 <knikolla> it's more of a kvs
18:20:26 <ayoung> dolphm, exactly.  SQL is
18:20:42 <ayoung> and memcache violates pretty much any transaction benefit
18:20:42 <stevemar> #topic open discussion
18:20:54 <dolphm> i'm not following
18:21:00 <ayoung> anyway...I'm derailing
18:21:12 <stevemar> ayoung: for once, i don't mind your derailing :)
18:21:36 <ayoung> It seems to me that Keystone first off has to be accurate
18:21:42 <dolphm> ayoung: are you saying that things can be written to memcache before a larger transaction is complete?
18:21:44 <ayoung> and we do a lot of caching in the interest of scalability
18:21:56 <ayoung> and I wonder if we are being pennywise and pound foolish here
18:22:01 <stevemar> it's open discussion now, folks can drop off if they want, or ping me if they have topics they want to chat about
18:22:21 <stevemar> or propose it to the etherpad, i have it open
18:22:25 <ayoung> we also have issues with Galera.  I wonder if all the errors add up to ...well, something nasty
18:22:32 <dolphm> stevemar: something something rolling upgrades
18:22:41 <henrynash> dolphm: ha!
18:22:43 <raildo> stevemar, should we get it touch with the TC to remove the support on v2 on Otaca release? :)
18:22:56 <bknudson> caching is disabled by default
18:22:58 <dstanek> ayoung: i would partially agree since our caching is broken in a few ways
18:23:18 <dstanek> but it can be good and safe. my goal is to make that happen
18:23:18 <stevemar> raildo: IIRC we deprecated v2 in M, and we said we would have it around for 4 releases, so not yet :)
18:23:38 <stevemar> raildo: i think the deprecation message says the Q release
18:23:45 <dstanek> we only cache data that is already in the DB so the acid guarantees are still there
18:23:46 <bknudson> dstanek: some people might have a different definition of safe.
18:23:46 <shaleh> bknudson: yes, but anyone with more than a few users enables caching so that is kind of a useless statement
18:23:47 <raildo> stevemar, hum... got it :)
18:23:52 <raildo> stevemar, yes
18:23:53 <henrynash> dolphm: (as per the comment I've added on the patch, if the weight of opinion of reviews is to go for 3 new repos, I'll have teh changes done and up by tonight)
18:24:08 <stevemar> #topic something something rolling upgrades
18:24:10 <dstanek> bknudson: you could always keep it off
18:24:46 <stevemar> dolphm henrynash rderose is there any impact in renaming the old repo?
18:25:36 <dolphm> stevemar: yes and no? renaming it won't break anything in and of itself
18:25:38 <dolphm> stevemar: but there's no reason to
18:25:46 <bknudson> https://en.wikipedia.org/wiki/CAP_theorem
18:25:51 <ayoung> dstanek, dolphm I do suspect this is why people go no-sql. We've essentially converted Keystone into a large, distributed, eventually-consistent store
18:26:01 <ayoung> bknudson, I took a course with Brewer
18:26:08 <henrynash> I can't think of a technical issue, I think it is more conceptual cleanness vs keep doing what we've been doing (for Newton) for the "expand" cycle
18:26:09 <dolphm> stevemar: and more importantly, repurposing the old repo to be the new expand repo prevents the 3 repos we're going to go forward with from having the same migration number at all times
18:26:28 <dolphm> stevemar: which is something that would be easy to enforce, and would help us prevent deployers from accidentally running migrations out of order
18:26:39 * topol why is bknudson referencing the CAP theorem?
18:26:49 * topol things getting intersting..
18:27:12 <ayoung> bknudson, that class is one of the reasons I realize just how hard Database stuff can be to get right
18:27:24 <bknudson> topol: because ayoung was saying that we could somehow use a consistent database. It can't happen.
18:27:31 <henrynash> dolphm: you thinking that we would check on, say, a --migrate, that the spand repo was not at a lower nummber?
18:27:37 <topol> ayoung the answer is eventual consistenc.. always .. except for banks :-)
18:27:47 <henrynash> (...that the expand repo....)
18:28:06 <topol> bknudson. agreed. dont go up against the CAP theorem
18:28:31 <ayoung> topol, what we've done is doubled our exposuie to its limitations
18:28:43 <ayoung> first at the Galera level, and then again at the memcache level
18:29:02 <topol> ayoung, uggh. really? How so
18:29:26 <ayoung> topol, SO, Gallera odes "the best it can" to be ACID compliant in a distributed sense
18:29:35 <henrynash> dolphm: if so, then I'm sold...since the one thing I liked about the "status based" code I wrote originally was that we stopped deployers doing things out of order
18:29:38 <ayoung> and if it fails, well, we kindof get notified
18:29:55 <ayoung> OTOH, memcache adds an additional layer of "distributed inconsistancy"
18:30:05 <dolphm> henrynash: yes, definitely to your question
18:30:46 <dolphm> henrynash: you can also do some other cool stuff like call migrate_repo(expand_version_number)
18:30:59 <dolphm> henrynash: contract_repo(migrate_version_number)
18:31:05 <henrynash> dolphm, rderose: OK, it's a fair cop (guvs) I'll come quietly (and make the changes today!)
18:31:09 <ayoung> We think of distributed memcache as a single system, but it is not, it is multiple nodes, and they are no-where-close to thinking about consistency across them
18:31:18 <rderose> henrynash: ++
18:31:19 <stevemar> henrynash: ++
18:31:21 <rderose> :)
18:31:26 <stevemar> rderose: hey thats my line!
18:31:35 <ayoung> they are just supposed to be a cache.  As such, it is "yes I have it" or "no let me get it"
18:31:40 <rderose> stevemar: I was first
18:31:48 <dolphm> henrynash: i feel like you're making some british pop culture reference that just went over my head lol
18:31:54 <stevemar> *grumble*
18:32:12 <ayoung> but if changes are happening behind the cache, such as what we do when we hammer on our tests, memcache does not deal
18:32:29 <topol> dolphm, I think he has been arrested before
18:32:30 <ayoung> henrynash, I thought the line was "its a fair court?"
18:32:39 <dolphm> ayoung: if you're clustering memcache nodes, then there's nothing to keep consistent because you're sharding data across them
18:32:46 <dstanek> ayoung: there is no consistency across memcached nodes by design
18:32:53 <ayoung> dolphm, right.  dstanek right
18:32:59 <stevemar> http://i1.kym-cdn.com/photos/images/newsfeed/000/992/401/e37.png
18:33:02 <dstanek> there isn't and shouldn't be - a give can can only be on a single node ever
18:33:08 <henrynash> ayoung: maybe, blended by folk law...
18:33:20 <henrynash> stevemar: ++
18:33:38 <ayoung> dstanek, and *we* kinda get that.  But people outside of Keystone think of it as a single, database backed, transaction system.  Hence revocations
18:33:54 <henrynash> http://dictionary.cambridge.org/dictionary/english/it-s-a-fair-cop
18:34:22 <stevemar> ayoung: i think the expectation is becoming that keystone scale globally and handle these more difficult scenarios
18:34:27 <ayoung> henrynash, all these years I've been misinterpreting the dead Bishop sketch
18:34:30 <rderose> thanks henrynash, that cleared it up
18:34:49 <dolphm> stevemar: http://s2.quickmeme.com/img/10/10e71fd6edca008ff7ab182e9c428ea756412398815dbeec35f25092e913ed2c.jpg
18:35:05 <knikolla> lol
18:35:06 <stevemar> lol
18:35:12 <ayoung> http://www.montypython.net/scripts/bishop.php
18:35:13 <topol> i dont get it
18:35:33 <stevemar> i think we're all sleep deprived and delirious
18:35:57 <stevemar> we need more open discussion? we're just arguing and posting memes at this point?
18:36:04 <lbragstad> leave it to the LA Kings
18:36:37 <shaleh> ayoung: so we are back to dropping revocations?
18:36:49 <henrynash> I like to think of us all as sleep deprived and delicious....
18:36:52 <rderose> lbragstad: that's better than the Fargo???  wait, what teams do you have?
18:37:09 <stevemar> :)
18:37:10 <lbragstad> rderose ... well played
18:37:30 <rderose> :)
18:37:34 <topol> henrynash I have no idea how offensive this really is
18:37:34 <lbragstad> rderose we settle for the Minnesota Wild ;)
18:37:44 <topol> https://www.youtube.com/watch?v=n5qn_kShlDI
18:37:50 <rderose> lbragstad: there you go
18:38:12 <rderose> topol: ++
18:38:21 <ayoung> "Well, I meet a lot of people and I'm convinced that the vast majority of wrongthinking people are right. "
18:38:46 <ayoung> shaleh, so, yeah, I think we can drop revocations if we go with the plan jamielennox and I were dreaming up at the midcycle
18:38:51 <stevemar> and topol is gonna have a talk with HR
18:39:02 <ayoung> well, most revocations. Explicit will still be needed
18:39:03 * topol took a risk
18:39:13 <stevemar> ayoung: i liked that plan FWIW
18:39:17 <shaleh> ayoung: if a) works and b) confuses people less that sounds good
18:39:19 <ayoung> But if we make tokens short lived...yeah, most go away
18:39:19 <ayoung> Maybe all
18:39:27 <henrynash> topol: :-)
18:39:32 <ayoung> stevemar, me, too.. Just need the time to work on it
18:39:33 <stevemar> ayoung: like 95+%
18:39:40 <stevemar> ayoung: just ditch tripleo :P
18:39:50 <stevemar> come back to us, you know you want to
18:40:01 <henrynash> ayoung: what;s the path to short lived tokens?
18:40:16 <ayoung> stevemar, I do want to, but it turns out that people can't actually use the crap we build if the installer disables it
18:40:27 <stevemar> henrynash: 2 options, one complex, one simple
18:40:31 <henrynash> ( and don't say fly to Boston and turn left)
18:40:40 <lbragstad> `[token] expiration` = 600
18:40:50 <dolphm> #shipit
18:40:54 <lbragstad> :)
18:41:07 <dstanek> lbragstad: ++
18:41:12 <stevemar> henrynash: we re-work it completely and use "reservations" by jamie
18:41:12 <ayoung> henrynash, bascially, this:  we make tokens live for a really short time as authentication proxies, but honor the data that they represent for the life time of the workflows
18:41:21 <stevemar> henrynash: or we yeah, taht ^
18:41:44 <ayoung> stevemar, lets not call them reservations.  That was a different mechanism, where the user goes to Cinder first, creates a reservation, and passes that to Nova
18:41:53 <henrynash> ayoung: right we discussed that at he midcycle...what's stopping us
18:42:03 <stevemar> so you have 5 minutes to use a token, but if you used it, it's valid for 24 hours
18:42:09 <stevemar> ayoung: i think i have that right?
18:42:17 <ayoung> henrynash, time
18:42:18 <ayoung> I need to get some team priorities done, and it is eating up my time
18:42:26 <ayoung> stevemar, you are right
18:42:37 <lbragstad> henrynash I think we need more ten-digit gnomes to do the neeful
18:42:38 <stevemar> ayoung: i like the solution, it's neat
18:42:40 <ayoung> and that is OK because we go back to Keystone and expand out the data each time
18:43:06 <stevemar> lbragstad: someone has their big book of british slang open
18:43:30 <stevemar> dstanek: i like what breton said here https://review.openstack.org/#/c/349704/5 :)
18:43:32 <shaleh> how do we settle that with the ops request for 4 hour tokens to limit security exposure
18:44:13 <stevemar> shaleh: the 24 hours part if a config option, could set it to 4 hrs or 1
18:44:14 <dstanek> stevemar: breton: woot! there is a transient failure though :-(
18:44:39 <dstanek> soon as we are done here i shall go fix
18:45:07 <stevemar> anyway, let's actually end this
18:45:12 <stevemar> thanks for coming everyone
18:45:20 <stevemar> thanks dolphm for covering me last week and next week
18:45:25 <rderose> well, bye
18:45:31 <dolphm> wat
18:45:31 <lbragstad> o/
18:45:36 <stevemar> #endmeeting