18:01:16 #startmeeting keystone 18:01:17 Meeting started Tue Aug 16 18:01:16 2016 UTC and is due to finish in 60 minutes. The chair is stevemar. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:01:18 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 18:01:21 The meeting name has been set to 'keystone' 18:01:22 it's the beret & onions that do it 18:01:27 alright keystoners, let's get ready! 18:01:47 \o 18:01:47 (un)fortunately, no one added anything to the meeting agenda :O 18:01:52 #link https://etherpad.openstack.org/p/keystone-weekly-meeting 18:01:53 yay short meeting 18:02:02 stevemar: don't tempt us 18:02:31 short topics, just status, bugs, and then open discussion 18:02:37 o/ 18:02:39 i showed up for no reason? 18:02:55 dstanek: tiny reasons 18:03:07 #topic 18:03:09 o/ 18:03:10 rfail 18:03:12 hi 18:03:15 i have a goal to get a few patches up to address some of the caching issue 18:03:20 #topic release status 18:03:44 the following blueprints are not yet complete: https://launchpad.net/keystone/+milestone/newton-3 18:03:56 but are in pretty darn good shape 18:04:00 PCI is a few hours away from being done 18:04:12 ldap preprocess is pretty close 18:04:22 rolling upgrades is chugging along 18:04:36 and lbragstad picked up credential encryption (thanks!) 18:04:39 Wow, my name does not even appear on that page. Tripleo has taken over my life 18:04:54 ayoung: you're forever with us in spirit 18:04:54 rolling upgrades is good shape (one final issue to settle on, but time to make teh chanegs which ever way we go is only an hour) 18:05:07 henrynash: nice! 18:05:10 I'll be back... 18:05:17 stevemar: quick, create a high priority blueprint and assign it to ayoung 18:05:20 henrynash: ++ 18:05:21 lol 18:05:26 ayoung: I've been really impressed with your work in triple-o. This kind of cross-project work helps us do better. 18:05:36 bknudson, thanks 18:05:43 bknudson: ++ 18:05:45 dolphm: can it be to say things from terminator movies? 18:05:52 I was able to get the Policy deployment to work the way I want it to in Tripleo 18:05:52 dolphm we could give him some music to transpose 18:06:00 stevemar: that seems actionable enough for a bp 18:06:04 http://adam.younglogic.com/2016/08/rbac-policy-update-tripleo/ 18:06:05 a compliment of the highest order from bknudson! 18:06:11 stevemar: ++ 18:06:13 bknudson: ++ 18:06:34 for real dates: 18:06:37 Next week (Aug 22) is the last release of keystoneauth and keystonemiddleware 18:07:02 Next Next week (Aug 29) is the last release of keystoneclient and newton-3 driver / keystone feature freeze! (it's actually mid-week -- Sept 1) 18:07:44 so the blueprints and bugs should ideally be completed *before* the 29th, or have a +2 and very close 18:07:46 but if it's not gating by august 29, it probably won't make it 18:07:54 because the gate queue will explode in length 18:07:55 dolphm: bingo 18:07:56 so - monday it is 18:08:04 dolphm: i was just gonna say... 18:08:06 lbragstad: basically 18:08:12 lbragstad: ergo, aim for EoD Friday 18:08:23 friday it is 18:08:32 ... so tomorrow? right?! 18:08:41 lbragstad: 1 more week :) 18:08:43 transient error rates will also sky rocket, so that gives us a couple days to recheck things 18:08:51 dolphm: yeeeep 18:09:02 dolphm: have you been through this before? i feel like you have 18:09:17 * topol this release will be different. I keep telling myself... re transient errors :-) 18:09:23 * dolphm is going on my 11th release 18:09:45 blueprints aside, we have a few nasty bugs 18:09:46 dolphm: nice! 18:09:49 #topic bugs 18:10:07 dolphm, that can't be right...this is *my* 11th, and you were there before me, no? 18:10:14 Diablo 18:10:24 the biggest of which is the caching issues we've been seeing: caching woes: https://bugs.launchpad.net/bugs/1600394 18:10:24 Launchpad bug 1600394 in OpenStack Identity (keystone) "memcache raising "too many values to unpack"" [Critical,Confirmed] - Assigned to David Stanek (dstanek) 18:10:33 i started immediately after the cactus release 18:10:41 ayoung: so, diablo was my first 18:10:47 dstanek has been heads down on this stuff 18:10:59 stevemar: i'm going to tackle that one after the current cache bugs i'm working on 18:10:59 dolphm & ayoung the camp fire is over at -dev :) 18:11:00 Ah....I think technically Essex was mine... 18:11:00 thanks dstanek! 18:11:13 that one is very mystical though 18:11:43 I thought this might be due to a collision from the sha-1 hash. 18:11:46 dstanek: do what you can, let us know when it's ready for review, and most importantly -- if you need elp 18:11:47 help 18:11:56 but now I don't know. 18:12:00 i have found that we had a few tests that worked only because revocation list caching was messed up 18:12:04 maybe it's a threading issue? 18:12:11 I think https://bugs.launchpad.net/keystone/+bug/1604479 will be a Not A Bug 18:12:11 Launchpad bug 1604479 in OpenStack Identity (keystone) "tenantId/default_project_id missing on Keystone service user in Mitaka" [Critical,In progress] - Assigned to Kam Nasim (knasim-wrs) 18:12:23 henrynash: why do you say that? 18:12:42 for some reason we're using the memcached pool and there's no need for it since we got rid of eventlet. 18:13:03 after this meeting i have to figure out why this tests raises two different exceptions http://paste.ubuntu.com/23062173/ 18:13:17 stevemar: from everything I can see... it is simply that the puppet module was assuming v2 fucntinality for teh v3 create user API 18:13:31 henrynash: that would be good news 18:13:45 henrynash: i was going to take a second look at that one today, if we can bounce it back, that would be great 18:14:00 stevemar: there were assuming a user with a default project would be granted a role on that project (which we don't do in v3) 18:14:13 right 18:14:25 henrynash: okay, comment as such in the bug and review, i'll catch up 18:14:37 on that list in the agenda we also have https://bugs.launchpad.net/bugs/1592169 18:14:37 Launchpad bug 1592169 in OpenStack Identity (keystone) newton "cached tokens break Liberty to Mitaka upgrade" [High,In progress] - Assigned to Colleen Murphy (krinkle) 18:14:38 stevemar: already done so 18:14:59 o/ 18:15:08 but crinkle has a fix up and mfisch volunteered to verify it (since he is the originator too) 18:15:14 thanks again crinkle :) 18:15:39 stevemar: crinkle: it's also a flag for reviewers 18:15:51 I particularly had never hit that issue before 18:16:01 but it's interesting 18:16:22 last on the list is https://bugs.launchpad.net/keystoneauth/+bug/1613498 -- jamie had a long chat with the bug originator last night and i think it's a no-op, but if you're interested, please weigh in 18:16:22 Launchpad bug 1613498 in python-openstackclient "Token Auth does not work (not fetching catalog)" [Undecided,New] 18:16:24 did the token format change at all? 18:16:39 Not sure how we could have testing for these kinds of issues in general. Seems pretty much impossible. 18:17:12 maybe the best thing to do is to say to wipe out your memcache on upgrade 18:17:13 dstanek: something could be passing ?nocatalog in the token request 18:17:17 bknudson: the upgrade ones are definitely tricky 18:17:39 jamielennox not here...hmmm 18:17:48 when we do remember to put the code in to handle the case it's ugly. 18:18:00 bknudson: that is what dolphm recommended too 18:18:09 maybe have a version in the cache lines 18:18:16 bknudson: buuut.... when do you restart memcached when it's shared by multiple nodes that are undergoing a rolling upgrade? 18:18:43 yikes, that's scarier. versioning would be safer. 18:18:45 dolphm: turn it off during an upgrade? 18:18:57 that sounds like a spectacularly bad idea. 18:19:08 ayoung: which one? 18:19:18 ayoung: well, it would workaround the issue at least 18:19:18 cache shared across nodes...is not something that makes me feel good 18:19:36 that's the whole point of memcached though 18:19:44 ayoung: it's the best way to take full advantage of your memory footprint 18:19:53 dstanek, it violates the whole reason to use a Transactional database 18:20:18 ayoung: ? memcached is neither transactional nor a database 18:20:25 it's more of a kvs 18:20:26 dolphm, exactly. SQL is 18:20:42 and memcache violates pretty much any transaction benefit 18:20:42 #topic open discussion 18:20:54 i'm not following 18:21:00 anyway...I'm derailing 18:21:12 ayoung: for once, i don't mind your derailing :) 18:21:36 It seems to me that Keystone first off has to be accurate 18:21:42 ayoung: are you saying that things can be written to memcache before a larger transaction is complete? 18:21:44 and we do a lot of caching in the interest of scalability 18:21:56 and I wonder if we are being pennywise and pound foolish here 18:22:01 it's open discussion now, folks can drop off if they want, or ping me if they have topics they want to chat about 18:22:21 or propose it to the etherpad, i have it open 18:22:25 we also have issues with Galera. I wonder if all the errors add up to ...well, something nasty 18:22:32 stevemar: something something rolling upgrades 18:22:41 dolphm: ha! 18:22:43 stevemar, should we get it touch with the TC to remove the support on v2 on Otaca release? :) 18:22:56 caching is disabled by default 18:22:58 ayoung: i would partially agree since our caching is broken in a few ways 18:23:18 but it can be good and safe. my goal is to make that happen 18:23:18 raildo: IIRC we deprecated v2 in M, and we said we would have it around for 4 releases, so not yet :) 18:23:38 raildo: i think the deprecation message says the Q release 18:23:45 we only cache data that is already in the DB so the acid guarantees are still there 18:23:46 dstanek: some people might have a different definition of safe. 18:23:46 bknudson: yes, but anyone with more than a few users enables caching so that is kind of a useless statement 18:23:47 stevemar, hum... got it :) 18:23:52 stevemar, yes 18:23:53 dolphm: (as per the comment I've added on the patch, if the weight of opinion of reviews is to go for 3 new repos, I'll have teh changes done and up by tonight) 18:24:08 #topic something something rolling upgrades 18:24:10 bknudson: you could always keep it off 18:24:46 dolphm henrynash rderose is there any impact in renaming the old repo? 18:25:36 stevemar: yes and no? renaming it won't break anything in and of itself 18:25:38 stevemar: but there's no reason to 18:25:46 https://en.wikipedia.org/wiki/CAP_theorem 18:25:51 dstanek, dolphm I do suspect this is why people go no-sql. We've essentially converted Keystone into a large, distributed, eventually-consistent store 18:26:01 bknudson, I took a course with Brewer 18:26:08 I can't think of a technical issue, I think it is more conceptual cleanness vs keep doing what we've been doing (for Newton) for the "expand" cycle 18:26:09 stevemar: and more importantly, repurposing the old repo to be the new expand repo prevents the 3 repos we're going to go forward with from having the same migration number at all times 18:26:28 stevemar: which is something that would be easy to enforce, and would help us prevent deployers from accidentally running migrations out of order 18:26:39 * topol why is bknudson referencing the CAP theorem? 18:26:49 * topol things getting intersting.. 18:27:12 bknudson, that class is one of the reasons I realize just how hard Database stuff can be to get right 18:27:24 topol: because ayoung was saying that we could somehow use a consistent database. It can't happen. 18:27:31 dolphm: you thinking that we would check on, say, a --migrate, that the spand repo was not at a lower nummber? 18:27:37 ayoung the answer is eventual consistenc.. always .. except for banks :-) 18:27:47 (...that the expand repo....) 18:28:06 bknudson. agreed. dont go up against the CAP theorem 18:28:31 topol, what we've done is doubled our exposuie to its limitations 18:28:43 first at the Galera level, and then again at the memcache level 18:29:02 ayoung, uggh. really? How so 18:29:26 topol, SO, Gallera odes "the best it can" to be ACID compliant in a distributed sense 18:29:35 dolphm: if so, then I'm sold...since the one thing I liked about the "status based" code I wrote originally was that we stopped deployers doing things out of order 18:29:38 and if it fails, well, we kindof get notified 18:29:55 OTOH, memcache adds an additional layer of "distributed inconsistancy" 18:30:05 henrynash: yes, definitely to your question 18:30:46 henrynash: you can also do some other cool stuff like call migrate_repo(expand_version_number) 18:30:59 henrynash: contract_repo(migrate_version_number) 18:31:05 dolphm, rderose: OK, it's a fair cop (guvs) I'll come quietly (and make the changes today!) 18:31:09 We think of distributed memcache as a single system, but it is not, it is multiple nodes, and they are no-where-close to thinking about consistency across them 18:31:18 henrynash: ++ 18:31:19 henrynash: ++ 18:31:21 :) 18:31:26 rderose: hey thats my line! 18:31:35 they are just supposed to be a cache. As such, it is "yes I have it" or "no let me get it" 18:31:40 stevemar: I was first 18:31:48 henrynash: i feel like you're making some british pop culture reference that just went over my head lol 18:31:54 *grumble* 18:32:12 but if changes are happening behind the cache, such as what we do when we hammer on our tests, memcache does not deal 18:32:29 dolphm, I think he has been arrested before 18:32:30 henrynash, I thought the line was "its a fair court?" 18:32:39 ayoung: if you're clustering memcache nodes, then there's nothing to keep consistent because you're sharding data across them 18:32:46 ayoung: there is no consistency across memcached nodes by design 18:32:53 dolphm, right. dstanek right 18:32:59 http://i1.kym-cdn.com/photos/images/newsfeed/000/992/401/e37.png 18:33:02 there isn't and shouldn't be - a give can can only be on a single node ever 18:33:08 ayoung: maybe, blended by folk law... 18:33:20 stevemar: ++ 18:33:38 dstanek, and *we* kinda get that. But people outside of Keystone think of it as a single, database backed, transaction system. Hence revocations 18:33:54 http://dictionary.cambridge.org/dictionary/english/it-s-a-fair-cop 18:34:22 ayoung: i think the expectation is becoming that keystone scale globally and handle these more difficult scenarios 18:34:27 henrynash, all these years I've been misinterpreting the dead Bishop sketch 18:34:30 thanks henrynash, that cleared it up 18:34:49 stevemar: http://s2.quickmeme.com/img/10/10e71fd6edca008ff7ab182e9c428ea756412398815dbeec35f25092e913ed2c.jpg 18:35:05 lol 18:35:06 lol 18:35:12 http://www.montypython.net/scripts/bishop.php 18:35:13 i dont get it 18:35:33 i think we're all sleep deprived and delirious 18:35:57 we need more open discussion? we're just arguing and posting memes at this point? 18:36:04 leave it to the LA Kings 18:36:37 ayoung: so we are back to dropping revocations? 18:36:49 I like to think of us all as sleep deprived and delicious.... 18:36:52 lbragstad: that's better than the Fargo??? wait, what teams do you have? 18:37:09 :) 18:37:10 rderose ... well played 18:37:30 :) 18:37:34 henrynash I have no idea how offensive this really is 18:37:34 rderose we settle for the Minnesota Wild ;) 18:37:44 https://www.youtube.com/watch?v=n5qn_kShlDI 18:37:50 lbragstad: there you go 18:38:12 topol: ++ 18:38:21 "Well, I meet a lot of people and I'm convinced that the vast majority of wrongthinking people are right. " 18:38:46 shaleh, so, yeah, I think we can drop revocations if we go with the plan jamielennox and I were dreaming up at the midcycle 18:38:51 and topol is gonna have a talk with HR 18:39:02 well, most revocations. Explicit will still be needed 18:39:03 * topol took a risk 18:39:13 ayoung: i liked that plan FWIW 18:39:17 ayoung: if a) works and b) confuses people less that sounds good 18:39:19 But if we make tokens short lived...yeah, most go away 18:39:19 Maybe all 18:39:27 topol: :-) 18:39:32 stevemar, me, too.. Just need the time to work on it 18:39:33 ayoung: like 95+% 18:39:40 ayoung: just ditch tripleo :P 18:39:50 come back to us, you know you want to 18:40:01 ayoung: what;s the path to short lived tokens? 18:40:16 stevemar, I do want to, but it turns out that people can't actually use the crap we build if the installer disables it 18:40:27 henrynash: 2 options, one complex, one simple 18:40:31 ( and don't say fly to Boston and turn left) 18:40:40 `[token] expiration` = 600 18:40:50 #shipit 18:40:54 :) 18:41:07 lbragstad: ++ 18:41:12 henrynash: we re-work it completely and use "reservations" by jamie 18:41:12 henrynash, bascially, this: we make tokens live for a really short time as authentication proxies, but honor the data that they represent for the life time of the workflows 18:41:21 henrynash: or we yeah, taht ^ 18:41:44 stevemar, lets not call them reservations. That was a different mechanism, where the user goes to Cinder first, creates a reservation, and passes that to Nova 18:41:53 ayoung: right we discussed that at he midcycle...what's stopping us 18:42:03 so you have 5 minutes to use a token, but if you used it, it's valid for 24 hours 18:42:09 ayoung: i think i have that right? 18:42:17 henrynash, time 18:42:18 I need to get some team priorities done, and it is eating up my time 18:42:26 stevemar, you are right 18:42:37 henrynash I think we need more ten-digit gnomes to do the neeful 18:42:38 ayoung: i like the solution, it's neat 18:42:40 and that is OK because we go back to Keystone and expand out the data each time 18:43:06 lbragstad: someone has their big book of british slang open 18:43:30 dstanek: i like what breton said here https://review.openstack.org/#/c/349704/5 :) 18:43:32 how do we settle that with the ops request for 4 hour tokens to limit security exposure 18:44:13 shaleh: the 24 hours part if a config option, could set it to 4 hrs or 1 18:44:14 stevemar: breton: woot! there is a transient failure though :-( 18:44:39 soon as we are done here i shall go fix 18:45:07 anyway, let's actually end this 18:45:12 thanks for coming everyone 18:45:20 thanks dolphm for covering me last week and next week 18:45:25 well, bye 18:45:31 wat 18:45:31 o/ 18:45:36 #endmeeting