18:01:35 #startmeeting 18:01:36 Meeting started Tue Jul 24 18:01:35 2012 UTC. The chair is heckj. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:01:37 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 18:01:53 Ola all! 18:02:28 The big topic for today is the PKI code review work/descriptions/etc from ayoung 18:02:37 Before we get there, is there any immediate issues folks have? 18:02:47 #topic anything immediate? 18:03:37 there's a new security bug that was alerted in today - received it, but haven't tested/verified against it. 18:03:38 ok 18:04:05 heckj, should we talking about that in a public chat room? 18:04:25 If it is OK to do so, please post the link 18:04:57 I don't think it's too butal of an exploit - https://bugs.launchpad.net/keystone/+bug/1028563 18:05:18 since it's a security vuln, you need to be explicitly subscribed to see it. I've added ayoung, termie, and dolph to it 18:05:46 content: "Identity authentication does not check if user is enabled" 18:06:21 heckj, ah, ok. yeah, LDAP enabled is going to need a little massaging, as that field doesn't exist in the base schema. The others are checked in the authenticate call...we should hit some of that in todays walk through 18:06:34 cool 18:06:37 I can chime in on the ticket 18:06:48 have at at your time 18:07:03 Let's get into the main topic then 18:07:09 #topic PKI code review 18:08:12 OK, lets start with authenticate then... 18:08:28 https://review.openstack.org/#/c/7754/9/keystone/service.py 18:08:57 line 269 18:09:00 #link https://review.openstack.org/#/c/7754/9/keystone/service.py 18:09:21 This is the call for either creating a new token, or validating an existing one 18:09:41 on line 290, you see the start of the big "if" that splits these two use cases 18:10:08 bascially, if you pass in the password credential, any existing token is ignored. This logic is maintained 18:10:51 that majority of the heavy lifting is done by self.identity_api.get_user_by_name 18:11:34 hmm... 18:12:05 no, the heavy lifting isin the common lines that are hidden, 1 sec 18:12:23 anyone know how to expand those? 18:12:44 the real heavy lifting is done by 18:12:47 auth_info = self.identity_api.authenticate(context=context, 18:12:47 user_id=user_id, 18:12:47 password=password, 18:12:47 tenant_id=tenant_id) 18:13:00 line 277 prior to the patch 18:13:02 yeah, not sure - was just mucking with the interface to see that 18:13:07 https://github.com/openstack/keystone/blob/master/keystone/service.py#L277 18:13:29 as a note, this call needs to be refactored, 18:13:51 identity api builds up the data that will be sent back as the authenticate response 18:13:56 click on "preferences", then change context to "whole file" and you'll see the whole thing side by side 18:14:40 heckj, thanks. You guys with me? 18:15:03 with you 18:15:21 so line 315 18:15:40 that is what A: checks that uid and password is correct and builds up the data for the token 18:15:50 there is a little postprocessing, but let skip ahead to 18:16:08 after the if statement 18:16:19 line 426 18:16:39 426? 18:16:44 heckj yes 18:16:53 I am skipping the token path for now 18:16:58 er 18:17:11 the path where an existing token is passed in 18:17:23 and instead followng the userid and password are passed in 18:17:35 gimme a sec, catching up 18:17:39 we'll go back to the else block in a second 18:18:04 okay - good 18:18:16 to sum up: line 315 creates the data for the token , line 426 signs it 18:18:43 what's "cms" stand for? 18:18:50 crypto message syntax 18:18:56 got it 18:18:56 it is the format of the signed document 18:19:04 it is what is used for SMIME among other things 18:19:15 and it maps to the following command line call: 18:19:41 openssl cms -sign -in auth_token.json -nosmimecap -signer cert.pem -inkey key.pem -outform DER -nodetach -nocerts -noattr -out auth_token.signed 18:20:06 that call, and the corresponding call to verify are in 18:20:25 https://review.openstack.org/#/c/7754/9/keystone/common/cms.py 18:20:55 note that they are done using popen. THis is the best supported parallelisation mechanism in Eventlet (AFAICT) 18:21:15 so a new process is forked off, then execs openssl ... 18:21:31 the output is read back into the parent process 18:21:49 yep, with ya 18:21:55 so cms.cms_sign_text(json.dumps(token_data), 18:22:09 gets signed. on line 54 18:22:32 cms_to_token (called line 73) does a little postprocessing 18:22:54 strips off the header, footer, and replaces / with - 18:22:58 ayoung, so line 375 (checking token's length) basically allows the old-fashioned token support if it's not a CMS token, correct? 18:23:01 slaps it all together into one big string 18:23:07 (service.py), sorry 18:23:07 chops line returns to 18:23:09 heckj, yes 18:23:18 liemmn, that is correct 18:23:29 thx 18:23:32 liemmn, as does the check to see if it is disabled 18:23:35 line... 18:23:43 422 18:23:52 got it 18:23:53 and for now default is to disable 18:24:21 ayoung: nice, looks good 18:24:35 OK, heckj so the big thing I would change, and will do so in the near future... 18:24:45 lets jump back to service.py 18:24:52 and look in that else block 18:25:21 line 340 18:25:34 we read the old token out of the Header 18:25:48 somewhere earlier... 18:26:12 and here we validate 18:26:49 first by checking to see if it is in the datastore....there are pros and cons to doing this, but this is the least change approach 18:27:12 if it is not in the backend, we can assume "disabled" or "invalid token" 18:27:17 note that this is in Keystone 18:27:28 a remote service does not go through this code path 18:27:56 up to line 374 is fairly close to the old logic 18:28:24 and then we hit 375 which liemmn pointed out before... 18:28:39 UUID tokens are shorter 18:29:14 there is also some interspersed logic here for expiry, which we maintain 18:30:00 the common lines, like the block at 394 18:30:11 is for building up the response to the verify call 18:30:31 and should be refactored into the identity api 18:30:48 thus, this code should be much simpler ideally 18:31:11 +1 for shorter methods :) 18:31:12 line 385 on is all really common code 18:31:35 OK..jump ahead to 430 18:31:56 regardless of password or token, if we issue a new token, once we sign it, we need to persist it 18:32:07 the big change is that the ID is no longer a uuid 18:32:26 431: token_ref = self.token_api.create_token( 18:32:58 so this code is per-backend. Lets look at the SQL one 18:33:15 https://review.openstack.org/#/c/7754/9/keystone/token/backends/sql.py 18:33:30 line 31 18:33:36 we have dropped the id column 18:33:42 because it is confusing 18:33:52 well, not dropped 18:33:58 but it is no longer the primary key 18:34:05 the id *is* the signed document 18:34:16 way too long to be indexable by MySQL etc 18:34:26 buy SQL Alchemy insists on a primary key 18:34:29 so that is 18:34:32 id_hash 18:34:45 camn anyone guess what that is? 18:34:54 hint on line 65 18:35:10 nice 18:35:11 quick md5 of the signed token, eh? 18:35:16 yep 18:35:22 rkukura gets props for the idea 18:35:42 It is short enough and unique enough for our purposes 18:35:59 unique enough? 18:36:10 what is the collission rate? 18:36:27 s34n, s34n on md5? QUite small 18:36:30 and for these 18:36:39 s34n: unique for a primary key in SQLAlchemy - it has the standard md5 hashing characteristics 18:36:40 because the docs are so similar, even smaller 18:37:44 MD5 is sensitive to small changes in the document. 2 dos that are similar are more likely to have different MD5s than wildly differen docs. 18:37:55 something is tickling my brain on that. Something I recently read on md5 collisions. Let me research. I'm probably wrong. 18:38:09 thatis, of course, a laymans understanding, and would proabably make most people that know stats annoyed 18:39:31 s34n, we should also be flushing the tokens after they are expired somehow. We are not doing that now, but once we do, collisions should be statistically sufficiently ignorable 18:39:48 s34n, note that UUIDs have the same problem 18:40:15 OK, before we move on to the auth_)token middlewar 18:40:16 e 18:40:24 I'd like to talk a bit about SQL migration 18:40:35 kk 18:40:37 note that I had to change https://review.openstack.org/#/c/7754/9/keystone/common/sql/migrate_repo/versions/001_add_initial_tables.py 18:41:05 that is because if you automate the table creation for tokens, they will be defined according to the new schema, with the id_hash column 18:41:22 and 001 needs to define them the way that they are today, with id as the pkey 18:41:41 hence line 38 defining the table explicitly 18:41:57 and dropping the import of the token into the migrate code 18:42:02 on line 26 18:42:24 then, because we are altering a table, it is really hard to do it right in sqlalchemy...maybe impossible 18:42:36 so upgrade and downgrade is done using goo-ole-SQL 18:42:54 got it 18:43:04 I have different files fdor mysdql and sqlite. I've been told that Postgres follows the sqlite 18:43:13 lets assume you are doign an on-the-fly upgrade 18:43:28 the old uuid token goes into the id and the id_hash columns...no harm there 18:43:49 and the old authenticate code kicks in (same thing that liemmn noted above) 18:43:59 only new tokens are signed and hashed for realz 18:44:09 on downgrade, we just dump all data 18:44:24 ayoung: sounds good 18:44:30 heckj, thanks 18:44:54 that is why for mysql we can get away with an altertable command. it changes the column name, anddrops the pkey, but maintains the data 18:45:01 for sqlite etc we do it more explicitly 18:45:34 OK, brief aside on config before auth_token 18:45:47 https://review.openstack.org/#/c/7754/9/keystone/config.py 18:46:05 line 128 is all we need 18:46:21 as the majoprity of the values we use we accepted in an earlier patch 18:46:49 opnce PKI is beat on somewhat, I'll submite a patch that flips line 129 to False 18:47:00 OK, any questions so far? 18:47:17 ayoung: lookin' good so far! 18:47:18 looks good... Is the default token validity 3650 days? 18:47:29 https://review.openstack.org/#/c/7754/9/keystone/middleware/auth_token.py 18:47:35 liemmn, um...let me see 18:47:42 * ayoung just closed that tab 18:47:43 137 18:48:07 liemmn: yep 18:48:13 That's a long time :) 18:48:26 yeah...that should be 1 18:48:35 that might be for the cert...let me check 18:48:51 that is not for token time out 18:48:57 that mechanism has not changed 18:49:01 oh, ok... makes sense 18:49:35 OK auth_token middleware, line 392 18:49:52 again, we gate on length 18:50:18 verify UUID token is the old path...on line validation 18:50:50 basically the red lines from line 350 to 392 18:51:06 moved to line 571 18:51:07 still wondering if there is value in caching these bigger cms tokens, since we are not incurring network cost anymore 18:51:20 liemmn, yes there is 18:51:32 if it is cached you don't have to do the fork/exec of openssl 18:52:05 ayoung: so as long as your local cache is good, you only take the decrypt hit once 18:52:11 heckj, that is right 18:52:15 ok 18:52:32 line 613 is where we verify signed tokens 18:52:57 again, using the code in keystone/common/cms.py 18:53:26 bascially adds back in the header, - to / and line breaks 18:53:33 then run through the openssl code 18:53:55 for now, I am just assuming one ca and one signing cert 18:54:00 they are fetched on demand 18:54:27 ayoung, you have a typo error on line 628 and 631... "this" -> "self" 18:54:33 see the exception blocks in line 627 18:54:42 ayoung: so a deployment expectation is that you'd likely drop in a ca and cert on every machine running the auth_token middleware, restricted directory, and they use it direclty as needed 18:54:53 heckj, yes 18:55:01 heckj, however 18:55:07 you might want to keep the fetch 18:55:12 especially for the signing cert 18:55:30 as that might expire. or you might have to deal with a security breach 18:55:33 breech 18:55:37 break in 18:55:40 heh 18:55:43 yeah, got it 18:56:23 heckj, in the future, I want auth_token to allow a list of keystone servers. We prime the pump with one 18:56:30 and the rest are fetched from the service catalog 18:56:48 then, in the signed token, it indicates "I was signed by foo" 18:56:53 ayoung: reasonable for the larger installations 18:56:54 and we fetch the cert for foo 18:57:16 heckj, it will also allow for federation, etc 18:57:33 we can specify that a given signing cert can only sign for a specific domain.... 18:57:40 heckj: just to get in under the wire, the keystone middleware was merged into swift. should be released next week in the next swift release 18:57:52 notmyname: thxusir 18:58:10 OK... the rest of the patch is commentary: tests and so forth 18:58:30 ;win 1 18:58:44 there is one thing I've found that makes me self-nack, but I did't want to speak up until after this walk through 18:58:56 I will buy a beer at the meetup if anyone can guess what it is 18:59:05 I will provide one hint: it is not in any of the files in this patch 18:59:35 any guesses? 18:59:42 going once.... 18:59:53 going twice.... 18:59:53 lack of docs 18:59:57 nope 19:00:00 :-) 19:00:14 Hi 19:00:16 heckj, I would be willing to checking withou tdocs 19:00:22 as the default behavior hasn;'t changed 19:00:39 no the missig feature is ec2 and s3 tokens in contrib 19:00:41 ayoung: yeah, I'm fine with it - but we'll want to describe how to use the new features very quickly 19:00:50 I think those will work as is by default 19:00:55 but not with PKI tokens 19:01:08 they only generate UUID tokens...which actually might be fine 19:01:21 Is this CI weekly meeting? 19:01:23 but they should be using common code for token generation 19:01:28 Daisy, not yet 19:01:30 still keystone 19:01:35 I'm waxing poetic 19:01:56 heckj, anyway...now that you know, I'll let you decide whether to nack on that...I think it is ok to do in a separate patch 19:02:31 very cool, ayoung... thanks for the walkthru! 19:02:38 My pleasure 19:02:44 now go forth and review! 19:02:48 :) 19:02:51 of course 19:02:53 * mtaylor taps foot patiently... 19:03:14 #endmeeting