18:01:35 <heckj> #startmeeting 18:01:36 <openstack> Meeting started Tue Jul 24 18:01:35 2012 UTC. The chair is heckj. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:01:37 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 18:01:53 <heckj> Ola all! 18:02:28 <heckj> The big topic for today is the PKI code review work/descriptions/etc from ayoung 18:02:37 <heckj> Before we get there, is there any immediate issues folks have? 18:02:47 <heckj> #topic anything immediate? 18:03:37 <heckj> there's a new security bug that was alerted in today - received it, but haven't tested/verified against it. 18:03:38 <liemmn> ok 18:04:05 <ayoung> heckj, should we talking about that in a public chat room? 18:04:25 <ayoung> If it is OK to do so, please post the link 18:04:57 <heckj> I don't think it's too butal of an exploit - https://bugs.launchpad.net/keystone/+bug/1028563 18:05:18 <heckj> since it's a security vuln, you need to be explicitly subscribed to see it. I've added ayoung, termie, and dolph to it 18:05:46 <heckj> content: "Identity authentication does not check if user is enabled" 18:06:21 <ayoung> heckj, ah, ok. yeah, LDAP enabled is going to need a little massaging, as that field doesn't exist in the base schema. The others are checked in the authenticate call...we should hit some of that in todays walk through 18:06:34 <heckj> cool 18:06:37 <ayoung> I can chime in on the ticket 18:06:48 <heckj> have at at your time 18:07:03 <heckj> Let's get into the main topic then 18:07:09 <heckj> #topic PKI code review 18:08:12 <ayoung> OK, lets start with authenticate then... 18:08:28 <ayoung> https://review.openstack.org/#/c/7754/9/keystone/service.py 18:08:57 <ayoung> line 269 18:09:00 <heckj> #link https://review.openstack.org/#/c/7754/9/keystone/service.py 18:09:21 <ayoung> This is the call for either creating a new token, or validating an existing one 18:09:41 <ayoung> on line 290, you see the start of the big "if" that splits these two use cases 18:10:08 <ayoung> bascially, if you pass in the password credential, any existing token is ignored. This logic is maintained 18:10:51 <ayoung> that majority of the heavy lifting is done by self.identity_api.get_user_by_name 18:11:34 <ayoung> hmm... 18:12:05 <ayoung> no, the heavy lifting isin the common lines that are hidden, 1 sec 18:12:23 <ayoung> anyone know how to expand those? 18:12:44 <ayoung> the real heavy lifting is done by 18:12:47 <ayoung> auth_info = self.identity_api.authenticate(context=context, 18:12:47 <ayoung> user_id=user_id, 18:12:47 <ayoung> password=password, 18:12:47 <ayoung> tenant_id=tenant_id) 18:13:00 <ayoung> line 277 prior to the patch 18:13:02 <heckj> yeah, not sure - was just mucking with the interface to see that 18:13:07 <ayoung> https://github.com/openstack/keystone/blob/master/keystone/service.py#L277 18:13:29 <ayoung> as a note, this call needs to be refactored, 18:13:51 <ayoung> identity api builds up the data that will be sent back as the authenticate response 18:13:56 <heckj> click on "preferences", then change context to "whole file" and you'll see the whole thing side by side 18:14:40 <ayoung> heckj, thanks. You guys with me? 18:15:03 <heckj> with you 18:15:21 <ayoung> so line 315 18:15:40 <ayoung> that is what A: checks that uid and password is correct and builds up the data for the token 18:15:50 <ayoung> there is a little postprocessing, but let skip ahead to 18:16:08 <ayoung> after the if statement 18:16:19 <ayoung> line 426 18:16:39 <heckj> 426? 18:16:44 <ayoung> heckj yes 18:16:53 <ayoung> I am skipping the token path for now 18:16:58 <ayoung> er 18:17:11 <ayoung> the path where an existing token is passed in 18:17:23 <ayoung> and instead followng the userid and password are passed in 18:17:35 <heckj> gimme a sec, catching up 18:17:39 <ayoung> we'll go back to the else block in a second 18:18:04 <heckj> okay - good 18:18:16 <ayoung> to sum up: line 315 creates the data for the token , line 426 signs it 18:18:43 <heckj> what's "cms" stand for? 18:18:50 <ayoung> crypto message syntax 18:18:56 <heckj> got it 18:18:56 <ayoung> it is the format of the signed document 18:19:04 <ayoung> it is what is used for SMIME among other things 18:19:15 <ayoung> and it maps to the following command line call: 18:19:41 <ayoung> openssl cms -sign -in auth_token.json -nosmimecap -signer cert.pem -inkey key.pem -outform DER -nodetach -nocerts -noattr -out auth_token.signed 18:20:06 <ayoung> that call, and the corresponding call to verify are in 18:20:25 <ayoung> https://review.openstack.org/#/c/7754/9/keystone/common/cms.py 18:20:55 <ayoung> note that they are done using popen. THis is the best supported parallelisation mechanism in Eventlet (AFAICT) 18:21:15 <ayoung> so a new process is forked off, then execs openssl ... 18:21:31 <ayoung> the output is read back into the parent process 18:21:49 <heckj> yep, with ya 18:21:55 <ayoung> so cms.cms_sign_text(json.dumps(token_data), 18:22:09 <ayoung> gets signed. on line 54 18:22:32 <ayoung> cms_to_token (called line 73) does a little postprocessing 18:22:54 <ayoung> strips off the header, footer, and replaces / with - 18:22:58 <liemmn> ayoung, so line 375 (checking token's length) basically allows the old-fashioned token support if it's not a CMS token, correct? 18:23:01 <heckj> slaps it all together into one big string 18:23:07 <liemmn> (service.py), sorry 18:23:07 <ayoung> chops line returns to 18:23:09 <ayoung> heckj, yes 18:23:18 <ayoung> liemmn, that is correct 18:23:29 <liemmn> thx 18:23:32 <ayoung> liemmn, as does the check to see if it is disabled 18:23:35 <ayoung> line... 18:23:43 <ayoung> 422 18:23:52 <liemmn> got it 18:23:53 <ayoung> and for now default is to disable 18:24:21 <heckj> ayoung: nice, looks good 18:24:35 <ayoung> OK, heckj so the big thing I would change, and will do so in the near future... 18:24:45 <ayoung> lets jump back to service.py 18:24:52 <ayoung> and look in that else block 18:25:21 <ayoung> line 340 18:25:34 <ayoung> we read the old token out of the Header 18:25:48 <ayoung> somewhere earlier... 18:26:12 <ayoung> and here we validate 18:26:49 <ayoung> first by checking to see if it is in the datastore....there are pros and cons to doing this, but this is the least change approach 18:27:12 <ayoung> if it is not in the backend, we can assume "disabled" or "invalid token" 18:27:17 <ayoung> note that this is in Keystone 18:27:28 <ayoung> a remote service does not go through this code path 18:27:56 <ayoung> up to line 374 is fairly close to the old logic 18:28:24 <ayoung> and then we hit 375 which liemmn pointed out before... 18:28:39 <ayoung> UUID tokens are shorter 18:29:14 <ayoung> there is also some interspersed logic here for expiry, which we maintain 18:30:00 <ayoung> the common lines, like the block at 394 18:30:11 <ayoung> is for building up the response to the verify call 18:30:31 <ayoung> and should be refactored into the identity api 18:30:48 <ayoung> thus, this code should be much simpler ideally 18:31:11 <liemmn> +1 for shorter methods :) 18:31:12 <ayoung> line 385 on is all really common code 18:31:35 <ayoung> OK..jump ahead to 430 18:31:56 <ayoung> regardless of password or token, if we issue a new token, once we sign it, we need to persist it 18:32:07 <ayoung> the big change is that the ID is no longer a uuid 18:32:26 <ayoung> 431: token_ref = self.token_api.create_token( 18:32:58 <ayoung> so this code is per-backend. Lets look at the SQL one 18:33:15 <ayoung> https://review.openstack.org/#/c/7754/9/keystone/token/backends/sql.py 18:33:30 <ayoung> line 31 18:33:36 <ayoung> we have dropped the id column 18:33:42 <ayoung> because it is confusing 18:33:52 <ayoung> well, not dropped 18:33:58 <ayoung> but it is no longer the primary key 18:34:05 <ayoung> the id *is* the signed document 18:34:16 <ayoung> way too long to be indexable by MySQL etc 18:34:26 <ayoung> buy SQL Alchemy insists on a primary key 18:34:29 <ayoung> so that is 18:34:32 <ayoung> id_hash 18:34:45 <ayoung> camn anyone guess what that is? 18:34:54 <ayoung> hint on line 65 18:35:10 <liemmn> nice 18:35:11 <heckj> quick md5 of the signed token, eh? 18:35:16 <ayoung> yep 18:35:22 <ayoung> rkukura gets props for the idea 18:35:42 <ayoung> It is short enough and unique enough for our purposes 18:35:59 <s34n> unique enough? 18:36:10 <s34n> what is the collission rate? 18:36:27 <ayoung> s34n, s34n on md5? QUite small 18:36:30 <ayoung> and for these 18:36:39 <heckj> s34n: unique for a primary key in SQLAlchemy - it has the standard md5 hashing characteristics 18:36:40 <ayoung> because the docs are so similar, even smaller 18:37:44 <ayoung> MD5 is sensitive to small changes in the document. 2 dos that are similar are more likely to have different MD5s than wildly differen docs. 18:37:55 <s34n> something is tickling my brain on that. Something I recently read on md5 collisions. Let me research. I'm probably wrong. 18:38:09 <ayoung> thatis, of course, a laymans understanding, and would proabably make most people that know stats annoyed 18:39:31 <ayoung> s34n, we should also be flushing the tokens after they are expired somehow. We are not doing that now, but once we do, collisions should be statistically sufficiently ignorable 18:39:48 <ayoung> s34n, note that UUIDs have the same problem 18:40:15 <ayoung> OK, before we move on to the auth_)token middlewar 18:40:16 <ayoung> e 18:40:24 <ayoung> I'd like to talk a bit about SQL migration 18:40:35 <heckj> kk 18:40:37 <ayoung> note that I had to change https://review.openstack.org/#/c/7754/9/keystone/common/sql/migrate_repo/versions/001_add_initial_tables.py 18:41:05 <ayoung> that is because if you automate the table creation for tokens, they will be defined according to the new schema, with the id_hash column 18:41:22 <ayoung> and 001 needs to define them the way that they are today, with id as the pkey 18:41:41 <ayoung> hence line 38 defining the table explicitly 18:41:57 <ayoung> and dropping the import of the token into the migrate code 18:42:02 <ayoung> on line 26 18:42:24 <ayoung> then, because we are altering a table, it is really hard to do it right in sqlalchemy...maybe impossible 18:42:36 <ayoung> so upgrade and downgrade is done using goo-ole-SQL 18:42:54 <heckj> got it 18:43:04 <ayoung> I have different files fdor mysdql and sqlite. I've been told that Postgres follows the sqlite 18:43:13 <ayoung> lets assume you are doign an on-the-fly upgrade 18:43:28 <ayoung> the old uuid token goes into the id and the id_hash columns...no harm there 18:43:49 <ayoung> and the old authenticate code kicks in (same thing that liemmn noted above) 18:43:59 <ayoung> only new tokens are signed and hashed for realz 18:44:09 <ayoung> on downgrade, we just dump all data 18:44:24 <heckj> ayoung: sounds good 18:44:30 <ayoung> heckj, thanks 18:44:54 <ayoung> that is why for mysql we can get away with an altertable command. it changes the column name, anddrops the pkey, but maintains the data 18:45:01 <ayoung> for sqlite etc we do it more explicitly 18:45:34 <ayoung> OK, brief aside on config before auth_token 18:45:47 <ayoung> https://review.openstack.org/#/c/7754/9/keystone/config.py 18:46:05 <ayoung> line 128 is all we need 18:46:21 <ayoung> as the majoprity of the values we use we accepted in an earlier patch 18:46:49 <ayoung> opnce PKI is beat on somewhat, I'll submite a patch that flips line 129 to False 18:47:00 <ayoung> OK, any questions so far? 18:47:17 <heckj> ayoung: lookin' good so far! 18:47:18 <liemmn> looks good... Is the default token validity 3650 days? 18:47:29 <ayoung> https://review.openstack.org/#/c/7754/9/keystone/middleware/auth_token.py 18:47:35 <ayoung> liemmn, um...let me see 18:47:42 * ayoung just closed that tab 18:47:43 <liemmn> 137 18:48:07 <heckj> liemmn: yep 18:48:13 <liemmn> That's a long time :) 18:48:26 <ayoung> yeah...that should be 1 18:48:35 <ayoung> that might be for the cert...let me check 18:48:51 <ayoung> that is not for token time out 18:48:57 <ayoung> that mechanism has not changed 18:49:01 <liemmn> oh, ok... makes sense 18:49:35 <ayoung> OK auth_token middleware, line 392 18:49:52 <ayoung> again, we gate on length 18:50:18 <ayoung> verify UUID token is the old path...on line validation 18:50:50 <ayoung> basically the red lines from line 350 to 392 18:51:06 <ayoung> moved to line 571 18:51:07 <liemmn> still wondering if there is value in caching these bigger cms tokens, since we are not incurring network cost anymore 18:51:20 <ayoung> liemmn, yes there is 18:51:32 <ayoung> if it is cached you don't have to do the fork/exec of openssl 18:52:05 <heckj> ayoung: so as long as your local cache is good, you only take the decrypt hit once 18:52:11 <ayoung> heckj, that is right 18:52:15 <liemmn> ok 18:52:32 <ayoung> line 613 is where we verify signed tokens 18:52:57 <ayoung> again, using the code in keystone/common/cms.py 18:53:26 <ayoung> bascially adds back in the header, - to / and line breaks 18:53:33 <ayoung> then run through the openssl code 18:53:55 <ayoung> for now, I am just assuming one ca and one signing cert 18:54:00 <ayoung> they are fetched on demand 18:54:27 <liemmn> ayoung, you have a typo error on line 628 and 631... "this" -> "self" 18:54:33 <ayoung> see the exception blocks in line 627 18:54:42 <heckj> ayoung: so a deployment expectation is that you'd likely drop in a ca and cert on every machine running the auth_token middleware, restricted directory, and they use it direclty as needed 18:54:53 <ayoung> heckj, yes 18:55:01 <ayoung> heckj, however 18:55:07 <ayoung> you might want to keep the fetch 18:55:12 <ayoung> especially for the signing cert 18:55:30 <ayoung> as that might expire. or you might have to deal with a security breach 18:55:33 <ayoung> breech 18:55:37 <ayoung> break in 18:55:40 <heckj> heh 18:55:43 <heckj> yeah, got it 18:56:23 <ayoung> heckj, in the future, I want auth_token to allow a list of keystone servers. We prime the pump with one 18:56:30 <ayoung> and the rest are fetched from the service catalog 18:56:48 <ayoung> then, in the signed token, it indicates "I was signed by foo" 18:56:53 <heckj> ayoung: reasonable for the larger installations 18:56:54 <ayoung> and we fetch the cert for foo 18:57:16 <ayoung> heckj, it will also allow for federation, etc 18:57:33 <ayoung> we can specify that a given signing cert can only sign for a specific domain.... 18:57:40 <notmyname> heckj: just to get in under the wire, the keystone middleware was merged into swift. should be released next week in the next swift release 18:57:52 <heckj> notmyname: thxusir 18:58:10 <ayoung> OK... the rest of the patch is commentary: tests and so forth 18:58:30 <termie> ;win 1 18:58:44 <ayoung> there is one thing I've found that makes me self-nack, but I did't want to speak up until after this walk through 18:58:56 <ayoung> I will buy a beer at the meetup if anyone can guess what it is 18:59:05 <ayoung> I will provide one hint: it is not in any of the files in this patch 18:59:35 <ayoung> any guesses? 18:59:42 <ayoung> going once.... 18:59:53 <ayoung> going twice.... 18:59:53 <heckj> lack of docs 18:59:57 <ayoung> nope 19:00:00 <heckj> :-) 19:00:14 <Daisy> Hi 19:00:16 <ayoung> heckj, I would be willing to checking withou tdocs 19:00:22 <ayoung> as the default behavior hasn;'t changed 19:00:39 <ayoung> no the missig feature is ec2 and s3 tokens in contrib 19:00:41 <heckj> ayoung: yeah, I'm fine with it - but we'll want to describe how to use the new features very quickly 19:00:50 <ayoung> I think those will work as is by default 19:00:55 <ayoung> but not with PKI tokens 19:01:08 <ayoung> they only generate UUID tokens...which actually might be fine 19:01:21 <Daisy> Is this CI weekly meeting? 19:01:23 <ayoung> but they should be using common code for token generation 19:01:28 <ayoung> Daisy, not yet 19:01:30 <ayoung> still keystone 19:01:35 <ayoung> I'm waxing poetic 19:01:56 <ayoung> heckj, anyway...now that you know, I'll let you decide whether to nack on that...I think it is ok to do in a separate patch 19:02:31 <liemmn> very cool, ayoung... thanks for the walkthru! 19:02:38 <ayoung> My pleasure 19:02:44 <ayoung> now go forth and review! 19:02:48 <liemmn> :) 19:02:51 <liemmn> of course 19:02:53 * mtaylor taps foot patiently... 19:03:14 <heckj> #endmeeting