Monday, October 29, 2012

A Four-legged OAuth for ABBI

I've been thinking about how to address the million registration problem that I identified last week (actually, it was Josh Mandel who first pointed it out).  I think I have a solution to the problem, and it revolves around the way the the consumer key is exchanged. 

In Twitter, the same client application on different consumers’ systems uses the same key and secret, to connect to a singular data holder (twitter.com).  In ABBI, an application will want to use the same key, yet different secrets on for each different data holder it connects to.  The reason to use the same key is so that the client application can be consistently identified in the same way across multiple data holders.  It also provides a way to avoid collisions between client applications.  The reason to use a different secret with each data holder is to ensure that no data holder's secret can be used with another data holder.  This prevents someone from pretending to be a data holder in order to obtain all of an application's secrets.

The consumer key I propose to use is a URL identifying the application’s web page.  It’s not unreasonable to assume that a commercial application will have its own web site, and while a little bit more challenging for the typical hacker, not a huge hurdle.  In fact, I run a secure web site for free.  It’s also not too hard to believe that someone would willingly host pages for “garage-apps” (things a developer like me would use to manage their own data).

The reason to use a different secret is so that the secret shared between a client application and a data holder can only be used with that data holder.  This prevents a rogue from creating a system pretending to be a data holder in order to gain access to a client application’s secret.  Surely they could obtain access to a secret, but it would do them no good with any other data holder, because the client application would just use a different secret for the other data holder.

It is important that the secrets used by the client application with the same data holder always be the same regardless of which consumer system is accessing that data holder, because most (if not all) OAuth implementations expect each key to be associated with one secret.   Centralizing control of secrets isn’t something that you can expect the client application itself to manager.  In fact, a new secret would have to be generated for every new data holder that appears in the ecosystem.  This is another reason why the client application needs to be associated with a web site, because that becomes the single point of secret distribution for that client application.

Here is the workflow I’m presuming:
1)      Client Application MyABBI is installed on my personal device.
2)      During Device Configuration, the application is pointed to MyDataHolder.org as being one of the sources of data it needs to query.
3)      Client Application MyABBI connects to its home web-server (MyAbbiApps.com/MyABBI), and asks for the secret key to use with MyDataHolder.org, as it has never connected to that Data Holder before.  We don’t need to say much about how the client application communicates to the web site, nor do we need to say how the client application authenticates itself to that site, since both are under control of the client application developer.  However, we do need to say that the communication must occur over a secure channel, and that the Client Application must authenticate itself.  In fact, the client application could use the first step of the token request workflow to authenticate itself.
4)      The Client Application Web Site for MyABBI responds with a secret key to MyABBI on my personal device.  If the client application used the first step of the token request workflow, the response could be the same as how a server responds with a request token.  During this stage, one of two things happens:
a)      https://MyAbbiApps.com/ABBI has never encountered https://MyDataHolder.org before.  In this case, the site makes a determination whether to trust MyDataHolder.org or not.  If it chooses to trust MyDataHolder.org, The website behind MyABBIAps.com/ABBI creates and stores a new secret associated with that data holder.b)      https://MyAbbiApps.com/ABBI has encountered https://MyDataHolder.org.  If it is trusted, the web site returns the secret associated with that data holder, and it isn’t trusted it returns some sort error response.
5)      Having obtained a consumer_secret, the MyABBI can now begin the Authorization Workflow with https://MyDataHolder.org.  The first step is for the client application to obtain a request token from the Data Holder.
6)      About half way through the process of trying to verify the request just made in the previous step, the https://MyDataHolder.org will now need to obtain the client secret associated with client application.  If it already has that information, it could just reuse what it knows, since it usually will not have changed. But if it doesn’t it needs to follow these steps:
a)      Identify itself as https://MyDataHolder.org to https://MyAbbiApps.com/ABBI, and indicate that it is requesting the secret for the client application that identified itself to the Data Holder as being supported by that site.b)      Reject the OAuth Request Token request with an error code.
7)      The https://MyABBIAps.com/ABBI website will accept a request for a secret from a Data Holder.  If the data holder trusts the client application’s website certificate (at https://MyDataHolder.org), communication continues, otherwise it stops at this point, and the Authorization workflow fails.
8)      That request must contain the base URL for the Data Holder’s ABBI API (e.g., https://MyDataHolder.org/ABBI/api) and also contains a nonce associated with this request for the client application’s key.
9)      The Client Application’s website (https://MyAbbiApps.com/ABBI) will send a request the “Secret Exchange” end-point for the Data Holder (https://MyDataHolder.org).  That request will contain: The sending website URL (https://MyAbbiApps.com/ABBI), the nonce given in step 8, and the client applications secret needed to access resources.
10)   The Data Holder will store the client secret associated with the client key if it recognizes the nonce, and return 200 OK.  If it doesn’t recognize the nonce, it sends back an error message.
11)   If https://MyAbbiApps.com gets back an error message, indicating that MyDataHolder.org didn’t recognize the nonce associated with the client secret, it should invalidate that secret.
12)   The client application will restart the Authorization workflow with the data holder, and this time, it will succeed.

There are a couple of key points here that I’d like to explain further.  I’m assuming TLS without client authentication, because that is just about how every consumer facing web site works (even those with patient portals).  Making either the data holder or the client application’s web site use TLS with client authentication would simplify the steps here, and enable them to establish mutual trust.  However, it would also make MyDataHolder.org have to provide a public certificate, which the web-site developer may not have access to, or vice-versa.  I’d be very challenged to get access to the certificate securing https://abbi-motorcycleguy.rhcloud.com. It’s not mine.  It belongs to https://rhcloud.com and so I can’t use it to verify a TLS connection as a client endpoint, it can only be used to verify the identity of the web server.

Without mutual authentication, the challenge in step 7 is when MyAbbiApp.com gets a request purporting to be from MyDataHolder.org asking for the secret associated with MyAbbiApp.com.  It has no way to determine that the requester is indeed MyDataHolder.org without without client authentication. 
To resolve that problem, step 8 and 9 come into play.  Step 8 ensures that there is a way to synchronize the request with the response that comes asynchronously in step 9.

In Step 9, DataHolder.org finally gets the secret it asked for at step 7. 

Note that we haven’t delayed the authorization request in step 6, we simply rejected it, expecting the client to retry in step 12.  It is certainly feasible to delay the authorization step, but synchronizing these two separate threads of activity can be challenging, and isn’t necessary.  It’s just necessary to get the client application to try again.  With a sufficient time delay (e.g., one minute), the client application will be able to complete the authentication workflow just fine most of the time.

Step 10 and 11 ensures that data holders respond appropriately to requests to send them a secret, and the client web site revokes a secret it handed out that wasn’t accepted by a Data Holder.

The applications running behind MyAbbiApps.com and MyDataHolder.org need establish no mutual trust relationship initially.  It is only when they become aware of each other that each can separately make a trust determination about the other based upon policies.  MyDataHolder.org could refuse to trust a site whose certificate it didn’t like.  Similarly, MyAbbiApps.com could refuse to trust a Data Holder it didn’t like.  It’s completely up to them to configure the policy.  In fact, the trust relationships need not even be based on the certifications, but could be based on the site URL.  Blacklists could be used by either site as well to reject requests.

Now that I’ve worked out the Authorization Workflow, what about the normal API Authorization case?  It shouldn’t be necessary for the data holder to get the consumer secret again for MyAbbiApps.com/ABBI, but if it were, it could simply start at step 6 with the request to obtain the secret.  In this way, applications don’t even need to maintain any sort of persistent record of the client secret, because they have a way to obtain it again later.

What I’ve just described is what I call 4-legged OAuth.  The three actors in 3-legged OAuth are
1.       Consumer (End User)
2.       Client Application
3.       Server (Data Holder)

We’ve just added the Client Registrar.


2 comments:

  1. Hi Keith,

    Again thanks for bringing attention to this issue -- but I'd strongly caution against inventing substantial new infrastructure here.

    Getting the details right is just very difficult. What do you see as the essential missing functionality, e.g., in the existing "dynamic client registration" draft for OpenID Connect [1]?

    Otherwise, two quick points:

    * A client app running on an end-user's device (phone, tablet, laptop, etc) probably shouldn't be trusted to keep a secret. Especially a secret on which the privacy of other people's data depends. (Imagine one rogue user extracting the app's consumer secret and running wild.) See discussion of public vs. confidential clients in the OAuth 2 spec [2].

    * As an example of the kind of subtlety involved here: the original OAuth spec was released and widely implemented when a session fixation vulnerability was discovered in its authorization process [3]. This was corrected in OAuth 1.0a by adding the `oauth_verifier parameter`.


    [1] http://openid.net/specs/openid-connect-registration-1_0.html

    [2] http://tools.ietf.org/html/draft-ietf-oauth-v2-31

    [3] http://hueniverse.com/2009/04/explaining-the-oauth-session-fixation-attack/

    ReplyDelete
  2. A couple of responses:
    1. As I mentioned previously, Data Holder's don't necessarily want to become identity providers. Having to register an application with each data holder quickly becomes a ton of registrations. That means that data holders need to provide support to application providers to deal with various application connectivity issues.

    With regard to your other points:
    1. We already TRUST client apps to keep their secrets. The apps I use today that use OAuth do so.
    2. I reused OAuth transaction described in 2.1 of the OAuth RFC to deliver a key and secret to the client (in a way that is already done by OAuth).
    3. I reused the verifier pattern in the second communication between client web-site and data holder.

    I'll have to look more deeply at OpenID Connect specifications. Right now, it isn't clear how these could be used by a Data Holder without the Data Holder having to be an Identity Provider.

    Also, I'm trying to avoid the million registration problem. To put it simply, if there are a 1000 apps, and a 1000 data holders, in order for every one of those apps to work with every one of those data holders, there must be a million registrations performed.

    ReplyDelete