I've been thinking about how to address the million registration problem that I identified last week (actually, it was Josh Mandel who first pointed it out). I think I have a solution to the problem, and it revolves around the way the the consumer key is exchanged.
In Twitter, the same client application on different consumers’
systems uses the same key and secret, to connect to a singular data holder
(twitter.com). In ABBI, an application
will want to use the same key, yet different secrets on for each different data
holder it connects to. The reason to use
the same key is so that the client application can be consistently identified
in the same way across multiple data holders.
It also provides a way to avoid collisions between client applications. The reason to use a different secret with each data holder is to ensure that no data holder's secret can be used with another data holder. This prevents someone from pretending to be a data holder in order to obtain all of an application's secrets.
The consumer key I propose to use is a URL identifying the
application’s web page. It’s not
unreasonable to assume that a commercial application will have its own web site,
and while a little bit more challenging for the typical hacker, not a huge
hurdle. In fact, I run a secure web site for
free. It’s also not too hard to believe that
someone would willingly host pages for “garage-apps” (things a developer like
me would use to manage their own data).
The reason to use a different secret is so that the secret
shared between a client application and a data holder can only be used with
that data holder. This prevents a rogue from
creating a system pretending to be a data holder in order to gain access to a client
application’s secret. Surely they could
obtain access to a secret, but it would do them no good with any other data
holder, because the client application would just use a different secret for
the other data holder.
It is important that the secrets used by the client
application with the same data holder always be the same regardless of which
consumer system is accessing that data holder, because most (if not all) OAuth implementations
expect each key to be associated with one secret. Centralizing control of secrets isn’t
something that you can expect the client application itself to manager. In fact, a new secret would have to be
generated for every new data holder that appears in the ecosystem. This is another reason why the client
application needs to be associated with a web site, because that becomes the single
point of secret distribution for that client application.
Here is the workflow I’m presuming:
1)
Client Application MyABBI is installed on my personal
device.
2)
During Device Configuration, the application is
pointed to MyDataHolder.org as being one of the sources of data it needs to
query.
3)
Client Application MyABBI connects to its home
web-server (MyAbbiApps.com/MyABBI), and asks for the secret key to use with
MyDataHolder.org, as it has never connected to that Data Holder before. We don’t need to say much about how the
client application communicates to the web site, nor do we need to say how the
client application authenticates itself to that site, since both are under
control of the client application developer.
However, we do need to say that the communication must occur over a
secure channel, and that the Client Application must authenticate itself. In fact, the client application could use the
first step of the token request workflow to authenticate itself.
4)
The Client Application Web Site for MyABBI
responds with a secret key to MyABBI on my personal device. If the client application used the first step
of the token request workflow, the response could be the same as how a server
responds with a request token. During
this stage, one of two things happens:
a) https://MyAbbiApps.com/ABBI has never encountered https://MyDataHolder.org before. In this case, the site makes a determination whether to trust MyDataHolder.org or not. If it chooses to trust MyDataHolder.org, The website behind MyABBIAps.com/ABBI creates and stores a new secret associated with that data holder.b) https://MyAbbiApps.com/ABBI has encountered https://MyDataHolder.org. If it is trusted, the web site returns the secret associated with that data holder, and it isn’t trusted it returns some sort error response.
5)
Having obtained a consumer_secret, the MyABBI
can now begin the Authorization Workflow with https://MyDataHolder.org. The first step is for the client application
to obtain a request token from the Data Holder.
6)
About half way through the process of trying to verify
the request just made in the previous step, the https://MyDataHolder.org will
now need to obtain the client secret associated with client application. If it already has that information, it could
just reuse what it knows, since it usually will not have changed. But if it
doesn’t it needs to follow these steps:
a) Identify itself as https://MyDataHolder.org to https://MyAbbiApps.com/ABBI, and indicate that it is requesting the secret for the client application that identified itself to the Data Holder as being supported by that site.b) Reject the OAuth Request Token request with an error code.
7)
The https://MyABBIAps.com/ABBI website will
accept a request for a secret from a Data Holder. If the data holder trusts the client
application’s website certificate (at https://MyDataHolder.org), communication
continues, otherwise it stops at this point, and the Authorization workflow
fails.
8)
That request must contain the base URL for the
Data Holder’s ABBI API (e.g., https://MyDataHolder.org/ABBI/api) and also
contains a nonce associated with this request for the client application’s key.
9)
The Client Application’s website (https://MyAbbiApps.com/ABBI)
will send a request the “Secret Exchange” end-point for the Data Holder (https://MyDataHolder.org). That request will contain: The sending
website URL (https://MyAbbiApps.com/ABBI), the nonce given in step 8, and the client
applications secret needed to access resources.
10)
The Data Holder will store the client secret
associated with the client key if it recognizes the nonce, and return 200
OK. If it doesn’t recognize the nonce,
it sends back an error message.
11)
If https://MyAbbiApps.com
gets back an error message, indicating that MyDataHolder.org didn’t recognize
the nonce associated with the client secret, it should invalidate that secret.
12)
The client application will restart the
Authorization workflow with the data holder, and this time, it will succeed.
There are a couple of key points here that I’d like to
explain further. I’m assuming TLS
without client authentication, because that is just about how every consumer
facing web site works (even those with patient portals). Making either the data holder or the client
application’s web site use TLS with client authentication would simplify the
steps here, and enable them to establish mutual trust. However, it would also make MyDataHolder.org
have to provide a public certificate, which the web-site developer may not have
access to, or vice-versa. I’d be very
challenged to get access to the certificate securing https://abbi-motorcycleguy.rhcloud.com.
It’s not mine. It belongs to https://rhcloud.com and so I can’t use it to verify
a TLS connection as a client endpoint, it can only be used to verify the
identity of the web server.
Without mutual authentication, the challenge in step 7 is when
MyAbbiApp.com gets a request purporting to be from MyDataHolder.org asking for
the secret associated with MyAbbiApp.com.
It has no way to determine that the requester is indeed MyDataHolder.org
without without client authentication.
To resolve that problem, step 8 and 9 come into play. Step 8 ensures that there is a way to synchronize
the request with the response that comes asynchronously in step 9.
In Step 9, DataHolder.org finally gets the secret it asked
for at step 7.
Note that we haven’t delayed the authorization request in
step 6, we simply rejected it, expecting the client to retry in step 12. It is certainly feasible to delay the authorization
step, but synchronizing these two separate threads of activity can be
challenging, and isn’t necessary. It’s just
necessary to get the client application to try again. With a sufficient time delay (e.g., one
minute), the client application will be able to complete the authentication workflow
just fine most of the time.
Step 10 and 11 ensures that data holders respond
appropriately to requests to send them a secret, and the client web site revokes
a secret it handed out that wasn’t accepted by a Data Holder.
The applications running behind MyAbbiApps.com and MyDataHolder.org
need establish no mutual trust relationship initially. It is only when they become aware of each
other that each can separately make a trust determination about the other based
upon policies. MyDataHolder.org could
refuse to trust a site whose certificate it didn’t like. Similarly, MyAbbiApps.com could refuse to
trust a Data Holder it didn’t like. It’s
completely up to them to configure the policy.
In fact, the trust relationships need not even be based on the
certifications, but could be based on the site URL. Blacklists could be used by either site as
well to reject requests.
Now that I’ve worked out the Authorization Workflow, what
about the normal API Authorization case?
It shouldn’t be necessary for the data holder to get the consumer secret
again for MyAbbiApps.com/ABBI, but if it were, it could simply start at step 6
with the request to obtain the secret.
In this way, applications don’t even need to maintain any sort of
persistent record of the client secret, because they have a way to obtain it
again later.
What I’ve just described is what I call 4-legged OAuth. The three actors in 3-legged OAuth are
1.
Consumer (End User)
2.
Client Application
3.
Server (Data Holder)
We’ve just added the Client Registrar.
Hi Keith,
ReplyDeleteAgain thanks for bringing attention to this issue -- but I'd strongly caution against inventing substantial new infrastructure here.
Getting the details right is just very difficult. What do you see as the essential missing functionality, e.g., in the existing "dynamic client registration" draft for OpenID Connect [1]?
Otherwise, two quick points:
* A client app running on an end-user's device (phone, tablet, laptop, etc) probably shouldn't be trusted to keep a secret. Especially a secret on which the privacy of other people's data depends. (Imagine one rogue user extracting the app's consumer secret and running wild.) See discussion of public vs. confidential clients in the OAuth 2 spec [2].
* As an example of the kind of subtlety involved here: the original OAuth spec was released and widely implemented when a session fixation vulnerability was discovered in its authorization process [3]. This was corrected in OAuth 1.0a by adding the `oauth_verifier parameter`.
[1] http://openid.net/specs/openid-connect-registration-1_0.html
[2] http://tools.ietf.org/html/draft-ietf-oauth-v2-31
[3] http://hueniverse.com/2009/04/explaining-the-oauth-session-fixation-attack/
A couple of responses:
ReplyDelete1. As I mentioned previously, Data Holder's don't necessarily want to become identity providers. Having to register an application with each data holder quickly becomes a ton of registrations. That means that data holders need to provide support to application providers to deal with various application connectivity issues.
With regard to your other points:
1. We already TRUST client apps to keep their secrets. The apps I use today that use OAuth do so.
2. I reused OAuth transaction described in 2.1 of the OAuth RFC to deliver a key and secret to the client (in a way that is already done by OAuth).
3. I reused the verifier pattern in the second communication between client web-site and data holder.
I'll have to look more deeply at OpenID Connect specifications. Right now, it isn't clear how these could be used by a Data Holder without the Data Holder having to be an Identity Provider.
Also, I'm trying to avoid the million registration problem. To put it simply, if there are a 1000 apps, and a 1000 data holders, in order for every one of those apps to work with every one of those data holders, there must be a million registrations performed.