The new draft spec at OAuth.xyz outlines a potential way to completely re-think OAuth from the ground up.
The XYZ spec, also nicknamed Transactional Authorization, effectively consolidates all of the different variations of OAuth that have evolved over the last decade, and plugs up a few security holes as well.
OAuth 2.0 has served us well over the last 10+ years, and has even been extensible enough to evolve to meet the new requirements of things like mobile apps and apps running on devices that don't have a browser or keyboard. It's hard to imagine that OAuth was originally created before the iPhone existed! That said, this extensibility has sometimes come at the cost of additional complexity, or at the cost of security. Many of the original assumptions that were made while designing OAuth 2.0 no longer apply, and the world of online security is very different today than it was 10 years ago.
The XYZ spec takes everything the industry has learned from OAuth 2.0 and OpenID Connect over the years and consolidates them into a uniform pattern, reducing reliance on less-secure communications channels, and building in flexibility in a consistent way.
If you're interested in the details of XYZ, the website oauth.xyz does a great job of laying out all the use cases with examples.
This post is not an overview of what XYZ is. Instead, what I'd like to talk about today is a suggestion for extending XYZ even further to encompass some additional use cases that have emerged, specifically about adding the user's identity information to the core protocol.
Identity has always been part of OAuth
OAuth was never intended to communicate information about the user who logged in to the app. This is a common point of confusion, usually wrapped up in the "authorization vs authentication" discussion. OAuth was originally created to give third party apps access to a user's account to read or write data at the API.
Naturally, many apps that want access to a user's resources also want to know who that user is. What ended up happening is most OAuth APIs added an endpoint that returns profile information about the user when requested with the access token. You can see this in common APIs like GitHub, Instagram, and Twitter. These were all developed independently, so they all work slightly differently and return data in slightly different formats.
GitHub
https://api.github.com/user
Authorization: Bearer XXXX
{
"login": "aaronpk",
"id": 1234,
"node_id": "MDQ6VXNlcjE=",
"avatar_url": "https://avatars3.githubusercontent.com/u/113001?s=460&v=4",
"gravatar_id": "",
"url": "https://api.github.com/users/aaronpk",
...
}
https://api.instagram.com/v1/users/self/?access_token=XXXX
{
"data": {
"id": "1574083",
"username": "aaronpk",
"full_name": "Aaron Parecki",
"profile_picture": "https://scontent-dfw5-2.cdninstagram.com/vp/a37666da4d6b670a849e7b77235e4d56/5DCD3710/t51.2885-19/s320x320/52837124_579812379154942_8897415153306304512_n.jpg?_nc_ht=scontent-dfw5-2.cdninstagram.com",
"bio": "OAuth hacker",
"website": "https://aaronparecki.com",
"is_business": false,
"counts": {
"media": 1572,
"follows": 285,
"followed_by": 505
}
}
}
https://api.twitter.com/1.1/account/verify_credentials.json
[
{
"id": 14447132,
"id_str": "14447132",
"name": "Aaron Parecki",
"screen_name": "aaronpk",
"location": "30,000 feet",
...
}
]
OpenID Connect
OpenID Connect is built on top of OAuth 2.0 and provides a mechanism for applications to get the identity of the user that signs in. It does this in two ways.
The idea of returning profile information given an access token was standardized in OpenID Connect in the "UserInfo Endpoint". When a provider supports this endpoint, a client can send an access token and get back a representation of the authenticated end user.
OpenID Connect UserInfo Endpoint
https://authorization-server.com/userinfo
Authorization: Bearer XXXXX
{
"sub": "https://aaronparecki.com/",
"name": "Aaron Parecki",
"email": "aaron@parecki.com",
"preferred_username": "aaronpk",
...
}
This ends up looking very similar to the provider-specific endpoints that we looked at above.
OpenID Connect ID Tokens
The other way OpenID Connect deals with identity is by defining a new type of token, an "ID token". The ID token is defined as a JSON Web Token (JWT) that encodes the user's profile information. This ID token is then returned in response to an OpenID Connect request, either returned in the redirect URL using the implicit flow, or returned alongside the access token using the authorization code flow.
In the case of the implicit flow, the application that receives the ID token needs to validate the JWT signature and all the JWT claims, because otherwise anyone could drop an ID token into an application to be signed in. This is a standard part of the plain OpenID Connect workflow.
Applications that also need an access token in addition to the ID token should be using the authorization code flow to obtain the access token, otherwise they are at risk of the access token leaking.
So if an application is getting both an access token and ID token, the simplest way is to get both via the authorization code flow, so that both tokens end up being returned in the response to exchanging the authorization code at the token endpoint. This ends up looking like the below.
POST /token
Content-type: application/x-www-form-urlencoded
grant_type=authorization_code
&code=XXXXXXXXX
&redirect_uri=https://example-app.com/redirect
&client_id={CLIENT_ID}
&client_secret={CLIENT_SECRET}
{
"access_token": "SlBV32hkMG",
"token_type": "Bearer",
"expires_in": 3600,
"refresh_token": "2xMBxBtEp3",
"id_token": "eyJhbGciOiJSUzI1..."
}
The interesting part about this is that in this case, the ID token validation step can be skipped completely, since validating it would provide no additional benefit. The application already knows it's talking to the right server, it knows the response is coming from that server, and the connection can't be tampered with since it will be made over HTTPS. This means the application can just extract the data it needs from the ID token and use it directly.
The problem with this approach is that this is essentially providing the application with a point-in-time snapshot of the user's profile information. If their profile information changes, there is no way for the application to get that update unless it either fetches the information from the userinfo endpoint or if the user signs in again. While this may only be a minor annoyance for things like the user's name, it is potentially dangerous for other profile information such as what groups a user belongs to, or other things the application may be relying on to modify its behavior based on the logged-in user.
It also means that the entire profile information for the user is pushed to the application whether it needs it or not. And again, while this may not be a big deal if the only profile information sent is the user's name and email address, it can be potentially a lot of data such as groups, roles, or other provider-specific information that the authorization server may have about the user.
Identity in XYZ
While we're re-thinking OAuth from the ground up, we should also re-think how identity is handled within the system, while drawing from the experience of widely deployed systems. Rather than returning an ID token with potentially stale or large amounts of user information in the authorization response, let's return only the minimum necessary for the application to function, and let it request additional information that it needs when it needs it.
Transaction Request (Authorization Code Exchange)
In XYZ, the application takes the interact handle it got from the redirect and makes a POST request back to the authorization server to get an access token. (Described in Transaction Request.)
POST /authorize
Content-type: application/json
{
"handle": "80UPWY2NM33OMUKMKSKU",
"interact_handle": "CuD2MrpSXVKvvI6dN2awtNLx-HhZy46hJFDBicG4KoZaCmBofvqPxtm7CDMTsUFuvcmLwi_zUN70cCvalI6ENw"
}
We'll extend the response as described next.
Transaction Response (Access Token and User Identity)
In addition to the access token being returned, the user identifier can also be returned in the same response.
The transaction response can include just the unique user identifier along with the access token.
{
"access_token": {
"value": "UM1P9PMHKUR64TB8N6BW7OZB8CDFONP219RP1LT0",
"type": "bearer"
},
"user": {
"id": "5035678642"
}
}
This property is analogous to the "sub" property in an ID token. This value MUST uniquely identify this user within the system, and MUST be consistent when the same user logs in to the same application. The value MAY vary for the same user when logging in to different applications in order to preserve the user's privacy, preventing different applications from cross-correlating users between apps. (This is based on the deployed behavior of several APIs such as Facebook's "app-scoped IDs" and Apple's app-specific user ID and proxy email address.)
If the application needs to know profile information about the user, it can get that from the userinfo endpoint using the access token it just obtained.
We can go one step further and also return the userinfo endpoint at the same time as returning the user's unique identifier, avoiding the need for the app developer to hard-code this value in the app or find it in a separate discovery spec.
{
"access_token": {
"value": "UM1P9PMHKUR64TB8N6BW7OZB8CDFONP219RP1LT0",
"type": "bearer"
},
"user": {
"id": "5035678642",
"userinfo": "https://authorization-server.com/user/5035678642"
}
}
This provides the additional advantage of the opportunity for the authorization server to use a unique URL per user if it wants (although that is by no means a requirement, since the access token will need to be included in the request too).
If the application then wants to discover profile information about the user, it can make a request to the userinfo endpoint with the access token:
GET /user/5035678642
Authorization: Bearer UM1P9PMHKUR64TB8N6BW7OZB8CDFONP219RP1LT0
{
"id": "5035678642",
"name": "Aaron Parecki",
"nickname": "aaronpk",
"email": "aaron@parecki.com",
"photo": "https://aaronparecki.com/images/profile.jpg",
"zoneinfo": "America/Los_Angeles"
}
This ensures that the application knows that it can get the latest version of the user's profile information whenever it needs to by continuing to make requests to the userinfo endpoint, and prevents the temptation to treat the initial snapshot of the profile information as authoritative.
If we don't address this need in XYZ, we'll risk ending up in the same situation we're in with OAuth 2.0 where providers end up implementing their own snowflake APIs to let apps find the user's profile information anyway.