This year, the IndieWeb community has been making progress on iterating and evolving the IndieAuth protocol. IndieAuth is an extension of OAuth 2.0 that enables it to work with personal websites and in a decentralized environment.
There are already a good number of IndieAuth providers and apps, including a WordPress plugin, a Drupal module, and built-in support in Micro.blog, Known and Dobrado. Ideally, we'd have even more first-class support for IndieAuth in a variety of different blogging platforms and personal websites, and that's been the goal motivating the spec updates this year. We've been focusing on simplifying the protocol and bringing it more in line with OAuth 2.0 so that it's both easier to understand and also easier to adapt existing OAuth clients and servers to add IndieAuth support.
Most of the changes this year have removed IndieAuth-specific bits to reuse things from OAuth when possible, and cleaning up the text of the spec. These changes are also intended to be backwards compatible as much as possible, so that existing clients and servers can upgrade independently.
This post describes the high level changes to the protocol, and is meant to help implementers get an idea of what may need to be updated with existing implementations.
If you would like an introduction to IndieAuth and OAuth, here are a few resources:
The rest of this post details the specifics of the changes and what they mean to client and server developers. If you've written an IndieAuth client or server, you'll definitely want to read this post to know what you'll need to change for the latest updates.
- Response Type
- Indicating the User who is Logging In
- Adding PKCE Support
- Grant Type Parameters
- Providing "me" in the Token Request
- Removing Same-Domain Requirement
- Returning Profile Information
- Editorial Changes
- Dropped Features and Text
Response Type
The first thing an IndieAuth client does is discover the user's authorization endpoint and redirect the user to their server to authorize the client. There are two possible ways a client might be wanting to use IndieAuth, either to confirm the website of the user who just logged in, or to get an access token to be able to create posts on their website.
Previously, this distinction was made at this stage of the request by varying the response_type
query string parameter. Instead, the response_type
parameter is now always response_type=code
which brings it in line with the OAuth 2.0 specification. This makes sense because the response of this request is always an authorization code, it's only after the authorization code is used that the difference between these two uses will be apparent.
Changes for clients: Always send response_type=code
in the initial authorization request.
Changes for servers: Only accept response_type=code
requests, and for backwards-compatible support, treat response_type=id
requests as response_type=code
requests.
Indicating the User who is Logging In
In earlier versions of the specification, the authorization request was required to have the parameter me
, the value of which was whatever URL the user entered into the client to start the flow. It turns out that this parameter isn't strictly necessary for the flow to succeed, however it still can help improve the user experience in some cases. As such, it has now been changed to an optional parameter.
This parameter is a way for the client to tell the IndieAuth server which user it expects will log in. For single-user sites this value is completely unnecessary, since there is only ever one me
URL that will be returned in the end. It turns out that most single-user implementations were already ignoring this parameter anyway since it served no purpose.
For multi-user websites like a multi-author WordPress blog, this parameter also served little purpose. If a user was already logged in to their WordPress site, then tried to log in to an IndieAuth client, the server could just ignore this parameter anyway and return the logged-in user's me
URL at the end of the flow.
For multi-user authorization endpoints like the (to-be deprecated) indieauth.com, this parameter served as a hint of who was trying to log in, so that the authorization server could provide a list of authentication options to the user. This is the only case in which this parameter really provides a user experience benefit, since without the parameter at this stage, the user would need to type in their website again, or be shown a list of authentication provider options such as "log in with Twitter".
There's yet another case where the user may enter just the domain of their website, even though their final me
URL may be something more specific. For example, a user can enter micro.blog
in an IndieAuth sign-in prompt, and eventually be logged in to that app as https://micro.blog/username
. There is no requirement that the thing they type in to clients has to be an exact match of their actual profile URL, which allows a much nicer user experience so that users can type only the domain of their service provider which may provide profiles for multiple users. And in this case, the client providing the me
URL to the server also doesn't serve any purpose.
The change to the spec makes the me
parameter in the authorization request optional. In this sense, it's more of a hint from the client about who is trying to log in. Obviously the server can't trust that value in the request at this point, since the user hasn't logged in yet, so it really is more of a hint than anything else.
Changes for clients: Continue to include the me
parameter in the request if you can, but if you are using an OAuth client that doesn't let you customize this request, it's okay to leave it out now.
Changes for servers: Most servers were already ignoring this parameter anyway, so if you fell into that category then no change is needed. If you were expecting this parameter to exist, change it to optional, because you probably don't actually need it. If it's present in a request, you can use it to influence the options you show for someone to authenticate if they are not yet logged in, or you could show an error message if the client provides a me
URL that doesn't match the currently logged-in user.
Adding PKCE Support
Probably the biggest change to the spec is the addition of the OAuth 2.0 PKCE (Proof Key for Code Exchange) mechanism. This is an extension to OAuth 2.0 that solves a number of different vulnerabilities. It was originally designed to allow mobile apps to securely complete an OAuth flow without a client secret, but has since proven to be useful for JavaScript apps and even solves a particular attack even if the client does have a client secret.
Since IndieAuth clients are all considered "Public Clients" in OAuth terms, there are no preregistered client secrets at all, and PKCE becomes a very useful mechanism to secure the flow.
I won't go into the details of the particular attacks PKCE solves in this post, since I've talked about them a lot in other talks and videos. If you'd like to learn more about this, check out this sketch notes video where I talk about PKCE and my coworker draws sketchnotes on his iPad.
Suffice it to say, PKCE is a very useful mechanism, isn't terribly complicated to implement, and can be added independently by clients and servers since it's designed to be backwards compatible.
The change to the spec is that PKCE has been rolled into the core authorization flow. Incidentally, the OAuth spec itself is making the same change by rolling PKCE in to the OAuth 2.1 update.
Changes for clients: Always include the PKCE parameters code_challenge
and code_challenge_method
in the authorization request.
Changes for servers: If a code_challenge
is provided in an authorization request, don't allow the authorization code to be used unless the corresponding code_verifier
is present in the request using the authorization code. For backwards compatibility, if no code_challenge
is provided in the request, make sure the request to use the authorization code does not contain a code_verifier
.
Using an Authorization Code
Whew, okay, you've made it this far and you've sent the user off to their authorization endpoint to log in. Eventually the IndieAuth server will redirect the user back to your application. Now you're ready to use that authorization code to either get an access token or confirm their me
URL.
There are two changes to this step, redeeming the authorization code.
Grant Type Parameters
The first change, while minor, brings IndieAuth in line with OAuth 2.0 since apparently this hadn't been actually specified before. This request must now contain the POST body parameter grant_type=authorization_code
.
Changes for clients: Always send the parameter grant_type=authorization_code
when redeeming an authorization code. Generic OAuth 2.0 clients will already be doing this.
Changes for servers: For backwards compatibility, treat the omission of this parameter the same as providing it with grant_type=authorization_code
. For example if you also accept requests with grant_type=refresh_token
, the absence of this parameter means the client is doing an authorization code grant.
Providing "me" in the Token Request
The request when using an authorization code, either to the token endpoint or authorization endpoint, previously required that the client send the me
parameter as well. The change to the spec drops this parameter from this request, making it the same as an OAuth 2.0 request.
This has only some minor implications in very specific scenarios. We analyzed all the known IndieAuth implementations and found that the vast majority of them were already ignoring this parameter anyway. For single-user endpoints, the additional parameter provides no value, since the endpoint would be self-contained anyway, and already know how to validate authorization codes. Even multi-user endpoints like the WordPress plugin would know how to validate authorization codes because the authorization and token endpoints are part of the same software.
The only implementations leaving this parameter out would break are separate implementations of authorization endpoints and token endpoints, where the user has no prior relationship with either. The biggest offender of this is actually my own implementation which I am eventually going to retire, indieauth.com and tokens.indieauth.com. I initially wrote indieauth.com as just the authorization endpoint part, and later added tokens.indieauth.com as a completely separate implementation, it shares nothing in common with indieauth.com and is actually entirely stateless. Over the years, it turns out this pattern hasn't actually been particularly useful, since a website is either going to build both endpoints or delegate both to an external service. So in practice, the only people using tokens.indieauth.com were using it with the indieauth.com authorization endpoint.
Removing this parameter has no effect on most of the implementations. I did have to update my own implementation of tokens.indieauth.com to default to verifying authorization codes at indieauth.com if there was no me
parameter, which so far has been sccessful.
Changes for clients: No need to send the me
parameter when exchanging an authorization code. This makes the request the same as a generic OAuth 2.0 request.
Changes for servers: For servers that have an authorization endpoint and token endpoint as part of the same software, make sure your token endpoint knows how to look up authorization codes. Most of the time this is likely what you're already doing anyway, and you were probably ignoring the me
parameter already. If you do want to provide a standalone token endpoint, you'll need to create your own encoding scheme to bake in the authorization endpoint or me
value into the authorization code itself. But for the vast majority of people this will require no change.
Removing Same-Domain Requirement
One of the challenges of a decentralized protocol like this is knowing who to trust to make assertions about who. Just because someone's authorization server claims that a user identified as "https://aaronpk.com/" logged in doesn't mean I actually did log in. Only my authorization server should be trusted to assert that I logged in.
In the previous version of the spec, the way this was enforced was that clients had to check that the final me
URL returned had a matching domain as what the user initially entered, after following redirects. That means if I entered aaronpk.com
into the client, and that redirected to https://aaronparecki.com/
, the client would then expect the final profile URL returned at the end to also be on aaronparecki.com
. This works, but it has a few challenges and limitations.
The biggest challenge for client developers was keeping track of the chain of redirects. There were actually separate rules for temporary vs permanent redirects, and the client would have to be aware of each step in the redirect chain if there was more than one. Then at the end, the client would have to parse the final profile URL to find the host component, then check if that matches, and it turns out that there are often some pretty low-level bugs with parsing URLs in a variety of languages that can lead to unexpected security flaws.
On top of the technical challenges for client developers, there was another problem in the specific case where a user may control only a subfolder of a domain. For example in a shared hosting environment where users can upload arbitrary files to their user directory, https://example.com/~user
, the same-domain restriction would still let /~user1
claim to be /~user2
on that domain. We didn't want to go down the route of adding more URL parsing rules like checking for substring matches, as that would likely have led to even more of a burden on client developers and more risk of security holes.
So instead, this entire restriction has been replaced with a new way of verifying that the final profile URL is legitimate. The new rule should drastically simplify the client code, at the slight cost of a possible additional HTTP request.
The new rule is that if the final profile URL returned by the authorization endpoint is not an exact match of the initially entered URL, the client has to go discover the authorization endpoint at the new URL and verify that it matches the authorization endpoint it used for the flow. This is described in a new section of the spec, Authorization Server Confirmation.
This change means clients no longer need to keep track of the full redirect chain (although they still can if they would like more opportunities to possibly skip that last HTTP request), and also ensures users on shared domains can't impersonate other users on that domain.
Changes for clients: Remove any code around parsing the initial and final URLs, and add a new step after receiving the user's final profile URL: If the final profile URL doesn't match exactly what was used to start the flow, then go fetch that URL and discover the authorization endpoint and confirm that the discovered authorization endpoint matches the one used at the beginning. Please read Authorization Server Confirmation for the full details.
Changes for servers: No change.
Returning Profile Information
If the application would like to know more about the user than just their confirmed profile URL, such as their name or photo, previously there was no easy or reliable way to find this information. It's possible the user's profile URL may have an h-card
with their info, but that would only include public info and would require bringing in a Microformats parser and making another HTTP request to find this information.
In the latest version of the spec, we've added a new section returned in the response when redeeming an authorization code for the authorization server to return this profile data directly. To request this information, there are now two scopes defined in the spec, profile
and email
. When the client requests the profile
scope, this indicates the client would like the server to return the user's name
, photo
and url
. The email
scope requests the user's email address.
The response when redeeming an authorization code that was issued with these scopes will now contain an additional property, profile
, alongside the me
URL and access token.
{
"access_token": "XXXXXX",
"token_type": "Bearer",
"scope": "profile email create",
"me": "https://user.example.net/",
"profile": {
"name": "Example User",
"url": "https://user.example.net/",
"photo": "https://user.example.net/photo.jpg",
"email": "user@example.net"
}
}
This comes with some caveats. As is always the case with OAuth, just because a client requests certain scopes does not guarantee the request will be granted. The user or the authorization server may decide to not honor the request and leave this information out. For example a user may choose to not share their email even if the app requests it.
Additionally, the information in this profile
section is not guaranteed to be "real" or "verified" in any way. It is merely information that the user intends to share with the app. This means everything from the user sharing different email addresses with different apps, or the URL in the profile being a completely different website. For example a multi-author WordPress blog which provides me
URLs on the WordPress site's domain, example.com
, may return the author's own personal website in the url
property of the profile information. The client is not allowed to treat this information as authoritative or many any policy decisions based on the profile information, it's for informational purposes only. Another common vulnerability in many existing OAuth clients is that they assume the provider has confirmed the email address returned and will use that to deduplicate accounts. This has the problem of if a user can edit their email address and have it returned in an OAuth response without the server confirming it, the client may end up being tricked into thinking a different user logged in. Only the me
URL is the one that can be trusted as the stable identifier of the user, and everything in the profile
section should be treated as if it were hand-entered into the client.
Changes for clients: If you would like to find the user's profile information, include the profile
or email
scope in your authorization request. If you don't need this, then no changes are necessary.
Changes for servers: Authorization servers should be able to recognize the profile
and email
scopes in a request, and ask the user for permission to share their profile information with clients, then return that along with the final me
URL and access token. It's also completely acceptable to not support this feature at all, as clients shouldn't be relying on the presence of this information in the response anyway.
Editorial Changes
There was a good amount of work done to clean up the text of the spec without changing any of the actual requirements. These are known as editorial changes.
The term "domain" has been replaced with the more accurate term "host" in most places. This matches the URL spec more closely, and reduces the confusion around registerable domain like example.com
or example.co.uk
and subdomains. In all cases, there has been no need to use the public suffix list because we have always meant full hostname matches.
Language around the term "profile URL" was cleaned up to make sure only the final URL returned by the authorization server is referred to as the "profile URL". The user may enter lots of different things into the client that might not be their profile URL, anything from just a hostname (aaronpk.com
) to a URL that redirects to their profile URL. This cleans up the language to better clarify what we mean by "profile URL".
With the change to use response_type=code
for both versions of the flow, it meant the authorization and authentication sections were almost entirely duplicate content. These have been consolidated into a single section, Authorization, and the only difference now is the response when the authorization code is redeemed.
Dropped Features and Text
Any time you can cut text from a spec and have it mean the same thing is a good thing. Thankfully we were able to cut a decent amount of text thanks to consolidating the two sections mentioned above. We also dropped an obscure feature that was extremely under-utilized. For the case where a token endpoint and authorization endpoint were not part of the same software, there was a section describing how those two could communicate so that the token endpoint could validate authorization codes issued by an arbitrary authorization endpoint. This serves no purpose if a single piece of software provided both endpoints since it would be far more efficient to have the token endpoint look up the authorization code in the database or however you're storing them, so virtually nobody had even bothered to implement this.
The only known implementations of this feature were my own tokens.indieauth.com
, and Martijn's mintoken
project. We both agreed that if we did want to pursue this feature in the future, we could write it up as an extension. Personally I plan on shutting down indieauth.com
and tokens.indieauth.com
in the near-ish future anyway, and the replacement that I build will contain both endpoints anyway, so I don't really plan on revisiting this topic anyway.
Conclusion / Future Work
Well if you've made it this far, congrats! I hope this post was helpful. This was definitely a good amount of changes, although hopefully all for good reasons and should simplify the process of developing IndieAuth clients and servers in the future.
We didn't get to every open IndieAuth issue in this round of updates, there are still a few interesting ones open that I would like to see addressed. The next largest change that will affect implementations would be to continue to bring this in line with OAuth 2.0 and always redeem the authorization code at the token endpoint even if no access token is expected to be returned. That would also have the added benefit of simplifying the authorization endpoint implementation to only need to worry about providing the authorization UI, leaving all the JSON responses to the token endpoint. This still requires some discussion and a plan for upgrading to this new model, so feel free to chime in on the discussions!
I would like to give a huge thank-you to everyone who has participated in the discussions this year, both on GitHub and in our virtual meetings! All the feedback from everyone who is interested in the spec has been extremely valuable!
We'll likely schedule some more sessions to continue development on the spec, so keep an eye on events.indieweb.org for upcoming events tagged #indieauth!
If you have any questions, feel free to stop by the #indieweb-dev chat (or join from IRC or Slack) and say hi!
What do you use as your IndieAuth server? It could be it needs to align with the latest changes in the spec? https://aaronparecki.com/2020/12/03/1/indieauth-2020 Do other clients such as https://micropublish.net or https://quill.p3k.io work? If neither of them either, it's likely the spec updates