Federated Identity Candidate Architectures

| | Comments (0) | TrackBacks (1)

Over the last year, I've been designing and implementing various candidate architectures for a federated identity management system. Quite honestly, when I started, I didn't know much about these types of systems. Not surprisingly my thoughts have really evolved and changed as I've learned more about them. I think there are many other people who are just starting to think about federated identity as well, so I thought I would explain the progression of my thoughts in hopes that it will help shape yours more quickly.

Architecture 1 - A Single STS with Centrally Stored Claims

Initially, the candidate architecture that I came up with had a single STS. I imagined that this lone service would issue all types of claims for all of its RPs. This meant that the STS would need a copy of all claim types and values that the RPs would need to authorize access. As a result, this first architecture included a claims service. This service would provide a way to register an RP with the STS, configure which claims it required, and provide a way for RPs to publish their values into a claims store that the STS would then draw from when issuing security tokens.

As you can see in the first figure, this architecture would include an STS that would delegate authentication to an IdP. The STS would get some identity-related claims from the IdP, but it would also get application-specific claims from the claims service. This solution has obvious flaws such as massive storage requirements, synchronization issues, etc. I abandoned this idea for a more decentralized approach.

Architecture 2 - A Single STS that Queries RPs for Claims as Needed

Even with automated synchronization, the first architecture was fatally flawed. When the second beta of ADFS v. 2 (PKA Geneva Server) came out, it included a little domain specific language (DLS) that allowed attributes to be retrieved from various attribute stores, including AD DS, LDAP directories, SQL Server databases, and more. These attribute stores could be configured on an RP by RP basis. With this capability, I built a system that fetched identity-related claims from a central directory and queried the RPs' data stores for additional application-specific values JIT. These were burnt into the security token that ADFS issued and sent back to the RPs. It ended up being some sort of weird circle as you can see in the second figure.

This design is a pretty clear progression from the first, but it still has some real problems. Chief among them is the odd back-channel thing. It just feels wrong to have the STS that is supposed to provide centralized identity management dipping into applications' databases for stuff that it's just going to turn around and give them right back. By having a centralized STS, identity information should not be as strewn around the enterprise as it often is today. This STS should be the single source for identities. With this candidate architecture, the authoritative sources of identity remain the various application databases.

Architecture 3 - A single STS that Issues Identity Claims Only

The third design I came up with was one that still had just one STS. This STS was different from the others though. It was an Identity Provider STS (IP-STS). This distinction meant it would only issue identity-related claims - global properties that described a user regardless of the application being accessed. This architecture means the weird back-channel thing was no longer needed, and that the IP-STS was the single authority for identities; however, with only identity-related claims, the RPs would often not have enough information to make an access control decision. Enterprise applications today need thousands of permission-granting attributes to authorize access. Just having the globally unique attributes issued by the IP-STS - things such as user ID, first name, last name, address, etc. - many applications would not have enough to determine if the subject should be allowed access. For this reason, the RPs would need to get additional attributes out of their own databases using the user ID as a key. After doing so, they would have enough information to make their authorization decisions.

This architecture is fine, and will be used by many companies I imagine. There is one though that I think is better.

Architecture 4 - An IP-STS and an RP-STS

An RP-STS is a relying party STS or resource provider STS (depending on who you ask). It is a lightweight STS that is married to an RP. These two components often share a database, but they don't necessarily have to. The RP trusts the RP-STS and the RP-STS trusts the IP-STS. When an unauthenticated user accesses the RP, he is redirected to the RP-STS and again to the IP-STS. There the user logs in, gets an identity token, and sends it back to the RP-STS. This service uses the user ID claim in it to look up application specific values that it knows the RP is going to need to authorize access. (It will almost certainly copy the identify-related claims into the new token as well.) It then returns this to the subject who sends it onto the RP. This expanded set of claims is all the RP needs. It doesn't have to query its database; it can authorize the user and return the resource if allowed.

This candidate architecture is common in the literature on this topic and obvious to my heroes - Hervey Wilson, Michele Bustamante, Vittorio Bertocci, Dominick Baier - and other experts in this field; however, it took me a long time to see it :-( Now that I have, however, I've also seen that it introduce a number of issues related to trust relationships, user provisioning, key management, and more. I'll try to find some time to blog about those in the future; if you're wondering about them before I do, feel free to get in touch with me.