søndag, december 08, 2013

Media types for APIs

I have previously touched upon the concept of media types (see http://soabits.blogspot.no/2013/05/the-role-of-media-types-in-restful-web.html), but somehow it has always been difficult for me to really nail the concept down in a concise and useful article.

Now the latest discussion about the benefits of hypermedia (see http://soabits.blogspot.no/2013/12/selling-benefits-of-hypermedia.html) got me thinking about media types again - but this time in the perspective of unique service implementations with dedicated clients versus large scale ecosystems of mixed implementations.

As it turns out, media types doesn't mean sh*t on a small scale. That kind of explains why it has been so difficult to get to some kind of consensus about media types for APIs.


When the discussion touches upon media types the arguments usually follow these lines:

  • Completely generic media types like JSON and XML should be avoided since they do not include any kind of hypermedia elements.
  • One school of thought argues that we should have very few (generic) media types. This is to avoid the need for clients to understand too many media types.
  • Another school of thought argues that we should have many different domain specific media types. Otherwise the client wouldn't know what kind of resource it was looking at.

But, as I said, it really doesn't matter. Both schools are right. At least when you look at unique service implementations with dedicated clients - like for instance dedicated Twitter clients.

Let me give you a concrete example from the Twitter API (see https://dev.twitter.com/discussions/5662): the return value from their oauth/request_token "endpoint" is key/value pairs encoded as application/x-www-form-urlencoded - but the server says it is "text/html" which is clearly wrong. Does that break any client implementations? No. Why? Because all clients are dedicated to the Twitter API; they KNOW about this little peculiarity and has been hard coded to work with it.

My point is:
Media types are irrelevant for unique service implementations with dedicated clients. In this world the client always knows exactly what it is doing and what kind of result to expect from the server (and it can safely ignore the media type).

Media types on a large scale

Let us broaden our view and look at the example of "Big corporation buys smaller companies and the result is a big unruly combination of customers, sales orders and other stuff living on different systems" which I introduced in my previous blog post (http://soabits.blogspot.no/2013/12/selling-benefits-of-hypermedia.html).

Now lets assume our fictive client is handed a link/URL to a customer resource in this mess of a heterogeneous mix of different company resources. The client can issue a GET on the URL and in return it will receive a stream of bytes. How does the client interpret those bytes? Obviously it will depend on the media type. But which kind of media type is useful for this purpose?

Let us assume the client understand a generic (hypermedia enabled) media type like HAL. Together with the GET request the client sends an accept header "Accept: application/hal+json". Luckily the server knows how to serve the customer resource as HAL, so the client gets a HAL document in return.

Now what? We have integrated customer resources from three different organizations and each of these have been encoding customer records in HAL - but in different ways.

For instance: Company X has these customer properties:

  ID: 1234,
  Name: "John Larsson",
  Address: "Marienborg 1, 2830 Virum, Denmark"

while company Y uses these properties:

  ID: 1234,
  FirstName: "John",
  LastName: "Larsson",
    Address: "Marienborg 1",
    PostalCode: "2830",
    City: "Virum",
    Country: "Denmark"

With nothing but this information our client must either give up or do some guessing like "If FirstName is present then assume format of company Y". So apparently we need a bit more information than we already have.

Now we can either choose to add some kind of profile to the representation - either as a header or in the payload - or we can use a domain specific media type.

1) A profile in the payload could be done like this:

  ID: 1234,
  profile: "http://company-x.com/profiles/customer-care",
  ... other properties ...

2) The profile could also be part of the media type, so we would get "application/hal+json;profile=http://company-x.com/profiles/customer-care".

3) A domain specific media type could be something like "application/company-x.customer-care.hal+json" or similar.

But which method should we choose? Lets take a look at how the client process the server response before we answer that.

Processing a server response

There are three things the client must know in order to process a server response correctly:

  1. How to decode the byte stream (generic knowledge).
  2. What the data represents (domain specific knowledge).
  3. How to locate hypermedia elements in the response (generic knowledge).

The media type is obviously the key to decoding the byte stream - it will tell the client whether it is looking at XML, PDF, HTML, HAL, Sirene and so on.

The media type should also be the key to locating hypermedia elements in the response.

But what about the domain specific knowledge - should we identify what a resource represents with a domain specific media type or with a profile? Both methods work, but there is one more thing to take into account: making the API explorable by client developers (see http://soabits.blogspot.no/2013/12/selling-benefits-of-hypermedia.html).

It is of course possible to implement a browser for any domain specific media type we can think of, but it would obviously be more practical if we could have one single API browser for all kinds of APIs. For this reason we should avoid domain specific media types. The domain knowledge can then be identified by a profile - either in the payload or in a HTTP header.

Wrapping it all up

As with the hypermedia problem: if you stick to unique service implementations with dedicated clients (like a dedicated Twitter client) then media types are utterly irrelevant. The client can safely assume that there will be one, and only one, representation of what ever kind of resource it is looking for.

But if you take broader perspective and venture into a highly heterogeneous, loosely coupled, unorganized, incoherent and fragmented ecology (also called "The internet") - then you need more domain specific information about the resources - either through domain specific media types, or generic media types with profiles.

My recommendation is:

  1. Use generic media types that include hypermedia elements.
  2. Identify domain specific information through profiles.

The media type will tell the client HOW to decode the byte stream and HOW to interact with the resource. The profile will tell the client WHAT it is looking at.

Ingen kommentarer:

Send en kommentar