søndag, december 08, 2013

Media types for APIs

I have previously touched upon the concept of media types (see http://soabits.blogspot.no/2013/05/the-role-of-media-types-in-restful-web.html), but somehow it has always been difficult for me to really nail the concept down in a concise and useful article.

Now the latest discussion about the benefits of hypermedia (see http://soabits.blogspot.no/2013/12/selling-benefits-of-hypermedia.html) got me thinking about media types again - but this time in the perspective of unique service implementations with dedicated clients versus large scale ecosystems of mixed implementations.

As it turns out, media types doesn't mean sh*t on a small scale. That kind of explains why it has been so difficult to get to some kind of consensus about media types for APIs.

Background


When the discussion touches upon media types the arguments usually follow these lines:

  • Completely generic media types like JSON and XML should be avoided since they do not include any kind of hypermedia elements.
  • One school of thought argues that we should have very few (generic) media types. This is to avoid the need for clients to understand too many media types.
  • Another school of thought argues that we should have many different domain specific media types. Otherwise the client wouldn't know what kind of resource it was looking at.

But, as I said, it really doesn't matter. Both schools are right. At least when you look at unique service implementations with dedicated clients - like for instance dedicated Twitter clients.

Let me give you a concrete example from the Twitter API (see https://dev.twitter.com/discussions/5662): the return value from their oauth/request_token "endpoint" is key/value pairs encoded as application/x-www-form-urlencoded - but the server says it is "text/html" which is clearly wrong. Does that break any client implementations? No. Why? Because all clients are dedicated to the Twitter API; they KNOW about this little peculiarity and has been hard coded to work with it.

My point is:
Media types are irrelevant for unique service implementations with dedicated clients. In this world the client always knows exactly what it is doing and what kind of result to expect from the server (and it can safely ignore the media type).

Media types on a large scale


Let us broaden our view and look at the example of "Big corporation buys smaller companies and the result is a big unruly combination of customers, sales orders and other stuff living on different systems" which I introduced in my previous blog post (http://soabits.blogspot.no/2013/12/selling-benefits-of-hypermedia.html).

Now lets assume our fictive client is handed a link/URL to a customer resource in this mess of a heterogeneous mix of different company resources. The client can issue a GET on the URL and in return it will receive a stream of bytes. How does the client interpret those bytes? Obviously it will depend on the media type. But which kind of media type is useful for this purpose?

Let us assume the client understand a generic (hypermedia enabled) media type like HAL. Together with the GET request the client sends an accept header "Accept: application/hal+json". Luckily the server knows how to serve the customer resource as HAL, so the client gets a HAL document in return.

Now what? We have integrated customer resources from three different organizations and each of these have been encoding customer records in HAL - but in different ways.

For instance: Company X has these customer properties:

{
  ID: 1234,
  Name: "John Larsson",
  Address: "Marienborg 1, 2830 Virum, Denmark"
}

while company Y uses these properties:

{
  ID: 1234,
  FirstName: "John",
  LastName: "Larsson",
  Address:
  {
    Address: "Marienborg 1",
    PostalCode: "2830",
    City: "Virum",
    Country: "Denmark"
  }
}

With nothing but this information our client must either give up or do some guessing like "If FirstName is present then assume format of company Y". So apparently we need a bit more information than we already have.

Now we can either choose to add some kind of profile to the representation - either as a header or in the payload - or we can use a domain specific media type.

1) A profile in the payload could be done like this:

{
  ID: 1234,
  profile: "http://company-x.com/profiles/customer-care",
  ... other properties ...
}

2) The profile could also be part of the media type, so we would get "application/hal+json;profile=http://company-x.com/profiles/customer-care".

3) A domain specific media type could be something like "application/company-x.customer-care.hal+json" or similar.

But which method should we choose? Lets take a look at how the client process the server response before we answer that.

Processing a server response


There are three things the client must know in order to process a server response correctly:

  1. How to decode the byte stream (generic knowledge).
  2. What the data represents (domain specific knowledge).
  3. How to locate hypermedia elements in the response (generic knowledge).

The media type is obviously the key to decoding the byte stream - it will tell the client whether it is looking at XML, PDF, HTML, HAL, Sirene and so on.

The media type should also be the key to locating hypermedia elements in the response.

But what about the domain specific knowledge - should we identify what a resource represents with a domain specific media type or with a profile? Both methods work, but there is one more thing to take into account: making the API explorable by client developers (see http://soabits.blogspot.no/2013/12/selling-benefits-of-hypermedia.html).

It is of course possible to implement a browser for any domain specific media type we can think of, but it would obviously be more practical if we could have one single API browser for all kinds of APIs. For this reason we should avoid domain specific media types. The domain knowledge can then be identified by a profile - either in the payload or in a HTTP header.

Wrapping it all up


As with the hypermedia problem: if you stick to unique service implementations with dedicated clients (like a dedicated Twitter client) then media types are utterly irrelevant. The client can safely assume that there will be one, and only one, representation of what ever kind of resource it is looking for.

But if you take broader perspective and venture into a highly heterogeneous, loosely coupled, unorganized, incoherent and fragmented ecology (also called "The internet") - then you need more domain specific information about the resources - either through domain specific media types, or generic media types with profiles.

My recommendation is:

  1. Use generic media types that include hypermedia elements.
  2. Identify domain specific information through profiles.

The media type will tell the client HOW to decode the byte stream and HOW to interact with the resource. The profile will tell the client WHAT it is looking at.

fredag, december 06, 2013

Selling the benefits of hypermedia in APIs

Once more I have found myself deeply engaged in a discussion about REST on the api-craft mailing list (https://groups.google.com/forum/#!topic/api-craft/ZxnLD6q6w7w). This time it started with the question "How do I sell the benefits of hypermedia". It turned out to be harder to answer than one would expect, but after some time we came up with the list below. But before we get into that I better explain "hypermedia" in a few sentences.

The most common use of hypermedia is embedding of links in representations returned from some service on the web. As an example we can look at the representation of a customer record containing a customer ID, customer name, customer contact information and related sales orders. Encoded in JSON we can get something like this:

// Customer record
// URL template: http://company-x.com/customers/{customer-id}
{
  ID: 1234,
  Name: "John Larsson",
  Address: "Marienborg 1, 2830 Virum, Denmark",
}

// List of sales order
// URL template: http://company-x.com/customers/{customer-id}/sales-orders
{
  CustomerId: 1234,
  Orders:
  [
    {
      ID: 10,
      ItemNumber: 15,
      Quantity: 4
    }
  ]
}

These two resources can be found by expanding the customer ID into the URL templates. This requires the client to be hard coded with 1) the URL templates and 2) the knowledge of which values to use as parameters.

Now, if we embed links in the responses then we can remove the hard coded knowledge of at least the sales orders URL template:

// Customer record
// URL template: http://company-x.com/customers/{customer-id}
{
  ID: 1234,
  Name: "John Larsson",
  Address: "Marienborg 1, 2830 Virum, Denmark",
  _links:
  [
    {
      rel: "http://linkrels.company-x.com/sales-orders",
      href: "{link-to-sales-orders}",
      title = "Sales orders"
    }
  ]
}

// List of sales order
// URL template unpublished
{
  CustomerId: 1234,
  Orders:
  [
    {
      ID: 10,
      ItemNumber: 15,
      Quantity: 4,
      _links:
      [
        {
          rel: "http://linkrels.company-x.com/order-details",
          href: "{link-to-sales-order-details}",
          title = "Sales order details"
        },
        {
          rel: "http://linkrels.company-x.com/item-details",
          href: "{link-to-item-details}",
          title = "Item order details (catalog)"
        }
      ]
    }
  ]
}

Notice how links are encoded:

  • Links are always found in collections named _links
  • A single link consists of a link relation identifier "rel", the hypermedia reference "href" and a human readable description "title".
  • Link relations are identified by URLs.

The question is know, what is gained by adding such links?

Short term effects


1. Explorable API

It may sound trivial but do not underestimate the power of an explorable API. The ability to browse around the data makes it a lot easier for the client developers to build a mental model of the API and its data structures.

Think of it like this; traditionally, as a client developer, you would have to read through a pile of documentation before you sit down and write some test programs more or less blindfolded. After that you run your test program to see how the API behaves. Then you have to go back to the documentation and read some more - and then back to coding again. This exercise has three different mental context switches going back and forth between reading documentation, programming and trying out test programs.

With an explorable API you can simply try out the API and test your understanding of it without any programming. Any mental "what-if" hypothesis testing of the API can be carried out right there without any additional tools or programming. The data as well as the interaction tools is right there in front of you, reducing the mental hoops you have to go through to understand the API.

The immediate benefits of an explorable API is perhaps more social than technical. But mind you - a lower barrier of entry means happier client developers, higher API adoption rates and less support, which in the end means fewer annoying support calls to bug YOU at the most annoying times of your work.

2. Inline documentation

Did you notice how link relations are identified by URLs? These URLs can point to online documentation where the API elements can be explained.

The immediate benefits of this are also social just like the explorability of the API. It will lower the barrier of entry to understanding the API and improve API adoption by client developers.

3. Simple client logic

A client that simply follows URLs instead of constructing them itself, should be easier to implement and maintain. It won't need logic to figure out which values to substitute into what URL templates. All it has to do is to identify links in the payload and extract the hypermedia reference URL.

Long term effects


4. The server takes ownership of URL structures

The use of hypermedia removes the client's hard coded knowledge of the URL structures used by the server. This means the server is free to change its URL structures over time when the API evolves without any need to upgrade the clients.

The benefits of this is obviously less coupling between the server and the client, removing the need to upgrade all clients in lock step with the server.

But, you may ask, why should the server change its URL structures? Once the server developers has decided that the URL is /customers/{customer-id} why should they then suddenly decide to change it? Well, I cannot tell you what will change in your API, but here are two examples:

- A resource grows too big. Over time it has been necessary to add more and more features to a single resource and one day it simply becomes too big to handle. So it is decided to split it into multiple sub-resources with new URL structures.

- It turns out that some resources requires bits and pieces of information from other resources when the client access them. It can for instance be an access token of some kind that need to be generated in one place and passed to another resource. With a traditional API the client has to be upgraded with this kind of business logic. With a hypermedia API the client can ignore this complexity and leave it to the server to add the desired parameters to the links it generates.

5. Off loading content to other services

Consider how APIs evolve: after some time you figure out that some of the content should be off-loaded to a Content Delivery Network (CDN). This means new URLs that points to completely different hosts all over the internet. The actual URL cannot be hard coded into the client since it may change over time or contain random pieces of server generated information for the CDN (like for instance some kind of access token). Now the server HAS to embed the URLs in the responses and the client HAS to follow them.

6. Versioning with links

With a hypermedia API it becomes trivial to implement new versions of the API resources without breaking existing clients: old clients will follow existing link relations to old-style resources whereas new clients will know how to follow new link relations to new resources - as long as the server response includes both the old as well as the new links.

Hypermedia also allows the server to re-implement an existing resource with a completely different technology stack, on a completely different server, without the client ever noticing it - given, of course, that the new implementation doesn't make any breaking changes.

If you want to read more about versioning then take a look at Mark Nottingham's "Web API versioning smackdown" at http://www.mnot.net/blog/2011/10/25/web_api_versioning_smackdown

Large scale effects


7. Multiple implementations of the same service

So far we have only looked at clients dedicated to a unique implementation of a single service. That could for instance be something kike a dedicated Twitter client. But where hypermedia really excels is when we start to work with multiple independent implementations of the same service.

Let us try to broaden the scene: think of a big corporation that has engulfed and bought up a lot of smaller companies. All of these smaller companies have their own server setup with lists of inventory, sales orders, customers and so on ...

Now I give you the ID 4328 of customer John Burton ... how will you be able to find the resource that represent the contact information of this customer, when it can live on any one of a dozen servers?

Solution 1: We need a central indexing service that allows the client to search for customer with ID 4328. But how can the indexing service tell the client where the resulting customer record resides? The answer is simple; the response has to contain a link to the customer record resource.

Solution 2: Don't use IDs like 4328 at all. Always refer to resources with their full URLs.

Either way, the client won't know anything about the URL it gets in return - all it has to do is to trust the search result and follow the link.

And now that we have some opaque, meaningless, URL to our customer record information, how do we get to the sales orders placed by said customer? We could take the customer ID again and throw it into some other indexing service and get a new URL out of it - or we could follow a "sales-orders" link-relation embedded in the customer information.

The point is:
When you transcend from unique one-off service implementations with dedicated clients to multiple independent service implementations with a variety of clients then you simply have to use hypermedia elements.
This also means that it can be difficult to sell hypermedia to startup APIs since hypermedia won't add much benefit to one single API living on a isolated island without any requirements of being able to co-exists and co-work seamlessly with other similar services.

Other large scale effects

Hypermedia solves some of the problems related to large scale service implementations as I have just argued. But there are a few more issues to be solved in order to decouple clients completely from specific server implementations; one is related to how the client understands the result (media types) and one is related to error handling.

I have already written about error handling in http://soabits.blogspot.no/2013/05/error-handling-considerations-and-best.html where the last section discuss error handling on a larger scale.

Unfortunately I have yet to write an article explaining my current view on media types - until then you can either check this post in api-craft https://groups.google.com/d/msg/api-craft/5N5SS0JMAJw/b0diFRzopY0J or read my ramblings about media types and type systems http://soabits.blogspot.no/2013/05/the-role-of-media-types-in-restful-web.html.

UPDATE (December 8th 2013): I have just added a blog post about media types: http://soabits.blogspot.no/2013/12/media-types-for-apis.html

Acknowledgments

Thanks to Mike Kelly for his initial blogpost on this tema: http://blog.stateless.co/post/68259564511/the-case-for-hyperlinks-in-apis - and his work on hypermedia linking in JSON with HAL (http://stateless.co/hal_specification.html).


onsdag, oktober 02, 2013

URL structures and hyper media for Web APIs and RESTful services

A recurring theme on various mailing lists is that of choosing the "right" URL structure for a specific kind of web API. In this post I will present my view on this issue, based on various input from for instance "API-craft" (https://groups.google.com/forum/#!forum/api-craft).

First of all, let me hammer it in: URL structuring has absolutely nothing to do with REST. Period. REST is not concerned about URL structures - in REST a URL is an opaque string of characters with no meaning beyond the fact that it is both an identifier and a resource locator. An on-line web service doesn't become a RESTful service just because it has a nice pretty looking URL structure. There is simply no such thing as "A RESTful URL".

What this means is that a URL like http://geo.com/countries/usa/states/nevada is just as "RESTful" (or non-RESTful) as http://geo.com/states/nevada, http://geo.com/states/321, http://geo.com/states?id=321 and http://geo.com/foo-bar-U7q. The URL structure simply doesn't matter in REST.

But from a human point of view it helps understanding if the API has some kind of meaningful URL structure. Computers may easily ignore URL structures but as humans we tend to look at URLs and try to infer meaning from that. Thus, having pretty and well structured URLs helps us understand what is going on - not only as client developers but certainly also as server developers who often have to navigate from URLs to source code - and having a well defined URL structure helps us with that process.

In order to discuss URL structures we need a domain to model. I think geographical information with countries, states and cities should be easily understood by most, so lets try that. I will ignore the fact that many countries doesn't have states ;-)

URLs as identifiers for entities

Our geographical domain easily lends itself to some kind of hierarchical structure of countries / states / cities. So intuitively we reach out for a hierarchical URL structure as shown below (for the sake of clarity I will ignore the host name and only show the path element of the URL). Let us try to build the URL for the city of Las Vegas in Nevada, USA:

  /countries/USA/states/Nevada/cities/Las+Vegas

That would work, but think a bit about it; what if the state of Nevada had more than one city called Las Vegas? How would we be able to distinguish between the two cities? The problem here is that we confuse searching for a city named Las vegas in Nevada, USA with the concept of identifying a specific city.

I believe that it is fair to assume that most geographical systems will have some kind of backend that assigns unique identifies to all of its entities. This may be integers, GUIDs or strings with composite keys - but in the end it boils down to a sequence of characters that uniquely identifies the entity in the system.

So let us assume that the well known city of Las Vegas is identified by the integer 82137 which is a unique city number. It may happen to be the same number which is used for a country or a state, but in the context of cities it is unique.

The same goes for countries and states: USA has the ID 54 and Nevada is identified by 7334. Now we get the URL:

  /countries/54/states/7334/cities/82137

But what happens if some client decided to lookup this URL with mismatching IDs:

  /countries/54/states/8112/cities/82137

Well, that should be considered a non existing resource and the server should return HTTP code 404 Not Found.

But why bother at all with the overhead of checking both state, country and city IDs when the city ID uniquely identifies the city? It would be easier for all parties if only the city ID was needed in the URL:

  /cities/82137

Now the server can do one single lookup by the ID to see if the referenced city exists. No need for any additional checking for matching state and country.

The same logic can be applied to states (and countries is trivial), so we end up with the following canonical URL structures for countries, states and cities:

  /countries/{country-id}
  /states/{state-id}
  /cities/{city-id}

Should it happen that the server doesn't assign unique IDs to cities (or states), and really needs the state reference for a city, because two cities in different states may have the same (non-unique) ID, then we must include both in the URL:

  /states/123/cities/77 => Rome in Italy (assuming some state in Italy is identified by 123)
  /states/432/cities/77 => Rome in the state of New York

In the rest of this post I will assume that all cities and states has "globally" unique IDs.

Finding the right ID with UI dropdowns

But how does the client know what ID to use, you may ask? This depends on the application, but lets take the scenario where an end user needs to get information about the city of Las Vegas (while still assuming that Nevada may have two Las Vegas).

The UI could be structured by three dropdowns: one for countries, one for states in the selected country and one for cities in the selected state. To present such a UI for the end user we first need to be able to get the list of all countries. The obvious choice for this resource is /countries. Then, for the selected country we need the list of states. The obvious choice here is /countries/{country-id}/states.

But what about the list of all cities for a specific state in a specific country? Let us avoid the trap of a hierarchical URL with multiple IDs and use the short /states/{state-id}/cities.

So now we have the following resources representing lists of geographical items:

  /countries
  /countries/{country-id}/states
  /states/{state-id}/cities

Each of these resources returns a JSON list as shown below and from this list the UI can easily build a dropdown element for selecting a city:

[
  { Name: "Item name A", ID: xxx },
  { Name: "Item name B", ID: yyy }
]


In this way the client gets the unique city ID by letting the end user select a city and its corresponding ID.

Query by text search

Another approach could be to use textual searching where the end user enters a query text like "Las Vegas, USA" (which is how Google maps work). This would require a new query resource:

  /cities?query=Las+Vegas,+USA

The result would be a list of matching cities:

[
  { Title: "Las Vegas, County A, Nevada, USA", ID: 16352 },
  { Title: "Las Vegas, County B, Nevada, USA", ID: 82137 }

]

Now the end user can select one of the results and thus get the ID of the city.

Adding hyper media, getting closer to REST

The previously mentioned approaches requires the client to create URLs by combining URL templates with IDs. This means the client has to be hard coded with the URL templates - and the consequence is a tight coupling to the URL structure of the web API.

But it is very easy to avoid this kind of URL coupling by using hyper media elements in the returned representations. Take for instance the list of cities matching the text "Las Vegas, USA"; here we can include the actual city URLs in the response instead of requiring the clients to construct the URLs itself:

[
  {
    Title: "Las Vegas, County A, Nevada, USA",
    ID: 16352,
    CityLink: "http://.../cities/16352"
  },
  {
    Title: "Las Vegas, County B, Nevada, USA",
    ID: 82137,
    CityLink: "http://.../cities/82137"
  }
]


Now we can start talking about a RESTful service instead of a static web API: by including hyper media elements we allow the server to include links to other hosts that might be better to represent cities:

[
  {
    Title: "Las Vegas, County A, Nevada, USA",
    CityLink: "http://other-geo-service/jump.aspx?type=city&ID=16352"
  },
  {
    Title: "Las Vegas, County B, Nevada, USA",
    CityLink: "http://geo.com/cities/82137"
  }
]


By including links we have stopped worrying about URL structures and has come one step closer to a RESTful service.

The upside is looser coupling to server URL structures, simpler client logic and enabling the use of different services on different servers. The downside is a larger payload with bigger URLs than simple IDs.

Filtering

So far we have looked at hierarchical data with some obvious URL structures. But what if we need to get the list of cities with a population of more than 200000 citizens? And what if we only want cities from the state of Massachusetts?

There are many different ways to do this depending on the complexity of the filtering. But it may be fine to start out with simple queries like "All cities in (Massachusetts or New York)"; first we need to use state IDs and thus we get "All cities in states (2321, 2981)". Such simple integer IDs can be separated with commas, so one possible URL structure could be:

  /cities?states=2321,2981

It is also possible to encode an SQL like query language in one single parameter:

  /cities?where=state+in+(2321,2981)+and+population+greater-than+200000

The possibilities are endless, but it usually consists of a path like /cities that identifies the type of query together with some set of URL parameters encoding the query specification.

A common solution is to interpret "&" as AND and "," as OR when possible. So for instance /cities?states=2321,2981&size=large,huge would mean "All cities where state is either (Massachusetts OR New Your) AND size is either (large OR huge)".

And no discussion about filtering without mentioning OData's URL conventions: http://www.odata.org/documentation/odata-v3-documentation/url-conventions/

Handling large input filters

URLs for filtering may become rather large, so another recurring question is "How do I handle filter strings too large for a URL"? The recommended solution is to POST the filter to a query resource, for instance like this:

  POST /city-filters
  Content-Type: application/x-www-form-urlencoded

  where=state+in+(2321,2981)+and+population+greater-than+200000


The server then creates a temporary resource for this query and returns a redirect to it:

  201 Created
  Location: /city-filters/9638


The client can then GET /city-filters/9638 to get the result of the query.

A nice side effect of this is that the created filter resource can be cached to avoid re-calculating the potentially very slow query on the server.

Natural keys, surrogate keys, URL aliases and resource duplication

A common question relates to the use of natural keys versus surrogate keys in URL construction. It is more or less the same discussion as we see with databases (see for instance http://www.agiledata.org/essays/keys.html). Examples of natural keys could be order numbers, e-mails, postal codes, social security numbers and phone numbers.

When choosing between natural keys versus surrogate keys you should consider the lifespan of the key; URLs are supposed to be stable over a very long period of time, so do not choose keys that vary over time. For instance, do not use phone numbers and e-mails to identify people since people tend to change these during their life.

You should also beware of natural keys which can be used by more than one entity. It is for instance (still) common for some members of a family to share a common e-mail, so e-mails are not good candidates for identifying persons. Even social security numbers may sometimes change. In Denmark for instance a person may get a new social security number if they change gender.

A valid natural key could be a sales order number since these are supposed to be both unique and stable.

But if we introduce natural keys, should we then only use natural keys? What if an entity has both a natural key and an internal surrogate key? You can use both but you should decide on one being the canonical ID and avoid duplicating resources by using HTTP redirects for the secondary keys.

Take for instance a sales order with the order number SK324-1 and internal surrogate key 887766 - if we consider the order number as the canonical ID then we can use these URL structures:

  /orders/SK324-1  =>  returns order representation
  /orders/id/887766 => redirects to /orders/SK324-1

Redirects should be done using the HTTP status code 303 See Other with a Location header containing the canonical URL.

As stated earlier on: do not confuse searching with identity. You may want to search for a person with a specific e-mail, but the result should include the canonical URL of the found person.

See also http://www.w3.org/TR/webarch/#uri-aliases for a discussion of URL aliases and duplication.

Relations and back-references

What if we want back-references and other relations to other resources, does that influence the URL structure? For instance, now that we have links to states in a country, we might also want links to the country in which a state belongs. That might lead to something like this:

  /states/{state-id}/country

But, wait a minute, we already have links to countries, right? The canonical version is although /countries/{country-id} so how do we get from /states/{state-id}/country to /countries/{country-id}? The obvious answer is to consider the /states/{state-id}/country URL as an alias for some country and use HTTP redirects to get to the canonical country URL.

But lets step back and take a broader look at relations in general; a back-reference is just one kind of relation from one resource to another - but we could have many other kinds of relations, like "neighbor states", "the country of a city", "statistical information about a state" and so on. The general solution to this concept is to include links in the payloads instead of creating a myriade of small alias resources that only redirects to canonical URLs.

So, instead of using /states/{state-id}/country for the country of a certain state, we include the canonical country link in the representation of the state:

  GET /states/4321

  returns =>

  {
    Name: "State X",
    CountryLink: "http://.../countries/1234",
    NeighborStatesLink: "http://.../states/4321/neighbors"
  }


Static data, volatile data and caching

Some times we end up with some sort of "hotspot" resource with tons of requests and a very volatile content making it impossible to cache the result and improve performance in that way. A solution to this may be to split the resource into two (or more) different sub-resources; a cacheable resource and a volatile non-cacheable resource.

Take for instance our state resource at /states/{state-id} - it may contain some very static data like the name of the state, its area and such like plus some volatile data like for instance the number of Tweets tweeted from that state the last ten minutes. The static information could easily be cached, but we have no way to do it since the complete resource also contains the number of Tweets.

The solution is straight forward: split the resource into two different resources:

  /states/{state-id} => static state information
  /states/{state-id}/tweet-stats => volatile tweet information

I'll admit that the above example is rather contrived, so lets try a more realistic example: a streaming music distribution network publishes information about its songs through an online web API. Each song has its own resource representation with details about the song. The title, lyrics, artist and such like won't change much (if ever), but the company also publishes the number of current listeners which changes all the time. To improve caching characteristics the song data is split into (at least) two different resources:

  /songs/{song-id} => static song details (cacheable)
  /songs/{song-id}/usage => volatile usage information (non cacheable)

But who says the song usage is published by the same API? Some time after the initial release of the web API the company off-load some of the streaming to another content delivery network which will also deliver the usage statistics. Now suddenly not only the URL structure changes but even the host name changes:

  /songs/{song-id} => static song details (cacheable)
  http://cdn.com/acme/file-usage/{song-id} => volatile usage information (non cacheable)

This is a breaking change and all clients must now be upgraded. Had the API instead contained hyper links then the change would have been transparent to all clients.

Classic song representation:

{
  Id: 1234,
  Name: "My song"
}


Hyper media improved representation:

{
  Id: 1234,
  Name: "My song",
  UsageLink: "http://cdn.com/acme/file-usage/1234"
}


Once again we see how unimportant the actual URL structure is when we start using hyper media elements in the responses.

Formats and content types

If the same resource can be found in different formats (encoded with different media types) then we can ask ourself, should URLs end on .json .xml or similar extensions? On one side it makes it easy to explore the different representations using a standard web browser - on the other side it introduces different URL aliases for the same resource.

My recommendation is to implement the extensions as a convenience for the client developers, but avoid using them when interacting with the API "for real". If for instance our geographical API can return both JSON as well as XML and HTML then I would use these URLs for states:

  /states/{state-id} => canonical URL used in all returned hyper media elements
  /states/{state-id}.json => JSON representation of state
  /states/{state-id}.xml => XML representation of state
  /states/{state-id}.html => HTML representation of state

The canonical URL would also support standard HTTP content negotiation for JSON, XML and HTML representations of the exact same resource. The framework I use, OpenRasta, supports this dual type of "content negotiation" right out of the box with no implementation overhead.

If our resources have different variations then we can add them as "sub resources" of the primary resource (not that such a thing really exists since URLs are opaque strings). Where I work we have resources for documents in a case management system. These resources contains meta data about the document (title, owner and so on) - and then we have various other (sub) resources for the documents themselves - the raw binary document (image, power point, pdf etc.), a PDF replica of the document and a PDF replica with an added front page containing the document meta data. Thus we get these URLs:

  /documents/{doc-id} => canonical document meta data URL
  /documents/{doc-id}/pdf => PDF replica
  /documents/{doc-id}/meta-pdf => PDF replica with meta data frontpage

Use NOUNS not VERBS

I think most people get this right nowadays: URLs should be NOUNS not VERBS. Avoid URLs like /getOrders and /updateCountry - use all of the HTTP verbs instead when interacting with the resources and use something like /orders/{order-id} and /countries/{country-id} for the URLs. If you run out of HTTP verbs then invent new resources.

In this way you will avoid the trap of doing something horrible like this which I would expect to delete order number 1234 when you GET the resource:

  GET /orders/1234/delete

You also get the ability to identify all your resources and add caching, which is not possible with this sort of old school SOAP'ish look-up mechanism:

  POST /orders
  Body => { OrderId: 1234, Operation: "read" }


And you get a nice explorable and consistent API that you developers will love to use :-)

Versioning

Where should API version numbers go in the URL? Should it be /api/v1/countries, /countries-1 or maybe in the host name http://v1.api.geo.com/countries?

Well, API and URL versioning is a whole story in itself so I suggest you take a look at Mark Nottingham's excellent "API versioning smackdown" (http://www.mnot.net/blog/2011/10/25/web_api_versioning_smackdown) for a good discussion on this subject.


Have fun, hack some code and create beautiful APIs out there :-)

/Jørn

onsdag, juni 05, 2013

Ramone 1.2 released

This new version of Ramone adds a few utility methods to the existing library:
  • Introducing cache headers on Request object:
      Request.IfModifiedSince()
      Request.IfUnmodifiedSince()
      Request.IfMatch()
      Request.IfNoneMatch()
  • Introducing .NET cache policy on Session and Service:
      Session.CachePolicy
      Service.CachePolicy
  • Adding Request.OnHeadersReady() for working with the underlying HttpWebRequest object.
  • Adding Request.AddQueryParameters() as an alternative to binding with predefined URL templates.
  • Adding XML settings in XmlConfiguration.XmlReaderSettings. Used when deserializing XML documents. Default is to allow DTD processing.
Ramone is a C# client side library for easy consuming of web APIs and REST services. It is available on GitHub: https://github.com/JornWildt/Ramone and as a NuGet package https://nuget.org/packages/Ramone/1.2.056.

Have fun, Jørn

fredag, maj 17, 2013

The role of media types in RESTful web services

One of the never ending discussions in the REST community is that of custom and domain specific media types; should we, or should we not, create new media types - and if we should, for what reasons should it be done?

In this blog post I will discuss the role of media types in web services and illustrate it with an example media type. I will go through the requirements for this media type and from this I will build up the features it needs to support. Together with this I will show some example scenarios and sketch out the processing algorithm for the client side. At last I compare this media type to other similar media types (HAL, Sirene, JSON-API).

My goals for this blog post are:
  1. To improve my own understanding of the role of media types in RESTful web services - and share that with others.
  2. To define a new media type for what I call systems integration - and show how it facilitates loose coupling between the integration components.

By systems integration I mean the kind of background processing that takes place behind the scenes in almost any IT enabled business today; shuffling data from one system to another in a safe and durable way without any human interaction.

REST seems like a good fit for systems integration. It has a strong focus on loosely coupled systems where servers and clients can evolve independently of each others; if we can leverage that then the whole ecosystem of multiple servers and clients should be a lot easier to maintain and with much less downtime required for upgrading the various components.

There is an ongoing trend to include hyper media controls in never web services; that is a good trend as it removes the clients dependency on specific URL structures. This in turn allows the server to evolve by adding new resources and link to these - and it also facilitates the ability to use multiple servers without the clients ever noticing (since the client do not care about either URL path structures or host names).

But there is still a thing missing in the puzzle. In Roy Fielding's (in)famous rant "REST APIs must be hypertext-driven" he states:

... Any effort spent describing what methods to use on what URIs of interest should be entirely defined within the scope of the processing rules for a media type

... From that point on, all application state transitions must be driven by client selection of server-provided choices that are present in the received representations

Especially the last statement is interesting "all application state transitions must be driven by client selection of server-provided choices". This means the client should not make any requests without first being instructed to do so (and how to do it). The client should not POST a new Tweet, bug report or similar without being instructed, on the fly, by some mechanism embedded in the server responses. Todays use of links in responses is on the right track, but links do not inform the client about what HTTP method to use (it assumes GET) and neither does it say anything about the possible payload.

With this blog post I will try to explain how a media type, with a sufficient number of hyper media controls, together with some intelligent client side code, can enable what Fielding is describing. The downside of this approach is that client implementations become more complex - the upside is that the whole client/server application becomes much more loosely coupled which, in the end, hopefully will help us reach a maintenance Nirvana of loosely coupled systems integration :-)

By the way, I am not comparing REST with SOAP/WSDL and EDA (event driven architectures) - that is not the purpose here even though these are often found in systems integration projects. I would rather just explore what benefits we can get from REST.

Media type requirements and constrains

The primary driver for this new media type is loose coupling where the clients only depends on the media type and some out-of-band business specific data structures and identifiers. This means:

  • The client must not make any assumptions about URL structures.
  • The client must not make any assumptions about what concrete service implementation it is interacting with.
  • The client must not initiate any HTTP request without following instructions embedded in server responses (besides the initial request).
  • The client should not be given more than:
    • A root URL from which all other resources must be discovered at runtime.
    • A set of business specific data structures.
    • A set of well known identifiers for locating hyper media controls and business data.
The media type itself must be generic with respect to the business domain; it must not contain references to concepts like medical records, e-commerce and so on.

The media type must be rich enough in terms of hyper media affordances to enable all the operations needed for systems integration.

The media type does not need to included much, if any, in terms of UI elements since it is intended for operations without human interaction. Neither is the media type intended for mobile use where bandwidth and message size is a concern.

The media type will be based on JSON. It could just as well be based on XML but, in my experience, JSON is lot simpler to work with, fits the data needs I have met, and has a simple and easy-to-work-with patch format (application/json-patch) which will come in handy later on.

Armed with these constraints and requirements we are ready to build up our new media type.

Example business domain "BugMe"

Through out this blog post I will use the imaginary open standard "BugMe" for interacting with bug tracking systems through the new media type. BugMe supports adding of new bug reports, attaching documents to reports, adding comments to reports and similar features shown later on.

BugMe is not a part of the media type specification - it is only used to illustrate how the media type facilitates interaction with BugMe servers anywhere on the web.

Neither is BugMe a vendor specific "standard", it is strictly defined in terms of the generic media type and a set of bug reporting specific data structures and identifiers (more on that later on).

Compare this to APIs like Twitter and others; these are always defined in terms of vendor specific resources and explicit URL structures and was never designed to be implemented on servers anywhere else on the web.

To highlight the difference between a standard like BugMe and an actual implementation I will assume that some clever guy named Joe, who studies computer science 101 at Example.edu, has set up a BugMe server for some local study project. He is using an implementation that uses a vocabulary slightly different from  BugMe - it talks about "issues" where BugMe talks about "bug reports". This fact is illustrated through the concrete URLs used in the examples . The root URL is http://example.edu/~joe/track.

Example 1 - Creating a bug report

The first thing we will try is to create a new bug report with BugMe. To do so we must supply our client with a few details about the operation:
  • The root URL: http://example.edu/~joe/track/index.
  • A "create bug report" identifier (as defined by BugMe): "http://bugme.org/names/create-bug-report".
  • Bug reporting data (as defined by BugMe)
    • Title: "Something bad happened",
    • Description: "I pressed ctrl-alt-del and all went black",
    • Severity: 5
We must also have an identifier for the media type. Lets call it it "application/razor+json" for no specific reason.
Now we are ready to set our client loose and make it create the bug report. It will do so in the same manner as a human working with a web based UI: get a resource representation, look for well known identifiers that labels data and hyper media controls, fill out data and activate hyper media controls.

This interaction pattern, getting a resource representation and following instructions on the fly, has a price: it requires more complex client side logic than "normal RPC" patterns with design time binding of methods and it results in higher bandwidth due to the embedded hyper media controls. The upside is a much looser coupling between clients and serves. But all of this is of course already discussed in Fielding's thesis on REST ;-)

GET initial resource

At the very beginning our client has nothing to do but GET the root URL in hope of finding something useful there:

Request
GET /~joe/track/index
Accept: application/razor+json

Response
Content-Type: application/razor+json

{
  curies:
  [
    { prefix: "bug", reference: "http://bugme.org/names/" }
  ],
  controls:
  [
    ...,
    {
      type: "link",
      name: "bug:create-bug-report",
      href: "http://example.edu/~joe/track/add-issue",
      title: "Add issue to issue tracker"
    },
    ...
  ]
}

The returned JSON data contains two top level properties defined by the media type: curies and controls. "curies" define short names for URLs used as identifiers in the other elements (see http://www.w3.org/TR/curie/) and "controls" contains various hyper media controls. The use of curies should be optioinal - but it helps reading the responses in posts like this.

Now the client scans the "controls" element looking for the identifier "bug:create-bug-report". In this case it finds a "link" control which is equivalent to an ATOM link. Since our client understands all the features of the media type it will know that a link should be "followed" by issuing a HTTP GET on the "href" value.

This little "algorithm" is equivalent to what a human would do: open up a webpage, look for instructions on how to perform the task at hand and then follow them.

You may have noticed the dots "..." in the example. Those are there for a reason: they illustrate how the client only cares about stuff that is relevant to its current task. Anything else in the response is ignored. The consequence is that the server is free to evolve the content of the resource over time without breaking any clients - as long as it only adds new stuff. Neither does the client care if the content is supposed to be a "link page", a service index, a medical record or have any other specific "type" - as long as it contains elements that will help the client getting closer to its goal.

Follow link

Here we have the next operation:

Request
GET /~joe/track/add-issue
Accept: application/razor+json

Response
200 Ok
Content-Type: application/razor+json

{
  curies: ...,
  controls:
  [
    {
      type: "poe-factory",
      name: "bug:create-bug-report",
      href: "http://example.edu/~joe/track/add-issue",
      title: "Create new idempotent POE resource"
    }
  ]
}

Bingo! This time the client finds an "poe-factory" control with the right name "bug:create-bug-report" and now its time to create the bug report. The control type "poe-factory" means "Post Once Exactly factory" and is a special action element that enables idempotent POST operations. If you do not know what "idempotent" means then take a look at this page: http://www.infoq.com/news/2013/04/idempotent.

The good thing about idempotent operations is that they can safely be repeated if anything goes wrong on the network. If an operation times out the client can simply retry it again without the risk of creating the same entry multiple times. And since this new media type is for safe and durable "behind the scenes" work I find it rather important to include a mechanism for idempotent POST operations.

The implementation chosen here requires the client to do an empty POST first. This will create a new POE resource (thus the name "poe-factory") and redirect the client to it. The client can then POST to the new resource as many times it needs until the operation succeeds. The server returns "201 Created" first time it completes the operation whereas it returns "303 See Other" on following requests. In either case the server includes a "Location" header pointing to the new POE resource.

Subbu Allamaraju has a nice blog post on post once exactly techniques.

I chose this approach for the following reasons:
  • It has the simplest possible client side logic - at the cost of an extra round trip to the server. A similar solution could have required the client to create a GUID (message ID) and include it in the payload somehow, but that would make the protocol slightly more prone to client side errors.
  • It requires no special headers.
  • It adds no extra information to the payload.
  • URLs are opaque and the server gets to choose how the POE/message ID is encoded.

Create POE resource

In order to complete its task the client first issues an empty POST operation to the URL of the "href" attribute:

Request
POST /~joe/track/add-issue
Content-length: 0

Response
201 Created
Location: http://example.edu/~joe/track/add-issue/bd925-ye174h

GET POE resource

It should be rather obvious now that the client has no choice but to follow the response:

Request
GET /~joe/track/add-issue/bd925-ye174h
Accept: application/razor+json

Response
400 Ok
Content-Type: application/razor+json

{
  curies: ...,
  controls:
  [
    {
      type: "poe-action",
      name: "bug:create-bug-report",
      documentation: ... some URL ...,
      method: "POST",
      href: "http://example.edu/~joe/track/add-issue/bd925-ye174h",
      type: "application/json",
      scaffold: ... any JSON object ...,
      title: "Add issue"
    }
  ]
}

Now the client gets a response with a "poe-action" control. This tells the client that it can safely POST as many times it needs to the "href" URL. The actual payload is given by the BugMe specification (Title, Description, Severity).

Some comments on the above response:
  1. The payload is encoded in application/json as a trivial JSON object. Other formats may be included in the media type spec later on.
  2. This format is NOT intended for automatic creation of UI's and thus it contains no UI related list of field definitions or similar.
  3. It is NOT necessary to embed any kind of schema information - that sort of thing is given by the name of the control element.
  4. The optional "scaffold" value is the JSON payload equivalent of a URL template: it supplies default values to some properties and adds additional "hidden" properties the client can ignore (as long as they are sent back).
  5. POE-actions are not restricted to POST - a PATCH with json/patch would work as well (but then perhaps we need to change the action type name).

Create bug report

Then the client issues a new request:

Request
POST /~joe/track/add-issue/bd925-ye174h
Accept: application/razor+json
Content-Type: application/json

{
  Title: "Something bad happened",
  Description: "I pressed ctrl-alt-del and all went black",
  Severity: 5
}

Response
201 Created
Location: http://example.edu/~joe/track/issues/32

GET created bug report

Now we are done unless we want to see the actual created bug report by following the Location header:

Request
GET /~joe/track/issues/32
Accept: application/razor+json

Response
Content-Type: application/razor+json

{
  curies: ...,
  controls: ...,
  payloads:
  [
    ...,
    {
      name: "bug:bug-report",
      data:
      {
        Id: 32,
        Title: "Something bad happened",
        Description: "I pressed ctrl-alt-del and all went black",
        Severity: 5,
        Created: "2012-04-23T18:25:43Z"
      }
    },
    ...
  ]
}

Now that the client can see the actual bug report it wanted to create it knows that the task is completed. Everyone is smiling and put on their happy face :-)

Other hyper media controls

There are of course more scenarios to cover than this single "Create stuff" scenario and these scenarios will call for other kinds of hyper media controls, for instance URL templates, PATCH actions, binary file upload and more (I should cover these in some future blog posts ...)

Error handling

If the client receives a 4xx or 5xx status code it can inspect the JSON payload and look for a property named "error" together with the other "payloads" and "controls" properties. The "error" property should contain data according to my previous blog post on error handling.

Here is an example:

Request
POST /~joe/track/add-issue/bd925-ye174h
Accept: application/razor+json
Content-Type: application/json

{
  Title: "Something bad happened",
  Description: "I pressed ctrl-alt-del and all went black",
  Severity: 5
}

Response
503 Service Unavailable
Content-Type: application/razor+json

{
  error:
  {
    message: "Could not create new bug report; server is down for maintenance",
    ...
  }
}

In addition to this the client can try to use content negotiation to receive error information in the format of application/api-problem+json.

Client side processing algorithm

Here is a simplified view of how the client should process the content:
  1. GET initial root resource.
  2. [LOOP:] Look for hyper media controls with appropriate names.
  3. Check the type of the found control element:
    1. If it is a "link" then follow that link and restart from [LOOP].
    2. If it is a "poe-factory" then issue an empty POST to the href value and restart from [LOOP].
    3. if it is a "poe-action" then issue a request with the specified method and data encoded according to the "target" media type. Then restart from [LOOP].
  4. Look for a payload with the appropriate name: If it exists then the task is complete - otherwise it has failed (actually I don't like this last step, but that is the only kind of "acknowledge" I can see the server responding with).
A consequence of this approach is that the service specification (BugMe in my example) should state nothing about how to find and update data since that is up to the servers actual implementation. The service specification should only consider what kind of data to look for or modify. The "how"-part is contained entirely in the returned hyper media controls.

As the media type evolves and more types of hyper media controls are added the client(s) will grow more and more complex. This is one of the trade offs that has to be accepted in order to keep clients and servers as loosely coupled as possible.

If the media type gets popular one could even expect to see the same scenario we see with todays web browsers: there will be multiple implementations of the client libraries and some will implement more than others of the final specification.

No profile needed

It may be tempting to allow for a "profile" parameter with the media type ID. But typically that would be used to ask for a specific "type" of a resource like for instance "application/razor+json;profile=user". As can be seen in the client side processing algorithm above there is no need for such a thing, so lets not introduce it.

Related work

Quite a few other people are trying to create new media types to reach similar goals, but neither of them include features such as POE semantics. Here is the list of related media types that I am aware of:
And then there is Jim Webber's fantastic "How to GET a cup of coffee" which has been a big inspiration for me over the years.

Reasons for creating a new media type

How many media types should we invent? Well, as many as needed, I would say. The media type described here includes some features not found in other media types (POE semantics for instance) and that should be sufficient argument for creating a new one.

I don't see anything wrong by creating many media types - eventually a few of them will be good enough and gain enough traction to become ubiquitous standards. That's called evolution.

Summary

In this blog post I have tried to explain one way of understanding media type's role in RESTful web services and illustrated it by building up (parts of) a media type for systems integration. I have also touched upon the issue of "typed" resources and how to avoid it (by not assuming anything about the resource type and instead look for certain identifiers in the response) ... there could be a blog post more to come on this issue.

So what do you think? Was this useful, understandable, totally overkill, outright naive or simply a pile of, well, rubbish? Feel free to add a comment, Tweet me or send me an e-mail. I would love to get some feedback.

Happy hacking, Jørn

UPDATE 2014-02-24: I have actually put much of this into a media type called Mason. See http://soabits.blogspot.dk/2014/02/implementing-hypermedia-apis-and-rest.html.

onsdag, maj 15, 2013

Error handling considerations and best practices

A recurring topic in REST and Web API discussions is that of error handling (see for instance https://groups.google.com/d/topic/api-craft/GLz_nNbK-6U/discussion or http://stackoverflow.com/questions/942951/rest-api-error-return-good-practices]; what information should be included in error responses, how should HTTP status codes be used and what media type should the response be encoded in? In this blog post I will try to address these issues and give some guidelines based on my own experience and existing solutions.

Existing solutions

Let us first take a look at some existing solutions to get started:
  • The twitter API uses a list of descriptive error messages and error codes. Twitter has both JSON and XML representations with property names: "errors", "error", "code"
  • The Facebook Graph API has  a single descriptive error message, an error code and even a sub-code. Facebook uses a JSON representation with property names: "error", "message", "type", "code" and "error_subcode".
  • The Github API has a top level descriptive error message and a optional list of additional error elements. The items in the error list refers to resources, fields and codes. Github uses a JSON representation with property names: "message", "errors", "resource", "field", "code".
  • The US White House has a set of guidelines for its APIs on GitHub. The error message used here contains the HTTP status code, a developer message, a user message, an error code and links to further information.
  • Ben Longden has proposed a media type for error reporting. This specification includes an "logref" identifier that some how refers to a log entry on the server side - such a feature can help debugging server errors later on.
  • Mark Nottingham has introduced "Problem Details for HTTP APIs" as an IETF draft. This proposal makes use of URIs for identifying errors and is as such meant as a general and extensible format for "problem reporting".
All of these response formats share some similar content: one or more descriptive messages, status codes and links to further information. But as can be seen there is a wide variety in the actual implementation and wire format.

Considerations and guidelines

So, what should you do with your web API? Well, here are some considerations and guidelines you can base your error reporting format on ...

Target audience

Remember that your audience includes both the end user, the client developer, the client application and your frontline support (which may just happen to be you). Your error responses should include information that caters for all of these parties:
  • The end user needs a short descriptive message.
  • The client developer needs as much detailed information as possible to debug the application.
  • The client application needs error codes (HTTP status codes) for error recovery actions.
  • The frontline support people needs detailed information and/or keywords to look for in their knowledge database.

Use the HTTP status codes correct

The HTTP status codes are standardized all over the web and your clients will know immediately how to handle them. Make sure to use them correct:
  • Do NOT just return HTTP status code 200 (OK) regardless of success or failure.
  • Use 2xx when a request succeeds.
  • Use 4xx when a request fails and the client should be able to fix it by modifying its own request.
  • Use 5xx when a request fails due to some internal server error.

Use descriptive error messages

Be descriptive in your error messages and include as much context as possible. Failure to do so will cost you dearly in support later on: if your client developers cannot figure out why their request went wrong, they will look for help - and eventually that will be you who will spend time tracking down client errors instead of coding new and exiting features for your service.

If it is a validation error, be sure to include why it failed, where it failed and what part of it that failed. A message like "Invalid input" is horrible and client developers will bug you for it over and over again, wasting your precious development time. Be descriptive and include context: "Could not place order: the field 'Quantity' should be an integer between 0 and 99 (got 127)".

You may want to include both a short version for end users and a more verbose version for the client developer.

Localization

Error messages for end users should be localized (translated into other languages) if your service is already a multi language service. Personally I don't think developer messages should be localized: it is difficult to translate technical terms correct and it will make it more difficult to search online for more information.

When localization is introduced it may also be necessary to include language codes and maybe even allow for a list of different translations to be returned in the error response.

Allow for more than one message

Make it possible to include more than one message in the error response. Then try to collect all possible errors on the server side and return the complete list in a single response. This is not always possible - and requires some more coding on the server side (compared to simply throwing an exception first time some invalid input is detected).

Additional status codes

If your business domain calls for more detailed information than can be found in the normal HTTP status codes then include a business specific status code in the response. Make sure all of the codes are documented.

You may be tempted to include more technical error codes, but consider who your audience is for that: It won't help your end user. It may help your client application recovering from errors - but probably not in any way that was not already covered by the HTTP status codes. Your client developer may have some need for it - but why make them lookup error codes in online documentation when you can include descriptive error text and links that refers directly to the documentation? It may help your support - but if the client dev have enough information in the error response they won't need to call your support anyway - right?

Use letters for status codes

I often find myself searching for online resources that can help me when I get some error while interacting with third party APIs. Usually I search for a combination of the API name, error messages and codes. If you include additional error codes in your response then you might want to use letters instead of digits: it is simply more likely to get a relevant hit for something like "OAUTH_AUTHSERVER_UNAVAILABLE" than "1625".

Include links to online resources

Include links to online help and other resources that will either clarify what went wrong or in some other way help the client developer to solve the problem.

Support multiple media types

If your have a RESTful service that allows both client applications and developers to explore it then you might want to support a human readable media type for your error responses. HTML is perfect for this as it allows the client developers to view the error information righ in their browsers without installing any additional plugins. A fallback to plain text could also be useful (but probably overkill).

Include a timestamp or log-reference

It can help support and bug hunting if the error report contains a timestamp (server timezone or UTC). This may help locating the right logfile entries later on.

Another possibility is to include some other kind of information that refers back to the logfiles such that server developers and support people can track what happened.

Field-by-field messages

In some cases it makes sense to be explicit about the fields in the input that caused the errors and include field names in separate elements of the error response. For instance something like this JSON response:

{
  message: "One or more inputs were not entered correctly",
  errors:
  [
    { field: "Weight", message: "The value if 'Weight' exceeds 100 - the value should be between 0 and 100" },
    { field: "Height", message: "A value must be entered for 'Height'" }
  ]
}


This would make it possible for the client to highlight those fields in the UI and draw the end users attention to them. It is although difficult to keep clients and servers in sync and requires a lot of coding on both sides to get it to work. Usually field-by-field information is handled by client side validation logic anyway. So a clear error message like "The value of 'Weight' exceeded 100 - the value should be between 0 and 100" should be enough for most applications.

Include the HTTP status code

This may sound a bit odd, but according to people on api-craft there are some client side environments where the application code do not have access to the HTTP headers and status codes. To cater for these clients it may be necessary to include the HTTP status code in the error message payload.

Do not include stack traces

It may be tempting to include a stack trace for easier support when something goes wrong. Don't do it! This kind of information is too valuable for hackers and should be avoided.

Implementation

Now that we have our "requirements" ready we should be able to design a useful solution. Lets first try to define the response without considering an actual wire format:

  • message (string): the primary descriptive error message - either in the primary language of the server or translated into a language negotiated via the HTTP header "Accept-Language".
  • messages (List of string): an optional list of descriptive error messages (with the same language rules as above).
  • details (string): an optional descriptive text targeted at the client developer. This text should always be in the primary language of the expected developer community (that would be English in my case).
  • errorCode (string): an optional error code.
  • httpStatusCode (integer): an optional copy of the HTTP status code.
  • time (date-time): an optional timestamp of when the error occurred.
  • additional (any data): a placeholder for any kind of business specific data.
  • links (List of <string,string,string>): an optional list of links to other resources that can be helpful for debugging (but should probably not be shown to the end user). Each link consists of <href, rel, title> just like an ATOM link element.
I have ignored the possibility of having multiple translations of the messages. Neither does this implementation include any field-by-field validation since I expect that to be performed by the client. That doesn't mean the server shouldn't do the validation - it just doesn't have to return the detailed field information in a format readable by the client application.

JSON format example

Now it is time to select a wire format for the error information. I will choose JSON since that is a wide spread and well known format that can be handled by just about any piece of infrastructure nowadays. The format is straight forward and is probably best illustrated with a few examples:

Example 1 - the simplest possible instantiation

{
  message: "The field 'StartDate' did not contain a valid date (the value provided was '2013-20-23'). Dates should be formated as YYYY-MM-DD."
}


Example 2 - handling multiple validation errors

{
  message: "There was something wrong with the input (see below)",
  messages:
  [
    "The field 'StartDate' did not contain a valid date (the value provided was '2013-20-23'). Dates should be formated as YYYY-MM-DD.",
    "The field 'Title' must have a value."
  ]
}


Example 3 - using most of the features

{
  message: "Could not authorize user due to an internal problem - please try again later.",
  details: "The OAuth2 service is down for maintenance.",
  errorCode: "O2SERUNAV",
  httpStatusCode: 503,
  time: "2013-04-30T10:27:12",
  links:
  [
    { 

      href: "http://example.com/oauth2status.html", 
      rel: "help", 
      title: "Service status information" 
    }
  ]
}

Client implementation and media types - a matter of perspective

The client implementation should, at a suitable high level, be straight forward:
  1. Client makes an HTTP request.
  2. Request fails for some reason, server returns HTTP status code 4xx or 5xx and includes error information in the HTTP body.
  3. Client checks HTTP status code, sees that it is 4xx or 5xx and decodes the error information.
  4. Client tries to recover from error - either showing the error message to the end user, write the error to a log, give up or maybe retry the request - all depending on the error and the client's own capabilities.
But, hey, wait a minute ... how does the client know how to to decode the payload? I mean, perhaps the client asked for a resource representation containing medical records, but then it got a HTTP status code 400 - how is it supposed to know the format of the error information?

If the client is working with a vendor specific service, like Twitter and GitHub, then chances are that the client is hard wired to extract the error information based on the vendor specific service documentation. My guess is that this is how most clients are implemented.

But what if the client is working with a more, shall we say, RESTful service? That is; the client doesn't know what actual implementation it is interacting with. This could for instance be the case of clients consuming an ATOM feed (application/atom+xml). How would the client know how to decode the error response payload? Actually this seems like an unanswered question for ATOM since the spec is rather vague about this point (see for instance http://stackoverflow.com/questions/9874319/how-to-represent-error-messages-in-atom-feeds)

A RESTful service specification may call for a media type dedicated to error reporting; lets call such a media type "application/error+json". When the client receives a 4xx or 5xx HTTP status it can then look at the content-type header: if it matches "application/error+json" then the client would know exactly what to look for in the HTTP body.

It could also be that the base media type included detailed specification about error payloads.

I would prefer one of the two last options: either specify error handling in the base media type of the service - or use an existing standard media type. The last option is actually what Mark Nottingham has done with https://tools.ietf.org/html/draft-nottingham-http-problem-03.

So it is a matter of perspective: vendor specific "one-of-a-kind" services tend to invent their own error formats whereas RESTful services (like ATOM) should standardize error reporting via media types for everyone to reuse all over the web.

Have fun, Jørn