I have started an on-line book of recipes for building hypermedia APIs with the Mason (hyper)media type. It is very much work in progress but hopefully I can add one or two recipes a week on my daily train commute to and from work.
The book source is hosted on GitHub and is made available on-line at GitBook in various formats. Later on I may host it another server depending on how GitBook turns out to work with - but so far it has been a nice experience.
You are most welcome to add issues, fork it and send me pull requests for improvements.
Viser opslag med etiketten hyper-media. Vis alle opslag
Viser opslag med etiketten hyper-media. Vis alle opslag
onsdag, juli 01, 2015
Introducing the Mason Cook Book
torsdag, marts 27, 2014
Modelling a Shipment example as a hypermedia service with Mason
Yesterday I was attending the "RAML" workshop at the API Strategy konference. In this workshop the speakers introduced a very simple little Web API; It allowed a customer to GET a (shipment) quote and afterwards create (POST) an actual shipment request based on the quote.
I decided that it could be fun to hypermedia-ize the example and show how it could be represented using the media type Mason. So here we go :-)
The first step is to ask for a quote; the customer has a package of some sort and needs to ship it from A to B, so he access the shipment service and asks for a quote (a price) given the size, weight, origin and destination for the package.
In the original example you could ask for a quote by issuing a GET request to /quote. But I believe that asking for a quote would result in a concrete quote being created and stored in the system as a separate resource to access later on, either by the customer or by the customer service of the shipment company. So I would rather go for a POST of a quote request followed by a redirect to the newly created quote.
At this point we could either document how to POST such a quote - or we could tell the client how to do it using hypermedia controls - and obviously I would go for the later. So lets ask the service for instructions and issue a GET /quote request. The output is a Mason document with suitable hypermedia controls embedded in it:
{
"@namespaces":
{
"myth":
{
"name": "http://mythological-shipment.com/api/rel-types"
}
},
"@actions":
{
"myth:quote":
{
"type": "json",
"method": "POST",
"href": "http://mythological-shipment.com/api/quote",
"title": "Ask for a quote",
"description": "Ask for a quote by posting package details. Weight is in kilograms, volume in cubic decimeters, origin and destination must be known identifiers for airports.",
"schemaUrl": "... URL to JSON schema describing the request ..."
}
}
}
The client reads this action specification, encodes the package details in JSON and POST it to the URL of the "href" property. As a result the service creates a new quote resource and redirects the client to it:
Request:
POST http://mythological-shipment.com/api/quote HTTP/1.1
content-type: application/json
{
"weight": 2.3,
"volume": 4,
"origin": "CPH",
"destination": "AMS"
}
Response:
201 Created
Location: http://mythological-shipment.com/api/quotes/myqo-129-gyh
Now the client can GET the newly created quote to get further instructions of how to accept the quote. The result is again a Mason representation of the quote itself plus hypermedia controls for accepting the quote:
{
"id": "myqo-129-gyh",
"weight": 2.3,
"volume: 4,
"origin": "CPH",
"destination": "AMS":
"price": 12,
"@links":
{
"self":
{
"href": "http://mythological-shipment.com/api/quotes/myqo-129-gyh"
}
},
"@actions":
{
"myth:accept-quote":
{
"type": "POST",
"href": "http://mythological-shipment.com/api/quotes/myqo-129-gyh/state",
"template":
{
"accepted": "yes"
}
}
}
}
As you can see the quote has a "self" link identifying the location of the quote resource. It also have a "accept-quote" action that instructs the client about how to accept the quote. In this case all the client has to do is to POST a predefined JSON value to http://mythological-shipment.com/api/quotes/myqo-129-gyh/state.
The result of accepting a quote is that it is converted to a sales order (in lack of better domain understanding - there's probably a better word for it). So the accept-quote operation results in a redirect to the newly create sales order which the client can GET:
{
"id": "myqo-129-gyh",
"weight": 2.3,
"volume: 4,
"origin": "CPH",
"destination": "AMS":
"price": 12,
"@links":
{
"self":
{
"href": "http://mythological-shipment.com/api/orders/myqo-129-gyh"
},
"myth:quote":
{
"href": "http://mythological-shipment.com/api/quotes/myqo-129-gyh"
},
"myth:shipment-label":
{
"href": "http://mythological-shipment.com/api/quotes/myqo-129-gyh/label",
"type": "application/pdf"
}
}
}
At last the customer needs a shipment label to print out and stick onto the package. All it has to do is to follow the "myth:shipment-label" link and GET the PDF. Thats it.
/Jørn
I decided that it could be fun to hypermedia-ize the example and show how it could be represented using the media type Mason. So here we go :-)
The first step is to ask for a quote; the customer has a package of some sort and needs to ship it from A to B, so he access the shipment service and asks for a quote (a price) given the size, weight, origin and destination for the package.
In the original example you could ask for a quote by issuing a GET request to /quote. But I believe that asking for a quote would result in a concrete quote being created and stored in the system as a separate resource to access later on, either by the customer or by the customer service of the shipment company. So I would rather go for a POST of a quote request followed by a redirect to the newly created quote.
At this point we could either document how to POST such a quote - or we could tell the client how to do it using hypermedia controls - and obviously I would go for the later. So lets ask the service for instructions and issue a GET /quote request. The output is a Mason document with suitable hypermedia controls embedded in it:
{
"@namespaces":
{
"myth":
{
"name": "http://mythological-shipment.com/api/rel-types"
}
},
"@actions":
{
"myth:quote":
{
"type": "json",
"method": "POST",
"href": "http://mythological-shipment.com/api/quote",
"title": "Ask for a quote",
"description": "Ask for a quote by posting package details. Weight is in kilograms, volume in cubic decimeters, origin and destination must be known identifiers for airports.",
"schemaUrl": "... URL to JSON schema describing the request ..."
}
}
}
The client reads this action specification, encodes the package details in JSON and POST it to the URL of the "href" property. As a result the service creates a new quote resource and redirects the client to it:
Request:
POST http://mythological-shipment.com/api/quote HTTP/1.1
content-type: application/json
{
"weight": 2.3,
"volume": 4,
"origin": "CPH",
"destination": "AMS"
}
Response:
201 Created
Location: http://mythological-shipment.com/api/quotes/myqo-129-gyh
Now the client can GET the newly created quote to get further instructions of how to accept the quote. The result is again a Mason representation of the quote itself plus hypermedia controls for accepting the quote:
{
"id": "myqo-129-gyh",
"weight": 2.3,
"volume: 4,
"origin": "CPH",
"destination": "AMS":
"price": 12,
"@links":
{
"self":
{
"href": "http://mythological-shipment.com/api/quotes/myqo-129-gyh"
}
},
"@actions":
{
"myth:accept-quote":
{
"type": "POST",
"href": "http://mythological-shipment.com/api/quotes/myqo-129-gyh/state",
"template":
{
"accepted": "yes"
}
}
}
}
As you can see the quote has a "self" link identifying the location of the quote resource. It also have a "accept-quote" action that instructs the client about how to accept the quote. In this case all the client has to do is to POST a predefined JSON value to http://mythological-shipment.com/api/quotes/myqo-129-gyh/state.
The result of accepting a quote is that it is converted to a sales order (in lack of better domain understanding - there's probably a better word for it). So the accept-quote operation results in a redirect to the newly create sales order which the client can GET:
{
"id": "myqo-129-gyh",
"weight": 2.3,
"volume: 4,
"origin": "CPH",
"destination": "AMS":
"price": 12,
"@links":
{
"self":
{
"href": "http://mythological-shipment.com/api/orders/myqo-129-gyh"
},
"myth:quote":
{
"href": "http://mythological-shipment.com/api/quotes/myqo-129-gyh"
},
"myth:shipment-label":
{
"href": "http://mythological-shipment.com/api/quotes/myqo-129-gyh/label",
"type": "application/pdf"
}
}
}
At last the customer needs a shipment label to print out and stick onto the package. All it has to do is to follow the "myth:shipment-label" link and GET the PDF. Thats it.
/Jørn
torsdag, februar 06, 2014
Implementing hypermedia APIs and REST services with Mason
I am happy to announce that I have taken all the lessons learned during the last few years and stuffed it into a new JSON based mediatype for hypermedia APIs and REST services. The media type is application/vnd.mason+json or simply "Mason". There is an IANA registration for it pending.
With Mason you get hypermedia elements for linking and modifying data, features for communicating to client developers and standardized error handling. Mason is built on JSON, reads JSON, writes JSON and generally fits well into a JSON based eco-system.
Here is a simple example illustrating how a single issue from a fictive issue tracker could be represented in Mason. It contains the basic API data like issue Title, Description and Severity and then it adds hypermedia elements for linking to other related resources and actions for writing stuff back to the issue tracker.
{
// Classic API data
"ID": 1,
"Title": "Program crashes when pressing ctrl-p",
"Description": "I pressed ctrl-p and, boom, it crashed.",
"Severity": 5,
"Attachments": [
{
"Id": 1,
"Title": "Error report",
// Hypermedia linking to attachment
"@links": {
"self": {
"href": "http://issue-tracker.org/attachments/1"
}
}
}
],
// Additional hypermedia links
"@links": {
// Hypermedia linking to self
"self": {
"href": "http://issue-tracker.org/issues/1"
},
// Hypermedia linking to containing project
"up": {
"href": "http://issue-tracker.org/projects/1",
"title": "Containing project"
},
},
// Hypermedia "action" element for creating a new project
"@actions": {
"is:project-create": {
"type": "json",
"href": "http://issue-tracker.org/mason-demo/projects",
"title": "Create new project",
"schemaUrl": "http://issue-tracker.org/mason-demo/schemas/create-project"
}
}
}
Those that are familiar with HAL may recognize some parts of the format. That is expected as Mason builds on the ideas from HAL. HAL was never intended to have hypermedia elements for writing stuff so I decided to go for it and design a new format based on HAL.
The Mason specification, online example and stand-alone API explorer are available from https://github.com/JornWildt/Mason.
My design goals with Mason are:
1. It should be easy to adopt in existing JSON based solutions and have a low barrier of entry for new developers.
2. It should contain hypermedia elements sufficient for both reading and writing data without any out-of-band information.
3. It should contain elements for information directed to client developers for the purpose of improving "API developer experience".
4. It should contain error elements sufficient for most kinds of applications.
5. It should work with JSON when both reading and writing.
Let me dig into each of those design goals one by one.
A classic JSON payload makes the raw API data directly accessible as JSON object properties. I believe it should be so too when working with hypermedia enabled APIs. So Mason merges hypermedia elements into existing JSON structures. To avoid name collisions Mason property names are prefixed with a '@'.
Mason can be adopted gradually:
Step 1: Change content type to application/vnd.mason+json instead of application/json.
Step 2: Add a @meta property with additional information targeted at client developers.
Step 3: Use links to remove client knowledge of server defined URLs.
Step 4: Use Mason's error format.
Step 5: Use actions to truly decouple client and server implementations.
Hypermedia has a lot of benefits as I wrote in http://soabits.blogspot.dk/2013/12/selling-benefits-of-hypermedia.html. Among these is the ability to remove a client's dependency on server URL structures using links.
But links are only good for, well, linking resources together - they don't say anything about how to change and modify API data. So Mason adds "actions" for writing API data.
An action defines both target URL, HTTP method and action type (payload encoding). With this information being discoverable at runtime it is no longer necessary to hard code clients with information about HTTP method and how to encode the payload. This means client and server only have to agree on WHICH data to send - not HOW to send it.
One of the great things about hypermedia enabled APIs is the ability to explore the API using a browser of some kind. As I wrote in http://soabits.blogspot.dk/2013/12/selling-benefits-of-hypermedia.html; Do not underestimate the power of an explorable API. The ability to browse around the data makes it a lot easier for the client developers to build a mental model of the API and its data structures.
And if client developers are browsing the API why not also be able to communicate directly with them? Mason adds a few meta data elements for sending messages directly to the client developers. An API browser should highlight these such that devs can instantly read some documentation and comments about the resource they are currently looking at.
At the same time Mason defines a technique for removing this client developer information from the payload in production.
By standardizing error handling Mason makes it possible for clients to interact with unknown services and still be able to communicate error conditions clearly to end users.
I have previously discussed error handling here in http://soabits.blogspot.dk/2013/05/error-handling-considerations-and-best.html and apparently that article hit a nerve somewhere because it keeps attracting a lot of attention (for an amateur blogger like me).
One of things that annoys me about the traditional key/value forms based on application/x-www-form-urlencoded is that there are no standards for encoding complex data structures. Neither does it define any standard for encoding booleans, integers and other basic data types. The consequence is that client and server needs to agree on these things before they can start talking about business data - and different servers are surely going to implement different encoding schemes - all in all making life miserable for developers that just want to get stuff done.
By using JSON Mason ensures interoperability on some of the lower levels. JSON defines more types than simple string based key/value formats and handles structures like objects and arrays.
Restricting implementations of Mason to handle JSON only reduces design choices and variations and thus improving the chances of things working out of the box (compared to simple string based key/value formats).
Most web APIs today are defined in terms of a single server implementation around which developers build dedicated clients (think "Twitter" or "Facebook"). In such a world clients have a strong coupling to server URL structures, HTTP methods, error formats and other quirks of the API. These APIs were never designed to be implemented by more than one organization (and are for this reason also called "snowflake APIs").
True REST services on the other hand are defined without reference to any specific server implementation. The best known example of this is the ATOM format which enables clients to interact with any ATOM enabled service on the web - no matter who implemented it, where it is hosted or what URL structures it is implemented with. The enabling factor for this is the ATOM media type specification.
But ATOM is restricted to feed-like data and does not fit well with other applications. So other media types are needed and Mason is an attempt to fill out this space. Mason attempts to facilitate complete decoupling from technical implementation details such that clients can discover HOW to interact with service at runtime.
One of my earlier blog posts discussed this problem in more detail: http://soabits.blogspot.no/2013/05/the-role-of-media-types-in-restful-web.html
Mason itself does not prescribe any business specific details. Clients and servers still have to agree on WHAT data to interchange - but Mason do remove the technical coupling on HOW to interchange the data.
Mason depends on profiles to enable clients to know WHAT data they are looking at. You can find an in-depth discussion about it here: http://soabits.blogspot.no/2013/12/media-types-for-apis.html.
At the time of writing I haven't put profiles into the specification yet.
Generic Mason browser (API explorer): https://github.com/JornWildt/Mason/wiki/Generic-Mason-browser
Online live example of fictive issue tracker using Mason: https://github.com/JornWildt/Mason/wiki/Example-service%3A-issue-tracker
/Jørn
With Mason you get hypermedia elements for linking and modifying data, features for communicating to client developers and standardized error handling. Mason is built on JSON, reads JSON, writes JSON and generally fits well into a JSON based eco-system.
Here is a simple example illustrating how a single issue from a fictive issue tracker could be represented in Mason. It contains the basic API data like issue Title, Description and Severity and then it adds hypermedia elements for linking to other related resources and actions for writing stuff back to the issue tracker.
{
// Classic API data
"ID": 1,
"Title": "Program crashes when pressing ctrl-p",
"Description": "I pressed ctrl-p and, boom, it crashed.",
"Severity": 5,
"Attachments": [
{
"Id": 1,
"Title": "Error report",
// Hypermedia linking to attachment
"@links": {
"self": {
"href": "http://issue-tracker.org/attachments/1"
}
}
}
],
// Additional hypermedia links
"@links": {
// Hypermedia linking to self
"self": {
"href": "http://issue-tracker.org/issues/1"
},
// Hypermedia linking to containing project
"up": {
"href": "http://issue-tracker.org/projects/1",
"title": "Containing project"
},
},
// Hypermedia "action" element for creating a new project
"@actions": {
"is:project-create": {
"type": "json",
"href": "http://issue-tracker.org/mason-demo/projects",
"title": "Create new project",
"schemaUrl": "http://issue-tracker.org/mason-demo/schemas/create-project"
}
}
}
Those that are familiar with HAL may recognize some parts of the format. That is expected as Mason builds on the ideas from HAL. HAL was never intended to have hypermedia elements for writing stuff so I decided to go for it and design a new format based on HAL.
The Mason specification, online example and stand-alone API explorer are available from https://github.com/JornWildt/Mason.
Design goals
My design goals with Mason are:
1. It should be easy to adopt in existing JSON based solutions and have a low barrier of entry for new developers.
2. It should contain hypermedia elements sufficient for both reading and writing data without any out-of-band information.
3. It should contain elements for information directed to client developers for the purpose of improving "API developer experience".
4. It should contain error elements sufficient for most kinds of applications.
5. It should work with JSON when both reading and writing.
Let me dig into each of those design goals one by one.
1. Easy to adopt
A classic JSON payload makes the raw API data directly accessible as JSON object properties. I believe it should be so too when working with hypermedia enabled APIs. So Mason merges hypermedia elements into existing JSON structures. To avoid name collisions Mason property names are prefixed with a '@'.
Mason can be adopted gradually:
Step 1: Change content type to application/vnd.mason+json instead of application/json.
Step 2: Add a @meta property with additional information targeted at client developers.
Step 3: Use links to remove client knowledge of server defined URLs.
Step 4: Use Mason's error format.
Step 5: Use actions to truly decouple client and server implementations.
2. Hypermedia for both reading and writing
Hypermedia has a lot of benefits as I wrote in http://soabits.blogspot.dk/2013/12/selling-benefits-of-hypermedia.html. Among these is the ability to remove a client's dependency on server URL structures using links.
But links are only good for, well, linking resources together - they don't say anything about how to change and modify API data. So Mason adds "actions" for writing API data.
An action defines both target URL, HTTP method and action type (payload encoding). With this information being discoverable at runtime it is no longer necessary to hard code clients with information about HTTP method and how to encode the payload. This means client and server only have to agree on WHICH data to send - not HOW to send it.
3. Information targeted at client developers
One of the great things about hypermedia enabled APIs is the ability to explore the API using a browser of some kind. As I wrote in http://soabits.blogspot.dk/2013/12/selling-benefits-of-hypermedia.html; Do not underestimate the power of an explorable API. The ability to browse around the data makes it a lot easier for the client developers to build a mental model of the API and its data structures.
And if client developers are browsing the API why not also be able to communicate directly with them? Mason adds a few meta data elements for sending messages directly to the client developers. An API browser should highlight these such that devs can instantly read some documentation and comments about the resource they are currently looking at.
At the same time Mason defines a technique for removing this client developer information from the payload in production.
4. Error handling
By standardizing error handling Mason makes it possible for clients to interact with unknown services and still be able to communicate error conditions clearly to end users.
I have previously discussed error handling here in http://soabits.blogspot.dk/2013/05/error-handling-considerations-and-best.html and apparently that article hit a nerve somewhere because it keeps attracting a lot of attention (for an amateur blogger like me).
5. JSON read/write
One of things that annoys me about the traditional key/value forms based on application/x-www-form-urlencoded is that there are no standards for encoding complex data structures. Neither does it define any standard for encoding booleans, integers and other basic data types. The consequence is that client and server needs to agree on these things before they can start talking about business data - and different servers are surely going to implement different encoding schemes - all in all making life miserable for developers that just want to get stuff done.
By using JSON Mason ensures interoperability on some of the lower levels. JSON defines more types than simple string based key/value formats and handles structures like objects and arrays.
Restricting implementations of Mason to handle JSON only reduces design choices and variations and thus improving the chances of things working out of the box (compared to simple string based key/value formats).
Transcending from web APIs to REST services
Most web APIs today are defined in terms of a single server implementation around which developers build dedicated clients (think "Twitter" or "Facebook"). In such a world clients have a strong coupling to server URL structures, HTTP methods, error formats and other quirks of the API. These APIs were never designed to be implemented by more than one organization (and are for this reason also called "snowflake APIs").
True REST services on the other hand are defined without reference to any specific server implementation. The best known example of this is the ATOM format which enables clients to interact with any ATOM enabled service on the web - no matter who implemented it, where it is hosted or what URL structures it is implemented with. The enabling factor for this is the ATOM media type specification.
But ATOM is restricted to feed-like data and does not fit well with other applications. So other media types are needed and Mason is an attempt to fill out this space. Mason attempts to facilitate complete decoupling from technical implementation details such that clients can discover HOW to interact with service at runtime.
One of my earlier blog posts discussed this problem in more detail: http://soabits.blogspot.no/2013/05/the-role-of-media-types-in-restful-web.html
Data profiles
Mason itself does not prescribe any business specific details. Clients and servers still have to agree on WHAT data to interchange - but Mason do remove the technical coupling on HOW to interchange the data.
Mason depends on profiles to enable clients to know WHAT data they are looking at. You can find an in-depth discussion about it here: http://soabits.blogspot.no/2013/12/media-types-for-apis.html.
At the time of writing I haven't put profiles into the specification yet.
Further reading
Mason homepage: https://github.com/JornWildt/MasonGeneric Mason browser (API explorer): https://github.com/JornWildt/Mason/wiki/Generic-Mason-browser
Online live example of fictive issue tracker using Mason: https://github.com/JornWildt/Mason/wiki/Example-service%3A-issue-tracker
/Jørn
fredag, december 06, 2013
Selling the benefits of hypermedia in APIs
Once more I have found myself deeply engaged in a discussion about REST on the api-craft mailing list (https://groups.google.com/forum/#!topic/api-craft/ZxnLD6q6w7w). This time it started with the question "How do I sell the benefits of hypermedia". It turned out to be harder to answer than one would expect, but after some time we came up with the list below. But before we get into that I better explain "hypermedia" in a few sentences.
The most common use of hypermedia is embedding of links in representations returned from some service on the web. As an example we can look at the representation of a customer record containing a customer ID, customer name, customer contact information and related sales orders. Encoded in JSON we can get something like this:
// Customer record
// URL template: http://company-x.com/customers/{customer-id}
{
ID: 1234,
Name: "John Larsson",
Address: "Marienborg 1, 2830 Virum, Denmark",
}
// List of sales order
// URL template: http://company-x.com/customers/{customer-id}/sales-orders
{
CustomerId: 1234,
Orders:
[
{
ID: 10,
ItemNumber: 15,
Quantity: 4
}
]
}
These two resources can be found by expanding the customer ID into the URL templates. This requires the client to be hard coded with 1) the URL templates and 2) the knowledge of which values to use as parameters.
Now, if we embed links in the responses then we can remove the hard coded knowledge of at least the sales orders URL template:
// Customer record
// URL template: http://company-x.com/customers/{customer-id}
{
ID: 1234,
Name: "John Larsson",
Address: "Marienborg 1, 2830 Virum, Denmark",
_links:
[
{
rel: "http://linkrels.company-x.com/sales-orders",
href: "{link-to-sales-orders}",
title = "Sales orders"
}
]
}
// List of sales order
// URL template unpublished
{
CustomerId: 1234,
Orders:
[
{
ID: 10,
ItemNumber: 15,
Quantity: 4,
_links:
[
{
rel: "http://linkrels.company-x.com/order-details",
href: "{link-to-sales-order-details}",
title = "Sales order details"
},
{
rel: "http://linkrels.company-x.com/item-details",
href: "{link-to-item-details}",
title = "Item order details (catalog)"
}
]
}
]
}
Notice how links are encoded:
The question is know, what is gained by adding such links?
Think of it like this; traditionally, as a client developer, you would have to read through a pile of documentation before you sit down and write some test programs more or less blindfolded. After that you run your test program to see how the API behaves. Then you have to go back to the documentation and read some more - and then back to coding again. This exercise has three different mental context switches going back and forth between reading documentation, programming and trying out test programs.
With an explorable API you can simply try out the API and test your understanding of it without any programming. Any mental "what-if" hypothesis testing of the API can be carried out right there without any additional tools or programming. The data as well as the interaction tools is right there in front of you, reducing the mental hoops you have to go through to understand the API.
The immediate benefits of an explorable API is perhaps more social than technical. But mind you - a lower barrier of entry means happier client developers, higher API adoption rates and less support, which in the end means fewer annoying support calls to bug YOU at the most annoying times of your work.
The immediate benefits of this are also social just like the explorability of the API. It will lower the barrier of entry to understanding the API and improve API adoption by client developers.
The benefits of this is obviously less coupling between the server and the client, removing the need to upgrade all clients in lock step with the server.
But, you may ask, why should the server change its URL structures? Once the server developers has decided that the URL is /customers/{customer-id} why should they then suddenly decide to change it? Well, I cannot tell you what will change in your API, but here are two examples:
- A resource grows too big. Over time it has been necessary to add more and more features to a single resource and one day it simply becomes too big to handle. So it is decided to split it into multiple sub-resources with new URL structures.
- It turns out that some resources requires bits and pieces of information from other resources when the client access them. It can for instance be an access token of some kind that need to be generated in one place and passed to another resource. With a traditional API the client has to be upgraded with this kind of business logic. With a hypermedia API the client can ignore this complexity and leave it to the server to add the desired parameters to the links it generates.
Hypermedia also allows the server to re-implement an existing resource with a completely different technology stack, on a completely different server, without the client ever noticing it - given, of course, that the new implementation doesn't make any breaking changes.
If you want to read more about versioning then take a look at Mark Nottingham's "Web API versioning smackdown" at http://www.mnot.net/blog/2011/10/25/web_api_versioning_smackdown
Let us try to broaden the scene: think of a big corporation that has engulfed and bought up a lot of smaller companies. All of these smaller companies have their own server setup with lists of inventory, sales orders, customers and so on ...
Now I give you the ID 4328 of customer John Burton ... how will you be able to find the resource that represent the contact information of this customer, when it can live on any one of a dozen servers?
Solution 1: We need a central indexing service that allows the client to search for customer with ID 4328. But how can the indexing service tell the client where the resulting customer record resides? The answer is simple; the response has to contain a link to the customer record resource.
Solution 2: Don't use IDs like 4328 at all. Always refer to resources with their full URLs.
Either way, the client won't know anything about the URL it gets in return - all it has to do is to trust the search result and follow the link.
And now that we have some opaque, meaningless, URL to our customer record information, how do we get to the sales orders placed by said customer? We could take the customer ID again and throw it into some other indexing service and get a new URL out of it - or we could follow a "sales-orders" link-relation embedded in the customer information.
The point is:
I have already written about error handling in http://soabits.blogspot.no/2013/05/error-handling-considerations-and-best.html where the last section discuss error handling on a larger scale.
Unfortunately I have yet to write an article explaining my current view on media types - until then you can either check this post in api-craft https://groups.google.com/d/msg/api-craft/5N5SS0JMAJw/b0diFRzopY0J or read my ramblings about media types and type systems http://soabits.blogspot.no/2013/05/the-role-of-media-types-in-restful-web.html.
UPDATE (December 8th 2013): I have just added a blog post about media types: http://soabits.blogspot.no/2013/12/media-types-for-apis.html
The most common use of hypermedia is embedding of links in representations returned from some service on the web. As an example we can look at the representation of a customer record containing a customer ID, customer name, customer contact information and related sales orders. Encoded in JSON we can get something like this:
// Customer record
// URL template: http://company-x.com/customers/{customer-id}
{
ID: 1234,
Name: "John Larsson",
Address: "Marienborg 1, 2830 Virum, Denmark",
}
// List of sales order
// URL template: http://company-x.com/customers/{customer-id}/sales-orders
{
CustomerId: 1234,
Orders:
[
{
ID: 10,
ItemNumber: 15,
Quantity: 4
}
]
}
These two resources can be found by expanding the customer ID into the URL templates. This requires the client to be hard coded with 1) the URL templates and 2) the knowledge of which values to use as parameters.
Now, if we embed links in the responses then we can remove the hard coded knowledge of at least the sales orders URL template:
// Customer record
// URL template: http://company-x.com/customers/{customer-id}
{
ID: 1234,
Name: "John Larsson",
Address: "Marienborg 1, 2830 Virum, Denmark",
_links:
[
{
rel: "http://linkrels.company-x.com/sales-orders",
href: "{link-to-sales-orders}",
title = "Sales orders"
}
]
}
// List of sales order
// URL template unpublished
{
CustomerId: 1234,
Orders:
[
{
ID: 10,
ItemNumber: 15,
Quantity: 4,
_links:
[
{
rel: "http://linkrels.company-x.com/order-details",
href: "{link-to-sales-order-details}",
title = "Sales order details"
},
{
rel: "http://linkrels.company-x.com/item-details",
href: "{link-to-item-details}",
title = "Item order details (catalog)"
}
]
}
]
}
Notice how links are encoded:
- Links are always found in collections named _links
- A single link consists of a link relation identifier "rel", the hypermedia reference "href" and a human readable description "title".
- Link relations are identified by URLs.
The question is know, what is gained by adding such links?
Short term effects
1. Explorable API
It may sound trivial but do not underestimate the power of an explorable API. The ability to browse around the data makes it a lot easier for the client developers to build a mental model of the API and its data structures.Think of it like this; traditionally, as a client developer, you would have to read through a pile of documentation before you sit down and write some test programs more or less blindfolded. After that you run your test program to see how the API behaves. Then you have to go back to the documentation and read some more - and then back to coding again. This exercise has three different mental context switches going back and forth between reading documentation, programming and trying out test programs.
With an explorable API you can simply try out the API and test your understanding of it without any programming. Any mental "what-if" hypothesis testing of the API can be carried out right there without any additional tools or programming. The data as well as the interaction tools is right there in front of you, reducing the mental hoops you have to go through to understand the API.
The immediate benefits of an explorable API is perhaps more social than technical. But mind you - a lower barrier of entry means happier client developers, higher API adoption rates and less support, which in the end means fewer annoying support calls to bug YOU at the most annoying times of your work.
2. Inline documentation
Did you notice how link relations are identified by URLs? These URLs can point to online documentation where the API elements can be explained.The immediate benefits of this are also social just like the explorability of the API. It will lower the barrier of entry to understanding the API and improve API adoption by client developers.
3. Simple client logic
A client that simply follows URLs instead of constructing them itself, should be easier to implement and maintain. It won't need logic to figure out which values to substitute into what URL templates. All it has to do is to identify links in the payload and extract the hypermedia reference URL.Long term effects
4. The server takes ownership of URL structures
The use of hypermedia removes the client's hard coded knowledge of the URL structures used by the server. This means the server is free to change its URL structures over time when the API evolves without any need to upgrade the clients.The benefits of this is obviously less coupling between the server and the client, removing the need to upgrade all clients in lock step with the server.
But, you may ask, why should the server change its URL structures? Once the server developers has decided that the URL is /customers/{customer-id} why should they then suddenly decide to change it? Well, I cannot tell you what will change in your API, but here are two examples:
- A resource grows too big. Over time it has been necessary to add more and more features to a single resource and one day it simply becomes too big to handle. So it is decided to split it into multiple sub-resources with new URL structures.
- It turns out that some resources requires bits and pieces of information from other resources when the client access them. It can for instance be an access token of some kind that need to be generated in one place and passed to another resource. With a traditional API the client has to be upgraded with this kind of business logic. With a hypermedia API the client can ignore this complexity and leave it to the server to add the desired parameters to the links it generates.
5. Off loading content to other services
Consider how APIs evolve: after some time you figure out that some of the content should be off-loaded to a Content Delivery Network (CDN). This means new URLs that points to completely different hosts all over the internet. The actual URL cannot be hard coded into the client since it may change over time or contain random pieces of server generated information for the CDN (like for instance some kind of access token). Now the server HAS to embed the URLs in the responses and the client HAS to follow them.6. Versioning with links
With a hypermedia API it becomes trivial to implement new versions of the API resources without breaking existing clients: old clients will follow existing link relations to old-style resources whereas new clients will know how to follow new link relations to new resources - as long as the server response includes both the old as well as the new links.Hypermedia also allows the server to re-implement an existing resource with a completely different technology stack, on a completely different server, without the client ever noticing it - given, of course, that the new implementation doesn't make any breaking changes.
If you want to read more about versioning then take a look at Mark Nottingham's "Web API versioning smackdown" at http://www.mnot.net/blog/2011/10/25/web_api_versioning_smackdown
Large scale effects
7. Multiple implementations of the same service
So far we have only looked at clients dedicated to a unique implementation of a single service. That could for instance be something kike a dedicated Twitter client. But where hypermedia really excels is when we start to work with multiple independent implementations of the same service.Let us try to broaden the scene: think of a big corporation that has engulfed and bought up a lot of smaller companies. All of these smaller companies have their own server setup with lists of inventory, sales orders, customers and so on ...
Now I give you the ID 4328 of customer John Burton ... how will you be able to find the resource that represent the contact information of this customer, when it can live on any one of a dozen servers?
Solution 1: We need a central indexing service that allows the client to search for customer with ID 4328. But how can the indexing service tell the client where the resulting customer record resides? The answer is simple; the response has to contain a link to the customer record resource.
Solution 2: Don't use IDs like 4328 at all. Always refer to resources with their full URLs.
Either way, the client won't know anything about the URL it gets in return - all it has to do is to trust the search result and follow the link.
And now that we have some opaque, meaningless, URL to our customer record information, how do we get to the sales orders placed by said customer? We could take the customer ID again and throw it into some other indexing service and get a new URL out of it - or we could follow a "sales-orders" link-relation embedded in the customer information.
The point is:
When you transcend from unique one-off service implementations with dedicated clients to multiple independent service implementations with a variety of clients then you simply have to use hypermedia elements.This also means that it can be difficult to sell hypermedia to startup APIs since hypermedia won't add much benefit to one single API living on a isolated island without any requirements of being able to co-exists and co-work seamlessly with other similar services.
Other large scale effects
Hypermedia solves some of the problems related to large scale service implementations as I have just argued. But there are a few more issues to be solved in order to decouple clients completely from specific server implementations; one is related to how the client understands the result (media types) and one is related to error handling.I have already written about error handling in http://soabits.blogspot.no/2013/05/error-handling-considerations-and-best.html where the last section discuss error handling on a larger scale.
Unfortunately I have yet to write an article explaining my current view on media types - until then you can either check this post in api-craft https://groups.google.com/d/msg/api-craft/5N5SS0JMAJw/b0diFRzopY0J or read my ramblings about media types and type systems http://soabits.blogspot.no/2013/05/the-role-of-media-types-in-restful-web.html.
UPDATE (December 8th 2013): I have just added a blog post about media types: http://soabits.blogspot.no/2013/12/media-types-for-apis.html
Acknowledgments
Thanks to Mike Kelly for his initial blogpost on this tema: http://blog.stateless.co/post/68259564511/the-case-for-hyperlinks-in-apis - and his work on hypermedia linking in JSON with HAL (http://stateless.co/hal_specification.html).onsdag, oktober 02, 2013
URL structures and hyper media for Web APIs and RESTful services
A recurring theme on various mailing lists is that of choosing the "right" URL structure for a specific kind of web API. In this post I will present my view on this issue, based on various input from for instance "API-craft" (https://groups.google.com/forum/#!forum/api-craft).
First of all, let me hammer it in: URL structuring has absolutely nothing to do with REST. Period. REST is not concerned about URL structures - in REST a URL is an opaque string of characters with no meaning beyond the fact that it is both an identifier and a resource locator. An on-line web service doesn't become a RESTful service just because it has a nice pretty looking URL structure. There is simply no such thing as "A RESTful URL".
What this means is that a URL like http://geo.com/countries/usa/states/nevada is just as "RESTful" (or non-RESTful) as http://geo.com/states/nevada, http://geo.com/states/321, http://geo.com/states?id=321 and http://geo.com/foo-bar-U7q. The URL structure simply doesn't matter in REST.
But from a human point of view it helps understanding if the API has some kind of meaningful URL structure. Computers may easily ignore URL structures but as humans we tend to look at URLs and try to infer meaning from that. Thus, having pretty and well structured URLs helps us understand what is going on - not only as client developers but certainly also as server developers who often have to navigate from URLs to source code - and having a well defined URL structure helps us with that process.
In order to discuss URL structures we need a domain to model. I think geographical information with countries, states and cities should be easily understood by most, so lets try that. I will ignore the fact that many countries doesn't have states ;-)
/countries/USA/states/Nevada/cities/Las+Vegas
That would work, but think a bit about it; what if the state of Nevada had more than one city called Las Vegas? How would we be able to distinguish between the two cities? The problem here is that we confuse searching for a city named Las vegas in Nevada, USA with the concept of identifying a specific city.
I believe that it is fair to assume that most geographical systems will have some kind of backend that assigns unique identifies to all of its entities. This may be integers, GUIDs or strings with composite keys - but in the end it boils down to a sequence of characters that uniquely identifies the entity in the system.
So let us assume that the well known city of Las Vegas is identified by the integer 82137 which is a unique city number. It may happen to be the same number which is used for a country or a state, but in the context of cities it is unique.
The same goes for countries and states: USA has the ID 54 and Nevada is identified by 7334. Now we get the URL:
/countries/54/states/7334/cities/82137
But what happens if some client decided to lookup this URL with mismatching IDs:
/countries/54/states/8112/cities/82137
Well, that should be considered a non existing resource and the server should return HTTP code 404 Not Found.
But why bother at all with the overhead of checking both state, country and city IDs when the city ID uniquely identifies the city? It would be easier for all parties if only the city ID was needed in the URL:
/cities/82137
Now the server can do one single lookup by the ID to see if the referenced city exists. No need for any additional checking for matching state and country.
The same logic can be applied to states (and countries is trivial), so we end up with the following canonical URL structures for countries, states and cities:
/countries/{country-id}
/states/{state-id}
/cities/{city-id}
Should it happen that the server doesn't assign unique IDs to cities (or states), and really needs the state reference for a city, because two cities in different states may have the same (non-unique) ID, then we must include both in the URL:
/states/123/cities/77 => Rome in Italy (assuming some state in Italy is identified by 123)
/states/432/cities/77 => Rome in the state of New York
In the rest of this post I will assume that all cities and states has "globally" unique IDs.
The UI could be structured by three dropdowns: one for countries, one for states in the selected country and one for cities in the selected state. To present such a UI for the end user we first need to be able to get the list of all countries. The obvious choice for this resource is /countries. Then, for the selected country we need the list of states. The obvious choice here is /countries/{country-id}/states.
But what about the list of all cities for a specific state in a specific country? Let us avoid the trap of a hierarchical URL with multiple IDs and use the short /states/{state-id}/cities.
So now we have the following resources representing lists of geographical items:
/countries
/countries/{country-id}/states
/states/{state-id}/cities
Each of these resources returns a JSON list as shown below and from this list the UI can easily build a dropdown element for selecting a city:
[
{ Name: "Item name A", ID: xxx },
{ Name: "Item name B", ID: yyy }
]
In this way the client gets the unique city ID by letting the end user select a city and its corresponding ID.
/cities?query=Las+Vegas,+USA
The result would be a list of matching cities:
[
{ Title: "Las Vegas, County A, Nevada, USA", ID: 16352 },
{ Title: "Las Vegas, County B, Nevada, USA", ID: 82137 }
]
Now the end user can select one of the results and thus get the ID of the city.
But it is very easy to avoid this kind of URL coupling by using hyper media elements in the returned representations. Take for instance the list of cities matching the text "Las Vegas, USA"; here we can include the actual city URLs in the response instead of requiring the clients to construct the URLs itself:
[
{
Title: "Las Vegas, County A, Nevada, USA",
ID: 16352,
CityLink: "http://.../cities/16352"
},
{
Title: "Las Vegas, County B, Nevada, USA",
ID: 82137,
CityLink: "http://.../cities/82137"
}
]
Now we can start talking about a RESTful service instead of a static web API: by including hyper media elements we allow the server to include links to other hosts that might be better to represent cities:
[
{
Title: "Las Vegas, County A, Nevada, USA",
CityLink: "http://other-geo-service/jump.aspx?type=city&ID=16352"
},
{
Title: "Las Vegas, County B, Nevada, USA",
CityLink: "http://geo.com/cities/82137"
}
]
By including links we have stopped worrying about URL structures and has come one step closer to a RESTful service.
The upside is looser coupling to server URL structures, simpler client logic and enabling the use of different services on different servers. The downside is a larger payload with bigger URLs than simple IDs.
There are many different ways to do this depending on the complexity of the filtering. But it may be fine to start out with simple queries like "All cities in (Massachusetts or New York)"; first we need to use state IDs and thus we get "All cities in states (2321, 2981)". Such simple integer IDs can be separated with commas, so one possible URL structure could be:
/cities?states=2321,2981
It is also possible to encode an SQL like query language in one single parameter:
/cities?where=state+in+(2321,2981)+and+population+greater-than+200000
The possibilities are endless, but it usually consists of a path like /cities that identifies the type of query together with some set of URL parameters encoding the query specification.
A common solution is to interpret "&" as AND and "," as OR when possible. So for instance /cities?states=2321,2981&size=large,huge would mean "All cities where state is either (Massachusetts OR New Your) AND size is either (large OR huge)".
And no discussion about filtering without mentioning OData's URL conventions: http://www.odata.org/documentation/odata-v3-documentation/url-conventions/
POST /city-filters
Content-Type: application/x-www-form-urlencoded
where=state+in+(2321,2981)+and+population+greater-than+200000
The server then creates a temporary resource for this query and returns a redirect to it:
201 Created
Location: /city-filters/9638
The client can then GET /city-filters/9638 to get the result of the query.
A nice side effect of this is that the created filter resource can be cached to avoid re-calculating the potentially very slow query on the server.
When choosing between natural keys versus surrogate keys you should consider the lifespan of the key; URLs are supposed to be stable over a very long period of time, so do not choose keys that vary over time. For instance, do not use phone numbers and e-mails to identify people since people tend to change these during their life.
You should also beware of natural keys which can be used by more than one entity. It is for instance (still) common for some members of a family to share a common e-mail, so e-mails are not good candidates for identifying persons. Even social security numbers may sometimes change. In Denmark for instance a person may get a new social security number if they change gender.
A valid natural key could be a sales order number since these are supposed to be both unique and stable.
But if we introduce natural keys, should we then only use natural keys? What if an entity has both a natural key and an internal surrogate key? You can use both but you should decide on one being the canonical ID and avoid duplicating resources by using HTTP redirects for the secondary keys.
Take for instance a sales order with the order number SK324-1 and internal surrogate key 887766 - if we consider the order number as the canonical ID then we can use these URL structures:
/orders/SK324-1 => returns order representation
/orders/id/887766 => redirects to /orders/SK324-1
Redirects should be done using the HTTP status code 303 See Other with a Location header containing the canonical URL.
As stated earlier on: do not confuse searching with identity. You may want to search for a person with a specific e-mail, but the result should include the canonical URL of the found person.
See also http://www.w3.org/TR/webarch/#uri-aliases for a discussion of URL aliases and duplication.
/states/{state-id}/country
But, wait a minute, we already have links to countries, right? The canonical version is although /countries/{country-id} so how do we get from /states/{state-id}/country to /countries/{country-id}? The obvious answer is to consider the /states/{state-id}/country URL as an alias for some country and use HTTP redirects to get to the canonical country URL.
But lets step back and take a broader look at relations in general; a back-reference is just one kind of relation from one resource to another - but we could have many other kinds of relations, like "neighbor states", "the country of a city", "statistical information about a state" and so on. The general solution to this concept is to include links in the payloads instead of creating a myriade of small alias resources that only redirects to canonical URLs.
So, instead of using /states/{state-id}/country for the country of a certain state, we include the canonical country link in the representation of the state:
GET /states/4321
returns =>
{
Name: "State X",
CountryLink: "http://.../countries/1234",
NeighborStatesLink: "http://.../states/4321/neighbors"
}
Take for instance our state resource at /states/{state-id} - it may contain some very static data like the name of the state, its area and such like plus some volatile data like for instance the number of Tweets tweeted from that state the last ten minutes. The static information could easily be cached, but we have no way to do it since the complete resource also contains the number of Tweets.
The solution is straight forward: split the resource into two different resources:
/states/{state-id} => static state information
/states/{state-id}/tweet-stats => volatile tweet information
I'll admit that the above example is rather contrived, so lets try a more realistic example: a streaming music distribution network publishes information about its songs through an online web API. Each song has its own resource representation with details about the song. The title, lyrics, artist and such like won't change much (if ever), but the company also publishes the number of current listeners which changes all the time. To improve caching characteristics the song data is split into (at least) two different resources:
/songs/{song-id} => static song details (cacheable)
/songs/{song-id}/usage => volatile usage information (non cacheable)
But who says the song usage is published by the same API? Some time after the initial release of the web API the company off-load some of the streaming to another content delivery network which will also deliver the usage statistics. Now suddenly not only the URL structure changes but even the host name changes:
/songs/{song-id} => static song details (cacheable)
http://cdn.com/acme/file-usage/{song-id} => volatile usage information (non cacheable)
This is a breaking change and all clients must now be upgraded. Had the API instead contained hyper links then the change would have been transparent to all clients.
Classic song representation:
{
Id: 1234,
Name: "My song"
}
Hyper media improved representation:
{
Id: 1234,
Name: "My song",
UsageLink: "http://cdn.com/acme/file-usage/1234"
}
Once again we see how unimportant the actual URL structure is when we start using hyper media elements in the responses.
My recommendation is to implement the extensions as a convenience for the client developers, but avoid using them when interacting with the API "for real". If for instance our geographical API can return both JSON as well as XML and HTML then I would use these URLs for states:
/states/{state-id} => canonical URL used in all returned hyper media elements
/states/{state-id}.json => JSON representation of state
/states/{state-id}.xml => XML representation of state
/states/{state-id}.html => HTML representation of state
The canonical URL would also support standard HTTP content negotiation for JSON, XML and HTML representations of the exact same resource. The framework I use, OpenRasta, supports this dual type of "content negotiation" right out of the box with no implementation overhead.
If our resources have different variations then we can add them as "sub resources" of the primary resource (not that such a thing really exists since URLs are opaque strings). Where I work we have resources for documents in a case management system. These resources contains meta data about the document (title, owner and so on) - and then we have various other (sub) resources for the documents themselves - the raw binary document (image, power point, pdf etc.), a PDF replica of the document and a PDF replica with an added front page containing the document meta data. Thus we get these URLs:
/documents/{doc-id} => canonical document meta data URL
/documents/{doc-id}/pdf => PDF replica
/documents/{doc-id}/meta-pdf => PDF replica with meta data frontpage
In this way you will avoid the trap of doing something horrible like this which I would expect to delete order number 1234 when you GET the resource:
GET /orders/1234/delete
You also get the ability to identify all your resources and add caching, which is not possible with this sort of old school SOAP'ish look-up mechanism:
POST /orders
Body => { OrderId: 1234, Operation: "read" }
And you get a nice explorable and consistent API that you developers will love to use :-)
Well, API and URL versioning is a whole story in itself so I suggest you take a look at Mark Nottingham's excellent "API versioning smackdown" (http://www.mnot.net/blog/2011/10/25/web_api_versioning_smackdown) for a good discussion on this subject.
Have fun, hack some code and create beautiful APIs out there :-)
/Jørn
First of all, let me hammer it in: URL structuring has absolutely nothing to do with REST. Period. REST is not concerned about URL structures - in REST a URL is an opaque string of characters with no meaning beyond the fact that it is both an identifier and a resource locator. An on-line web service doesn't become a RESTful service just because it has a nice pretty looking URL structure. There is simply no such thing as "A RESTful URL".
What this means is that a URL like http://geo.com/countries/usa/states/nevada is just as "RESTful" (or non-RESTful) as http://geo.com/states/nevada, http://geo.com/states/321, http://geo.com/states?id=321 and http://geo.com/foo-bar-U7q. The URL structure simply doesn't matter in REST.
But from a human point of view it helps understanding if the API has some kind of meaningful URL structure. Computers may easily ignore URL structures but as humans we tend to look at URLs and try to infer meaning from that. Thus, having pretty and well structured URLs helps us understand what is going on - not only as client developers but certainly also as server developers who often have to navigate from URLs to source code - and having a well defined URL structure helps us with that process.
In order to discuss URL structures we need a domain to model. I think geographical information with countries, states and cities should be easily understood by most, so lets try that. I will ignore the fact that many countries doesn't have states ;-)
URLs as identifiers for entities
Our geographical domain easily lends itself to some kind of hierarchical structure of countries / states / cities. So intuitively we reach out for a hierarchical URL structure as shown below (for the sake of clarity I will ignore the host name and only show the path element of the URL). Let us try to build the URL for the city of Las Vegas in Nevada, USA:/countries/USA/states/Nevada/cities/Las+Vegas
That would work, but think a bit about it; what if the state of Nevada had more than one city called Las Vegas? How would we be able to distinguish between the two cities? The problem here is that we confuse searching for a city named Las vegas in Nevada, USA with the concept of identifying a specific city.
I believe that it is fair to assume that most geographical systems will have some kind of backend that assigns unique identifies to all of its entities. This may be integers, GUIDs or strings with composite keys - but in the end it boils down to a sequence of characters that uniquely identifies the entity in the system.
So let us assume that the well known city of Las Vegas is identified by the integer 82137 which is a unique city number. It may happen to be the same number which is used for a country or a state, but in the context of cities it is unique.
The same goes for countries and states: USA has the ID 54 and Nevada is identified by 7334. Now we get the URL:
/countries/54/states/7334/cities/82137
But what happens if some client decided to lookup this URL with mismatching IDs:
/countries/54/states/8112/cities/82137
Well, that should be considered a non existing resource and the server should return HTTP code 404 Not Found.
But why bother at all with the overhead of checking both state, country and city IDs when the city ID uniquely identifies the city? It would be easier for all parties if only the city ID was needed in the URL:
/cities/82137
Now the server can do one single lookup by the ID to see if the referenced city exists. No need for any additional checking for matching state and country.
The same logic can be applied to states (and countries is trivial), so we end up with the following canonical URL structures for countries, states and cities:
/countries/{country-id}
/states/{state-id}
/cities/{city-id}
Should it happen that the server doesn't assign unique IDs to cities (or states), and really needs the state reference for a city, because two cities in different states may have the same (non-unique) ID, then we must include both in the URL:
/states/123/cities/77 => Rome in Italy (assuming some state in Italy is identified by 123)
/states/432/cities/77 => Rome in the state of New York
In the rest of this post I will assume that all cities and states has "globally" unique IDs.
Finding the right ID with UI dropdowns
But how does the client know what ID to use, you may ask? This depends on the application, but lets take the scenario where an end user needs to get information about the city of Las Vegas (while still assuming that Nevada may have two Las Vegas).The UI could be structured by three dropdowns: one for countries, one for states in the selected country and one for cities in the selected state. To present such a UI for the end user we first need to be able to get the list of all countries. The obvious choice for this resource is /countries. Then, for the selected country we need the list of states. The obvious choice here is /countries/{country-id}/states.
But what about the list of all cities for a specific state in a specific country? Let us avoid the trap of a hierarchical URL with multiple IDs and use the short /states/{state-id}/cities.
So now we have the following resources representing lists of geographical items:
/countries
/countries/{country-id}/states
/states/{state-id}/cities
Each of these resources returns a JSON list as shown below and from this list the UI can easily build a dropdown element for selecting a city:
[
{ Name: "Item name A", ID: xxx },
{ Name: "Item name B", ID: yyy }
]
In this way the client gets the unique city ID by letting the end user select a city and its corresponding ID.
Query by text search
Another approach could be to use textual searching where the end user enters a query text like "Las Vegas, USA" (which is how Google maps work). This would require a new query resource:/cities?query=Las+Vegas,+USA
The result would be a list of matching cities:
[
{ Title: "Las Vegas, County A, Nevada, USA", ID: 16352 },
{ Title: "Las Vegas, County B, Nevada, USA", ID: 82137 }
]
Now the end user can select one of the results and thus get the ID of the city.
Adding hyper media, getting closer to REST
The previously mentioned approaches requires the client to create URLs by combining URL templates with IDs. This means the client has to be hard coded with the URL templates - and the consequence is a tight coupling to the URL structure of the web API.But it is very easy to avoid this kind of URL coupling by using hyper media elements in the returned representations. Take for instance the list of cities matching the text "Las Vegas, USA"; here we can include the actual city URLs in the response instead of requiring the clients to construct the URLs itself:
[
{
Title: "Las Vegas, County A, Nevada, USA",
ID: 16352,
CityLink: "http://.../cities/16352"
},
{
Title: "Las Vegas, County B, Nevada, USA",
ID: 82137,
CityLink: "http://.../cities/82137"
}
]
Now we can start talking about a RESTful service instead of a static web API: by including hyper media elements we allow the server to include links to other hosts that might be better to represent cities:
[
{
Title: "Las Vegas, County A, Nevada, USA",
CityLink: "http://other-geo-service/jump.aspx?type=city&ID=16352"
},
{
Title: "Las Vegas, County B, Nevada, USA",
CityLink: "http://geo.com/cities/82137"
}
]
By including links we have stopped worrying about URL structures and has come one step closer to a RESTful service.
The upside is looser coupling to server URL structures, simpler client logic and enabling the use of different services on different servers. The downside is a larger payload with bigger URLs than simple IDs.
Filtering
So far we have looked at hierarchical data with some obvious URL structures. But what if we need to get the list of cities with a population of more than 200000 citizens? And what if we only want cities from the state of Massachusetts?There are many different ways to do this depending on the complexity of the filtering. But it may be fine to start out with simple queries like "All cities in (Massachusetts or New York)"; first we need to use state IDs and thus we get "All cities in states (2321, 2981)". Such simple integer IDs can be separated with commas, so one possible URL structure could be:
/cities?states=2321,2981
It is also possible to encode an SQL like query language in one single parameter:
/cities?where=state+in+(2321,2981)+and+population+greater-than+200000
The possibilities are endless, but it usually consists of a path like /cities that identifies the type of query together with some set of URL parameters encoding the query specification.
A common solution is to interpret "&" as AND and "," as OR when possible. So for instance /cities?states=2321,2981&size=large,huge would mean "All cities where state is either (Massachusetts OR New Your) AND size is either (large OR huge)".
And no discussion about filtering without mentioning OData's URL conventions: http://www.odata.org/documentation/odata-v3-documentation/url-conventions/
Handling large input filters
URLs for filtering may become rather large, so another recurring question is "How do I handle filter strings too large for a URL"? The recommended solution is to POST the filter to a query resource, for instance like this:POST /city-filters
Content-Type: application/x-www-form-urlencoded
where=state+in+(2321,2981)+and+population+greater-than+200000
The server then creates a temporary resource for this query and returns a redirect to it:
201 Created
Location: /city-filters/9638
The client can then GET /city-filters/9638 to get the result of the query.
A nice side effect of this is that the created filter resource can be cached to avoid re-calculating the potentially very slow query on the server.
Natural keys, surrogate keys, URL aliases and resource duplication
A common question relates to the use of natural keys versus surrogate keys in URL construction. It is more or less the same discussion as we see with databases (see for instance http://www.agiledata.org/essays/keys.html). Examples of natural keys could be order numbers, e-mails, postal codes, social security numbers and phone numbers.When choosing between natural keys versus surrogate keys you should consider the lifespan of the key; URLs are supposed to be stable over a very long period of time, so do not choose keys that vary over time. For instance, do not use phone numbers and e-mails to identify people since people tend to change these during their life.
You should also beware of natural keys which can be used by more than one entity. It is for instance (still) common for some members of a family to share a common e-mail, so e-mails are not good candidates for identifying persons. Even social security numbers may sometimes change. In Denmark for instance a person may get a new social security number if they change gender.
A valid natural key could be a sales order number since these are supposed to be both unique and stable.
But if we introduce natural keys, should we then only use natural keys? What if an entity has both a natural key and an internal surrogate key? You can use both but you should decide on one being the canonical ID and avoid duplicating resources by using HTTP redirects for the secondary keys.
Take for instance a sales order with the order number SK324-1 and internal surrogate key 887766 - if we consider the order number as the canonical ID then we can use these URL structures:
/orders/SK324-1 => returns order representation
/orders/id/887766 => redirects to /orders/SK324-1
Redirects should be done using the HTTP status code 303 See Other with a Location header containing the canonical URL.
As stated earlier on: do not confuse searching with identity. You may want to search for a person with a specific e-mail, but the result should include the canonical URL of the found person.
See also http://www.w3.org/TR/webarch/#uri-aliases for a discussion of URL aliases and duplication.
Relations and back-references
What if we want back-references and other relations to other resources, does that influence the URL structure? For instance, now that we have links to states in a country, we might also want links to the country in which a state belongs. That might lead to something like this:/states/{state-id}/country
But, wait a minute, we already have links to countries, right? The canonical version is although /countries/{country-id} so how do we get from /states/{state-id}/country to /countries/{country-id}? The obvious answer is to consider the /states/{state-id}/country URL as an alias for some country and use HTTP redirects to get to the canonical country URL.
But lets step back and take a broader look at relations in general; a back-reference is just one kind of relation from one resource to another - but we could have many other kinds of relations, like "neighbor states", "the country of a city", "statistical information about a state" and so on. The general solution to this concept is to include links in the payloads instead of creating a myriade of small alias resources that only redirects to canonical URLs.
So, instead of using /states/{state-id}/country for the country of a certain state, we include the canonical country link in the representation of the state:
GET /states/4321
returns =>
{
Name: "State X",
CountryLink: "http://.../countries/1234",
NeighborStatesLink: "http://.../states/4321/neighbors"
}
Static data, volatile data and caching
Some times we end up with some sort of "hotspot" resource with tons of requests and a very volatile content making it impossible to cache the result and improve performance in that way. A solution to this may be to split the resource into two (or more) different sub-resources; a cacheable resource and a volatile non-cacheable resource.Take for instance our state resource at /states/{state-id} - it may contain some very static data like the name of the state, its area and such like plus some volatile data like for instance the number of Tweets tweeted from that state the last ten minutes. The static information could easily be cached, but we have no way to do it since the complete resource also contains the number of Tweets.
The solution is straight forward: split the resource into two different resources:
/states/{state-id} => static state information
/states/{state-id}/tweet-stats => volatile tweet information
I'll admit that the above example is rather contrived, so lets try a more realistic example: a streaming music distribution network publishes information about its songs through an online web API. Each song has its own resource representation with details about the song. The title, lyrics, artist and such like won't change much (if ever), but the company also publishes the number of current listeners which changes all the time. To improve caching characteristics the song data is split into (at least) two different resources:
/songs/{song-id} => static song details (cacheable)
/songs/{song-id}/usage => volatile usage information (non cacheable)
But who says the song usage is published by the same API? Some time after the initial release of the web API the company off-load some of the streaming to another content delivery network which will also deliver the usage statistics. Now suddenly not only the URL structure changes but even the host name changes:
/songs/{song-id} => static song details (cacheable)
http://cdn.com/acme/file-usage/{song-id} => volatile usage information (non cacheable)
This is a breaking change and all clients must now be upgraded. Had the API instead contained hyper links then the change would have been transparent to all clients.
Classic song representation:
{
Id: 1234,
Name: "My song"
}
Hyper media improved representation:
{
Id: 1234,
Name: "My song",
UsageLink: "http://cdn.com/acme/file-usage/1234"
}
Once again we see how unimportant the actual URL structure is when we start using hyper media elements in the responses.
Formats and content types
If the same resource can be found in different formats (encoded with different media types) then we can ask ourself, should URLs end on .json .xml or similar extensions? On one side it makes it easy to explore the different representations using a standard web browser - on the other side it introduces different URL aliases for the same resource.My recommendation is to implement the extensions as a convenience for the client developers, but avoid using them when interacting with the API "for real". If for instance our geographical API can return both JSON as well as XML and HTML then I would use these URLs for states:
/states/{state-id} => canonical URL used in all returned hyper media elements
/states/{state-id}.json => JSON representation of state
/states/{state-id}.xml => XML representation of state
/states/{state-id}.html => HTML representation of state
The canonical URL would also support standard HTTP content negotiation for JSON, XML and HTML representations of the exact same resource. The framework I use, OpenRasta, supports this dual type of "content negotiation" right out of the box with no implementation overhead.
If our resources have different variations then we can add them as "sub resources" of the primary resource (not that such a thing really exists since URLs are opaque strings). Where I work we have resources for documents in a case management system. These resources contains meta data about the document (title, owner and so on) - and then we have various other (sub) resources for the documents themselves - the raw binary document (image, power point, pdf etc.), a PDF replica of the document and a PDF replica with an added front page containing the document meta data. Thus we get these URLs:
/documents/{doc-id} => canonical document meta data URL
/documents/{doc-id}/pdf => PDF replica
/documents/{doc-id}/meta-pdf => PDF replica with meta data frontpage
Use NOUNS not VERBS
I think most people get this right nowadays: URLs should be NOUNS not VERBS. Avoid URLs like /getOrders and /updateCountry - use all of the HTTP verbs instead when interacting with the resources and use something like /orders/{order-id} and /countries/{country-id} for the URLs. If you run out of HTTP verbs then invent new resources.In this way you will avoid the trap of doing something horrible like this which I would expect to delete order number 1234 when you GET the resource:
GET /orders/1234/delete
You also get the ability to identify all your resources and add caching, which is not possible with this sort of old school SOAP'ish look-up mechanism:
POST /orders
Body => { OrderId: 1234, Operation: "read" }
And you get a nice explorable and consistent API that you developers will love to use :-)
Versioning
Where should API version numbers go in the URL? Should it be /api/v1/countries, /countries-1 or maybe in the host name http://v1.api.geo.com/countries?Well, API and URL versioning is a whole story in itself so I suggest you take a look at Mark Nottingham's excellent "API versioning smackdown" (http://www.mnot.net/blog/2011/10/25/web_api_versioning_smackdown) for a good discussion on this subject.
Have fun, hack some code and create beautiful APIs out there :-)
/Jørn
fredag, maj 17, 2013
The role of media types in RESTful web services
One of the never ending discussions in the REST community is that of custom and domain specific media types; should we, or should we not, create new media types - and if we should, for what reasons should it be done?
In this blog post I will discuss the role of media types in web services and illustrate it with an example media type. I will go through the requirements for this media type and from this I will build up the features it needs to support. Together with this I will show some example scenarios and sketch out the processing algorithm for the client side. At last I compare this media type to other similar media types (HAL, Sirene, JSON-API).
My goals for this blog post are:
By systems integration I mean the kind of background processing that takes place behind the scenes in almost any IT enabled business today; shuffling data from one system to another in a safe and durable way without any human interaction.
REST seems like a good fit for systems integration. It has a strong focus on loosely coupled systems where servers and clients can evolve independently of each others; if we can leverage that then the whole ecosystem of multiple servers and clients should be a lot easier to maintain and with much less downtime required for upgrading the various components.
There is an ongoing trend to include hyper media controls in never web services; that is a good trend as it removes the clients dependency on specific URL structures. This in turn allows the server to evolve by adding new resources and link to these - and it also facilitates the ability to use multiple servers without the clients ever noticing (since the client do not care about either URL path structures or host names).
But there is still a thing missing in the puzzle. In Roy Fielding's (in)famous rant "REST APIs must be hypertext-driven" he states:
Especially the last statement is interesting "all application state transitions must be driven by client selection of server-provided choices". This means the client should not make any requests without first being instructed to do so (and how to do it). The client should not POST a new Tweet, bug report or similar without being instructed, on the fly, by some mechanism embedded in the server responses. Todays use of links in responses is on the right track, but links do not inform the client about what HTTP method to use (it assumes GET) and neither does it say anything about the possible payload.
With this blog post I will try to explain how a media type, with a sufficient number of hyper media controls, together with some intelligent client side code, can enable what Fielding is describing. The downside of this approach is that client implementations become more complex - the upside is that the whole client/server application becomes much more loosely coupled which, in the end, hopefully will help us reach a maintenance Nirvana of loosely coupled systems integration :-)
By the way, I am not comparing REST with SOAP/WSDL and EDA (event driven architectures) - that is not the purpose here even though these are often found in systems integration projects. I would rather just explore what benefits we can get from REST.
The media type must be rich enough in terms of hyper media affordances to enable all the operations needed for systems integration.
The media type does not need to included much, if any, in terms of UI elements since it is intended for operations without human interaction. Neither is the media type intended for mobile use where bandwidth and message size is a concern.
The media type will be based on JSON. It could just as well be based on XML but, in my experience, JSON is lot simpler to work with, fits the data needs I have met, and has a simple and easy-to-work-with patch format (application/json-patch) which will come in handy later on.
Armed with these constraints and requirements we are ready to build up our new media type.
BugMe is not a part of the media type specification - it is only used to illustrate how the media type facilitates interaction with BugMe servers anywhere on the web.
Neither is BugMe a vendor specific "standard", it is strictly defined in terms of the generic media type and a set of bug reporting specific data structures and identifiers (more on that later on).
Compare this to APIs like Twitter and others; these are always defined in terms of vendor specific resources and explicit URL structures and was never designed to be implemented on servers anywhere else on the web.
To highlight the difference between a standard like BugMe and an actual implementation I will assume that some clever guy named Joe, who studies computer science 101 at Example.edu, has set up a BugMe server for some local study project. He is using an implementation that uses a vocabulary slightly different from BugMe - it talks about "issues" where BugMe talks about "bug reports". This fact is illustrated through the concrete URLs used in the examples . The root URL is http://example.edu/~joe/track.
Now we are ready to set our client loose and make it create the bug report. It will do so in the same manner as a human working with a web based UI: get a resource representation, look for well known identifiers that labels data and hyper media controls, fill out data and activate hyper media controls.
This interaction pattern, getting a resource representation and following instructions on the fly, has a price: it requires more complex client side logic than "normal RPC" patterns with design time binding of methods and it results in higher bandwidth due to the embedded hyper media controls. The upside is a much looser coupling between clients and serves. But all of this is of course already discussed in Fielding's thesis on REST ;-)
Request
GET /~joe/track/index
Accept: application/razor+json
Response
Content-Type: application/razor+json
{
curies:
[
{ prefix: "bug", reference: "http://bugme.org/names/" }
],
controls:
[
...,
{
type: "link",
name: "bug:create-bug-report",
href: "http://example.edu/~joe/track/add-issue",
title: "Add issue to issue tracker"
},
...
]
}
The returned JSON data contains two top level properties defined by the media type: curies and controls. "curies" define short names for URLs used as identifiers in the other elements (see http://www.w3.org/TR/curie/) and "controls" contains various hyper media controls. The use of curies should be optioinal - but it helps reading the responses in posts like this.
Now the client scans the "controls" element looking for the identifier "bug:create-bug-report". In this case it finds a "link" control which is equivalent to an ATOM link. Since our client understands all the features of the media type it will know that a link should be "followed" by issuing a HTTP GET on the "href" value.
This little "algorithm" is equivalent to what a human would do: open up a webpage, look for instructions on how to perform the task at hand and then follow them.
You may have noticed the dots "..." in the example. Those are there for a reason: they illustrate how the client only cares about stuff that is relevant to its current task. Anything else in the response is ignored. The consequence is that the server is free to evolve the content of the resource over time without breaking any clients - as long as it only adds new stuff. Neither does the client care if the content is supposed to be a "link page", a service index, a medical record or have any other specific "type" - as long as it contains elements that will help the client getting closer to its goal.
Request
GET /~joe/track/add-issue
Accept: application/razor+json
Response
200 Ok
Content-Type: application/razor+json
{
curies: ...,
controls:
[
{
type: "poe-factory",
name: "bug:create-bug-report",
href: "http://example.edu/~joe/track/add-issue",
title: "Create new idempotent POE resource"
}
]
}
Bingo! This time the client finds an "poe-factory" control with the right name "bug:create-bug-report" and now its time to create the bug report. The control type "poe-factory" means "Post Once Exactly factory" and is a special action element that enables idempotent POST operations. If you do not know what "idempotent" means then take a look at this page: http://www.infoq.com/news/2013/04/idempotent.
The good thing about idempotent operations is that they can safely be repeated if anything goes wrong on the network. If an operation times out the client can simply retry it again without the risk of creating the same entry multiple times. And since this new media type is for safe and durable "behind the scenes" work I find it rather important to include a mechanism for idempotent POST operations.
The implementation chosen here requires the client to do an empty POST first. This will create a new POE resource (thus the name "poe-factory") and redirect the client to it. The client can then POST to the new resource as many times it needs until the operation succeeds. The server returns "201 Created" first time it completes the operation whereas it returns "303 See Other" on following requests. In either case the server includes a "Location" header pointing to the new POE resource.
Subbu Allamaraju has a nice blog post on post once exactly techniques.
I chose this approach for the following reasons:
Request
POST /~joe/track/add-issue
Content-length: 0
Response
201 Created
Location: http://example.edu/~joe/track/add-issue/bd925-ye174h
Request
GET /~joe/track/add-issue/bd925-ye174h
Accept: application/razor+json
Response
400 Ok
Content-Type: application/razor+json
{
curies: ...,
controls:
[
{
type: "poe-action",
name: "bug:create-bug-report",
documentation: ... some URL ...,
method: "POST",
href: "http://example.edu/~joe/track/add-issue/bd925-ye174h",
type: "application/json",
scaffold: ... any JSON object ...,
title: "Add issue"
}
]
}
Now the client gets a response with a "poe-action" control. This tells the client that it can safely POST as many times it needs to the "href" URL. The actual payload is given by the BugMe specification (Title, Description, Severity).
Some comments on the above response:
Request
POST /~joe/track/add-issue/bd925-ye174h
Accept: application/razor+json
Content-Type: application/json
{
Title: "Something bad happened",
Description: "I pressed ctrl-alt-del and all went black",
Severity: 5
}
Response
201 Created
Location: http://example.edu/~joe/track/issues/32
Request
GET /~joe/track/issues/32
Accept: application/razor+json
Response
Content-Type: application/razor+json
{
curies: ...,
controls: ...,
payloads:
[
...,
{
name: "bug:bug-report",
data:
{
Id: 32,
Title: "Something bad happened",
Description: "I pressed ctrl-alt-del and all went black",
Severity: 5,
Created: "2012-04-23T18:25:43Z"
}
},
...
]
}
Now that the client can see the actual bug report it wanted to create it knows that the task is completed. Everyone is smiling and put on their happy face :-)
Here is an example:
Request
POST /~joe/track/add-issue/bd925-ye174h
Accept: application/razor+json
Content-Type: application/json
{
Title: "Something bad happened",
Description: "I pressed ctrl-alt-del and all went black",
Severity: 5
}
Response
503 Service Unavailable
Content-Type: application/razor+json
{
error:
{
message: "Could not create new bug report; server is down for maintenance",
...
}
}
In addition to this the client can try to use content negotiation to receive error information in the format of application/api-problem+json.
As the media type evolves and more types of hyper media controls are added the client(s) will grow more and more complex. This is one of the trade offs that has to be accepted in order to keep clients and servers as loosely coupled as possible.
If the media type gets popular one could even expect to see the same scenario we see with todays web browsers: there will be multiple implementations of the client libraries and some will implement more than others of the final specification.
And then there is Jim Webber's fantastic "How to GET a cup of coffee" which has been a big inspiration for me over the years.
I don't see anything wrong by creating many media types - eventually a few of them will be good enough and gain enough traction to become ubiquitous standards. That's called evolution.
So what do you think? Was this useful, understandable, totally overkill, outright naive or simply a pile of, well, rubbish? Feel free to add a comment, Tweet me or send me an e-mail. I would love to get some feedback.
Happy hacking, Jørn
UPDATE 2014-02-24: I have actually put much of this into a media type called Mason. See http://soabits.blogspot.dk/2014/02/implementing-hypermedia-apis-and-rest.html.
In this blog post I will discuss the role of media types in web services and illustrate it with an example media type. I will go through the requirements for this media type and from this I will build up the features it needs to support. Together with this I will show some example scenarios and sketch out the processing algorithm for the client side. At last I compare this media type to other similar media types (HAL, Sirene, JSON-API).
My goals for this blog post are:
- To improve my own understanding of the role of media types in RESTful web services - and share that with others.
- To define a new media type for what I call systems integration - and show how it facilitates loose coupling between the integration components.
By systems integration I mean the kind of background processing that takes place behind the scenes in almost any IT enabled business today; shuffling data from one system to another in a safe and durable way without any human interaction.
REST seems like a good fit for systems integration. It has a strong focus on loosely coupled systems where servers and clients can evolve independently of each others; if we can leverage that then the whole ecosystem of multiple servers and clients should be a lot easier to maintain and with much less downtime required for upgrading the various components.
There is an ongoing trend to include hyper media controls in never web services; that is a good trend as it removes the clients dependency on specific URL structures. This in turn allows the server to evolve by adding new resources and link to these - and it also facilitates the ability to use multiple servers without the clients ever noticing (since the client do not care about either URL path structures or host names).
But there is still a thing missing in the puzzle. In Roy Fielding's (in)famous rant "REST APIs must be hypertext-driven" he states:
... Any effort spent describing what methods to use on what URIs of interest should be entirely defined within the scope of the processing rules for a media type
... From that point on, all application state transitions must be driven by client selection of server-provided choices that are present in the received representations
Especially the last statement is interesting "all application state transitions must be driven by client selection of server-provided choices". This means the client should not make any requests without first being instructed to do so (and how to do it). The client should not POST a new Tweet, bug report or similar without being instructed, on the fly, by some mechanism embedded in the server responses. Todays use of links in responses is on the right track, but links do not inform the client about what HTTP method to use (it assumes GET) and neither does it say anything about the possible payload.
With this blog post I will try to explain how a media type, with a sufficient number of hyper media controls, together with some intelligent client side code, can enable what Fielding is describing. The downside of this approach is that client implementations become more complex - the upside is that the whole client/server application becomes much more loosely coupled which, in the end, hopefully will help us reach a maintenance Nirvana of loosely coupled systems integration :-)
By the way, I am not comparing REST with SOAP/WSDL and EDA (event driven architectures) - that is not the purpose here even though these are often found in systems integration projects. I would rather just explore what benefits we can get from REST.
Media type requirements and constrains
The primary driver for this new media type is loose coupling where the clients only depends on the media type and some out-of-band business specific data structures and identifiers. This means:- The client must not make any assumptions about URL structures.
- The client must not make any assumptions about what concrete service implementation it is interacting with.
- The client must not initiate any HTTP request without following instructions embedded in server responses (besides the initial request).
- The client should not be given more than:
- A root URL from which all other resources must be discovered at runtime.
- A set of business specific data structures.
- A set of well known identifiers for locating hyper media controls and business data.
The media type must be rich enough in terms of hyper media affordances to enable all the operations needed for systems integration.
The media type does not need to included much, if any, in terms of UI elements since it is intended for operations without human interaction. Neither is the media type intended for mobile use where bandwidth and message size is a concern.
The media type will be based on JSON. It could just as well be based on XML but, in my experience, JSON is lot simpler to work with, fits the data needs I have met, and has a simple and easy-to-work-with patch format (application/json-patch) which will come in handy later on.
Armed with these constraints and requirements we are ready to build up our new media type.
Example business domain "BugMe"
Through out this blog post I will use the imaginary open standard "BugMe" for interacting with bug tracking systems through the new media type. BugMe supports adding of new bug reports, attaching documents to reports, adding comments to reports and similar features shown later on.BugMe is not a part of the media type specification - it is only used to illustrate how the media type facilitates interaction with BugMe servers anywhere on the web.
Neither is BugMe a vendor specific "standard", it is strictly defined in terms of the generic media type and a set of bug reporting specific data structures and identifiers (more on that later on).
Compare this to APIs like Twitter and others; these are always defined in terms of vendor specific resources and explicit URL structures and was never designed to be implemented on servers anywhere else on the web.
To highlight the difference between a standard like BugMe and an actual implementation I will assume that some clever guy named Joe, who studies computer science 101 at Example.edu, has set up a BugMe server for some local study project. He is using an implementation that uses a vocabulary slightly different from BugMe - it talks about "issues" where BugMe talks about "bug reports". This fact is illustrated through the concrete URLs used in the examples . The root URL is http://example.edu/~joe/track.
Example 1 - Creating a bug report
The first thing we will try is to create a new bug report with BugMe. To do so we must supply our client with a few details about the operation:- The root URL: http://example.edu/~joe/track/index.
- A "create bug report" identifier (as defined by BugMe): "http://bugme.org/names/create-bug-report".
- Bug reporting data (as defined by BugMe)
- Title: "Something bad happened",
- Description: "I pressed ctrl-alt-del and all went black",
- Severity: 5
Now we are ready to set our client loose and make it create the bug report. It will do so in the same manner as a human working with a web based UI: get a resource representation, look for well known identifiers that labels data and hyper media controls, fill out data and activate hyper media controls.
This interaction pattern, getting a resource representation and following instructions on the fly, has a price: it requires more complex client side logic than "normal RPC" patterns with design time binding of methods and it results in higher bandwidth due to the embedded hyper media controls. The upside is a much looser coupling between clients and serves. But all of this is of course already discussed in Fielding's thesis on REST ;-)
GET initial resource
At the very beginning our client has nothing to do but GET the root URL in hope of finding something useful there:Request
GET /~joe/track/index
Accept: application/razor+json
Response
Content-Type: application/razor+json
{
curies:
[
{ prefix: "bug", reference: "http://bugme.org/names/" }
],
controls:
[
...,
{
type: "link",
name: "bug:create-bug-report",
href: "http://example.edu/~joe/track/add-issue",
title: "Add issue to issue tracker"
},
...
]
}
The returned JSON data contains two top level properties defined by the media type: curies and controls. "curies" define short names for URLs used as identifiers in the other elements (see http://www.w3.org/TR/curie/) and "controls" contains various hyper media controls. The use of curies should be optioinal - but it helps reading the responses in posts like this.
Now the client scans the "controls" element looking for the identifier "bug:create-bug-report". In this case it finds a "link" control which is equivalent to an ATOM link. Since our client understands all the features of the media type it will know that a link should be "followed" by issuing a HTTP GET on the "href" value.
This little "algorithm" is equivalent to what a human would do: open up a webpage, look for instructions on how to perform the task at hand and then follow them.
You may have noticed the dots "..." in the example. Those are there for a reason: they illustrate how the client only cares about stuff that is relevant to its current task. Anything else in the response is ignored. The consequence is that the server is free to evolve the content of the resource over time without breaking any clients - as long as it only adds new stuff. Neither does the client care if the content is supposed to be a "link page", a service index, a medical record or have any other specific "type" - as long as it contains elements that will help the client getting closer to its goal.
Follow link
Here we have the next operation:Request
GET /~joe/track/add-issue
Accept: application/razor+json
Response
200 Ok
Content-Type: application/razor+json
{
curies: ...,
controls:
[
{
type: "poe-factory",
name: "bug:create-bug-report",
href: "http://example.edu/~joe/track/add-issue",
title: "Create new idempotent POE resource"
}
]
}
Bingo! This time the client finds an "poe-factory" control with the right name "bug:create-bug-report" and now its time to create the bug report. The control type "poe-factory" means "Post Once Exactly factory" and is a special action element that enables idempotent POST operations. If you do not know what "idempotent" means then take a look at this page: http://www.infoq.com/news/2013/04/idempotent.
The good thing about idempotent operations is that they can safely be repeated if anything goes wrong on the network. If an operation times out the client can simply retry it again without the risk of creating the same entry multiple times. And since this new media type is for safe and durable "behind the scenes" work I find it rather important to include a mechanism for idempotent POST operations.
The implementation chosen here requires the client to do an empty POST first. This will create a new POE resource (thus the name "poe-factory") and redirect the client to it. The client can then POST to the new resource as many times it needs until the operation succeeds. The server returns "201 Created" first time it completes the operation whereas it returns "303 See Other" on following requests. In either case the server includes a "Location" header pointing to the new POE resource.
Subbu Allamaraju has a nice blog post on post once exactly techniques.
I chose this approach for the following reasons:
- It has the simplest possible client side logic - at the cost of an extra round trip to the server. A similar solution could have required the client to create a GUID (message ID) and include it in the payload somehow, but that would make the protocol slightly more prone to client side errors.
- It requires no special headers.
- It adds no extra information to the payload.
- URLs are opaque and the server gets to choose how the POE/message ID is encoded.
Create POE resource
In order to complete its task the client first issues an empty POST operation to the URL of the "href" attribute:Request
POST /~joe/track/add-issue
Content-length: 0
Response
201 Created
Location: http://example.edu/~joe/track/add-issue/bd925-ye174h
GET POE resource
It should be rather obvious now that the client has no choice but to follow the response:Request
GET /~joe/track/add-issue/bd925-ye174h
Accept: application/razor+json
Response
400 Ok
Content-Type: application/razor+json
{
curies: ...,
controls:
[
{
type: "poe-action",
name: "bug:create-bug-report",
documentation: ... some URL ...,
method: "POST",
href: "http://example.edu/~joe/track/add-issue/bd925-ye174h",
type: "application/json",
scaffold: ... any JSON object ...,
title: "Add issue"
}
]
}
Now the client gets a response with a "poe-action" control. This tells the client that it can safely POST as many times it needs to the "href" URL. The actual payload is given by the BugMe specification (Title, Description, Severity).
Some comments on the above response:
- The payload is encoded in application/json as a trivial JSON object. Other formats may be included in the media type spec later on.
- This format is NOT intended for automatic creation of UI's and thus it contains no UI related list of field definitions or similar.
- It is NOT necessary to embed any kind of schema information - that sort of thing is given by the name of the control element.
- The optional "scaffold" value is the JSON payload equivalent of a URL template: it supplies default values to some properties and adds additional "hidden" properties the client can ignore (as long as they are sent back).
- POE-actions are not restricted to POST - a PATCH with json/patch would work as well (but then perhaps we need to change the action type name).
Create bug report
Then the client issues a new request:Request
POST /~joe/track/add-issue/bd925-ye174h
Accept: application/razor+json
Content-Type: application/json
{
Title: "Something bad happened",
Description: "I pressed ctrl-alt-del and all went black",
Severity: 5
}
Response
201 Created
Location: http://example.edu/~joe/track/issues/32
GET created bug report
Now we are done unless we want to see the actual created bug report by following the Location header:Request
GET /~joe/track/issues/32
Accept: application/razor+json
Response
Content-Type: application/razor+json
{
curies: ...,
controls: ...,
payloads:
[
...,
{
name: "bug:bug-report",
data:
{
Id: 32,
Title: "Something bad happened",
Description: "I pressed ctrl-alt-del and all went black",
Severity: 5,
Created: "2012-04-23T18:25:43Z"
}
},
...
]
}
Now that the client can see the actual bug report it wanted to create it knows that the task is completed. Everyone is smiling and put on their happy face :-)
Other hyper media controls
There are of course more scenarios to cover than this single "Create stuff" scenario and these scenarios will call for other kinds of hyper media controls, for instance URL templates, PATCH actions, binary file upload and more (I should cover these in some future blog posts ...)Error handling
If the client receives a 4xx or 5xx status code it can inspect the JSON payload and look for a property named "error" together with the other "payloads" and "controls" properties. The "error" property should contain data according to my previous blog post on error handling.Here is an example:
Request
POST /~joe/track/add-issue/bd925-ye174h
Accept: application/razor+json
Content-Type: application/json
{
Title: "Something bad happened",
Description: "I pressed ctrl-alt-del and all went black",
Severity: 5
}
Response
503 Service Unavailable
Content-Type: application/razor+json
{
error:
{
message: "Could not create new bug report; server is down for maintenance",
...
}
}
In addition to this the client can try to use content negotiation to receive error information in the format of application/api-problem+json.
Client side processing algorithm
Here is a simplified view of how the client should process the content:- GET initial root resource.
- [LOOP:] Look for hyper media controls with appropriate names.
- Check the type of the found control element:
- If it is a "link" then follow that link and restart from [LOOP].
- If it is a "poe-factory" then issue an empty POST to the href value and restart from [LOOP].
- if it is a "poe-action" then issue a request with the specified method and data encoded according to the "target" media type. Then restart from [LOOP].
- Look for a payload with the appropriate name: If it exists then the task is complete - otherwise it has failed (actually I don't like this last step, but that is the only kind of "acknowledge" I can see the server responding with).
As the media type evolves and more types of hyper media controls are added the client(s) will grow more and more complex. This is one of the trade offs that has to be accepted in order to keep clients and servers as loosely coupled as possible.
If the media type gets popular one could even expect to see the same scenario we see with todays web browsers: there will be multiple implementations of the client libraries and some will implement more than others of the final specification.
No profile needed
It may be tempting to allow for a "profile" parameter with the media type ID. But typically that would be used to ask for a specific "type" of a resource like for instance "application/razor+json;profile=user". As can be seen in the client side processing algorithm above there is no need for such a thing, so lets not introduce it.Related work
Quite a few other people are trying to create new media types to reach similar goals, but neither of them include features such as POE semantics. Here is the list of related media types that I am aware of:And then there is Jim Webber's fantastic "How to GET a cup of coffee" which has been a big inspiration for me over the years.
Reasons for creating a new media type
How many media types should we invent? Well, as many as needed, I would say. The media type described here includes some features not found in other media types (POE semantics for instance) and that should be sufficient argument for creating a new one.I don't see anything wrong by creating many media types - eventually a few of them will be good enough and gain enough traction to become ubiquitous standards. That's called evolution.
Summary
In this blog post I have tried to explain one way of understanding media type's role in RESTful web services and illustrated it by building up (parts of) a media type for systems integration. I have also touched upon the issue of "typed" resources and how to avoid it (by not assuming anything about the resource type and instead look for certain identifiers in the response) ... there could be a blog post more to come on this issue.So what do you think? Was this useful, understandable, totally overkill, outright naive or simply a pile of, well, rubbish? Feel free to add a comment, Tweet me or send me an e-mail. I would love to get some feedback.
Happy hacking, Jørn
UPDATE 2014-02-24: I have actually put much of this into a media type called Mason. See http://soabits.blogspot.dk/2014/02/implementing-hypermedia-apis-and-rest.html.
torsdag, april 19, 2012
Ramone: Media types and codecs
One of the basic building blocks of REST is the concept of a media type - the file format used to represent a resource on the web. Media types comes in many different flavours - images, PDF, vCard, XML, JSON, spreadsheets and so on, each of them having their own specific formats and capabilities. If you haven't done it already then take a look at my previous post where I go deeper into details about media types.
Media types are considered first class citizens of Ramone, my C# library for consuming web APIs and RESTful services on the web - just like the uniform interface (GET/POST/PUT/...) and resource identifiers (URLs) - and in this post I will show how to work with different kinds of media types.
The codec interfaces are rather simple:
The context parameter contains references to the current session, the data stream, HTTPRequest, HTTPResponse and others that are available for the codec.
You can read a bit more about using hyper media links in another of my earlier posts.
Here is an example use of the XML codec which decodes into C#'s XML DOM class XmlDocument:
The JSON codec can do some nifty stuff with C# dynamics:
Here MyClass is the type of object returned or written by the codec, MyCodec is the type of the codec and MyMediaType is the media type id string, e.g., "application/vnd.mytype+xml".
Ramone can be downloaded from https://github.com/JornWildt/Ramone
Media types are considered first class citizens of Ramone, my C# library for consuming web APIs and RESTful services on the web - just like the uniform interface (GET/POST/PUT/...) and resource identifiers (URLs) - and in this post I will show how to work with different kinds of media types.
Codecs
A codec is a class that translates to and from the file format on the wire and some kind of internal representation in C#. To do so it must first implement either IMediaTypeWriter, IMediaTypeReader or both and then register with the current codec manager such that Ramone will be able to find it.The codec interfaces are rather simple:
public interface IMediaTypeCodec
{
object CodecArgument { get; set; }
}
public interface IMediaTypeWriter : IMediaTypeCodec
{
void WriteTo(WriterContext context);
}
public interface IMediaTypeReader : IMediaTypeCodec
{
object ReadFrom(ReaderContext context);
}
The context parameter contains references to the current session, the data stream, HTTPRequest, HTTPResponse and others that are available for the codec.
Decoding an HTML micro format
One example of a codec is the BlogCodec from Ramone's test library. This codec demonstrates how to decode a (non-standard) micro format from an HTML page that shows a blog listing (see https://gist.github.com/2305777 for the actual HTML). public class BaseCodec_Html : TextCodecBase<Resources.Blog>
{
// This method is from TextCodecBase which has wrapped the binary input stream in a TextReader
// using the charset encoding stated by the client's request headers.
protected override Resources.Blog ReadFrom(TextReader reader, ReaderContext context)
{
// Using HtmlDocument from HtmlAgilityPack
HtmlDocument doc = new HtmlDocument();
doc.Load(reader);
return ReadFromHtml(doc, context);
}
protected Resources.Blog ReadFromHtml(HtmlDocument html, ReaderContext context)
{
HtmlNode doc = html.DocumentNode;
List<Resources.Blog.Post> posts = new List<Resources.Blog.Post>();
// Scan through HTML and look for "class" attributes identifying values
foreach (HtmlNode postNode in doc.SelectNodes(@"//div[@class=""post""]"))
{
HtmlNode title = postNode.SelectNodes(@".//*[@class=""post-title""]").First();
HtmlNode content = postNode.SelectNodes(@".//*[@class=""post-content""]").First();
List<Anchor> links = new List<Anchor>(postNode.Anchors(context.Response.ResponseUri));
posts.Add(new Resources.Blog.Post
{
Title = title.InnerText,
Text = content.InnerText,
Links = links
});
}
// Extract all HTML anchors together with <head> links and store them as ILink instances
List<ILink> blogLinks = new List<ILink>(doc.Anchors(context.Response.ResponseUri).Cast<ILink>().Union(doc.Links(context.Response.ResponseUri)));
// Create and return an object that represents the data extracted from the HTML
Resources.Blog blog = new Resources.Blog()
{
Title = doc.SelectNodes(@".//*[@class=""blog-title""]").First().InnerText,
Posts = posts,
Links = blogLinks
};
return blog;
}
// This method is also from TextCodecBase, but is not used (since we do not write HTML)
protected override void WriteTo(T item, System.IO.TextWriter writer, WriterContext context)
{
throw new NotImplementedException();
}
}
You can read a bit more about using hyper media links in another of my earlier posts.
Other codec examples
Other examples of codecs could be:- Decoding cooking recipe data from XML or JSON.
- Decoding binary image data.
- Decoding CSV into tabular data.
- Decoding and writing vCard information.
Built-in generic codecs
All of the previous codecs has been "typed" in the sense that they decode response data into a typed object with the specific properties needed. But Ramone has also built-in support for various generic formats such as XML, JSON and HTML.Here is an example use of the XML codec which decodes into C#'s XML DOM class XmlDocument:
Request req = Session.Bind("... some url ...);
XmlDocument doc = req.Get<XmlDocument>().Body;
The JSON codec can do some nifty stuff with C# dynamics:
Request req = Session.Bind("... some URL for cat data ...");
dynamic cat = req.Accept("application/json").Get().Body;
Assert.IsNotNull(cat);
Assert.AreEqual("Ramstein", cat.Name);
Advantages of typed codecs versus generic codecs
By working with typed codecs you gain a few advantages over the generic ones:- The application code is completely decoupled from the wire format. This gives you the ability to work with different wire formats without changing the application code, e.g., decoding both JSON, XML, and vCard into the same internal representation.
- It results in more readable application code.
- It makes the parsing code reusable across difference pieces of application code.
Update (20/04/2012): Codec Manager
I forgot to show how codecs can be registered with either the current service (more on services at a later time); Codecs must be registered with Ramone, otherwise they will simply be ignored. To do so you first grab a reference to ICodecManager and then call AddCodec(...):ICodecManager cm = MyService.CodecManager; cm.AddCodec<MyClass, MyCodec>(MyMediaType);
Here MyClass is the type of object returned or written by the codec, MyCodec is the type of the codec and MyMediaType is the media type id string, e.g., "application/vnd.mytype+xml".
Ramone can be downloaded from https://github.com/JornWildt/Ramone
Abonner på:
Kommentarer (Atom)