onsdag, januar 02, 2013

HTTP PUT, PATCH or POST - Partial updates or full replacement?

I have recently been working on the write side of a REST service for managing case files. During this work I have gone through a lot of discussions about partial updates versus full updates of resources in a RESTful service - and whether to perform these updates using the HTTP verbs PUT, PATCH or POST.

In the end we settled on using PATCH for partial updates (with the media type application/json-patch) and PUT for complete updates. That is in itself not so surprising, it is how we got to these decisions that I would like to share.

Example resource model

For demonstration purposes I will use the representation of a bug report from some fictive bug reporting system. A GET of such a bug report would return the following XML which should be rather self explaining:

<BugReport>
  <Id>15</Id>
  <Title>Program crashes when hitting ctrl-P</Title>
  <Status>Open</Status>
  <Responsible id="22">
    <Name>John Smith</Name>
    <Link rel="self" href="{url-to-person}" title="John Smith"/>
  </Responsible>
  <Link rel="self" href="{url-to-bug-report}" title="This bug report"/>
</BugReport>


Complete updates with PUT or POST

At first we wanted to use PUT with URL encoded key-value pairs for complete updates. We wanted to make it clear that such updates were idempotent and thus PUT seemed to be a perfect match - it signals full idempotent replacement of the target resource (see for instance http://www.emergentone.com/blog/http-methods-and-idempotence/ for an explanation of "idempotent").

There is although no reason to PUT all the server-generated stuff like the links and bug report ID when performing an update. But that meant we were not doing a complete update any more - and so we entered the foggy zone of unclear semantics for PUT: is it, or is it not, allowed by the HTTP specification to do such a full-but-somehow-also-partial update using PUT? Some people say yes, other people say no ...

Due to this uncertainty we gave up using PUT and turned to POST instead. This would not signal idempotency as we wanted, but it would certainly be a legal and standard use of POST.

Partial updates with POST

In the end it turned out that we could have saved us the trouble of discussing these "quite-but-not-entirely-unlike-partial" updates using PUT because it soon became apparent that "real" partial updates was a much better fit for our use case. The business case for implementing updates in the system was an integration service that transferred changes from one external system into our system - with heavy focus on only transferring changes from A to B.

We chose the intuitive solution of interpreting missing values (or, rather, null values) as "do not change this property". That would allow the client to POST an update to the Title property only as:

  POST /bug-report-url
  Content-Type: application/x-www-form-urlencoded

  Title=A%20new%20title


That worked well for the Title property, but how about the Responsible property? It had to be possible to either 1) ignore the Responsible, 2) set it to something, or 3) clear it ... But if a null value meant "ignore" what should then be used for "clear"? We decided to add a new (boolean) field "NoResponsible", so we could clear the responsible with a POST like this:

  POST /bug-report-url
  Content-Type: application/x-www-form-urlencoded

  NoResponsible=true


Somehow that seemed quite a bit hacked - what other sorts of artificial key-value fields would we need to implement the full solution?

The move to PATCH

This was surely not going in the right direction, so we started researching partial updates again and looked at the HTTP verb PATCH with the media type application/json-patch (see http://tools.ietf.org/html/draft-ietf-appsawg-json-patch-08). This turned out to be a perfect match for our use case where the integration component could build up a patch document, based on the changes that needs to be transferred, and then apply that to the bug report resource.

A json-patch document is basically a list of operations to be applied to the target resource. Here is an example of how such a patch document can be used to update the title of our fictive bug report and remove the responsible person at the same time:

  PATCH /bug-report-url
  Content-Type: application/json-patch

  [
    { op: "replace", path: "/Title", value: "New title" },
    { op: "remove", path: "/Responsible" }
  ]


Back to complete updates using PUT

PATCH turned out to be perfect for our integration project. So far so good. But another relevant use case could soon be to support some kind of human facing application that works like this:

1) User opens an existing bug report.

2) User modifies the bug report.

3) User saves the new version of the bug report.

In this case tracking changes for a patch document may not be desirable, so instead we should use the well known pattern of GET resource - modify local representation - PUT result back, with the relevant use of ETag headers etc. to avoid lost updates (see for instance http://www.w3.org/1999/04/Editing/).

My point here is that even though we discarded PUT for partial updates earlier on, doesn't mean that it cannot be done - but you need to do it right with GET+PUT and ETag headers - and that is quite a bit more complex than using PATCH as described above.

json-patch with Ramone

Since we use Ramone internally for our client applications, it was natural to add support for json-patch here - which, not surprisingly, is the topic for my next blog post.

Other approaches (which we found less useful)


Doing partial updates using PUT with sub-resources

Another common way of doing idempotent partial updates using PUT is to represent certain properties as sub-resources that contains a subset of the main resource. We could for instance move the Responsible property to a resource of it's own with only that value and then link to it in the main resource.

Such a solution would work, but 1) it would require lots of boiler plate code to implement, and, 2) more importantly, it would not support atomic and transactional updates to multiple properties in one request.

Encoding modified property names in the URL

Some people has suggested putting the names of the properties to modify in the URL of the request, thus resulting in a new URL which can be updated using a complete PUT. For instance, to update only the Title in our example we could make a PUT like this:

  PUT /bug-report-url;Title
  Content-Type: application/x-www-form-urlencoded

  Title=New%20title

But this requires the client to build URLs with the assumption of a given URL structure - something to be avoided in hyper media based APIs (since this would couple the client with the server's URL structure).

Further more, I cannot foresee if it will work for more complex structures, so why not use a standardized approach like json-patch?

Using the XML patch framework

We might as well have used the XML patch framework (see http://tools.ietf.org/html/rfc5261) instead of json-patch for the patch document. That would certainly be more in line with the XML representation we use already. The main reason for not doing this is very simple and pragmatic - we discovered the XML framework too late.

Should I although choose between the two patch formats today, I would probably choose json-patch again for the simple reason that it matches our data better - XML is only a thin wrapper around the public data model and JSON might in fact have been a better choice. The XML patch framework is quite over-engineered for the kind of data we serve.

UPDATE (January 3, 2013): Is this RPC?

Someone on Stackoverflow commented that this "sending commands" (via patch) is more RPC style than REST. Let me try to cover that issue ...

If you define RPC as sending commands to a server then any and all HTTP operations are RPC calls by definition - whether you GET a resource, PUT a new representation or DELETE it again - each of them consist of a sending a command (verb) GET/PUT/DELETE etc. and a optional payload. It just happens that the HTTP working group (or who ever it is) has introduced a new verb PATCH which allows clients to do partial updates to a resource.

If anything else than sending the complete representation to the server is considered RPC style, then, by definition, partial updates cannot be RESTful. One can choose to have this point of view, but the people behind the web infrastructure says differently - and has thus defined a new verb for this purpose.

RPC is more about tunneling method calls through HTTP in a way that is invisible to intermediaries on the web - for instance using SOAP to wrap method names and parameters. These operations are "invisible" since there are no standards defining the methods and parameters inside the payload.

Compare this to PATCH with the media type application/json-patch - the intention of the operation is clearly visible to any intermediary on the web since the verb PATCH has a well defined meaning and the payload is encoded in another well defined public available format owned by common authority on the web (IETF). The net result is full visibility for everybody and no application specific secret semantics.

REST is also about "serendipitous reuse" which is exactly what PATCH with application/json-patch is - reusing an existing standard instead of inventing application specific protocols that do more or less the same.

1 kommentar: