torsdag, april 19, 2012

Ramone: Media types and codecs

One of the basic building blocks of REST is the concept of a media type - the file format used to represent a resource on the web. Media types comes in many different flavours - images, PDF, vCard, XML, JSON, spreadsheets and so on, each of them having their own specific formats and capabilities. If you haven't done it already then take a look at my previous post where I go deeper into details about media types.

Media types are considered first class citizens of Ramone, my C# library for consuming web APIs and RESTful services on the web - just like the uniform interface (GET/POST/PUT/...) and resource identifiers (URLs) - and in this post I will show how to work with different kinds of media types.

Codecs

A codec is a class that translates to and from the file format on the wire and some kind of internal representation in C#. To do so it must first implement either IMediaTypeWriter, IMediaTypeReader or both and then register with the current codec manager such that Ramone will be able to find it.

The codec interfaces are rather simple:

  public interface IMediaTypeCodec
  {
    object CodecArgument { get; set; }
  }

  public interface IMediaTypeWriter : IMediaTypeCodec
  {
    void WriteTo(WriterContext context);
  }

  public interface IMediaTypeReader : IMediaTypeCodec
  {
    object ReadFrom(ReaderContext context);
  }

The context parameter contains references to the current session, the data stream, HTTPRequest, HTTPResponse and others that are available for the codec.

Decoding an HTML micro format

One example of a codec is the BlogCodec from Ramone's test library. This codec demonstrates how to decode a (non-standard) micro format from an HTML page that shows a blog listing (see https://gist.github.com/2305777 for the actual HTML).

  public class BaseCodec_Html : TextCodecBase<Resources.Blog>
  {
    // This method is from TextCodecBase which has wrapped the binary input stream in a TextReader
    // using the charset encoding stated by the client's request headers.
    protected override Resources.Blog ReadFrom(TextReader reader, ReaderContext context)
    {
      // Using HtmlDocument from HtmlAgilityPack
      HtmlDocument doc = new HtmlDocument();
      doc.Load(reader);
      return ReadFromHtml(doc, context);
    }

    protected Resources.Blog ReadFromHtml(HtmlDocument html, ReaderContext context)
    {
      HtmlNode doc = html.DocumentNode;

      List<Resources.Blog.Post> posts = new List<Resources.Blog.Post>();

      // Scan through HTML and look for "class" attributes identifying values
      foreach (HtmlNode postNode in doc.SelectNodes(@"//div[@class=""post""]"))
      {
        HtmlNode title = postNode.SelectNodes(@".//*[@class=""post-title""]").First();
        HtmlNode content = postNode.SelectNodes(@".//*[@class=""post-content""]").First();
        List<Anchor> links = new List<Anchor>(postNode.Anchors(context.Response.ResponseUri));

        posts.Add(new Resources.Blog.Post
        {
          Title = title.InnerText,
          Text = content.InnerText,
          Links = links
        });
      }

      // Extract all HTML anchors together with <head> links and store them as ILink instances
      List<ILink> blogLinks = new List<ILink>(doc.Anchors(context.Response.ResponseUri).Cast<ILink>().Union(doc.Links(context.Response.ResponseUri)));

      // Create and return an object that represents the data extracted from the HTML
      Resources.Blog blog = new Resources.Blog()
      {
        Title = doc.SelectNodes(@".//*[@class=""blog-title""]").First().InnerText,
        Posts = posts,
        Links = blogLinks
      };

      return blog;
    }

    // This method is also from TextCodecBase, but is not used (since we do not write HTML)
    protected override void WriteTo(T item, System.IO.TextWriter writer, WriterContext context)
    {
      throw new NotImplementedException();
    }
  }

You can read a bit more about using hyper media links in another of my earlier posts.

Other codec examples

Other examples of codecs could be:
  • Decoding cooking recipe data from XML or JSON.
  • Decoding binary image data.
  • Decoding CSV into tabular data.
  • Decoding and writing vCard information.

Built-in generic codecs

All of the previous codecs has been "typed" in the sense that they decode response data into a typed object with the specific properties needed. But Ramone has also built-in support for various generic formats such as XML, JSON and HTML.

Here is an example use of the XML codec which decodes into C#'s XML DOM class XmlDocument:

  Request req = Session.Bind("... some url ...);

  XmlDocument doc = req.Get<XmlDocument>().Body;

The JSON codec can do some nifty stuff with C# dynamics:

  Request req = Session.Bind("... some URL for cat data ...");

  dynamic cat = req.Accept("application/json").Get().Body;

  Assert.IsNotNull(cat);
  Assert.AreEqual("Ramstein", cat.Name);


Advantages of typed codecs versus generic codecs

By working with typed codecs you gain a few advantages over the generic ones:
  1. The application code is completely decoupled from the wire format. This gives you the ability to work with different wire formats without changing the application code, e.g., decoding both JSON, XML, and vCard into the same internal representation.
  2. It results in more readable application code.
  3. It makes the parsing code reusable across difference pieces of application code.
The downside of codecs is that you have to write a few more pieces of boilerplate code to create and register them. You also have to implement the client side representation of the resource as a specific class. But in the end it all pays off, in my opinion, and yields more readable and maintainable code.

Update (20/04/2012): Codec Manager

I forgot to show how codecs can be registered with either the current service (more on services at a later time); Codecs must be registered with Ramone, otherwise they will simply be ignored. To do so you first grab a reference to ICodecManager and then call AddCodec(...):

  ICodecManager cm = MyService.CodecManager;
  cm.AddCodec<MyClass, MyCodec>(MyMediaType);

Here MyClass is the type of object returned or written by the codec, MyCodec is the type of the codec and MyMediaType is the media type id string, e.g., "application/vnd.mytype+xml".


Ramone can be downloaded from https://github.com/JornWildt/Ramone


Ingen kommentarer:

Send en kommentar