71°F

Aaron Parecki

  • Articles
  • Notes
  • Photos
  • Providing APIs for content-driven websites

    October 7, 2012

    This morning, @tommorris struke a nerve with the Hackernews community with his post Dear "API providers," I don't want a relationship.

    Paraphrasing a bit, his main point was:

    I just want the data on your website in a machine readable format. XML, JSON, RDF, CSV, YAML.

    I just want to take the URL of the thing I’m looking at, send you a content negotiation header or tack a little .xml or .json or whatever on the end and get the damn data.

    This post was #1 on Hackernews for a while, garnering a wide variety of comments, including Tom being called arrogant, arguing about the purpose of URLs, and a tangential discussion of post URL formats.

    I think most people missed the point of the article, and took it too literally. Perhaps @tommorris should have clarified what he meant (assuming I am right in interpreting this rant). I don't think he was referring to APIs like Twitter or Github, where you obviously need to register API keys since you can use the APIs to manipulate user data.

    My assumption was that he was talking specifically about APIs for content-driven websites. What I mean is that websites like Nasa's raw image feed from the Curiosity Rover or Portland's Rose Quarter event calendar are data-rich websites, and since the content is already available as HTML, there should be alternative machine-readable formats easily available without needing to first register for an API key.

    One way to do this, as Tom suggested, would be to have alternate versions of the content available by appending ".json" or ".xml" to the URL, or to accept alternate "Accept" HTTP headers specifying the content type.

    Enter Microformats

    A potentially easier solution for the content providers is to mark up the HTML content with Microformats to give it a machine-readable structure. There are several Microformats parsers available in a variety of languages. Most will accept the HTML of a page as input and output a structured version of the page. A goal of Microformats 2 is to be able to convert any HTML page into a JSON representation by parsing the Microformats on the page.

    For example,

    <a class="h-card" href="http://benward.me">Ben Ward</a>
    

    when parsed, is converted to the following JSON representation:

    {
      "items": [{ 
        "type": ["h-card"],
        "properties": {
          "name": ["Ben Ward"],
          "url": ["http://benward.me"]
        }
      }]
    }
    

    If all content-driven websites marked up content with the appropriate Microformats, you wouldn't even need a separate API to access a machine-readable version of the content.

    So if you're creating or maintaining a website with content that people will potentially want to access programmatically, consider marking up the content with Microformats. There are currently several types of content described, including:

    • Addresses
    • Contact Cards
    • Post Entries
    • Events
    • Geo
    • Resumes
    Sun, Oct 7, 2012 11:27am -07:00 #microformats
    1 mention

    Other Mentions

    • Written By: Matt Hubbard @matthewrhubbard
      What We Read This Weekend – October 8 Written By: Matt Hubbard ...
      Tue, Oct 21, 2014 8:39am -07:00
Posted in /articles

Hi, I'm Aaron Parecki, Director of Identity Standards at Okta, and co-founder of IndieWebCamp. I maintain oauth.net, write and consult about OAuth, and participate in the OAuth Working Group at the IETF. I also help people learn about video production and livestreaming. (detailed bio)

I've been tracking my location since 2008 and I wrote 100 songs in 100 days. I've spoken at conferences around the world about owning your data, OAuth, quantified self, and explained why R is a vowel. Read more.

  • Director of Identity Standards at Okta
  • IndieWebCamp Founder
  • OAuth WG Editor
  • OpenID Board Member

  • 🎥 YouTube Tutorials and Reviews
  • 🏠 We're building a triplex!
  • ⭐️ Life Stack
  • ⚙️ Home Automation
  • All
  • Articles
  • Bookmarks
  • Notes
  • Photos
  • Replies
  • Reviews
  • Trips
  • Videos
  • Contact
© 1999-2025 by Aaron Parecki. Powered by p3k. This site supports Webmention.
Except where otherwise noted, text content on this site is licensed under a Creative Commons Attribution 3.0 License.
IndieWebCamp Microformats Webmention W3C HTML5 Creative Commons
WeChat ID
aaronpk_tv