Proposal: JSON for scripting-ready page metadata

I'm proposing a new standard: JSON used to enrich regular web pages with scripting-ready metadata. No more theme-dependent HTML parsing -- just grab the needed data out of a namespaced global object. Read on for details.

Background

Tools like coComment (which allows users to track their online conversations) need to parse data from the web page a user is visiting in order to enrich or modify the page or report data back to a central site. This parsing assumes that the page is using a popular template or theme, and looks for known element classes and ids. Parsing doesn't always work, usually because the author has customized their template beyond recognition. CoComment suggests that site authors integrate their blog with the service by adding certain global javascript variables, making parsing unnecessary. (An alternative approach would be to add classes to the targeted elements.) As the number of coComment-like add-on services increases, the number of redundant metadata blocks increases -- each service likely wants the site's name, for example.

Proposed standards

The site author optionally creates a singleton javascript object called "MetaData" that at minimum contains a defined field "version". The different types of data are scoped into the fields "page", "site", and "proprietary", among others. Here's an example:

var MetaData =
{
	"version":"0",
	"page":
	{
		"title":"Proposal: JSON for scripting-ready page metadata",
		"url":"http://www.brainonfire.net/2006/08/04/json-script-ready-metadata/",
		"author":
		{
			"name":"Tim McCormack",
			"url":"http://www.brainonfire.net/about/tim-mccormack/"
		},
		"creationDate":"2006-08-04 15:10:39",
		"lastEditDate":"2006-08-06 7:23:06"
	},
	"site":
	{
		"title":"Brain on Fire",
		"url":"http://www.brainonfire.net/"
	},
	"proprietary":
	{
		"cocomment":
		{
			"blogTool":"WordPress"
			"commentTextFieldName":"messageTextArea",
			"commentButtonName":"SubmitButton",
			"commentAuthorLoggedIn":true,
			"commentFormName":"commentForm"
		}
	}
};

Adoption

I'm going to approach coComment, del.icio.us, and Digg about this proposed standard and see if they'll adopt it.

Remaining questions

  • How will versioning work?
  • Could this be placed in a JSON header?
  • What about encoding it as XML instead and serving it using <link rel="alternate metadata" />?
  • Would moving the data to an XML request or JSON header raise the implementation threshold too high?

Responses: 6 so far Feed icon

  1. Stephanie Booth says:

    Any possible synergies with hAtom?

    http://microformats.org/wiki/hatom

  2. Tim McCormack says:

    Well, I think I'd like to keep this inside the page as a javascript obect literal -- that's a more friendly format for bookmarklets and other enhancements. I certainly have no aversion to syncing the data schema to an existing standard, though.

  3. l.m.orchard says:

    Personally, I'd rather see this stuff in HTML Meta tags in the header. That is what they're for, basically. If various in-page things need page-level metadata in a JS context, meta tags are fairly easy to access from the DOM. On the other hand, in-page JS would be a pain in the butt to reliably get at from a service like del.icio.us.

  4. Tim McCormack says:

    That's a good point, but there still needs to be some sort of organization for the data. Perhaps name="page.title", name="page.author", etc.?

  5. l.m.orchard says:

    Yeah, check out how the Dublin Core guys proposed to embed their metadata elements in HTML Head tags:

    http://dublincore.org/documents/dcq-html/

    In fact, I wouldn't be surprised if the majority of what you want to see expressed as page metadata can be done with the Dublin Core metadata elements.

  6. Tim McCormack says:

    @l.m.orchard: Not a bad idea. You've inspired me to write a WordPress plugin to implement part of the Dublin Core.

Commenting is not yet reimplemented after the Wordpress migration, sorry! For now, you can email me and I can manually add comments.