Data’s from Mars; Content’s from Venus

Facts are simple and facts are straight
Facts are lazy and facts are late
Facts all come with points of view
Facts don’t do what I want them to
- Cross-eyed and Painless, The Talking Heads

What’s the difference between data and content?

If content, in the digital age, includes all the text and associated assets plus metadata, does that mean that all content is also data?

Is all data also content, or are there forms of structured data that we would not include in a content strategy?

I can think of a few distinctions people traditionally have made between content and data:

  • Content is for humans to consume (across devices); Data is for machines to consume
  • Content is loosely structured and creative; Data is very tightly structured and modeled
  • Content belongs to the CMO; Data belongs to the CIO

Applications have highly structured data systems – specified down to the field level with high degrees of precision. (Such data is often exchanged or exchangeable via APIs or structured feeds of XML/JSON/etc. – it has to be readable by machines not humans).

Content, on the other hand, needs to resonate with end users. Content resists structure. Like people, it’s highly variable and resists being forced into templates. (Note how easy it is to personify / anthropomorphize content, which you can’t easily do with data).

Arguably, content management systems are essentially interfaces wrapped around databases, to make it easier for humans to manage data.

Men-Mars-Women-Venus-CoverDoes this ultimately mean that data is from Mars and content is from Venus?

I want to tread carefully here. I was a graduate student in literary theory in the 90s as John Gray’s work was becoming popular. I found it horribly essentializing - glossing over the complex social construction of gender with simple platitudes and metaphors.

But there is a real historical gendered context here – data has been the realm of IT, content has been the realm of marketing. It should not be surprising then that we see way more women as visible leaders of the content strategy community – and that the developer community behind CMS’s has been more broadly male. (Of course there are great exceptions in both cases).

What difference does it make?

Perhaps the most important challenge in 2013 is the goal of structured content, future-friendly content, adaptive content, and content everywhere.

Does this mean ultimately learning to treat content more like data?

In Karen McGrane‘s keynote at Drupalcon, she argued  that our job is to create new tools and interfaces that reflect new mental models.   We need to make CMS platforms that create a different tradeoff. Rather than providing perceived short-term ease-of-use via things like WYSIWYG editors and editing-in-place, which reinforce the assumptions of content creators carried over from earlier platforms, we need to create interfaces which highlight the structured, data-like nature of content.

The web is not print; digital creates the possibility of content separate from its manifestation in a specific format. Content can finally become data in the fullest sense.

The real challenge, I believe, is how do we avoid losing what made it content in the first place – the human, narrative, contextualized, lumpy, unstructured part of what makes content not data.

Or is the very existence of something about content that isn’t data itself a residual, essentialist concept we need to abandon?

About the Author

Formerly the Managing Director of Boston Connective DX office, John's passion for technology and the role of CMS are clear in his point of view.

More articles from John Eckman


5 responses… read them below or add one.

  1. John Eckman says:

    Another way of thinking about it perhaps – more inside baseball. As the purchasing decision for CMS platforms has been shifting (in recent years) from the CIO to the CMO, s that what is leading platforms to go after features like “in-place editing” and more blob-like interfaces which trade apparent ease-of-use for authors instead of more structured content, data-like approaches that present more complexity upfront but provide more flexibility down the road?

    Still looking for that CIO/CMO hybrid who gets both – who knows how to understand and appreciate the narrative, human, emotional, impactful side of content but also gets how to structure it like data.

  2. John, you open by asking if “all content is also data? Is all data also content” and then resolve the hypothesis:

    Content can finally become data in the fullest sense.
    The real challenge, I believe, is how do we avoid losing what made it content in the first place – the human, narrative, contextualized, lumpy, unstructured part of what makes content not data.

    I think that’s the crux of this discussion. To quote Rahel Anne Bailie, “content is contextualized data,” but we shouldn’t confuse design and context. Content retains internal context even when it’s divorced from presentation in a specific device, the principle guiding adaptive content. It’s that context that makes data useful. Though we might want to impose more granular structure on our content, that’s never to debase it to the level of mere data.

    Consider the “data, information, knowledge, wisdom” continuum popularized by Russell Ackoff in the 80′s. It’s the brainchild of TS Eliot, who had explored it 50 years earlier in The Rock.

    In the hierarchy, it’s context that separates data from information. 60? 65 is nothing. Retirement? Temperature? A failing grade? 65 degrees in the lecture hall… that’s context, and that’s something.

    Relevance–what’s it mean to me?–separates information from knowledge. 65 degrees in the lecture hall, and I know that feels chilly to me? I’ll have to remember a sweater.

    Application separates knowledge from wisdom. Application is behavioral change, committing to action, making a purchase. In this case, maybe it’s saying “ooh, last year at the conference I was always cold. I’m going to remember to pack a couple cardigans this year.”

    I’d argue the strength of that content lies first with context, that mere data just doesn’t have. Not the context of the device on which it appears or the layout, but the internal context in copy or a graphic that gives it meaning–meaning that is human, useful, and usable.

    • John Eckman says:

      So where does “content” sit on a continuum from data->information->knowledge->wisdom?

      Is content just data until context makes it information, relevance makes it knowledge, and application makes it wisdom?

      I think the historical cognitive gap here is that we think of content as something inherently or essentially different than the other kinds of data we use computers (databases) to manage – that’s why CMS platforms aren’t purely database CRUD engines.

      But maybe the difference between data and content isn’t inherent to the stuff itself but external – use/context/lens is what separates content from mere data?

      (Like the velveteen rabbit, content becomes data becomes someone loves it?)

  3. That makes sense to me–except that data becomes content because someone loves it… enough to give it a name (tags), context, and application. Then and only then can it run free, with all the other structured content! Oh man… that’s always the part of the story that makes me tear up. Our lesson: love your data enough to set it free!

  4. Jake DiMare Jake DiMare says:

    Just don’t tell Dave. Content is a dirty word to the CMO…They own messaging, not content. #wholeanotherlevel

Leave a Reply
  1. Fields marked with * are required.
  2. We will not publish your email.