From Blobs to Chunks: Structured Content in WordPress

By Michael Holley Swtpc6800 (Own work) [Public domain], via Wikimedia Commons
Photo By Michael Holley Swtpc6800 (Own work) [Public domain], via Wikimedia Commons

As the folks at CERN recently celebrated, it was twenty years ago that the core technologies and standards of the world wide web (including code for a web server and a line-mode client) were officially placed in the public domain. Tim Berners-Lee’s invention, designed to enable researchers to share research documents across multiple computing platforms and formats, would quickly outgrow these academic beginnings to become a global force for business and social interaction.

It helps to remember this history, though, as we still struggle with one of the fundamental assumptions of early HTML (and its predecessor SGML):

Content has its own internal structure separate from the specific presentations which might be made of it.

This core notion of separation of content from presentation has been a challenge ever since. We just can’t seem to come to grips with the notion that the web is different than print, and that rather than trying to control the output across device types, contexts, and users, we ought to aim for flexibility. (In the 10 years between John Allsop’s The Dao of Web Design and Ethan Marcotte’s Responsive Web Design, the majority of the industry – with some notable exceptions –  largely fell back into a pattern of fixed page designs for the desktop browser).

Enter Content Strategy

While the approaches like progressive enhancement, adaptive web design, and responsive web design have helped the situation significantly, by helping realize the goal of flexible presentations rendering reasonably on various devices, form factors, and contexts, they only account for content presentation.

Content strategists, most notably Karen McGrane (Content Strategy for Mobile) and Sara Wachter-Boettcher (Content Everywhere) have forced us all to recognize that without structured content – without forcing the content management systems and strategies we’re dependent upon to recognize, capture, and make use of structured content – we can’t ever truly be prepared for a world of adaptive content. The apparent flexibility of the WYSIWYG blob in fact prevents us from realizing the best presentation for each context; if we want true flexibility, we need structured content.

Content Blobs

WordPress has often been the poster child for – or represented the anti-pattern of –  unstructured content. McGrane wites:

If your organization is using a blogging platform like WordPress as its CMS, you know what this looks like. Content creators get one big field for the body of their content, and it’s like their own personal playground.

WordPress’ focus on a WYSIWYG authoring experience and the legacy of posts/pages has been loved by authors and content creators, because it allows a great degree of flexibility, or what authors see as flexibility. In fact, one of Matt Mullenweg’s favorite features in WordPress is “distraction free writing mode” in which everything but that “one big field” disappears into the background.

That apparent flexibility, however, comes at a longer term cost because the content it captures is not structured. What WordPress stores in the database is a messy melange of plain text, html markup, references to images and other assets like files, headings, sub-headings, paragraphs, and “shortcodes” which are specific snippets of text designed to be understood by plugins and transformed on display into consumable html.

WordPress plugins, themes, and templates can and will impact the presentation layer, adding styling onto that markup and processing shortcodes, but their ability to have structured, regularized, programmatic access to specific parts of the content (in order to execute rules) is very limited, because of the flexibility allowed to the person inputting the content.

Making WordPress Chunky

It doesn’t, however, have to be this way. If you have a content model, WordPress can be made to respect that model and provide interfaces for content creators which encourage require structured content and rich metadata.

Some basic structure and metadata are already built in, of course: title, body, excerpt, author, categories, tags, featured image, and (depending on what plugins you’ve elected) SEO-related metadata. Setting a featured image, for example, stores a specific relationship between the post or page and the media asset that can be presented different ways in different contexts: often, the featured image is included with the excerpt on list style pages in a smaller size, and becomes a hero image on “single article” type pages.

WordPress has also for a long time also allowed the notion of custom meta data, which requires some content modeling and some development, but enables users to add specific fields to content entries representing specific parts of the structure. For example, it’s a trivial exercise to add a subhead in addition to headlines, or to add a “pull quote” field to be treated visually in different presentation contexts. (Although it’s called custom meta data, and stored in a post-meta table, such information doesn’t have to be “metadata” in the purist sense; it can also be part of the content itself).

For $100 off a three-day conference registration at CMS Expo use code CMSX54417
For $100 off a three-day conference registration use code CMSX54417

Beyond Posts and Pages

This is where one of the talks I’m giving next week at CMS Expo in Chicago comes in: Beyond Posts & Pages – Structured Content and Content Types in WordPress.

Since version WordPress 2.9 (which is to say since late 2009), WordPress can not only be made to think about posts and pages in a chunkier fashion, and to gather richer metadata about the content it is being used to manage, but to create and manage other types of content. Developers have the capability to create custom content types (which WordPress refers to as custom post types, often abbreviated CPT). Sites managed in WordPress can leverage a fairly rich content model, including multiple custom content types, different custom taxonomies for each of those content types, specific meta data or content chunks for those content types, and even (via plugins) relationships between those content types.

In the talk I will walk through a specific example – a site built during New England GiveCamp a few weekends back by a small team for a non-profit focused on encouraging youth understanding of and participation in civic action. We’ll cover:

  • Registering (which means “creating”) a custom post type via theme or plugin
  • Template hierarchy and styling custom post types
  • Custom taxonomies and permissions
  • Custom fields and meta boxes for adding/editing them – controlled vocabulary or free form
  • Secondary HTML content areas

I will also discuss some of the limitations and challenges of doing complex content types in WordPress: search, alternative outputs, and relationships. It’s definitely a WordPress specific talk but many of the core concepts we’ll be discussing will be helpful for those modeling content in other platforms as well.

About the Author

Formerly the Managing Director of Boston Connective DX office, John's passion for technology and the role of CMS are clear in his point of view.

More articles from John Eckman


3 responses… read them below or add one.

  1. There is a fantastic WordPress plugin called Advanced Custom Fields, which really helps us get away from the “one giant box” input issue. It provides a great UI for the person inputting content, and powerful markup output systems for developers.

Leave a Reply
  1. Fields marked with * are required.
  2. We will not publish your email.