Avoiding content migration train wrecks

We’re back again with David Hobbs, author of the Website Migration Handbook.

The first part of our conversation covered the fundamentals of website migration, but we pressed David to give us some of the gory details on migrations run amuck.

Here’s the second half of our talk.

OK David, let’s say that someone didn’t read your ebook in time and now has to migrate a large website in a very short amount of time. Any tips on how to make this process less catastrophic so they can avoid that photo of a train wreck you have on page 65?

In that case I would recommend two things: 1) restate the vision, and 2) estimate manual effort. By restating why you are even attempting the project, you can help focus everyone. If the goal is browsing by richer metadata, then everyone can focus on that.  If the goal is to provide an archive of older content, then everyone knows that quality isn’t as high a priority. Also, by estimating manual effort you can make more judgment calls, including just acceptance of lower quality. If you blindly go forward when the launch is imminent, then you can have even bigger train wrecks when more manual effort is required. That said, I intentionally made the handbook short and direct with lots of checklists, so hopefully people will take the time to read it!

Staying on worst case scenarios for a moment,aside from not launching on time, what’s going to happen if an organization really botches a website migration?

Sometimes failures are spectacular and other times a bit more grinding. The more prominent types are when organizations start a large replatforming project, spend a lot of money, and then stop before the project launches. These can really burn teams and put everyone off of the needed work for years.  Other more obvious cases are when internal stakeholders have an open rebellion against the new system (which is one of the reasons to be clear about how to have a productive engagement from the start). Then there’s the slow failure, where perhaps there’s a launch party with everyone declaring victory but the site was done in such a slipshod manner that it really is unmanageable.  I’m thinking of creating an anonymous hotline where people can report failures and then others can learn from them.

We get a lot of clients who want to automate the content migration effort to avoid the manual copy and paste effort. When does an automated approach make sense?

One way to look at it is that you should automate whenever you can!  The problem is that people often narrow their question to the easy aspect (copying and pasting) of the migration completely overlooking the harder issues.  Folks should always consider the steps of handling content, for example to make sure that human editing (and here I’m not talking about HTML editing, but editing the words). Consider steps of handling content, (Sort, Place, Edit, Move, Enhance, and QA).  Sometimes rules can be defined and applied automatically to each step, and sometimes they cannot (and sometimes steps can be skipped entirely).

I liked the part where you warned of organizations of over complicating the taxonomy if it was more complex than was needed to support site functionality. But to play devils advocate, a more detailed taxonomy can provide greater flexibility for future functionality and search. If you’re going through a big content migration effort, isn’t it better to error on the side of better-defined data?

Great question. In theory, I would agree with you. The problem is that in practice that’s not how it works!  Especially during a migration, people are only doing the bare minimum required to get content in (whether automated or manual).  So the tagging just isn’t going to be high quality unless there’s a feedback loop to show the quality of that tagging. The best (and perhaps only one that matters) feedback is to have metadata-driven functionality on the site that people care about.  So if there’s no strong feedback loop, then the tagging quality will be low.  Some may point to automated concept extraction, and that may work if you either have a site like a standard news-driven site using terms where tools already have the right rules or for a very large site where you are committed to maintaining the engine.  But the bottom line in my opinion — you want to hit the metadata sweet spot.

Aim for the metadata sweet spot

Content migration for large sites can be extremely tedious. Aside from tequila, any advice for how to keep teams energized and happy while keying in thousands of fields of metadata for a widget catalog?

Hopefully in the first place things can be set up to reduce any monotony and / or speed things up so they take less time.   An obvious thing to do there is to automate as much as possible. Anyway, if the migration is in full swing I would recommend three things: 1) keep the vision of why the whole project is occurring in mind, 2) tracking and broadcasting process, and, especially for large migrations, 3) instill a sense of competition between teams which can often spur teams to work faster.

Thanks for all the great tips David, and for publishing your Website Migration Handbook. Happy migrating folks!

About the Author
Jeff Cram

Jeff Cram is Chief Strategy Officer and co-founder of Connective DX (formerly ISITE Design), a digital agency based in Portland, OR and Boston, MA. As the Managing Editor of the CMS Myth, Jeff is passionate about all topics related to content management, digital strategy and experience design.

More articles from Jeff Cram


One response… read them below or add one.

  1. Whitney says:

    “The best (and perhaps only one that matters) feedback is to have metadata-driven functionality on the site that people care about. So if there’s no strong feedback loop, then the tagging quality will be low.”

    I have been screaming this at the top of my little lungs during our last TWO CMS do-overs, but no one ever listened. We are now in the midst of yet another taxonomy nightmare. Thank you, I am not crazy after all.

Leave a Reply
  1. Fields marked with * are required.
  2. We will not publish your email.