Prezi

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in the manual

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Drupal Data Migration Made Simple with Feeds

Make Drupal data migrations less painful with the Feeds module.
by Vincent Massaro on 8 July 2013

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Drupal Data Migration Made Simple with Feeds

.
Data Migration Made Simple with Feeds
What is Feeds?
Feeds allows us to import or aggregate data as nodes, users, taxonomy terms or simple database records.
One-off imports and periodic aggregation of content
Import or aggregate RSS/Atom feeds
Import or aggregate CSV files
Granular mapping of input elements to Drupal content elements
and much more!
...but let's focus on CSV files
...the heck is a CSV?
Comma-separated values
Stores tabular data in plain-text form
Year,Make,Model,Length
1997,Ford,E350,2.34
2000,Mercury,Cougar,2.38
A simple example of some car data:
Feeds allows us to import content into Drupal in this format!
Header row
Data rows
{
Feeds monster by Saman Bemel Benrud
drupal.org/project/feeds
drupal.org/project/feeds_tamper
http://regexr.com
http://en.wikipedia.org/wiki/Comma-separated_values
http://office.microsoft.com/en-us/excel-help/top-ten-ways-to-clean-your-data-HA010221840.aspx
Case study: YaleNews
Migrate university news site (Yale Daily Bulletin) into Drupal and rebrand
Custom ASP.NET news publishing system
Microsoft SQL Server database
7000+ news articles to be migrated
Data migration? I'm no expert in data migration.
There are 100 ways to do anything in Drupal.
This is just one.
This approach involved familiar tools
and did not require me to write code.
Step one:
Learn MS SQL

Just kidding.
Start with a clean
database dump.
I was provided a complex MS SQL query to extract the data I needed.

You may not be so lucky. The good news is there are many resources online that can help you.
'Save results as...' UTF8 CSV.
You don't want to lose foreign or special characters.
Step two:
Build the content type

Analyze your CSV and create matching fields to map the data to.
...all of these and more can be imported to by Feeds.
Title
Lead copy (body)
Teaser
Press Contact
Date
Image
Article ID
Topics
We end up with something like this:
Core node fields
CCK Text
CCK Date
CCK Filefield
CCK Integer
Taxonomy
{
{
Step four:
Massage that CSV

Analyze your data with Excel or OpenOffice Calc and clean it up.
Get really close to it because you're going to be spending a lot of quality time together.
This is a trial and error process based on the data, and every project throws in its own unforseen wrench, but here are some useful tips for any project.
Use the
clean()
function around your data to remove nonprintable control characters and any invisible Microsoft Word copy and paste garbage.
=Clean(A1)
would return "hi there"
=Clean(A2)
would return "this is a test"
Vincent Massaro
A sample of the query result:
Use the Feeds Tamper add-on module. It allows you to modify data before Feeds saves it. One example is to to convert date fields to a UNIX timestamp on import.
Working with dates can be tricky and this makes date imports 'just work'.
Check links and image paths in your body content and make them relative or absolute as needed. Regular expression find/replace is helpful here.
If you map an absolute file URL to a CCK Filefield, Feeds will download the file automatically. Use CCK Imagefields for images, and CCK Filefields for files.
Step three:
Build a Feeds importer

20110109 2011-01-09T00:00:00
8753-54954998.jpg
http://opa.yale.edu/images/articles/8753-54954998.jpg
Map the header row source fields to their matching target fields in your content type. They must match as written in your CSV (case-sensitive).
Taxonomy and some multi-value CCK fields support being mapped to multiple times.
Always set a Globally Unique Identifier (GUID)! The GUID tells Feeds what is unique so it doesn't process things more than once. Without a GUID on a CSV import, Feeds may loop endlessly.
Step five:
Test the import

Start
small
Begin with a few rows that are an inclusive, representative sampling of the data.
Import
Analyze
Rinse
Repeat
References
Feeds allows you to delete imported items. Use this when testing to rapidly remove and reimport.
When the nodes generated by your sample imports look correct, hand your full CSV file to Feeds and let it churn. It may take some time, but you can finally enjoy the fruits of your labor.
It's how we went from this:
To this:
Thanks!
Questions?
Some example fields:
Some additional work up front for automation is always better than manual copy & paste, but always weigh the cost vs. benefit first.
Magic!
The trial & error process will take some time, and you will need to progressively tweak your CSV file and importer.
Office of Public Affairs & Communications
vincent.massaro@gmail.com
See the full transcript