Whether you are moving from HubSpot, Blogger, TypePad, Drupal, HTML, Moveable Type or a custom CMS into WordPress its handy to know how much work is involved (how many pages or posts), even before you have usernames and passwords.
This is the second in a series of posts where you can learn how to use the SEO Spider and Excel to see the differences between HubSpot, Blogger and Drupal URL structures.
We tried many tools and settled on the Screaming Frog SEO Spider for our pre-move assessments. We ponied up for the Pro version and removing the 500 URL limit is well worth it. Let’s look at how to use the SEO Spider before a migration project begins.
First, launch the SEO spider and then export your results as a .csv or Excel spreadsheet.
Second, in Excel click on the Data tab and then click on Filter. You will now see the little drop down menu for each column. Here is where it gets specific and powerful.
If you are Migrating from Blogger to WordPress;
Let’s see how many blog posts we have. The difference between a Blogger move and other CMSs is in the URL structure filtering.
I like to expand the column width of Column A (Address) so that I can read most URLs before I begin.
- Click on the Content Filter drop down, in Column B. It is usually found on Row 2.
- Click on Select All to uncheck all boxes.
- Select the boxes that start with Text/html; charset=UTF-8. Generally the other two options visible relate to XML feeds.
- Click OK.
Post – http://blog.domain-name.com/2008/05/post-name.html
Comment – http://blog.domain-name.com/2008/05/post-name.html?showComment=1210259340000
Comment – http://blog.domain-name.com/2008/05/post-name.html?showComment=1211221980000
Comment – http://blog.domain-name.com/2008/05/post-name.html?showComment=1211988840000
Thankfully the Archives use a different string to show the date.
http://blog.domain-name.com.com/2007_11_01_archive.html
http://blog.domain-name.com.com/2008_12_01_archive.html
Feed URLs are distinctive.
http://blog.domain-name.com/feeds/1058267882026467620/comments/default
For Blogger we need to filter for /year/month/ and exclude Comment. Here is how to do that.
First off, it may already be obvious that we can’t filter on one year or month because it will exclude all others. So, we will avoid that issue by using a wildcard *.
- Click on the Address Filter drop down, in Column A. It is usually found on Row 2.
- Click on the Text Filters and then choose Custom Filter
- In the drop down, select “Contains” and type in /20**/**/
- In the second drop down, select “does not contain” and type in Comment
- Click OK
Now you can see just the blog posts in Blogger.
Be sure to read the two other related posts. Each one has unique lessons that will still apply regardless of CMS you are using. The first in the series discusses moving from HubSpot to WordPress.
The second post reveals a few more tips and looks at migrating from Drupal to WordPress.