The Old Is New

$1.859

Tonight was a colossal excursion into the dark side of rewrite rules for Apache. And it’s all for you, the reader of my little blog.

Basically, the deal was this. When I converted from MovableType to WordPress, all the old links to material went missing, replaced by WP’s. This was primarily of concern for individual entries and monthly archives. This also meant that all the indexing that Yahoo had done was now busted. I’d waited a long time for one of the search engines to pick me up, so I didn’t want to break everything that was out there!

I started searching for ways to make WP see the old MT entry numbers. I tried changing the MySQL entry numbers, and that didn’t work very well — I kept getting php errors in the scripts. After doing a lot of looking, I decided a RewriteMap was probably my best bet. I only have about 275 entries that needed rewriting, so the rewrite file couldn’t be too hard to generate. And it wasn’t.

The challenge was with the RewriteRule. I’d never played with these before, and have a much better understanding of them now than I did, but I was surprised at the dearth of information on the RewriteMap piece of that. Everyone mentioned the RewriteMap part, but integrating that into the RewriteRule wasn’t well documented. Even my oracle of knowledge, O’Reilly books (in this case, Apache, The Definitive Guide), failed me, only containing about two pages of info. Wainwright’s Professional Apache didn’t help much either. However, between those two books from my bookshelf, and a bunch of searching for examples on Yahoo, I finally ended up with the correct syntax. Here’s what it all looks like.

For the VirtualHost for the blog in Apache’s httpd.conf, I added:

RewriteMap mt-to-wp txt:/www/apache/current/htdocs/wordpress/mt-to-wp-map.txt

This pointed to my map file that cross referenced the old MT entry ids to those used by WP after its import. Part of that file looks like this:

000001 002 # Dinner
000002 003 # What would you do with $300M?
000003 004 # Road Trip!

The comments for each entry are unneccesary, and were leftover from the way I created the original list of entries. I started by grabbing the archives.html file from the old MT installation. This had a list of all the entries I had at the point I cutover to WP. I stripped the HTML out of it, did some creative search and replaces, and wound up with a file that had the old MT entry number, a pound sign, and the original entry title.

Then the tedious part. I used mysqlcc to examine the database table, and typed the number that WP assigned that entry. About 20 minutes of work — not bad, and I’m sure there are better (read: programmatic) ways of matching up the entry titles between the flat file and the MySQL database, but there were so few lines to fool with, I figured I could type faster than code it. 🙂

The last piece was the RewriteRule to make it go. In the .htaccess file in the WP directory, I added:

RewriteRule ^archives/([0-9]{6}).html$ /index.php?p=${mt-to-wp:$1} [R,L]

This rule forces anything that looks like an old MT URL to be translated to a WP-centric URL.

That got me past the individual entries. What about the monthly archives? After all, they’re out in the web indexes, too. That chore was easier.

I went to Options/Permalinks as the admin, and told it I wanted links that looked like the old MT archives:

/archives/%year%_%month%.html

The interface will then create the RewriteRules necessary to handle MT-oriented URLs that come in, and make them WP-centric. Don’t save the new permalink structure, however! Simply added that to the same .htaccess file I mentioned above is sufficient to let the WP pages work correctly, and allow the old indexed URLs from MT also work.

Cool, huh?

So, I think it all works now, and I can turn my eye toward design and enhancements. Watch this space!

(BTW, it still appears that Google refuses to index my sites, despite repeatedly submitting them. I guess they are very concerned that since few — if any! — other sites have linked to mine, the information here must not be worthy of indexing. Yahoo, OTOH, has taken the high road, and indexes me all the time. It’s fun to watch. 🙂 )