New Method for Redirecting to www
January 23rd 2007
During my server setup I discovered an interesting new way to solve the age-old http://domain.com http://www.domain.com problem, one that I think is less resource intensive and more logical.
The problem, for those unfamiliar, is that Google sees those two sites as separate, assigns two PageRanks, and can even duplicate content. That’s not good! To try to combat this, in their webmaster tools page, you can select which address you prefer, but that’s been unreliable for me. Zenphoto.org still has split PageRanks even though I told Google to always use the www subdomain. For that reason (and others, like consistency of the URL users see) I think it’s best taken care of server-side.
Before I set up my shiny new VPS, I had my domains set up (like most do, I’m sure) so everything simply worked through either the http://trisweb.com/ or http://www.trisweb.com addresses using a simple Apache ServerAlias directive, like so:
<virtualhost *> ServerName www.trisweb.com ServerAlias trisweb.com DocumentRoot /var/www/trisweb ... </virtualhost>
In google’s eyes, that gives the split site, with both www and a non-www addresses allowed. The second method is to use Apache’s mod_rewrite module to redirect to the correct URL, like so (found here):
Options +FollowSymlinks RewriteEngine on RewriteCond %{HTTP_HOST} ^mydomain.com [NC] RewriteRule ^(.*)$ http://www.mydomain.com/$1 [R=301,NC]
But this method also has problems because mod_rewrite has bugs. It doesn’t correctly rewrite a few characters in URLs, which I discovered in my rewrite-hacking for zenphoto, and rediscovered after trying mod_rewrite for this task. Some characters in my Zenphoto URL’s weren’t being redirected correctly, so going to http://www.trisweb.com/photos/index.php?a=album%20one would work, but http://trisweb.com/photos/index.php?a=album%20one would cause the URL to be mangled in the rewrite process, making zenphoto choke, unable to find the album. In addition, you’ve got the extra (admittedly small) overhead of loading the rewrite engine for every single page in your site, just to check if it’s got no www in front.
The third method, and the one I’m currently using and testing, is something I’ve never actually seen done before. I got the idea from Matt in this post about Wildcard DNS and redirecting to the non-www version of your domain. Though I’m doing the opposite, the idea is essentially the same. Here’s the setup:
# Redirect to www: <virtualhost *> ServerName trisweb.com DocumentRoot /var/www/redirect RedirectPermanent / http://www.trisweb.com/ </virtualhost> <virtualhost *> ServerName www.trisweb.com DocumentRoot /var/www/trisweb ... </virtualhost>
As you can see, the configuration is split into two virtual hosts, one handling straight-up trisweb.com, and one with www.trisweb.com. As in Matt’s redirect, the RedirectPermanent line is used for the actual 301 redirect, but it’s only ever invoked if a request without a subdomain happens. All subsequent requests are handled by the main virtual host, which is normal except for the removal of the www alias.
This makes for a few improvements. First, no mod_rewrite. RedirectPermanent is part of mod_alias, which is built in to Apache. It’s also much less buggy, and actually preserves the URL very well, making it work with every legal URL in your site. Second, no overhead. There’s one hit for the redirection, and after that nothing else is even checked, because you’re on the right virtual host already. Third, this is very customizable. If you decide you don’t want www anymore, switch it around: make the first server ServerName www.domain.com or *.domain.com and redirect it to the other virtual host serving only domain.com.
It’s not that big a difference I suppose, but I think this way should be more robust and compatible, especially if you have a lot of “interesting” script URLs in your site. I’ll see how it works in the long run and keep this updated. Also, if anyone sees any reason this is a bad idea, please comment and let me know.








On second thought, that may have been how Matt meant it to be done in the first place. Ah well, this is a decent explanation.
The first one is the default so the main one first:
virtualhost *
ServerName http://www.trisweb.com
DocumentRoot var www trisweb
…
virtualhost
then your redirects