Consider the many ways you could arrive at my blog:
All of them are valid but only the last URL is canonical. The canonical URL is the "standard" or "authoritative" pointer to the content. But so what if there are multiple URLs that reference the same thing? Well, there are two issues:
I use Apache's mod_rewrite module to rewrite and redirect incoming requests to the canonical URL. I thought I'd share the two rules I use to construct my canonical URLs.
I redirect all requests to www.ericw.ca to ericw.ca.
RewriteCond %{HTTP_HOST} !^ericw\.ca [NC] RewriteCond %{HTTP_HOST} !^$ RewriteRule ^/+(.*) http://ericw.ca/$1 [R=301,L]
The conditions match any requests where the hostname starts with anything other than "ericw.ca" (i.e., requests with subdomains). The rule then reconstructs the URL and uses a 301 Moved Permanently redirect to point to the correct location.
I redirect all requests to "non-files" to contain a trailing slash.
RewriteCond %{REQUEST_URI} !/[^/]+\.[^/]+$ RewriteCond %{REQUEST_URI} !(.*)/$ RewriteRule ^/(.*) http://ericw.ca/$1/ [R=301]
The first condition ignores requests for files: URLs matching, roughly, /.../file.extension; on these we do not want a slash. The second condition ignores requests that already contain a trailing slash.
Are your URLs canonical?