SUSE Conversations


Using Apache mod_rewrite for URL Redirection

mfaris01

By: mfaris01

April 3, 2008 7:36 am

Reads:696

Comments:0

Rating:0

Apache Web Server offers many modules to use for your various needs, from helping integrate eDirectory authentication to PHP and MySQL support. There is one module that widely misunderstood and underutilized. That’s the mod_rewrite module.

The mod_rewrite module grants you the ability to redirect browsers from one URL to another, substitute back end cryptic paths for more meaningful, search engine friendly addresses that the end user would see in their address bar. Redirect users to a new domain, while still allowing the users to enter the old domain name in their browsers. Increase web site security by presenting alternative URL paths to the browser and still achieving the proper connection paths. The possibilities are numerous. I will present several of the more popular uses here and walk through how and why, because if you don’t understand what you did, you’ll never be able to do it again on your own.

Let’s start off by explaining what mod_rewrite is not. It does not create URLs that do not exist. It does not create site content that does not exist.

Basically, mod_rewrite allows you, the site administrator, to control your site and it’s content. Including, who can visit your site, what pages they are allowed to see when they visit, what URLs a search engine or visitor is allowed to see.

The module mod_rewrite is already included in SUSE Linux Enterprise Server, so no additional installations or Apache Web Server compiles are necessary. You simply turn it on in the configuration file, which we will cover in a little bit.

If you are new to mod_rewrite or not, and would like an excellent resource dedicated strictly to mod_rewrite, please visit http://forum.modrewrite.com/

Before we jump in and look at what mod_rewrite can do, I will add a caveat about mod_rewrite that should be stated. Mod_rewrite, although very useful, can send your Apache web server into an infinite loop, even to the point of requiring a reboot of your server. The point is to think about and map your settings for mod_rewrite on a test system, BEFORE you implement these changes into a production environment.

Simple Redirection

Let’s say that you have are using the SLES Apache defaults, where your DocumentRoot is /srv/www/htdocs and you have a directory under that called /widgets. That is where your customers are directed to view the different widgets your company sells from your default web page.

The URL would look something like http://www.yourcompany.com/widgets/widget1.html

Simple enough. Easy to search on with the leading search engines and your site appears in the top five results for users wanting find a widget. Now, your marketing guy comes to you and tells you that you all will be carrying another type of widget and wants to change the /widgets directory to /allwidgets. You could simply change the code in your default web page to reflect the new path. The problem is you will break any users who have this link bookmarked and search engines will still return the old link in their results, until they re-scan your site. That’s where mod_rewrite can make everyone happy.

Create a file in your default directory, /srv/www/htdocs, and call it .htaccess. That’s (DOT)htaccess Edit this file with your favorite text editor.

First thing we have to do is enable mod_rewrite. Enter the following in this file:

Options +FollowSymLinks
RewriteEngine On

This tells Apache to enable mod_rewrite operations.

Move all of the files in /widgets directory to the /allwidgets directory.
Now we have to tell mod_rewrite that “anyone requesting content from /widgets/ to look in the /allwidgets directory and display the new URL in the browser address line.”

Add the following line to your .htaccess file after the ReWriteEngine On directive:

RewriteRule ^widgets/([a-z0-9])\.html$ /allwidgets/$1.html [R=301,NC,L]

Let’s break this down.

RewriteRule tells the server to implement the following rule. The syntax is

RewriteRule Pattern Substitution

Pattern is the query string for the match or what you are looking for.

Substitution is the redirection.

^ defines the beginning of the line or starting anchor. Remember that ^ also means NOT when used inside the condition.

widgets/
If the request starts with widgets/

([a-z0-9])
AND the request has any letters or numbers in it, store it as a variable

\.
Match the dot. The slash is used to define that we are looking for a special character.

html
Match that this request ends in html

$
Stop the match query.

/allwidgets/
Redirect to /allwidgets/

$1
The variable we matched above

.html
Add the file extension
[R=301,NC,L]

R – Redirect the browser
=301 – Tell the browser and search engine that this is a permanent move.
NC – the request is case insensitive.
L – Stop processing

So, with our example rule, if a user enters the URL of

http://www.yourcompany.com/widgets/coolwidget9.html

they will be redirected to

http://www.yourcompany.com/allwidgets/coolwidget9.html

That is your basic rewrite redirection rule. Let’s look at regular expressions and PHP.

Regular Expressions are a more complicated part of mod_rewrite that tend to cause hair loss and permanent facial creases, but shows the power of mod_rewrite.
Using regular expressions you can have your rules matching a set of URLs at a time, and mass-redirect them to their actual pages.

Take a look at this rule:
RewriteRule ^widgets/([0-9][0-9])/$ /widgetinfo.php?WidID=$1

This will match any URLs that start with widgets/, followed by any two digits, followed by a forward slash.

This rule will match an URL like widgets/34/ and redirect it to the PHP page.

([0-9] [0-9])
Match any two digit number and store it in a variable.

Once the redirect is done, the page loaded in the user’s browser will look like

http://www.yourcompany.com/widgetinfo.php?WidID=34

This looks much better to the user and it hides what’s really happening in the back end.

Now that you’ve had a taste of what mod_rewrite can do. Let’s look at an example of an infinite loop. – BAD

Remember that when a requests is redirected, it is then sent back to the configuration file as a new request.

Look at this rule for simple redirection:

RewriteRule ^([a-z]+)\.html$ \index.html [R,NC,L]
And say you want any request to be redirected to http://www.yourcompany.com/index.html

So, you type in a request. http://www.yourcompany.com/

The rule processes your request and redirects you to

http://www.yourcompany.com/index.html

Now this URL is considered a new request and applies the rule and redirects you to

http://www.yourcompany.com/index.html

And this URL is considered a new request …and so on and so on.

It doesn’t work because it’s stuck in an infinite loop. Actually, you would eventually get a 500 Error due to maximum redirects exceeded, but you get the point.

Conclusion

You can see the power of mod_rewrite provides you as an Apache administrator. There are endless scenarios for this module. Now go and have fun and don’t be afraid to ask for help with this one. It’s not one you can learn in a day. I’m still learning what it can do.

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

Tags: ,
Categories: SUSE Linux Enterprise Server, Technical Solutions

Disclaimer: As with everything else at SUSE Conversations, this content is definitely not supported by SUSE (so don't even think of calling Support if you try something and it blows up).  It was contributed by a community member and is published "as is." It seems to have worked for at least one person, and might work for you. But please be sure to test, test, test before you do anything drastic with it.

Comment

RSS