Thank you for all this info! ------------------------ Keith Smith --- On Fri, 10/22/10, Lisa Kachold wrote: From: Lisa Kachold Subject: Re: performance when using a .htaccess To: "Main PLUG discussion list" Date: Friday, October 22, 2010, 8:18 PM On Fri, Oct 22, 2010 at 4:00 PM, Lisa Kachold wrote: On Fri, Oct 22, 2010 at 2:18 PM, keith smith wrote: Hi, I have a question about performance when using a .htaccess file.  I have read that having multiple .htaccess files can slow Apache.  Meaning a .htaccess file in each directory. We have moved a ton of content, upwards of 900 pages.  About 600 of those have been moved from our blog which was located in the directory /blog.  It was suggested to break the .htaccess into files that reflect the content moved.  For example put a .htaccess file in the /blog directory that reflects all the content from the blog instead of one big .htaccess file in the doc root directory that would contain 900 redirects. Well, that's better than FollowSymlinks? The reason that multiple .htaccess file management can be slow and difficult is that Apache2 searches each TREE and .htaccess files are inherited from hierarchical directories.  A rewrite might actually be able to do exactly what you need?  have you considered that?  Rewrite overhead is not huge, especially if you are caching for this /blog URL?   You simply enable mod_rewrite in Apache2 (procedure varies depending on your distro/version). A mod_rewrite solution is ONE line entry in your configuration file for that VirtualHost (for instance): 1) Here's a simple rewrite (provided your directory BLOG containing all of the 600 files can be trivially redirected to something like "newblog" ). RewriteEngine on RewriteBase /blog/ RewriteRule ^/newblog/ $R1 Rewrite all files from one URL "blog" with a R permanent redirect to /blogs/? 2) Use a RewriteMap which is loaded ONCE by Apache: http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html#rewritemap The RewriteMap directive defines a Rewriting Map which can be used inside rule substitution strings by the mapping-functions to insert/substitute fields through a key lookup. The source of this lookup can be of various types. The MapName is the name of the map and will be used to specify a mapping-function for the substitution strings of a rewriting rule via one of the following constructs: ${ MapName : LookupKey } ${ MapName : LookupKey | DefaultValue } When such a construct occurs, the map MapName is consulted and the key LookupKey is looked-up. If the key is found, the map-function construct is substituted by SubstValue. If the key is not found then it is substituted by DefaultValue or by the empty string if no DefaultValue was specified. For example, you might define a RewriteMap as: RewriteMap examplemap txt:/path/to/file/map.txt You would then be able to use this map in a RewriteRule as follows: RewriteRule ^/ex/(.*) ${examplemap:$1} 3) Advanced Rewrites Filesystem Reorganization Description: This really is a hardcore example: a killer application which heavily uses per-directory RewriteRules to get a smooth look and feel on the Web while its data structure is never touched or adjusted. drwxrwxr-x 2 netsw users 512 Aug 3 18:39 Audio/ drwxrwxr-x 2 netsw users 512 Jul 9 14:37 Benchmark/ drwxrwxr-x 12 netsw users 512 Jul 9 00:34 Crypto/ drwxrwxr-x 5 netsw users 512 Jul 9 00:41 Database/ drwxrwxr-x 4 netsw users 512 Jul 30 19:25 Dicts/ drwxrwxr-x 10 netsw users 512 Jul 9 01:54 Graphic/ drwxrwxr-x 5 netsw users 512 Jul 9 01:58 Hackers/ drwxrwxr-x 8 netsw users 512 Jul 9 03:19 InfoSys/ drwxrwxr-x 3 netsw users 512 Jul 9 03:21 Math/ drwxrwxr-x 3 netsw users 512 Jul 9 03:24 Misc/ drwxrwxr-x 9 netsw users 512 Aug 1 16:33 Network/ drwxrwxr-x 2 netsw users 512 Jul 9 05:53 Office/ drwxrwxr-x 7 netsw users 512 Jul 9 09:24 SoftEng/ drwxrwxr-x 7 netsw users 512 Jul 9 12:17 System/ drwxrwxr-x 12 netsw users 512 Aug 3 20:15 Typesetting/ drwxrwxr-x 10 netsw users 512 Jul 9 14:08 X11/ Solution: The solution has two parts: The first is a set of CGI scripts which create all the pages at all directory levels on-the-fly. I put them under /e/netsw/.www/ as follows: -rw-r--r-- 1 netsw users 1318 Aug 1 18:10 .wwwacl drwxr-xr-x 18 netsw users 512 Aug 5 15:51 DATA/ -rw-rw-rw- 1 netsw users 372982 Aug 5 16:35 LOGFILE -rw-r--r-- 1 netsw users 659 Aug 4 09:27 TODO -rw-r--r-- 1 netsw users 5697 Aug 1 18:01 netsw-about.html -rwxr-xr-x 1 netsw users 579 Aug 2 10:33 netsw-access.pl -rwxr-xr-x 1 netsw users 1532 Aug 1 17:35 netsw-changes.cgi -rwxr-xr-x 1 netsw users 2866 Aug 5 14:49 netsw-home.cgi drwxr-xr-x 2 netsw users 512 Jul 8 23:47 netsw-img/ -rwxr-xr-x 1 netsw users 24050 Aug 5 15:49 netsw-lsdir.cgi -rwxr-xr-x 1 netsw users 1589 Aug 3 18:43 netsw-search.cgi -rwxr-xr-x 1 netsw users 1885 Aug 1 17:41 netsw-tree.cgi -rw-r--r-- 1 netsw users 234 Jul 30 16:35 netsw-unlimit.lst The DATA/ subdirectory holds the above directory structure, i.e. the real net.sw stuff and gets automatically updated via rdist from time to time. The second part of the problem remains: how to link these two structures together into one smooth-looking URL tree? We want to hide the DATA/ directory from the user while running the appropriate CGI scripts for the various URLs. Here is the solution: first I put the following into the per-directory configuration file in the DocumentRoot of the server to rewrite the announced URL /net.sw/ to the internal path /e/netsw: RewriteRule ^net.sw$ net.sw/ [R] RewriteRule ^net.sw/(.*)$ e/netsw/$1 The first rule is for requests which miss the trailing slash! The second rule does the real thing. And then comes the killer configuration which stays in the per-directory config file /e/netsw/.www/.wwwacl: Options ExecCGI FollowSymLinks Includes MultiViews RewriteEngine on # we are reached via /net.sw/ prefix RewriteBase /net.sw/ # first we rewrite the root dir to # the handling cgi script RewriteRule ^$ netsw-home.cgi [L] RewriteRule ^index\.html$ netsw-home.cgi [L] # strip out the subdirs when # the browser requests us from perdir pages RewriteRule ^.+/(netsw-[^/]+/.+)$ $1 [L] # and now break the rewriting for local files RewriteRule ^netsw-home\.cgi.* - [L] RewriteRule ^netsw-changes\.cgi.* - [L] RewriteRule ^netsw-search\.cgi.* - [L] RewriteRule ^netsw-tree\.cgi$ - [L] RewriteRule ^netsw-about\.html$ - [L] RewriteRule ^netsw-img/.*$ - [L] # anything else is a subdir which gets handled # by another cgi script RewriteRule !^netsw-lsdir\.cgi.* - [C] RewriteRule (.*) netsw-lsdir.cgi/$1 Some hints for interpretation: Notice the L (last) flag and no substitution field ('-') in the forth partNotice the ! (not) character and the C (chain) flag at the first rule in the last partNotice the catch-all pattern in the last rule Reference:  http://httpd.apache.org/docs/2.0/misc/rewriteguide.html  (SEE also the excellent sections on blocking annoying robots, and other tricks). 4) I would consider organizing your blog files into some form of organization like say an Alphabetical new file structure where wildcard rewrites will reduce your toital number of rewrites. With a large number of rewrites, especially where are permanent R1 redirect is used, I would ALWAYS USE HARD /etc/apache2 configuration files as an include statement.  They are easier to backup manage, grep through and evaluate problems after a graceful restart to reinitialize new changes.   Thank you for your feedback. ------------------------ Keith Smith --------------------------------------------------- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss -- Skype: 6022393392 ATT:     5037544452 GV:      6923073392 Phoenix Linux Security Team   PLUG.PHOENIX.AZ.US http://www.it-clowns.com "Great things are not done by impulse but a series of small things brought together." -Van Gogh -- Skype: 6022393392 ATT:     5037544452 GV:      6923073392 Phoenix Linux Security Team   PLUG.PHOENIX.AZ.US http://www.it-clowns.com "Great things are not done by impulse but a series of small things brought together." -Van Gogh -----Inline Attachment Follows----- --------------------------------------------------- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss