Using htaccess to send bots and users to different locations

I'm trying to set up a conditional .htaccess file that'll send google and facebook bots to a server-side rendered version of my site. For regular users, all requests should be redirected to index.html as I have a JavaScript based router that'll read the URL and render a view based on the URL.

Here's what I have

<IfModule mod_rewrite.c>
  RewriteEngine on

  RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
  RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
  RewriteRule ^ - [L]
  RewriteRule ^ /index.html [L]

  RewriteCond %{HTTP_USER_AGENT} facebookexternalhit|Facebot|Googlebot [NC,OR]
  RewriteRule .* /sharehandler/index.php [L]
</IfModule>

Currently everything is redirected to index.html. Including Googlebot and Facebot.

If I move the bot lines to the top:

RewriteCond %{HTTP_USER_AGENT} facebookexternalhit|Facebot|Googlebot [NC,OR]
RewriteRule .* /sharehandler/index.php [L]

Then everything is redirected to /sharehandler/index.php including all the reguar users. It seems the RewriteCond's aren't evaluated and the server simply triggers the first RewriteRule it sees no matter what.

Answers


Your non-bot section does not exclude hits from bots, so if it is first, it will match everything. Your bot section includes some errors which I suspect make it match everything, so putting it first as-is will also match everything.

If you put the (working) bot section first, only visitors not matching those conditions will ever reach the next section.

So, first for the bot section:

  • You only have 1 RewriteCond, so [OR] is unnecessary (and possibly why it matches everything when placed at the top? Condition 1 OR no condition will always match?)

  • The RewriteCond docs include an example of how to redirect based on user agent. Your regexp is a subpattern, so should be in parenthesis. The docs also quote it.

For the non-bot section:

  • Don't use the [OR], you want to match when the request is both not an existing file and not an existing directory.

  • You have 2 RewriteRules in this section, there should only be 1.

  • The pattern in what looks like the right RewriteRule in this section is not a valid regexp.

Here's an updated version with the above fixes applied:

<IfModule mod_rewrite.c>
  RewriteEngine on

  RewriteCond "%{HTTP_USER_AGENT}" "(facebookexternalhit|Facebot|Googlebot)" [NC]
  RewriteRule .* /sharehandler/index.php [L]

  RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f
  RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
  RewriteRule .* /index.html [L]
</IfModule>

Thank you for this! I helped me along far enough so I could finish it myself.

I had to add RewriteBase / to get it to redirect normal users properly, and I also had to use %{REQUEST_FILENAME} instead so the final result looks like this.

<IfModule mod_rewrite.c>
  RewriteEngine on

  RewriteCond "%{HTTP_USER_AGENT}" "(facebookexternalhit|Facebot|Googlebot)" [NC]
  RewriteRule .* /sharehandler/index.php [L]

  RewriteBase /
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteRule .* /index.html [L]
</IfModule>

Need Your Help

How to add Access-Control-Allow-Origin="*" to the response header in WSO2 DSS

jquery cross-domain wso2 wso2dss

I made a data service with WSO2 DSS. It can handle rest-like queries like get services/person/get_name?id=1000 and send response by http in json format. I would like to add a parameter to http head...

This recursive method to traverse a binary tree crashes after some recursions! Why?

c++ recursion binary-tree huffman-code

This is my class Node and Traverse is a method to visit an Huffman binary tree and save the codes of the characters of a .txt file. Codes is a vector of string where I save the codes. Temp is the