根据此处的代码:remove multiple trailing slashes mod_rewrite
我有以下htaccess
Options +FollowSymLinks
DirectorySlash Off
RewriteEngine on
RewriteOptions inherit
RewriteBase /
#
# remove multiple slashes from url
#
RewriteCond %{HTTP_HOST} !=""
RewriteCond %{THE_REQUEST} ^[A-Z]+\s//+(.*)\sHTTP/[0-9.]+$ [OR]
RewriteCond %{THE_REQUEST} ^[A-Z]+\s(.*/)/+\sHTTP/[0-9.]+$
RewriteRule .* http://%{HTTP_HOST}/%1 [R=301,L]
#
# Remove multiple slashes anywhere in URL
#
RewriteCond %{THE_REQUEST} ^(.*)//(.*)$
RewriteRule . %1/%2 [R=301,L]
然而我发现G-Bot抓取了这个网址:http://www.example.com/aaa/bbb/////////bbb-ccc/bbb-ddd.htm
。 (aaa,bbb,ccc,ddd,是url中的关键字,不得采用litraly - 我会显示网址的模式)
通过实时服务器测试上面的url我发现斜杠删除不起作用。
任何人都可以提供现有代码的任何提示或改进吗?谢谢
编辑1
@Sylwester提供了以下代码
# if match set environment variable and start over
RewriteRule ^(.*?)//+(.*)$ $1/$2 [E=REDIR:1,N]
# if done at least one. redirect with 301
RewriteCond %{ENV:REDIR} 1
RewriteRule ^/(.*) /$1 [R=301,L]
它也不起作用。我仍然在网址中看到//////。我已将这套规则放在我的htaccess文件的顶部,就在" RewriteBase /",以便不受其他规则的影响,但......没有 还有其他建议吗?
答案 0 :(得分:3)
每个目录和.htaccess都很棘手,因为apache实际上已经为我们删除了冗余。例如。不再匹配// +所以我们检查%{REQUEST_URI},因为它有原始URI,而重写规则需要匹配任何东西:
# NB: Only works for per directory and .htaccess
# Needs "AllowOverride All" in global config for .htaccess
RewriteEngine On
RewriteBase "/"
Options +FollowSymlinks
# Check if the REQUEST_URI has redundant slashes
# and redirect to self if it has (which apache has cleaned up already)
RewriteCond %{REQUEST_URI} //+
RewriteRule ^(.*) $1 [R=301,L]
如果你可以添加全局配置,我会更喜欢在虚拟主机中使用它:
RewriteEngine On
# if match set environment variable and start over
RewriteRule ^(.*?)//+(.*)$ $1/$2 [E=REDIR:1,N]
# if done at least one. redirect with 301
RewriteCond %{ENV:REDIR} 1
RewriteRule ^/(.*) /$1 [R=301,L]