HAProxy 1.4:删除多个斜杠

时间:2015-06-09 18:55:36

标签: regex haproxy

我想用haproxy删除多个斜杠:

$ curl -I "http://www.host.com//some_path/sub_path/slug/"

并让它为SEO目的发送301响应:

HTTP/1.1 301 Moved Permanently
Cache-Control: no-cache
Content-length: 0
Location: /some_path/subpath/slug/

第一个方法:创建ACL - >替换斜线 - >重定向

acl multiple-bars path_reg /{2,}
reqrep ^([^\ :]*)\ /{2,}(.*)     \1\ /\2 if multiple-bars
...
redirect prefix / code 301 if multiple-bars

reqrep更改了URL,因此“多个条形”不再是真的。因此我没有301重定向

$ curl -I "http://www.host.com//some_path/sub_path/slug/"
HTTP/1.1 200 OK
domain=www.host.com
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Content-Type: text/html; charset=UTF-8
Date: Tue, 09 Jun 2015 18:16:36 GMT

第二次方法:创建自定义标题并在找到标题时重定向

acl multiple-bars path_reg /{2,}
reqadd X-HadMultipleBars if multiple-bars
reqrep ^([^\ :]*)\ /{2,}(.*)     \1\ /\2 if multiple-bars
...
redirect prefix / code 301 if { hdr_cnt(X-HadMultipleBars) 1 }

但是HAProxy首先处理reqrep规则然后reqadd。所以:

[ec2-user@haproxy-stage-m3m-a haproxy]$ sudo service haproxy restart
Stopping haproxy:                                          [  OK  ]
Starting haproxy: [WARNING] 159/181957 (11430) : parsing [/etc/haproxy/haproxy.cfg:87] : a 'reqrep' rule placed after a 'reqadd' rule will still be processed before.
[WARNING] 159/181957 (11430) : parsing [/etc/haproxy/haproxy.cfg:103] : a 'reqidel' rule placed after a 'reqadd' rule will still be processed before.
[WARNING] 159/181957 (11430) : parsing [/etc/haproxy/haproxy.cfg:104] : a 'reqrep' rule placed after a 'reqadd' rule will still be processed before.
                                                           [  OK  ]

它不会重定向:

$ curl -I "http://www.host.com//some_path/sub_path/slug/"
HTTP/1.1 200 OK
domain=www.host.com
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Content-Type: text/html; charset=UTF-8
Date: Tue, 09 Jun 2015 18:16:36 GMT

第三次方法:

acl multiple-bars path_reg /{2,}
reqrep ^([^\ :]*)\ /{2,}(.*)    \1\ /\2\nX-HadMultipleBars: if multiple-bars
...
redirect prefix / code 301 if { hdr_cnt(X-HadMultipleBars) 1 }

我不喜欢这个 hack ,它看起来很脏,但它确实有效:hdr_cnt的值为1.

它适用于多个斜杠:

$ curl -I "http://www.host.com/////some_path/sub_path/slug/"
HTTP/1.1 301 Moved Permanently
Cache-Control: no-cache
Content-length: 0
Location: /some_path/sub_path/slug/

但它不适用于 sub_path 多个斜杠:

$ curl -I "http://www.host.com//////some_path///sub_path////slug/////"
HTTP/1.1 301 Moved Permanently
Cache-Control: no-cache
Content-length: 0
Location: /some_path///sub_path////slug/////

我很难让这个正则表达式有点递归,但是这段代码每次都会删除一些“随机”斜杠:

http://www.host.com//////some_path///sub_path////slug/////
301 -> Location: /some_path///sub_path////slug/////
301 -> Location: /some_path//sub_path//slug///
301 -> Location: /some_path/sub_path/slug//
301 -> Location: /some_path/sub_path/slug/
200!

我尝试使用这个正则表达式,它适用于http://www.regexr.com/,而不是反向引用([^\ :]*)

#Parses Valid URL characters:
([!#$&-.0-;=?-[\]_a-z~]|%[0-9a-fA-F]{2})+

但它不适用于HAProxy 1.4 :( 有人可以在这里给我一些帮助吗?

谢谢!

1 个答案:

答案 0 :(得分:1)

on haproxy version> 1.6

acl has_multiple_slash  path_reg /{2,}

http-request redirect code 301 location http://%[hdr(host)]%[url,regsub(/+,/,g)] if has_multiple_slash