对于测试我需要对网站进行重新获取 - 不幸的是,当使用perl lwp时,“连接”出现在主机的标题b4中。结果,请求被Web应用程序防火墙过滤。我只需要删除或下移标题中的连接线。当我用我的剧本进行复议时:
use warnings;
use IO::Socket;
use LWP::UserAgent;
use LWP::Protocol::http;
use HTTP::Request;
my $ua = LWP::UserAgent->new();
push(@LWP::Protocol::http::EXTRA_SOCK_OPTS, SendTE => 0, PeerHTTPVersion => "1.1");
$ua->default_header(Cookie => 'XXX', User-Agent => 'whateva');
my $request = $ua->get('https://www.test.com/test.html?...');
....
标题如下所示:
GET /test.html?... HTTP/1.1
Connection: close
Host: www.test.com
User-Agent: whateva
Cookie: XXXX
但它应该看起来像这样工作(conenction在主机之后):
GET /test.html?... HTTP/1.1
Host: www.test.com
Connection: close
User-Agent: whateva
Cookie: XXXX
如何摆脱LWP中的连接线?我只需要重新编写它......它不是需要完全删除它;我很高兴再次在那里添加它作为
# $userAgent->default_header ("Connection" => "keep-alive");..
提前很多!
答案 0 :(得分:3)
要解决防火墙中的错误*,请更改
return _bytes(join($CRLF, "$method $uri HTTP/$ver", @h2, @h, "", $content));
在Net/HTTP.pm
到
my @h3 = ( @h2, @h );
if (my ($idx) = grep /^Host:/, 0..$#h3) {
unshift(@h3, splice(@h3, $idx, 1));
}
return _bytes(join($CRLF, "$method $uri HTTP/$ver", @h3, "", $content));
* - 根据HTTP / 1.1规范RFC 2616,“接收到具有不同字段名称的头字段的顺序并不重要。”