HTTPS代理和LWP :: UserAgent

时间:2012-08-24 20:33:28

标签: perl https proxy lwp lwp-useragent

我已经在很多网站上阅读了很多线程,但仍然无法完成这项工作。

我有一台运行perl 5.12.4的OpenSSL 0.9.8r客户机(OSX),LWP 6.0.4,更新了Crypt :: SSLeay,Net :: SSL等。我正在尝试连接到HTTPS站点(通过我在Windows VM上运行的WinGate代理,示例中为https://github.com。请注意,我的实际应用程序附加到我无法控制的SSL Web服务。

从Firefox,指向代理一切都是copacetic。页面加载成功,我在代理软件活动监视器中看到了连接。如果我能在Perl中工作,我会很高兴。我已经从这个Stack Overflow问题的代码开始:How do I force LWP to use Crypt::SSLeay for HTTPS requests?并添加了一些调试和附加输出。我现在站在这里:

#!/usr/bin/perl

use strict;
use warnings;
use Net::SSL (); # From Crypt-SSLeay

BEGIN {
  $Net::HTTPS::HTTPS_SSL_SOCKET_CLASS = "Net::SSL"; # Force use of Net::SSL
  $ENV{HTTPS_PROXY} = 'https://192.168.1.11:80';
#  $ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
  $ENV{HTTPS_DEBUG} = 1;  #Add debug output
}

use LWP::UserAgent;
my $ua = LWP::UserAgent->new();
my $req = HTTP::Request->new('GET','https://github.com/');
my $response = $ua->request($req);

print "--\n";
print "$_\n" for grep { $_ =~ /SSL/ } keys %INC;
print "--\n";

if ($response->is_success) {
     print $response->decoded_content;  # or whatever
     exit(0);
}
else {
 print "\nFail:\n";
     print $response->status_line ."\n";
     exit(1);
}

以下是此代码的输出:

--
Crypt/SSLeay.pm
Crypt/SSLeay/X509.pm
Net/SSL.pm
--

Fail:
500 Can't connect to github.com:443 (Crypt-SSLeay can't verify hostnames)

如果我然后取消注释$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;,我确实在代理上看到一个连接到github.com:443然后什么都没有。 (注意它通过代理在Web浏览器中运行良好)。经过多次挂起后,我从脚本中获得以下输出:

SSL_connect:before/connect initialization
SSL_connect:SSLv2/v3 write client hello A
SSL_connect:before/connect initialization
SSL_connect:SSLv3 write client hello A
SSL_connect:failed in SSLv3 read server hello A
SSL_connect:before/connect initialization
SSL_connect:SSLv2 write client hello A
SSL_connect:failed in SSLv2 read server hello A
--
Crypt/SSLeay.pm
Crypt/SSLeay/X509.pm
Net/SSL.pm
Crypt/SSLeay/CTX.pm
Crypt/SSLeay/MainContext.pm
--

Fail:
500 SSL negotiation failed: 

如果有人能在这里提供一些方向,我将不胜感激!

8 个答案:

答案 0 :(得分:5)

我刚刚将LWP::Protocol::connect模块上传到CPAN。 此模块将缺少的HTTP / CONNECT方法支持添加到LWP。

  use LWP::UserAgent;

  $ua = LWP::UserAgent->new(); 
  $ua->proxy('https', 'connect://proxyhost.domain:3128/');

  $ua->get('https://www.somesslsite.com');

使用此模块,您可以使用常规IO :: Socket :: SSL实现LWP> = 6.00。

答案 1 :(得分:3)

为什么要“强制使用Net :: SSL”。 尝试

#!/usr/bin/perl    
use strict;
use warnings;
use LWP::UserAgent;

BEGIN {
  $ENV{HTTPS_PROXY} = 'https://192.168.1.11:80';
#  $ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
  $ENV{HTTPS_DEBUG} = 1;  #Add debug output
}

my $ua = LWP::UserAgent->new();
my $req = HTTP::Request->new('GET','https://github.com/');
my $response = $ua->request($req);
print $response->code ."\n";

200的输出应该意味着没有错误。

以下示例代码完美无缺

#!/usr/bin/perl
use warnings;
use LWP::UserAgent;

BEGIN {
  $ENV{HTTPS_PROXY} = 'https://176.9.209.113:8080'; #Valid HTTPS proxy taken from http://hidemyass.com/proxy-list/
  $ENV{HTTPS_DEBUG} = 1;
}

my $ua = new LWP::UserAgent;
my $req = new HTTP::Request('GET', 'https://www.nodeworks.com');
my $res = $ua->request($req);
print $res->code, "\n";

输出 -

200
SSL_connect:before/connect initialization
SSL_connect:SSLv2/v3 write client hello A
SSL_connect:SSLv3 read server hello A
SSL_connect:SSLv3 read server certificate A
SSL_connect:SSLv3 read server key exchange A
SSL_connect:SSLv3 read server done A
SSL_connect:SSLv3 write client key exchange A
SSL_connect:SSLv3 write change cipher spec A
SSL_connect:SSLv3 write finished A
SSL_connect:SSLv3 flush data
SSL_connect:SSLv3 read finished A
SSL_connect:before/connect initialization
SSL_connect:SSLv2/v3 write client hello A
SSL_connect:SSLv3 read server hello A
SSL_connect:SSLv3 read server certificate A
SSL_connect:SSLv3 read server key exchange A
SSL_connect:SSLv3 read server done A
SSL_connect:SSLv3 write client key exchange A
SSL_connect:SSLv3 write change cipher spec A
SSL_connect:SSLv3 write finished A
SSL_connect:SSLv3 flush data
SSL_connect:SSLv3 read finished A

Tool completed successfully

使用https://github.com/输出为 -

200
SSL_connect:before/connect initialization
SSL_connect:SSLv2/v3 write client hello A
SSL_connect:SSLv3 read server hello A
SSL_connect:SSLv3 read server certificate A
SSL_connect:SSLv3 read server done A
SSL_connect:SSLv3 write client key exchange A
SSL_connect:SSLv3 write change cipher spec A
SSL_connect:SSLv3 write finished A
SSL_connect:SSLv3 flush data
SSL_connect:SSLv3 read finished A

Tool completed successfully

所以说了这一切。您的代码版本(如下)应该可以正常工作 -

use warnings;
use LWP::UserAgent;

BEGIN {
  $ENV{HTTPS_PROXY} = 'https://176.9.209.113:8080';
  $ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0; #works even with this
  $ENV{HTTPS_DEBUG} = 1;  #Add debug output
}

my $ua = new LWP::UserAgent;
my $req = new HTTP::Request('GET', 'https://github.com/');
my $res = $ua->request($req);
print $res->code, "\n";

if ($res->is_success) {
     print $res->decoded_content;  # or whatever
     exit(0);
}
else {
 print "\nFail:\n";
     print $res->status_line ."\n";
     exit(1);
}

答案 2 :(得分:1)

我几乎遇到了同样的问题。以下是为我修复它的事情:

答案 3 :(得分:1)

您可以使用Net :: SSLGlue :: LWP,而不是使用不提供太多主机验证(并且没有SNI)的Net :: SSL。这个猴子补丁LWP,以便https_proxy可以与默认的SSL后端IO :: Socket :: SSL一起使用:

use Net::SSLGlue::LWP; # do this first
use LWP::Simple;
... continue with normal LWP stuff..

答案 4 :(得分:0)

我知道这可能是一个死的问题,但如果有其他人点击它我还有另一个角度......我不能答应任何答案,但我们在这方面的工作中遇到了长期存在的问题,但使用Squid代理,也许特定于使用X509客户端证书。

Net :: SSL覆盖的使用是解决方案的一部分,但我担心WinGate可能是问题(而不是我可以帮助的东西)虽然在我们的例子中我们通过http联系代理(不确定) LWP如何处理代理+ https)。

对于记录,这是我们使用的精确代码形式的示例:

use Net::SSL;
$ENV{PERL_NET_HTTPS_SSL_SOCKET_CLASS}="Net::SSL";
use LWP::UserAgent;
use LWP::Protocol::https;
my $ua = LWP::UserAgent->new;
$ENV{HTTPS_PROXY} = 'http://cache.local.employer.co.uk:80';
$ua->get("https://example.com/");

这是Perl 5.8.8,最近安装了CPAN(因此分离了L:P:https),所以我们有一个新的Net :: HTTP。

我要提到Net :: HTTP的某些版本已被破坏,但我刚刚意识到这是我在Martin的回复中的CPAN错误:)

很抱歉,如果这没有添加任何内容。

答案 5 :(得分:0)

我在libwww-perl存储库上发送了pull-request来修复(或可能是解决方法......)问题。

此PR的评论显示了一个简单的程序,通过代理与https连接到github.com。有了这个补丁,就不需要在你的程序中乱用%ENV了。

另一个优点是您可以重复使用通常的https_proxy设置。

答案 6 :(得分:0)

在earl 5.8和一些其他模块中出现了错误,其中环境变量HTTP_PROXY未正确设置代理连接。

您的案例存在报告错误的问题,如https://bugzilla.redhat.com/show_bug.cgi?id=1094440

所述

使用它的更好方法是没有环境变量并使用LWP UserAgent

 `use LWP::UserAgent;
  $ua = LWP::UserAgent->new(); 
  $ua->proxy('https', 'connect://proxyhost.domain:3128/');`

答案 7 :(得分:-1)



#!/usr/bin/env perl 
# 
# mimvp.com
# 2017-03-28

use CGI;
use strict;
use LWP::UserAgent;


our %proxy_https = ("https", "connect://173.233.55.118:443");
our $mimvp_url = "https://proxy.mimvp.com/exist.php";

## https
## 1. download LWP-Protocol-connect (wget http://search.cpan.org/CPAN/authors/id/B/BE/BENNING/LWP-Protocol-connect-6.09.tar.gz)
## 2. tar zxvf LWP-Protocol-connect-6.09.tar.gz 
##    cd LWP-Protocol-connect-6.09
##    perl Makefile.PL
##    make
##    sudo make install
sub test_connect {
	my ($url, %proxy) = @_;
	
	print "proxy  : $proxy{'http'}\n";
	print "https  : $proxy{'https'}\n";
	print "socks4 : $proxy{'socks4'}\n";
	print "socks5 : $proxy{'socks5'}\n";
	print "url : $url\n";
	
	my $browser = LWP::UserAgent->new();
	$browser->env_proxy();
	
# 	# 设置的代理格式
	$browser->proxy(%proxy);
	$browser->timeout(30);
	$browser->agent('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36');
	
# 	my $req = new HTTP::Request('GET', $url);
# 	my $response = $browser->request($req);
	my $response = $browser->get($url);  				# 爬取的网址
	my $is_success = $response->is_success();			# 1
	my $content_type = $response->content_type();		# text/html
	my $content = $response->content();					# 网页正文
	my $content_length = length($content);				# 网页正文长度
	
	print "$is_success\n";
	print "$content_type\n";
	print "$content_length\n";
	print "$content\n";
}

test_connect($mimvp_url, %proxy_https);		# https

## perl mimvp-proxy-perl.pl