Question

use strict;

use LWP::UserAgent;

my $UserAgent = LWP::UserAgent->new;

my $response = $UserAgent->get("https://scholar.google.co.in/scholar_lookup?author=N.+R.+Alpert&author=S.+A.+Mohiddin&author=D.+Tripodi&author=J.+Jacobson-Hatzell&author=K.+Vaughn-Whitley&author=C.+Brosseau+&publication_year=2005&title=Molecular+and+phenotypic+effects+of+heterozygous,+homozygous,+and+compound+heterozygote+myosin+heavy-chain+mutations&journal=Am.+J.+Physiol.+Heart+Circ.+Physiol.&volume=288&pages=H1097-H1102");

if ($response->is_success)

{

$response->content =~ /<title>(.*?) - Google Scholar<\/title>/;

print $1;
}

else

{

die $response->status_line;

}

运行此脚本时出现以下错误。

403禁止在D：\ Getelement.pl第52行。

我已将此网站地址粘贴到地址栏中，并将其重定向到该网站，但在脚本运行时无效。

你能帮我解决这个问题。

Answer 1

Google服务条款disallow automated searches。他们是检测到你是否从脚本中发送了这个，因为你的标题和您的浏览器标准标题非常不同，您可以进行分析如果你愿意的话。

在过去，他们有一个SOAP API，你可以使用像 WWW::Search::Google但事实并非如此，因为这样 API已被弃用。

以下StackOverflow中已讨论了替代方案问题：

What are the alternatives now that the Google web search API has been deprecated?

Answer 2

Google已列入黑名单LWP::UserAgent他们将UserAgent或部分请求列入黑名单（无论如何）。

我建议你使用Mojo::UserAgent.。默认情况下，请求看起来更像浏览器。您必须编写至少1行代码。

use Mojo::UserAgent;
use strict;
use warnings;

print Mojo::UserAgent->new->get('https://scholar.google.co.in/scholar_lookup?author=N.+R.+Alpert&author=S.+A.+Mohiddin&author=D.+Tripodi&author=J.+Jacobson-Hatzell&author=K.+Vaughn-Whitley&author=C.+Brosseau+&publication_year=2005&title=Molecular+and+phenotypic+effects+of+heterozygous,+homozygous,+and+compound+heterozygote+myosin+heavy-chain+mutations&journal=Am.+J.+Physiol.+Heart+Circ.+Physiol.&volume=288&pages=H1097-H1102')->res->dom->at('title')->text;

# Prints Molecular and phenotypic effects of heterozygous, homozygous, and      
# compound heterozygote myosin heavy-chain mutations - Google Scholar

<强>条款

代码不接受任何条款，也没有添加额外的行来绕过安全检查。这绝对没问题。

Answer 3

如果添加用户代理字符串以向Web服务器标识自己，则可以获取内容：

...
my $UserAgent = LWP::UserAgent-new;
$UserAgent->agent('Mozilla/5.0'); #...add this...
...
print $1;
...

这打印：＆＃34;杂合子，纯合子和复合杂合子肌球蛋白重链突变的分子和表型效应＆＃34;

通过Perl

3 个答案: