使用安全登录抓取网站内容

时间:2012-06-23 17:55:05

标签: php curl web-scraping

我正在尝试通过登录安全抓取网站的内容 但无法做到这一点 该站点的登录有三个选项用户名,密码,密码 这是我正在使用的代码

<?php

// HTTP authentication

$url = "http://aftabcurrency.com/login_script.php";

$ch = curl_init();    

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 

curl_setopt($ch, CURLOPT_URL, $url); 
$cookie = 'cookies.txt';
$timeout = 30;
curl_setopt($curl, CURLOPT_TIMEOUT,         10); 
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT,  $timeout );
curl_setopt($curl, CURLOPT_COOKIEJAR,       $cookie);
curl_setopt($curl, CURLOPT_COOKIEFILE,      $cookie);

curl_setopt ($ch, CURLOPT_POST, 1); 
curl_setopt ($ch,CURLOPT_POSTFIELDS,"user_name=user&user_password=pass&passcode=code");             

$result = curl_exec($ch); 

curl_close($ch); 

echo $result;

?>

2 个答案:

答案 0 :(得分:7)

您需要POSThttp://aftabcurrency.com/login_script.php 你的卷曲也需要接受饼干 在验证之后,脚本将重定向您,因此您还需要添加 CURLOPT_FOLLOWACTION

这是您的脚本的编辑版本,我无法在http://aftabcurrency.com/上测试它希望它有效:

$url = "http://aftabcurrency.com/login_script.php";

$ch = curl_init();    
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 

curl_setopt($ch, CURLOPT_URL, $url); 
$cookie = 'cookies.txt';
$timeout = 30;

curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_TIMEOUT,         10); 
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT,  $timeout );
curl_setopt($ch, CURLOPT_COOKIEJAR,       $cookie);
curl_setopt($ch, CURLOPT_COOKIEFILE,      $cookie);

curl_setopt ($ch, CURLOPT_POST, 1); 
curl_setopt ($ch,CURLOPT_POSTFIELDS,"user_name=user&user_password=pass&passcode=code");     

$result = curl_exec($ch);

/* //OPTIONAL - Redirect to another page after login
$url = "http://aftabcurrency.com/some_other_page";
curl_setopt ($ch, CURLOPT_POST, 0); 
curl_setopt($ch, CURLOPT_URL, $url);
$result = curl_exec($ch);
 */ //end OPTIONAL 

curl_close($ch); 
echo $result;

答案 1 :(得分:0)

您需要将您的用户名/密码/密码发布到该页面。你现在要做的是http身份验证。 所以不是这个

curl_setopt($ch, CURLOPT_USERPWD, "demo:demopass:demopasscode"); 

你需要这个

curl_setopt ($ch, CURLOPT_POST, 1); 
curl_setopt ($ch, CURLOPT_POSTFIELDS, "user_name=xxxxx&user_password=xxxxxx&passcode=xxxxx"); 
相关问题