在musicmagpie上使用curl,php和simple_html_dom刮取购物车流程

时间:2013-07-08 12:53:07

标签: php curl screen-scraping

我编写了一个与我们的网站推车系统配合使用的脚本,但Music Magpie证明是困难的。它似乎没有让我获得结果。

我猜他们试图阻止这种情况发生,但据我所知,我做得对。有什么建议吗?

我的代码是:

require('simple_html_dom.php');

function grab_via_proxy_combination($url,$fields)
{


foreach($fields as $key=>$value) { $fields_string .= $key.'='.$value.'&'; }
rtrim($fields_string, '&');


curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_TIMEOUT,2400);
curl_setopt($ch,CURLOPT_TIMEOUT_MS,2400);
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20100101 Firefox/6.0 FirePHP/0.6');
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT ,2400);
//curl_setopt($ch, CURLOPT_PROXYUSERPWD, $proxyauth);
$cookie_file = "cookies/magpie.txt";
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, $header);
if ($fields) {curl_setopt($ch,CURLOPT_POST, count($fields)); } 
if ($fields) {curl_setopt($ch,CURLOPT_POSTFIELDS, $fields_string); }


    $curl_scraped_page = curl_exec($ch);

    $html = new simple_html_dom();
    $res = $html->load($curl_scraped_page, true, false);


    if (strlen($res) > 100) {
    return $res;
    }
    else {


    }

}


$fields = array(
                        'barcode' => urlencode('711719274551')
                );



$html = grab_via_proxy_combination('http://www.musicmagpie.co.uk/Sellit.asp?result=nq&barcode=0711719274551&loc=103',$fields);

echo $html;

0 个答案:

没有答案