无法运行DOM(使用cURL)

时间:2016-01-14 01:28:15

标签: php curl cookies

我试图用另一个替换一个有效的Simple DOM HTML Parser片段来发送一个cookie。问题是它只是搞砸了我的页面。

我正在尝试从游戏中获取描述文本,该文本位于class game_description_snippet的div内。该特定div仅包含文本而不包含其他标记。

非工作:

//Not working at all
$url = "http://store.steampowered.com/app/100";

$ch = curl_init();

curl_setopt($ch, CURLOPT_COOKIE, "birthtime=28801; path=/; domain=store.steampowered.com");
curl_setopt($ch, CURLOPT_TIMEOUT, 5); 
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

$result = curl_exec($ch);

$dom = new domDocument;
libxml_use_internal_errors(true);
$dom->loadHTMLFile($result);

$descrs = $dom->getElementsByClassName('game_description_snippet');
foreach ($descrs as $descr) {
    $spanx = $descr->textContent;
    echo $spanx;
}

尝试用它替换以下代码并添加该cookie:

//Working, but slowly.
$url = "http://store.steampowered.com/app/100";
$html = file_get_html($url);

foreach($html->find('div.game_description_snippet') as $element) {

    if(empty($element)) {
        $descr = "This link will take you to the full game.";
    } else {
        $unformatted = $element->plaintext;
        $formatted = trim($unformatted);
        $descr = str_replace("'", "", $formatted);
        if($descr == "<br>") {
            $descr == "";
        }
    }
}

只需一点指导就会非常感激。我知道我是一个新手,但我一直在谷歌搜索几个小时,我不知道如何继续前进。我确实做了类似的功能,但它没有包含cURL部分,只是使用$dom->loadHTMLFile($url);来代替..

编辑:

foreach ($nodes as $node) {
    $span = $node->childNodes;
    $knas = $span->item(0)->nodeValue;
    echo $knas;
}

这对我有用,但结果并不仅仅包含文本(在我目前的情况下也是如此),它包含<h1>Some text</h1>

之类的内容

1 个答案:

答案 0 :(得分:1)

根据我们在聊天中的讨论,您需要让PHP请求一个cookie,而不是从浏览器发送一个。服务器将验证cookie,如果它不是来自同一会话,则它不会成为有效的cookie。

这是我开始工作的代码。

// Modified the cURL function to accept POST parameters and a cookiejar file
// You will actually need to use a cookie jar because the cookie is not static
// Using a cookie jar allows the server  to change the cookie as needed and the changed cookie is sent back to the server.
function request($url, $params=array(), $cookiejar="") {
    $ch = curl_init();
    $curlOpts = array(
        CURLOPT_URL => $url,
        CURLOPT_SSL_VERIFYPEER => false,
        CURLOPT_SSL_VERIFYHOST => false,
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_FOLLOWLOCATION => true
    );
    if (!empty($params)) {
        // If POST values are given, send that shit too
        $curlOpts[CURLOPT_POST] = true;
        $curlOpts[CURLOPT_POSTFIELDS] = $params;
    }
    if (!empty($cookiejar)) {

        if(!file_exists($cookiejar)){
            echo 'Cookie file missing. please create it first.'; exit;
        }else if(!is_writable($cookiejar)){
            echo 'Cookie file not writable. chmod it to 777.'; exit;
        }

        curl_setopt($ch, CURLOPT_COOKIEJAR, realpath($cookiejar));
        curl_setopt($ch, CURLOPT_COOKIEFILE, realpath($cookiejar));
    }
    curl_setopt_array($ch, $curlOpts);
    $answer = curl_exec($ch);
    // If there was an error, show it
    if (curl_error($ch)) die(curl_error($ch));
    curl_close($ch);
    return $answer;
}

// The url you're trying to access
$url = "http://store.steampowered.com/app/223470";

// The url that gives  us the cookie
$cookie_url = "http://store.steampowered.com/agecheck/app/223470/";

// The POST parameters that are sent to the server when requesting a cookie
$params = array("ageDay"=>1, "ageMonth"=>"January", "ageYear"=>"1915", "snr"=>"1_agecheck_agecheck__age-gate");

// This request just give us a cookie.
request($cookie_url, $params, "cookiejar");

// And this request is that page you need
$html =  request($url, array(), "cookiejar");

// Domdoc stuff that we already discussed
$dom = new domDocument;
libxml_use_internal_errors(true);
$dom->loadHTML($html);

$classname="game_description_snippet";
$finder = new DomXPath($dom);
$spaner = $finder->query("//*[contains(@class, '$classname')]");

foreach ($spaner as $spane) { 
    $mo = $spane->nodeValue;
    echo $mo;
}