使用Dom Xpath从多个网页获取文件内容

时间:2015-07-15 01:01:36

标签: php dom xpath

当我执行此代码时,仅显示第二次迭代的详细信息 如何遍历第1页和第2页的两个页面?

我有以下代码div,它位于img:

  $ratings = array();       
for ($pageNum = 1; $pageNum < 3; $pageNum++) {

    $html = file_get_contents("http://www.example.com/store/abc/page/$pageNum");            
    @$dom = DOMDocument::loadHTML($html);

    //Init the XPath object
    $xpath = new DOMXpath($dom);

    //Query the DOM
    $rating = $xpath->query( '//div[contains(@class, "rating fl")]//img' );

    //Display the results as in the previous example

    foreach ($rating as $link) {
        //echo  $link->getAttribute('title'),'<br>';            
        $ratings[] = $link->getAttribute('title');                    
        if (sizeof($ratings) == 15) {
            //  var_dump($ratings);
        }
    }
}

1 个答案:

答案 0 :(得分:0)

您在每次迭代时重置$ratings,因此您只有$ratings数组中的最后一个传递值。

简化版:

for($pageNum=1; $pageNum<3;$pageNum++){
    $ratings = array();
    $rating = array(0,1,2);
    foreach($rating as $link){
        $ratings[]  = $pageNum;
        echo $pageNum;
     }
}
print_r($ratings);

输出:

111222Array
(
    [0] => 2
    [1] => 2
    [2] => 2
)

如果您注释掉评级的初始化或将其移出循环,它应该按预期工作。

for($pageNum=1; $pageNum<3;$pageNum++){
    //$ratings = array();
    $rating = array(0,1,2);
    foreach($rating as $link){
        $ratings[]  = $pageNum;
        echo $pageNum;
     }
}
print_r($ratings);

输出:

111222Array
(
    [0] => 1
    [1] => 1
    [2] => 1
    [3] => 2
    [4] => 2
    [5] => 2
)