Question

我正试图抓住这个网址上的每一行

http://www.gosugamers.net/counterstrike/news/archive

我使用xpath-helper创建了以下路径：

//div[class='content']/table[@class='simple gamelist medium']/tbody/tr

这应该打印tbody中的每一行但是当我在简单的html dom中尝试这个时，它会返回带有标题，日期和注释的thead。为什么它不会像在xpath helper中那样返回tbody？

include('simple_html_dom.php');



    function getHTML($url,$timeout)
{
       $ch = curl_init($url); // initialize curl with given url
       curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER["HTTP_USER_AGENT"]); // set  useragent
       curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // write the response to a variable
       curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // follow redirects if any
       curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); // max. seconds to execute
       curl_setopt($ch, CURLOPT_FAILONERROR, 1); // stop when it encounters an error
       return @curl_exec($ch);
}

    $html = str_get_html(getHTML("http://www.gosugamers.net/counterstrike/news/archive",10));



    $table = $html->find("//div[class='content']/table[@class='simple gamelist medium']/tbody/tr",0);

    echo $table;

Answer 1

<强>更新：

simplehtmldom库似乎不支持XPath中的位置谓词。要获取特定行，您需要将基于0的索引作为第二个参数传递给find()。

获取第一个非标题行（第二个表行）：

$table = $html->find("//div[class='content']/table[@class='simple gamelist medium']/tbody/tr", 1);

working phpfiddle

<德尔> 你的XPath表达式是选择每个`tr`元素。如果你想要整个`tbody`元素，从表达式的末尾删除`/ tr`。如果你只想要表格单元格（`td`），添加`/ td [1]`。如果你只想要标题，添加`/ td [1] / a / string（）`。

xpath在简单的html dom中输出错误的输出

1 个答案: