php xpath评估重复数据只获得第一行

时间:2017-08-16 11:39:08

标签: php xpath evaluate

这是我的PHP代码:

<?php

error_reporting(E_ALL);
ini_set("display_errors",1);

ini_set('max_execution_time', 36000); //300 seconds = 5 minutes

$url = 'http://www.sportstats.com/soccer/matches/20170815/';

libxml_use_internal_errors(true); 
$doc = new DOMDocument();
$doc->loadHTMLFile($url);
$xpath = new DOMXpath($doc);


$data = array(

'HomeTeam' => $xpath->evaluate('string(//td[@class="table-home"]/a)'),
'AwayTeam' => $xpath->evaluate('string(//td[contains(@class, "table-away")]/a)'),
'FtScore' => $xpath->evaluate('string(normalize-space(translate(//td[@class="result-neutral"]," " ,"")))'),
'HomeTeamid' => $xpath->evaluate('substring-before(substring-after(substring-after(//td[@class="table-home"]/a/@href, "/soccer/"),"-"),"/")'),
'AwayTeamid' => $xpath->evaluate('substring-before(substring-after(substring-after(//td[@class="table-away"]/a/@href, "/soccer/"),"-"),"/")')

);

foreach ($data as $key) {

echo $data['HomeTeamid'].",";
echo $data['HomeTeam'].",";
echo $data['FtScore'].",";
echo $data['AwayTeam'].",";
echo $data['AwayTeamid']."<br/>";

}

?>

但是脚本会产生重复的结果:

n3QdnjFB,Santos,0-0,Fluminense,EV9L3kU4
n3QdnjFB,Santos,0-0,Fluminense,EV9L3kU4
n3QdnjFB,Santos,0-0,Fluminense,EV9L3kU4
n3QdnjFB,Santos,0-0,Fluminense,EV9L3kU4
n3QdnjFB,Santos,0-0,Fluminense,EV9L3kU4

但我希望它看起来像......

 HTeamid,Santos,0-0,Fluminense,ATeamid
 HTeamid,Cartagena,1-0,Llaneros,ATeamid
 HTeamid,Cerro Porteno,1-1,Libertad Asuncion,ATeamid
 HTeamid,Operario,2-1,Maranhao,ATeamid
 HTeamid,Emelec,2-0,Fuerza,ATeamid
 ...
 ..
 .

Matches list image 我查看了网站上的其他问题并没有找到答案 我如何使用echo命令获取所有其他团队数据(我不想使用var_dump)。感谢。

2 个答案:

答案 0 :(得分:0)

以下是两个错误,您在位置路径中使用//td。这使得相对于文档的路径和字符串函数始终返回列表中第一个节点的文本内容。你永远都是第一场比赛。

获取列表数据的典型结构是:

foreach($xpath->evaluate($exprForItems) as $item) {
  $detail = $xpath->evaluate($exprForDetail, $item);
}

更具体的例子:

$document = new DOMDocument();
$document->loadHtml($html);
$xpath = new DOMXpath($document);

$expressions = new stdClass();
// this is the expression for items - it returns a node list
$expressions->games = '//div[@id = "LS_todayMatchesContent"]/table/tbody/tr';
// this are detail expressions - they return a string
$expressions->home = 'string(td[@class = "table-home"]/a)';
$expressions->homeId = 'substring-before(substring-after(substring-after(td[@class="table-home"]/a/@href, "/soccer/"),"-"),"/")';
$expressions->away= 'string(td[@class = "table-away"]/a)';

foreach ($xpath->evaluate($expressions->games) as $game) {
  var_dump(
    [
      $xpath->evaluate($expressions->home, $game),
      $xpath->evaluate($expressions->homeId, $game),
      $xpath->evaluate($expressions->away, $game)
    ]
  );
}

输出:

array(3) {
  [0]=>
  string(6) "Santos"
  [1]=>
  string(8) "n3QdnjFB"
  [2]=>
  string(10) "Fluminense"
}
array(3) {
  [0]=>
  string(9) "Cartagena"
  [1]=>
  string(8) "6eofBSjQ"
  [2]=>
  string(8) "Llaneros"
}
//...

因此,只有detail表达式使用字符串函数,并且它们总是需要it​​em节点作为上下文(第二个参数)。你必须小心使用上下文。

答案 1 :(得分:-1)

尝试编辑xpath数组,如下所示:

'HomeTeam' => $xpath->query('//td[@class="table-home"]/a'),
'AwayTeam' => $xpath->query('//td[contains(@class, "table-away")]/a'),
'FtScore' => $xpath->query('//td[@class="result-neutral"]'),
...

使用query并更改路径。

然后你可以像这样回应你的结果:

foreach ($data as $dataKey => $dataValue) {
    foreach ($dataValue as $key => $element) {
        $nodes = $element->childNodes;  
        foreach ($nodes as $node) { 
            $tag = $node->nodeValue;
            echo $dataKey.' - '.$key.' - '.$tag.'<br>';  //$dataKey and $key are just informative
        }
    }
    echo '<br>';
}

对我来说,它列出了:

HomeTeam - 0 - Santos
HomeTeam - 1 - Cartagena
HomeTeam - 2 - Cerro Porteno
HomeTeam - 3 - Operario
HomeTeam - 4 - Boca Juniors
HomeTeam - 5 - Emelec
....
AwayTeam - 0 - Fluminense
AwayTeam - 1 - Llaneros
AwayTeam - 2 - Libertad Asuncion
AwayTeam - 3 - Maranhao
AwayTeam - 4 - Gimnasia y Tiro
AwayTeam - 5 - Fuerza A.
....

当然,如果你想要一些有意义的数据打印,你需要它在数组中收集....

希望这是您正在寻找的答案:) 祝你有美好的一天!