用xpath操作html页面后的PHP字符串

时间:2017-01-23 17:21:35

标签: php xpath

使用此脚本:

<?php

$html = file_get_contents('http://www.sports-reference.com/olympics/summer/');

error_reporting(E_ERROR | E_PARSE);

$doc = new DOMDocument();
$doc->loadHTML($html);

$xpath = new DOMXpath($doc);
$result = $xpath->query('//div[contains(@id, "div_Summer")]//tbody//tr//td[position() >= 1 and position() <= 2]');

#foreach ($result as $i => $tag){
 #   echo $i, ': ', var_dump($tag->nodeValue), ' HTML: ', $doc->saveHTML($tag), "\n";
#}


$links = [];


  foreach($result as $item) { // DOMElement Object

  #var_dump($item->nodeValue);

    $links[] = [
      'city' => $item->nodeValue,
      'year' => $item->nodeValue,
    ];
  }

print_r ($links);

#echo $link = 'http://www.sports-reference.com'.$links[27][href];

?>

最后我得到了这个输出:

string(4) "2012"
string(6) "London"
string(4) "2008"
string(7) "Beijing"
string(4) "2004"
string(6) "Athina"
string(4) "2000"
Whai我想在阵列中做我喜欢的城市只有城市和年仅一年。在我编写脚本的方式显然不起作用,我怎么能得到我的结果?

1 个答案:

答案 0 :(得分:0)

你可以试试这个:

<?php
$html = file_get_contents('http://www.sports-reference.com/olympics/summer/');  
error_reporting(E_ERROR | E_PARSE);
$doc = new DOMDocument();
$doc->loadHTML($html); 
$xpath = new DOMXpath($doc);
$result = $xpath->query('//div[contains(@id, "div_Summer")]//tbody//tr//td[position() >= 1 and position() <= 2]'); 
$links = [];  $i=0;
  foreach($result as $item) { // DOMElement Object
      if($i%2 == 0){
    $links['city'][] =  $item->nodeValue;
      }else{
    $links['year'][] = $item->nodeValue;
      }
    $i++;

  }
echo "<pre>";
print_r ($links);


?>

DEMO HERE