从.asp网页获取HTML元素的Xpath

时间:2017-08-15 14:27:50

标签: xpath web-scraping firebug

我需要抓取这个HTML页面......

http://www.asl1.liguria.it/templateProntoSoccorso.asp

enter image description here

....使用PHP和XPath获取

中的值2

Codice bianco: 2

(注意:如果您尝试浏览它,您可以在该页面中看到不同的值...这无关紧要......,它们会改变恐怖状态......)

我不能像通常那样使用Mozilla Firebug获取XPath值:任何建议?

提前谢谢!

更新

<?php
    ini_set('display_errors', 1);

    $url = 'http://www.asl1.liguria.it/templateProntoSoccorso.asp';

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
    curl_setopt($ch, CURLOPT_PROXY, '');
    $data = curl_exec($ch);
    curl_close($ch);

    $dom = new DOMDocument();
    @$dom->loadHTML($data);

    $xpath = new DOMXPath($dom);

    $Number = $xpath->query('/html/body/table/tbody/tr/td[2]/table[2]/tbody/tr/td[3]/table/tbody/tr[2]/td[1]/table/tbody/tr/td/div[1]/div[3]/div[2]');

    foreach( $Number as $node )
    {
      echo "Number: " .$node->nodeValue;
      echo '<br>';
      echo '<br>';
    }    
?>

2 个答案:

答案 0 :(得分:1)

这应该有效:

  1. 第一元素的价值:

    substring-after(//div[@class="datiOspedaleCodici"]/div[1]/text(), ":")
    
  2. 从第二个开始:

    substring-after(//div[@class="datiOspedaleCodici"]/div[2]/text(), ":")
    

    ...等

  3. 只需增加/div[x]中的索引即可获得下一个值

答案 1 :(得分:0)

我已经解决了...这里你是正确的代码

<?php
    ini_set('display_errors', 1);

    $url = 'http://www.asl1.liguria.it/templateProntoSoccorso.asp';

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
    curl_setopt($ch, CURLOPT_PROXY, '');
    $data = curl_exec($ch);
    curl_close($ch);

    $dom = new DOMDocument();
    @$dom->loadHTML($data);

    $xpath = new DOMXPath($dom);

    $Number = $xpath->query('(//div[@class="datiOspedaleCodici"]/div[1]/text())[1]');

    foreach( $Number as $node )
    {
      echo "Number: " .$node->nodeValue;
      echo '<br>';
      echo '<br>';
    }    
?>
打印....

Codice bianco:2