我想从网址获取div文本(学校)。
<div id='listBox'>
<div class='list'>
<span class='listID'>01101602</span>school
</div>
<div class='department'></div>
<div class='nop'></div>
</div>
我已经尝试了几种方法:
1。file_get_html
和query('//div[@class="list"]');
2. file_get_contents
$first = explode( '<div class="list">',$content );
和$second = explode("</div>" , $first[0] );
,然后是echo $second[0];
我无法让它发挥作用......
答案 0 :(得分:0)
这是一个肮脏的解决方案,但您粘贴的代码不是有效的XML / HTML,因此普通的XML / HTML解析器无法对其进行解析。
<?php
$text = file_get_contents("http://page.com/file.htm");
$explode1 = explode('</span>', $text);
$explode2 = explode('</div>', $explode1[1]);
$schoolText = trim($explode2[0]);
此部分无效HTML(缺少&lt;打开div):
<div class='department'>/div>
所有/几乎所有HTML解析器都会忽略此文本(学校):
<span class='listID'>01101602</span>school
答案 1 :(得分:0)
使用带有Xpath的domDocument
没问题$html = "<div id='listBox'>
<div class='list'>
<span class='listID'>01101602</span>school
</div>
<div class='department'>/div>
<div class='nop'></div>
</div>";
$dom = new domDocument();
$dom->loadHTML($html);
$xpath = new domXpath($dom);
// Get innerHTML of the div
foreach($xpath->query('//div[@class="list"]')->item(0)->childNodes as $x) {
echo $dom->saveHTML($x);
}
// <span class="listID">01101602</span>school
答案 2 :(得分:0)
您应该能够将远程页面直接加载到DOMDocument
的新实例中,并使用XPath
查询来查找您想要的节点
$dom=new DOMDocument;
$dom->loadHTMLFile( $url );
$xp=new DOMXPath($dom);
$query='//*[@id="listBox"]/div[@class="list"]/span[@class="listID"]';
$col=$xp->query($query);
if( !empty( $col ) && $col->length > 0 ){
foreach( $col as $node )echo $node->nodeValue;
}
根据在远程站点找到的HTML的有效性,您可能需要使用一些libxml
错误处理方法,如
/* try to prevent errors */
libxml_use_internal_errors( true );
$dom=new DOMDocument;
$dom->validateOnParse=false;
$dom->standalone=true;
$dom->strictErrorChecking=false;
$dom->recover=true;
$dom->formatOutput=false;
$dom->loadHTMLFile( $url );
/* clear errors */
libxml_clear_errors();
$xp=new DOMXPath($dom);
$query='//*[@id="listBox"]/div[@class="list"]/span[@class="listID"]';
$col=$xp->query($query);
if( !empty( $col ) && $col->length > 0 ){
foreach( $col as $node )echo $node->nodeValue;
}