我需要一些帮助,从下面的html代码中提取日期,(下面只是我要提取的内容的快照,它是一个完整的html页面)
.... <span class="glyphicon glyphicon-comment" style="color:#ccc;"> </span>
<span style="font-family:'Open Sans', arial;font-size:11px!important;color:#ccc;">0</span>
<span class="glyphicon glyphicon-time" style="color:#ccc;"></span>
<span style="font-family:'Open Sans',arial;font-size:11px!important;color:#ccc;">December 6, 2014</span>
<span style="font-family:'Open Sans',arial;font-size:11px!important;color:#ccc;">2:00 am</span>
<span style="font-family:'Open Sans',arial;font-size:11px!important;color:#ccc;">Hits(6)</span>....
所以我尝试使用以下代码使用PHP DOM对象和XPATH进行查找,但失败,结果长度为零。为什么?
//libxml_use_internal_errors(true);
$dom_document = new DOMDocument(); // CREATE A NEW DOCUMENT
$dom_document->loadHTML(
mb_convert_encoding($row['html'], 'HTML-ENTITIES', 'UTF-8')
); // LOAD THE STRING INTO THE DOCUMENT
$classname = "font-family:'Open Sans',arial;font-size:11px!important;color:#ccc;";
$xpath = new DOMXPath($dom_document);
$results = $xpath->query("//*[@span=\"" . $classname . "\"]");
var_dump($results);
if ($results->length > 0) {
$date = $results->item(0)->nodeValue;
}
//libxml_use_internal_errors(false);
答案 0 :(得分:1)
您的$classname
具有误导性/混淆性,它不包含样本标记内的类名,而是 css样式规则。
$classname = "font-family:'Open Sans',arial;font-size:11px!important;color:#ccc;";
您应该搜索具有该规则样式的节点:
$results = $xpath->query("//*[@style=\"" . $classname . "\"]");