我需要解析一些看起来像这样的HTML代码:
<div id="item-1" class="genclass" itemscope itemtype="http://schema.org/Book">
<meta itemprop="isbn" content="XXXXXXXXXXXX" />
<meta itemprop="name" content="Neverending Story" />
<meta itemprop="author" content="Michael Ender" />
<meta itemprop="publisher" content="MyBooks" />
<meta itemprop="datePublished" content="1991" />
<h2 itemprop="offers" itemscope itemtype="http://schema.org/Offer">
<meta itemprop="price" content="6.6" />
<meta itemprop="priceCurrency" content="USD" />
</h2>
</div>
&#13;
所以我正在尝试:
libxml_use_internal_errors(TRUE);
$dom->loadHTMLFile($url);
libxml_clear_errors();
foreach($dom->find("meta[itemprop='isbn']") as $books){
switch ($books->itemprop) {
case 'isbn':
$line['isbn'] = $books->content;
break;
case 'name':
$line['name'] = $books->content;
break;
case 'author':
$line['author'] = $books->content;
break;
case 'publisher':
$line['publisher'] = $books->content;
break;
case 'datePublished':
$line['datePublished'] = $books->content;
break;
case 'price':
$line['price'] = $books->content;
break;
default:
break;
}
}
print_r($books);
但结果总是空白。我究竟做错了什么?我试过get_meta_tags和其他人......