我正在尝试检索包含在SimpleXMLElement中的一些元数据。我正在使用XPATH,我很难获得我感兴趣的价值。
以下是网页标题的摘录(来自:http://www.wayfair.de/CleverFurn-Couchtisch-Abby-69318X2-MFE2223.html)
您知道如何检索包含以下内容的数组中的所有xmlns数据:
1)og:类型 2)og:url 3)og:图像 .... x)og:upc
<meta xmlns:og="http://opengraphprotocol.org/schema/" property="og:title" content="CleverFurn Couchtisch "Abby"" />
这是我的PHP代码
<?php
$html = file_get_contents("http://www.wayfair.de/CleverFurn-Couchtisch-Abby-69318X2-MFE2223.html");
$doc = new DOMDocument();
$doc->strictErrorChecking = false;
$doc->recover=true;
@$doc->loadHTML("<html><body>".$html."</body></html>");
$xpath = new DOMXpath($doc);
$elements = $xpath->query("//*/meta[@property='og:url']");
if (!is_null($elements)) {
foreach ($elements as $element) {
echo "<br/>[". $element->nodeName. "]";
var_dump($element);
$nodes = $element->childNodes;
foreach ($nodes as $node) {
echo $node->nodeValue. "\n";
}
}
}
?>
答案 0 :(得分:1)
刚刚找到答案:
How to get Open Graph Protocol of a webpage by php?
<?php
$html = file_get_contents("http://www.wayfair.de/CleverFurn-Couchtisch-Abby-69318X2-MFE2223.html");
libxml_use_internal_errors(true); // Yeah if you are so worried about using @ with warnings
$doc = new DomDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$query = '//*/meta[starts-with(@property, \'og:\')]';
$metas = $xpath->query($query);
foreach ($metas as $meta) {
$property = $meta->getAttribute('property');
$content = $meta->getAttribute('content');
$rmetas[$property] = $content;
}
var_dump($rmetas);
?>