使用简单的html dom 1.5 2014 php获取元数据

时间:2014-05-21 18:51:19

标签: php simple-html-dom

我正在尝试获取google.com的元描述/关键字,但我最终得到了一个空数组。

<?php
include "simple_html_dom.php";
$url = isset($_POST['url']) ? $_POST['url'] : ''; // this would be http://www.google.com
if(!empty($url) && @file_get_contents($url) == true) {
    $html = new simple_html_dom();
    $html->load_file($url); //put url or filename in place of xxx
    $title = $html->find('title', 0)->plaintext;
    //echo $title;

    $descr = $html->find("meta[name='description']", 0);
    var_dump($descr); // NULL

}
?>

$title正在变得正常,但说明是一个问题,并且不明白为什么。 我也试过

$descr = $html->find("meta[name='description']", 0)->content;

结果为Notice: Trying to get property of non-object

$descr = $html->find("meta[name='description']", 0)->attr('content');

结果为Fatal error: Call to a member function attr() on a non-object

 $descr = $html->find("meta[name='description']", 0)->getAttribute('content');

结果为Fatal error: Call to a member function getAttribute() on a non-object

所有这些错误我认为它们是因为无法找到元描述,尽管事实上如果您打开在Google.com上查看源代码,您会看到这是您在头标记之后看到的第一件事 请帮我这个我是Simple HTML DOM中的菜鸟。非常感谢。

2 个答案:

答案 0 :(得分:1)

您可以像这样获得关键字:

$oHTML = str_get_html( $remote_html );
$arElements = $oHTML->find( "meta[name=keywords]" );
echo $arElements[0]->content;

答案 1 :(得分:1)

这可以给你你想要的东西:

<?php
include "simple_html_dom.php";
$url = isset($_POST['url']) ? $_POST['url'] : ''; // this would be http://www.google.com
if(!empty($url)) {
    $html = file_get_html($url);
    $title = $html->find('title', 0)->plaintext;
    echo $title . "\n";;

    $descr = $html->find("meta[name='description']", 0);
    echo $descr . "\n";

}    ?&GT;

输出

Google
<meta content="Search the world's information, including webpages, images, videos and more. Google has many special features to help you find exactly what you're looking for." name="description">