使用PHP从URL获取“property og”元标记

时间:2012-02-11 13:21:43

标签: php url fetch opengraph meta-tags

我想创建一个类似于Facebook使用的发布功能(您将链接粘贴到文本框中,点击帖子并发布标题,描述和图像)。我意识到最好提取具有og属性的元标记,例如“og:title”和“og:image”,因为如果我使用普通标记,有时它们会有换行符和其他类似的东西,它会出错

有没有办法使用PHP获取这些标签的内容,但没有AJAX或其他自定义解析器?起点是:

<?php

$url = $_POST['link'];

?>

我们通过POST方法获取上一页的URL,但是如何完成剩下的工作?

5 个答案:

答案 0 :(得分:10)

解决方案是:

libxml_use_internal_errors(true);
$c = file_get_contents("http://url/here");
$d = new DomDocument();
$d->loadHTML($c);
$xp = new domxpath($d);
foreach ($xp->query("//meta[@property='og:title']") as $el) {
    echo $el->getAttribute("content");
}
foreach ($xp->query("//meta[@property='og:description']") as $el) {
    echo $el->getAttribute("content");
}

答案 1 :(得分:5)

使用类似下面的内容:

libxml_use_internal_errors(true); // Yeah if you are so worried about using @ with warnings
$doc = new DomDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$query = '//*/meta[starts-with(@property, \'og:\')]';
$metas = $xpath->query($query);
foreach ($metas as $meta) {
    $property = $meta->getAttribute('property');
    $content = $meta->getAttribute('content');
    $rmetas[$property] = $content;
}
var_dump($rmetas);

How to get Open Graph Protocol of a webpage by php?上找到了这一点 - 搜索很有帮助,Google也是如此!

http://www.google.co.uk/search?q=meta+property+og+tags

答案 2 :(得分:3)

使用此:https://github.com/baj84/MetaData

简单而有效。

$metaData = MetaData::fetch($url);
var_dump($metaData->tags());

答案 3 :(得分:1)

我们使用 Apache Tika 通过php(命令行实用程序)和-j为json:

http://tika.apache.org/

<?php
    shell_exec( 'java -jar tika-app-1.4.jar -j http://www.guardian.co.uk/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying' );
?>

这是来自随机监护文章的示例输出

{
   "Content-Encoding":"UTF-8",
   "Content-Length":205599,
   "Content-Type":"text/html; charset\u003dUTF-8",
   "DC.date.issued":"2013-07-21",
   "X-UA-Compatible":"IE\u003dEdge,chrome\u003d1",
   "application-name":"The Guardian",
   "article:author":"http://www.guardian.co.uk/profile/nicholaswatt",
   "article:modified_time":"2013-07-21T22:42:21+01:00",
   "article:published_time":"2013-07-21T22:00:03+01:00",
   "article:section":"Politics",
   "article:tag":[
      "Lynton Crosby",
      "Health policy",
      "NHS",
      "Health",
      "Healthcare industry",
      "Society",
      "Public services policy",
      "Lobbying",
      "Conservatives",
      "David Cameron",
      "Politics",
      "UK news",
      "Business"
   ],
   "content-id":"/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying",
   "dc:title":"Tory strategist Lynton Crosby in new lobbying row | Politics | The Guardian",
   "description":"Exclusive: Firm he founded, Crosby Textor, advised private healthcare providers how to exploit NHS \u0027failings\u0027",
   "fb:app_id":180444840287,
   "keywords":"Lynton Crosby,Health policy,NHS,Health,Healthcare industry,Society,Public services policy,Lobbying,Conservatives,David Cameron,Politics,UK news,Business,Politics",
   "msapplication-TileColor":"#004983",
   "msapplication-TileImage":"http://static.guim.co.uk/static/a314d63c616d4a06f5ec28ab4fa878a11a692a2a/common/images/favicons/windows_tile_144_b.png",
   "news_keywords":"Lynton Crosby,Health policy,NHS,Health,Healthcare industry,Society,Public services policy,Lobbying,Conservatives,David Cameron,Politics,UK news,Business,Politics",
   "og:description":"Exclusive: Firm he founded, Crosby Textor, advised private healthcare providers how to exploit NHS \u0027failings\u0027",
   "og:image":"https://static-secure.guim.co.uk/sys-images/Guardian/Pix/pixies/2013/7/21/1374433351329/Lynton-Crosby-008.jpg",
   "og:site_name":"the Guardian",
   "og:title":"Tory strategist Lynton Crosby in new lobbying row",
   "og:type":"article",
   "og:url":"http://www.guardian.co.uk/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying",
   "resourceName":"tory-strategist-lynton-crosby-lobbying",
   "title":"Tory strategist Lynton Crosby in new lobbying row | Politics | The Guardian",
   "twitter:app:id:googleplay":"com.guardian",
   "twitter:app:id:iphone":409128287,
   "twitter:app:name:googleplay":"The Guardian",
   "twitter:app:name:iphone":"The Guardian",
   "twitter:app:url:googleplay":"guardian://www.guardian.co.uk/politics/2013/jul/21/tory-strategist-lynton-crosby-lobbying",
   "twitter:card":"summary_large_image",
   "twitter:site":"@guardian"
}

答案 4 :(得分:0)

试试这个.. 它对我有用..

@interface SubPerson : Person

- (void)someMethod;

@end