当您发布Facebook链接时,它会抓取文章标题,说明和相关图片。大多数主要网站都有所需的OG标签,因此很容易获取此信息,但FB也能够处理没有它们的网站(您可以尝试here)。
显然,他们已经建立了一个系统,可以在没有OG标签的情况下抓取这些信息。有谁知道是否有开源版本?我认为它需要(按照每个部分的优先顺序):
名称:
说明
图片:
非常感谢!
答案 0 :(得分:0)
https://github.com/Anonyfox/node-htmlcarve
Node.js的htmlcarve
模块完成了你所追求的大部分内容,这里是从this page生成的输出:
htmlcarve = require('htmlcarve');
htmlcarve.fromUrl('https://scotch.io/tutorials/using-mongoosejs-in-node-js-and-mongodb-applications', function(error, data) {
console.log(JSON.stringify(data, null, 2));
});
这会产生:
{
"source": {
"html_meta": {
"title": "Easily Develop Node.js and MongoDB Apps with Mongoose ⥠Scotch",
"summary": "",
"image": "/wp-content/themes/thirty/img/scotch-logo.png",
"language": "en-US",
"feed": "https://scotch.io/feed",
"favicon": "https://scotch.io/wp-content/themes/thirty/img/icons/favicon-57.png",
"author": "Chris Sevilleja"
},
"open_graph": {
"title": "Easily Develop Node.js and MongoDB Apps with Mongoose",
"summary": "",
"image": "https://scotch.io/wp-content/uploads/2014/11/mongoosejs-node-mongodb-applications.png"
},
"twitter_card": {
"title": "Easily Develop Node.js and MongoDB Apps with Mongoose",
"summary": "",
"author": "sevilayha"
}
},
"result": {
"title": "Easily Develop Node.js and MongoDB Apps with Mongoose",
"summary": "",
"image": "https://scotch.io/wp-content/uploads/2014/11/mongoosejs-node-mongodb-applications.png",
"author": "sevilayha",
"language": "en-US",
"feed": "https://scotch.io/feed",
"favicon": "https://scotch.io/wp-content/themes/thirty/img/icons/favicon-57.png"
},
"links": {
"deep": "https://scotch.io/tutorials/using-mongoosejs-in-node-js-and-mongodb-applications",
"shallow": "https://scotch.io/tutorials/using-mongoosejs-in-node-js-and-mongodb-applications",
"base": "https://scotch.io"
}
}
如果您安装了Node.js,请使用
进行安装npm i -g htmlcarve
您可以直接从命令行运行它。