我正在尝试从各种网站提取基本信息,例如标题和说明。我可以成功获得一个响应对象,我得到的是整个网站的HTML。我一直在使用HTTP.call方法中的选项,但无法弄清楚如何只从响应对象中返回我想要的内容。以下是我想要的两个要素:
<meta property="og:description" content="Our unique teaching style lets students develop their creative potential while learning solid computing skills.">
和
<meta property="og:site_name" content="Goldsmiths, University of London">
通过搜索<title></title>
的结果,我可以轻松获得标题,但在调用方法选项中使用params或数据必须有更好的方法。
Meteor.methods({
getInfo: function (url) {
HTTP.call('GET', url, {}, function (error, result) {
if (!error) {
//console.log(result);
var titleStart = result.content.toLowerCase().indexOf('<title>'),
titleEnd = result.content.toLowerCase().indexOf('</title>'),
titleText = result.content.substring(titleStart + '<title>'.length, titleEnd)}
答案 0 :(得分:1)
查看流星划痕包。
https://github.com/Anonyfox/meteor-scrape
以下是您可以如何使用它的示例:
# scrape any website
websiteData = Scrape.website "http://example.com/article"
结果:
{
title: 'The Avengers (2012 film)'
lang: 'en'
descriptions: [ '2012 superhero film produced by Marvel Studios' ]
tags: [ 'avengers' ]
url: 'http://en.wikipedia.org/wiki/The_Avengers_(2012_film)'
summary: '<p><i><b>Marvel\'s The Avengers</b></i> (classified under the name <i><b>Marvel Avengers Assemble</b></i> in the United Kingdom and Ireland), or simply <i><b>The Avengers</b></i>, is a 2012 American superhero film based on the Marvel Comics superhero team of the same name, produced by Marvel Studios and distributed by Walt Disney Studios Motion Pictures.<sup class="reference plainlinks nourlexpansion" id="ref_1">1</sup> It is the sixth installment in the Marvel Cinematic Universe. The film was written [...]'
meta:
caption: 'Theatrical release poster'
director: '[Joss Whedon](http://en.wikipedia.org/wiki/Joss_Whedon)'
producer: '[Kevin Feige](http://en.wikipedia.org/wiki/Kevin_Feige)'
screenplay: 'Joss Whedon'
based: '[The Avengers](http://en.wikipedia.org/wiki/Avengers_(comics))'
music: '[Alan Silvestri](http://en.wikipedia.org/wiki/Alan_Silvestri)'
cinematography: '[Seamus McGarvey](http://en.wikipedia.org/wiki/Seamus_McGarvey)'
studio: '[Marvel Studios](http://en.wikipedia.org/wiki/Marvel_Studios)'
runtime: '143 minutes'
country: 'United States'
language: 'English'
budget: '$220 million'
gross: '$1.518 billion'
}