我正在JS中创建一个新闻刮刀应用程序,其中包含文章及其描述。
下面是刮刀;但是,当提入描述时,它还会从链接中“读取我”作为文本。我还想继续摘录,但在最后删除Read More:
app.get("/scraper", function(req, res) {
// Grabs the body of the html with request
request("https://techcrunch.com/", function(error, response, html) {
// Load into cheerio with $ as a shorthand selector
var $ = cheerio.load(html);
// Grabs the title, description, and link within the block-content class.
$(".block-content").each(function(i, element) {
// Save an empty result object
var result = {};
// Saves them as properties of the result object
result.title = $(this).find(".post-title").children("a").text();
result.link = $(this).find("a").children(".excerpt").attr("href");
result.description =$(this).find(".excerpt").text();
console.log(result);
if (result.title && result.link && result.description) {
// Creates a new entry using the article model
var entry = new Article(result);
// Saves that entry to the db
entry.save(function(err, doc, next) {
// Log any errors
if (err) {
console.log(err);
}
});
}
});
这是对象;你可以在描述中看到“阅读更多”。
{ title: 'Snapcart raises $10M to shed light on consumer spending in emerging markets',
link: 'https://techcrunch.com/2017/10/25/snapcart-raises-10m/',
description: 'Taking on a giant like the $15 billion research firm Nielsen is no easy task. But tucked away in Southeast Asia, Snapcart is a two-year old company that is making progress by shining light on the black box that is consumer spending in emerging markets. Read More' }
答案 0 :(得分:0)
result.description =$(this).find(".excerpt").text().replace(' Read More', '');
使用replace会从字符串中删除Read More
。