我正在尝试获取一个博客文章(https://www.mrmoneymustache.com)的所有链接,因此我可以将它们编译成pdf,但我是javascript中的完整菜鸟。 reddit上的某个人告诉我使用这个代码,这应该是我想要的:
const fs = require('fs');
const EventEmitter = require('events').EventEmitter;
const fetch = require('node-fetch');
const cheerio = require('cheerio');
const e = new EventEmitter();
e.on('fetchPage', link => {
fetch(link).then(r => r.text()).then(cheerio.load).then($ => {
const nextLink = $(".next_post a").attr('href');
if (nextLink === undefined) return; // end on final page
const postTitle = $(".headline").text();
const postContent = $(".post_content").html();
console.log(postTitle);
fs.writeFileSync(postTitle + ".html", postContent);
setTimeout(() => e.emit('fetchPage', nextLink), 5000);
});
});
e.emit('fetchPage', 'https://whatever/post1');
但我真的得不到我应该如何运行这个程序..请帮助吗?