Question

我正在尝试抓取整个页面并使用PJScrape将其保存到JSON文件中运行以下代码，我可以在标准输出中看到整个DOM，但是我没有在当前目录中看到文件scrape_output.json

pjs.addSuite({
    // single URL or array
    url: 'http://en.wikipedia.org/wiki/List_of_towns_in_Vermont',
    // single function or array, evaluated in the client
    scraper: function() {
        return $(document).text();
    },

    // options: 'json' or 'csv'
    format: 'json',
    // options: 'stdout' or 'file' (set in config.outFile)
    writer: 'file',
    outFile: 'scrape_output.json'
});

Answer 1

我明白了。日志记录在pjs.config

中配置

pjs.addSuite({
    // single URL or array
    url: 'http://en.wikipedia.org/wiki/List_of_towns_in_Vermont',
    // single function or array, evaluated in the client
    scraper: function() {
        return $(document).text();
    }
});

pjs.config({
    // options: 'stdout', 'file' (set in config.logFile) or 'none'
    log: 'stdout',
    // options: 'json' or 'csv'
    format: 'json',
    // options: 'stdout' or 'file' (set in config.outFile)
    writer: 'file',
    outFile: 'scrape_output.json'
});

我无法使用pjscrape输出到文件

1 个答案: