Question

我刚刚运行了一个Scrapy蜘蛛，花了大约2个小时来爬行（见下面的截图），但我忘了使用命令行选项--logfile FILE（参见）来保存记录输出到文件。不过，我仍然希望这样做，以追踪抓取期间发生的一些ERROR。

有没有办法追溯＆＃39;这样做而不再重新刮2个小时？

https://doc.scrapy.org/en/latest/topics/logging.html#command-line-options

Answer 1

有几个选项，但无法在过程完成后检索数据，因为bash（和其他shell）不记录输出。
您可以尝试从shell复制它，但是您将只获得最后几行，因为默认情况下，unix终端具有有限的回滚历史记录。有一些方法可以增加回滚，但通常不建议这样做，请参阅相关内容：https://askubuntu.com/questions/385901/how-to-see-more-lines-in-the-terminal 而是确保明确启用日志记录：

Linux输出重定向：

# only sdout scrapy crawl spider > output.log # both stdout and stderr scrapy crawl spider &> output.log # to file and stdout scrapy crawl spider 2>1 | tee output.log

Scrapy方式：

scrapy crawl spider -s LOG_FILE=output.log scrapy crawl spider --logfile output.log

scrapy支持-s标志以覆盖设置，并且可以在这种情况下使用LOG_FILE设置，并且可以在项目中设置（例如setting.py文件）以始终输出到日志文件

在Linux中，如何在生成文件后将日志输出保存到文件中？

1 个答案:

Linux输出重定向：

Scrapy方式：