Question

我正试图从我的脚本开始scrapy蜘蛛，如here

所示

logging.basicConfig(
    filename='log.txt',
    format='%(levelname)s: %(message)s',
    level=logging.CRITICAL
)
configure_logging(install_root_handler=False)
process = CrawlerProcess(get_project_settings())

process.crawl('1740')
process.start() # the script will block here until the crawling is finished

我想配置我的蜘蛛的日志记录级别，但即使我没有安装root logger处理程序并使用 logging.basicConfig 方法配置我的基本配置，它也不会服从确定的水平。

INFO: Enabled spider middlewares: ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', 'scrapy.spidermiddlewares.referer.RefererMiddleware', 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 'scrapy.spidermiddlewares.depth.DepthMiddleware'] INFO: Enabled item pipelines: ['collector.pipelines.CollectorPipeline'] INFO: Spider opened INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)

它遵循在basicConfig中确定的格式和文件名，但它不使用日志记录级别。我不确定这个地方以外的日志记录级别。

注意：我没有任何其他地方可以导入日志记录或更改日志记录级别。

Answer 1

对于scrapy本身，您应该在settings.py as described in the docs

中定义日志记录设置

所以在settings.py你可以设置：

LOG_LEVEL = 'ERROR'  # to only display errors
LOG_FORMAT = '%(levelname)s: %(message)s'
LOG_FILE = 'log.txt'

Scrapy记录级别更改

1 个答案: