我有以下代码:
#import necessary packages
import os
from scrapy.selector import Selector
from scrapy.contrib.exporter import CsvItemExporter
from scrapy.item import Item, Field
from scrapy.settings import Settings
from scrapy.settings import default_settings
from selenium import webdriver
from urlparse import urlparse
import csv
from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from scrapy import log
#set maximum DEPTH_LIMIT to 3
default_settings.DEPTH_LIMIT = 3
.....
.....
.....
蜘蛛工作得很漂亮,但由于某种原因,它会进入深度大于3的地方。我如何限制深度,以便蜘蛛不会进入深度大于3的地点?如上所示,我试图以自己的方式控制深度,但它不起作用....谢谢。
答案 0 :(得分:2)
对于新版本,请使用
class MySpider(scrapy.Spider):
name = 'myspider'
custom_settings = {
'DEPTH_LIMIT': '3',
}
答案 1 :(得分:1)