我开始研究新的scrapy项目。到目前为止,我有:
class ContactSpider(Spider):
name = "contact"
allowed_domains = ["http://www.domain.com/"]
start_urls = [
"http://web.domain.com/DECORATION"
]
def start_requests(self,response):
l = response.selector.xpath('//*[@id="ListingResults"]/text()').extract()
print(l)
我得到了:
Unhandled error in Deferred:
2016-08-17 12:37:16 [twisted] CRITICAL: Unhandled error in Deferred:
Traceback (most recent call last):
File "Hlib\site-packages\scrapy\commands\crawl.py", line 57, in run
self.crawler_process.crawl(spname, **opts.spargs)
File "C:\lib\site-packages\scrapy\crawler.py", line 163, in crawl
return self._crawl(crawler, *args, **kwargs)
File "C:\lib\site-packages\scrapy\crawler.py", line 167, in _crawl
d = crawler.crawl(*args, **kwargs)
File "C:\lib\site-packages\twisted\internet\defer.py", line 1274, in unwindGenerator
return _inlineCallbacks(None, gen, Deferred())
--- <exception caught here> ---
File "C:\lib\site-packages\twisted\internet\defer.py", line 1128, in _inlineCallbacks
result = g.send(result)
File "C:\lib\site-packages\scrapy\crawler.py", line 90, in crawl
six.reraise(*exc_info)
File "C:\lib\site-packages\scrapy\crawler.py", line 73, in crawl
start_requests = iter(self.spider.start_requests())
exceptions.TypeError: start_requests() takes exactly 2 arguments (1 given)
2016-08-17 12:37:16 [twisted] CRITICAL:
Traceback (most recent call last):
File "C:\lib\site-packages\twisted\internet\defer.py", line 1128, in _inlineCallbacks
result = g.send(result)
File "C:\lib\site-packages\scrapy\crawler.py", line 90, in crawl
six.reraise(*exc_info)
File "C:\lib\site-packages\scrapy\crawler.py", line 73, in crawl
start_requests = iter(self.spider.start_requests())
TypeError: start_requests() takes exactly 2 arguments (1 given)
Unhandled error in Deferred:
2016-08-17 12:37:16 [twisted] CRITICAL: Unhandled error in Deferred:
我做错了什么?
答案 0 :(得分:3)
start_requests
是来自scrapy.spider
的方法,除了self
之外,它不需要任何参数。它用于创建起始Requests
,因此它应产生某些Request
个对象(或返回Requests
列表。)
def start_requests(self,response):
应该是:
def start_requests(self):