如何修复:TypeError:无法腌制选择器对象

时间:2019-01-09 06:35:47

标签: python pickle scrapy-spider

您好,我想将我的crawlspider更改为rediscrawlspider,然后出现错误:

2019-01-09 10:28:54 [twisted] CRITICAL: Unhandled error in Deferred:
2019-01-09 10:28:54 [twisted] CRITICAL:
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\twisted\internet\task.py", line 517, in _oneWorkUnit
result = next(self._iterator)
File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\utils\defer.py", line 63, in 
work = (callable(elem, *args, **named) for elem in iterable)
File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\core\scraper.py", line 183, in _process_spidermw_output
self.crawler.engine.crawl(request=output, spider=spider)
File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\core\engine.py", line 210, in crawl
self.schedule(request, spider)
File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\core\engine.py", line 216, in schedule
if not self.slot.scheduler.enqueue_request(request):
File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy_redis\scheduler.py", line 167, in enqueue_request
self.queue.push(request)
File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy_redis\queue.py", line 99, in push
data = self._encode_request(request)
File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy_redis\queue.py", line 43, in _encode_request
return self.serializer.dumps(obj)
File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy_redis\picklecompat.py", line 14, in dumps
return pickle.dumps(obj, protocol=-1)
File "C:\ProgramData\Anaconda3\lib\site-packages\parsel\selector.py", line 204, in getstate
raise TypeError("can't pickle Selector objects")
TypeError: can't pickle Selector objects

点列表:

lxml 4.1.0
parsel 1.5.0
pywin32 221
redis 2.10.6
requests 2.18.4
Scrapy 1.5.1
scrapy-redis 0.6.8
scrapyd 1.2.0
testpath 0.3.1
Twisted 17.9.0
urllib3 1.22
python 3.6.7

我知道错误的意思,但是我只想使用rediscrawlspider进行dupefilter,为什么选择器会出现? 另外,如果我删除了rule的回调,它可以正常工作,但是没有任何意义:

Rule(LinkExtractor(restrict_xpaths='//li[@Class="pj-box-li"]', allow=r'/project/.+'), callback='parse_detail')

我尝试检查pywin模块,但是效果很好

我希望蜘蛛运转良好; 另外为什么会出现该错误?我只想选择网址。

1 个答案:

答案 0 :(得分:0)

我发现了错误!我在meta中使用了加载程序,但是当pickle的请求也将选择加载程序时,错误就会出现!