使用身份验证数据对公共FTP站点进行scrapy,获取FTP错误

时间:2015-04-03 07:34:43

标签: python python-2.7 ftp scrapy scrapy-spider

我正在为公共FTP站点编写带有身份验证的蜘蛛

我为ftp提供了用户名和密码。 Scrapy没有处理此请求及其提供的ftp_user'错误

 # all import stmt
 class my_xml(BaseSpider):
    name = 'my_xml'

    def start_requests(self):
        yield Request(
            url='url',
            meta={'ftp_user': self.ftp_user, 'ftp_password': self.ftp_password}
        )

    def parse(self, response):
        print response.body

我这样得到了错误。

 2015-04-03 12:46:08+0530 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
 2015-04-03 12:46:08+0530 [scrapy] DEBUG: Web service listening on 127.0.0.1:6080
 2015-04-03 12:46:08+0530 [-] ERROR: Unhandled error in Deferred:
 2015-04-03 12:46:08+0530 [-] Unhandled Error
    Traceback (most recent call last):
      File "C:\Python27\lib\site-packages\scrapy\core\downloader\middleware.py", line 38, in process_request
        return download_func(request=request, spider=spider)
      File "C:\Python27\lib\site-packages\scrapy\core\downloader\__init__.py", line 123, in _enqueue_request
        self._process_queue(spider, slot)
      File "C:\Python27\lib\site-packages\scrapy\core\downloader\__init__.py", line 143, in _process_queue
        dfd = self._download(slot, request, spider)
      File "C:\Python27\lib\site-packages\scrapy\core\downloader\__init__.py", line 154, in _download
        dfd = mustbe_deferred(self.handlers.download_request, request, spider)
    --- <exception caught here> ---
      File "C:\Python27\lib\site-packages\scrapy\utils\defer.py", line 39, in mustbe_deferred
        result = f(*args, **kw)
      File "C:\Python27\lib\site-packages\scrapy\core\downloader\handlers\__init__.py", line 40, in download_request
        return handler(request, spider)
      File "C:\Python27\lib\site-packages\scrapy\core\downloader\handlers\ftp.py", line 72, in download_request
        creator = ClientCreator(reactor, FTPClient, request.meta["ftp_user"],
    exceptions.KeyError: 'ftp_user'

任何人都可以为此错误提供解决方案。 ?如果我做错了程序,请建议我正确的解决方案。如何处理这些类型的蜘蛛? 请注意:URL,ftp_user和ftp_password是正确的,在浏览器中我们可以使用这些数据打开它。

1 个答案:

答案 0 :(得分:0)

尝试这种方式:

# -*- coding: utf-8 -*-
import scrapy
from scrapy.http import Request

class my_xml(scrapy.Spider):
    name = 'my_xml'
    ftp_host = 'ftp://127.0.0.1'
    ftp_user = 'your_username'
    ftp_password = 'your_password'

    def start_requests(self):
        yield Request(
            url=self.ftp_host,
            meta={'ftp_user': self.ftp_user, 'ftp_password': self.ftp_password}
        )

    def parse(self, response):
        print response.body