如何从命令行使用Scrapy传递表单数据?

时间:2014-01-13 20:55:36

标签: python screen-scraping scrapy scrapyd

如何从命令行传递用户名和密码?谢谢!

class LoginSpider(Spider):
    name = 'example.com'
    start_urls = ['http://www.example.com/users/login.php']

    def parse(self, response):
        return [FormRequest.from_response(response,
                    formdata={'username': 'john', 'password': 'secret'},
                    callback=self.after_login)]

    def after_login(self, response):
        # check login succeed before going on
        if "authentication failed" in response.body:
            self.log("Login failed", level=log.ERROR)
            return

        # continue scraping with authenticated session...

2 个答案:

答案 0 :(得分:6)

打开终端并确保已安装scrapy。

  1. scrapy shell

  2. from scrapy.http import FormRequest

  3. request=FormRequest(url='http://www.example.com/users/login.php',formdata={'username': 'john','password':'secret',})

  4. 信息:

    • Scrapy 1.0.0

答案 1 :(得分:5)

你可以做到

scrapy crawl spidername -a username="john" -a password="secret"

然后

class LoginSpider(Spider):
    name = 'example.com'
    start_urls = ['http://www.example.com/users/login.php']

    def parse(self, response):
        return [FormRequest.from_response(response,
                    formdata={'username': self.username, 'password': self.password},
                    callback=self.after_login)]

    def after_login(self, response):
        # check login succeed before going on
        if "authentication failed" in response.body:
            self.log("Login failed", level=log.ERROR)
            return

        # continue scraping with authenticated session...