scrapy不使用FormRequest.from_response登录网页

时间:2018-04-12 11:07:01

标签: python web-scraping scrapy web-crawler

我有下面的代码登录和scarp给定网址。但是从不尝试登录。它在登录和忘记密码屏幕之间移动。我试过传递登录cookie而没有运气。不确定FormRequest.from_response是否适用于任何人。请帮忙..

import scrapy
from scrapy.selector import Selector


class QuotesSpider(scrapy.Spider):
    name = "quotes"
    handle_httpstatus_list = [401]
    start_urls = ['https://xyz/login']    

    def parse(self, response):

        return scrapy.FormRequest.from_response(
            response,
            formdata={'username': 'user', 'password': 'pwd='},
            callback=self.after_login
        )

    def after_login(self, response):
        # check login succeed before going on
        if "authentication failed" in response.body:
            self.logger.error("Login failed")
            return


        for quote in response.xpath('//select'):            
            yield {
                 'url': response.url.extract(),               
                'text': quote.xpath('option::text').extract(),               
            }

        for next_page in response.xpath('//a/@href').extract():
            if`enter code here` next_page is not None:
                yield response.follow(next_page, self.after_login)

0 个答案:

没有答案