验证后Scrapy FormRequest

时间:2017-09-25 11:40:06

标签: python scrapy

我正在尝试使用此代码抓取经过身份验证的网站。我成功登录网站,但当我尝试发送另一个FormRequest时,我再次被重定向到登录页面。看来会话/ cookies不是通过scrapy保存的吗?

在scrapy docs here中,如果我发送另一个请求,会话不会被保存?那么# continue scraping with authenticated session...这意味着什么呢?

任何想法?谢谢你

import scrapy
from scrapy.utils.response import open_in_browser

class LoginSpider(scrapy.Spider):
     name = 'login_spider'
     start_urls = ['https://example.com/login']

def parse(self, response):
    yield scrapy.FormRequest.from_response(
        response,
        formdata={'username': 'username', 'password': 'password'},
        callback=self.after_login
    )

def after_login(self, response):
    if "Authenticated" in response.body.decode("utf-8"):
        # continue scraping with authenticated session...
        url = 'https://example.com/search'
        yield scrapy.FormRequest(
            url,
            formdata={'from': '09/24/2017', 'to': '09/25/2017'},
            callback=self.parse_something
        )

    else:
        self.logger.error("Login failed")
        return

def parse_something(self, response):
    open_in_browser(response)
    self.logger.error(response.body)
    return

0 个答案:

没有答案