发布请求返回405

时间:2019-08-06 21:19:11

标签: python-2.7 http scrapy

我必须发帖,但出现错误405

在此网站-> http://177.66.89.34:8079/Transparencia/#中,我需要遍历页面顶部2个下拉菜单的选项。

首先,我会发布“ Escolha oExercício”旁边的选项: 然后,我的确发布了“ Escolha Entidade:”旁边的选项。

使用下面的代码,我在“ Escolha oExercício”上发帖

# -*- coding: utf-8 -*-
import scrapy

class ScpiSpider(scrapy.Spider): # classe abstrata
    start_urls = ['http://177.66.89.34:8079/Transparencia']

def parse(self, response):
    anos_exercicios = response.xpath("//table[@id='cmbExercicio_DDD_L_LBT']//td/text()").extract()

    for ano in anos_exercicios:
        formadata = {"Scriptmanager1": "UpdatePanel1|cmbExercicio",
                         "cmbExercicio_VI": ano,
                        "cmbExercicio": ano,
                        "__EVENTTARGET": "cmbExercicio",
                        "__VIEWSTATE": response.xpath("//input[@id='__VIEWSTATE']/@value").get(),
                        "__VIEWSTATEGENERATOR": response.xpath("//input[@id='__VIEWSTATEGENERATOR']/@value").get(),
                        "__EVENTVALIDATION": response.xpath("//input[@id='__EVENTVALIDATION']/@value").get(),
                         "__ASYNCPOST": "true"}
        headers = {
            'origin': "http://177.125.200.195:8079",
            'x-requested-with': "XMLHttpRequest",
            'cache-control': "no-cache",
            'x-microsoftajax': "Delta=true",
            'user-agent': "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36",
            'accept': "*/*"
        }
        yield scrapy.FormRequest(url=self.start_urls[0], formdata=formadata, callback=self.parse_entidade,
                                 dont_filter=True, headers=headers)

def parse_entidade(self, response):
    print(response)

我希望代码输入parse_entidade,但是我正在接收[scrapy.spidermiddlewares.httperror] INFO: Ignoring response <405 http://177.66.89.34:8079/Transparencia>: HTTP status code is not handled or not allowed

1 个答案:

答案 0 :(得分:0)

仅在start_url中添加“ /”似乎可以解决问题:

start_urls = ['http://177.66.89.34:8079/Transparencia/']

启动时得到的输出如下:

2019-08-07 09:44:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://177.66.89.34:8079/Transparencia/> (referer: None)
2019-08-07 09:44:43 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://177.66.89.34:8079/Transparencia/> (referer: http://177.66.89.34:8079/Transparencia/)
2019-08-07 09:44:43 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://177.66.89.34:8079/Transparencia/> (referer: http://177.66.89.34:8079/Transparencia/)
<200 http://177.66.89.34:8079/Transparencia/>
<200 http://177.66.89.34:8079/Transparencia/>
2019-08-07 09:44:44 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://177.66.89.34:8079/Transparencia/> (referer: http://177.66.89.34:8079/Transparencia/)
<200 http://177.66.89.34:8079/Transparencia/>
2019-08-07 09:44:44 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://177.66.89.34:8079/Transparencia/> (referer: http://177.66.89.34:8079/Transparencia/)
<200 http://177.66.89.34:8079/Transparencia/>
2019-08-07 09:44:45 [scrapy.core.engine] INFO: Closing spider (finished)