Scrapy https教程

时间:2017-06-30 16:11:38

标签: python python-3.x web-scraping scrapy

大家好!

我是Scrapy框架的新手。我需要解析wisemapping.com。 起初,我阅读了官方的Scrapy教程并尝试访问“wisemap”之一,但出现了错误:

[scrapy.core.engine] DEBUG: Crawled (404) <GET https://app.wisemapping.com/robots.txt> (referer: None)

[scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying
<GET https://app.wisemapping.com/c/maps/576786/public> (failed 3 times): 500 Internal Server Error

[scrapy.core.engine] DEBUG: Crawled (500) <GET https://app.wisemapping.com/c/maps/576786/public> (referer: None)

[scrapy.spidermiddlewares.httperror] INFO: Ignoring response <500 https://app.wisemapping.com/c/maps/576786/public>: HTTP status code is not handled or not allowed

请给我一个解决以下代码问题的建议:

import scrapy

class QuotesSpider(scrapy.Spider):
    name = "quotes"

    def start_requests(self):
        urls = [
            'https://app.wisemapping.com/c/maps/576786/public',
        ]
        for url in urls:
            yield scrapy.Request(url=url, callback=self.parse)

    def parse(self, response):
        page = response.url.split("/")[-2]
        filename = 'wisemape.html'
        with open(filename, 'wb') as f:
            f.write(response.body)
        self.log('Saved file %s' % filename)

1 个答案:

答案 0 :(得分:0)

导航到https://app.wisemapping.com/c/maps/576786/public会出错 &#34; Outch !!。此地图不再可用。 您没有足够的权限查看此地图。此地图已更改为私有或已删除。&#34;

这张地图是否存在?如果是这样,请尝试公开。

如果您知道您尝试访问的地图存在,请验证您尝试访问的网址是否正确。