Question

我想开始一个简单的项目。这是Visual Studio的python项目。 VS以管理模式运行。不幸的是，从来没有调用parse（...），而是应该调用。.

a1=[3,9,1]
a2=[8,3,4]

a = a1 + a2
a.sort()

编辑：我的输出：

import scrapy
from scrapy.crawler import CrawlerProcess
import logging

class BlogSpider(scrapy.Spider):
    name = 'blogspider'
    start_urls = ['https://blog.scrapinghub.com']

    def parse(self, response):
        for title in response.css('.post-header>h2'):
            yield {'title': title.css('a ::text').extract_first()}

        for next_page in response.css('div.prev-post > a'):
            yield response.follow(next_page, self.parse)
        logging.error("this should be printed")

process = CrawlerProcess({
    'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
})
process.crawl(BlogSpider)
process.start()
print("ready")

请注意：https://www.lfd.uci.edu/~gohlke/pythonlibs/使用了Twisted。

Answer 1

当我修复缩进开始工作后，这看起来像是整个缩进问题

2018-09-22 11:35:47 [root] ERROR: this should be printed

我的代码段相同

import scrapy
from scrapy.crawler import CrawlerProcess
import logging

class BlogSpider(scrapy.Spider):
    name = 'blogspider'
    start_urls = ['https://blog.scrapinghub.com']

    def parse(self, response):
        logging.error("this should be printed")
        for title in response.css('.post-header>h2'):
            yield {'title': title.css('a ::text').extract_first()}
        for next_page in response.css('div.prev-post > a'):
            yield response.follow(next_page, self.parse)
        logging.error("this should be printed")

process = CrawlerProcess({
    'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
})

process.crawl(BlogSpider)
process.start()
print("ready")

附上Pastbin粘贴https://pastebin.com/pDu8kW27

Answer 2

我安装了Anaconda，然后执行了conda install -c conda-forge scrapy（出现了一些错误）。

现在一切正常。

Installation guide

Scrapy：简单项目

2 个答案: