我想开始一个简单的项目。这是Visual Studio的python项目。 VS以管理模式运行。 不幸的是,从来没有调用parse(...),而是应该调用。.
a1=[3,9,1]
a2=[8,3,4]
a = a1 + a2
a.sort()
编辑:我的输出:
import scrapy
from scrapy.crawler import CrawlerProcess
import logging
class BlogSpider(scrapy.Spider):
name = 'blogspider'
start_urls = ['https://blog.scrapinghub.com']
def parse(self, response):
for title in response.css('.post-header>h2'):
yield {'title': title.css('a ::text').extract_first()}
for next_page in response.css('div.prev-post > a'):
yield response.follow(next_page, self.parse)
logging.error("this should be printed")
process = CrawlerProcess({
'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
})
process.crawl(BlogSpider)
process.start()
print("ready")
请注意:https://www.lfd.uci.edu/~gohlke/pythonlibs/使用了Twisted。
答案 0 :(得分:0)
当我修复缩进开始工作后,这看起来像是整个缩进问题
2018-09-22 11:35:47 [root] ERROR: this should be printed
我的代码段相同
import scrapy
from scrapy.crawler import CrawlerProcess
import logging
class BlogSpider(scrapy.Spider):
name = 'blogspider'
start_urls = ['https://blog.scrapinghub.com']
def parse(self, response):
logging.error("this should be printed")
for title in response.css('.post-header>h2'):
yield {'title': title.css('a ::text').extract_first()}
for next_page in response.css('div.prev-post > a'):
yield response.follow(next_page, self.parse)
logging.error("this should be printed")
process = CrawlerProcess({
'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
})
process.crawl(BlogSpider)
process.start()
print("ready")
附上Pastbin粘贴https://pastebin.com/pDu8kW27
答案 1 :(得分:0)