我是scrapy的新手,正在努力争取以下网站的头衔 https://www.mdcalc.com/heart-score-major-cardiac-events
我查看过有关此主题的所有帖子,但仍然会收到打开ssl错误
这是我的代码: settings.py
['142', '1567', '153000', '32', '10', '50']
这是我蜘蛛的代码
DOWNLOADER_CLIENTCONTEXTFACTORY ='scrapy.core.downloader.contextfactory.ScrapyClientContextFactory'
当我跑步时
import scrapy
from skitter.items import SkitterItem
class mdcalc(scrapy.Spider):
name = "mdcalc"
allowed_domains = "mdcalc.com"
start_urls = ['https://www.mdcalc.com/heart-score-major-cardiac-events']
def parse(self, response) :
item = SkitterItem()
item['title'] = response.xpath('//h1//text()').extract()[0]
yield item
这是我得到的错误
curl localhost:6800/schedule.json -d project=skitter -d spider=mdcalc
'下载器/ exception_type_count / twisted.web._newclient.ResponseNeverReceived&#39 ;:
6,
' downloader / request_bytes':1614,
' downloader / request_count':6,
' downloader / request_method_count / GET':6,
' finish_reason':'已完成',
' finish_time&#39 ;: datetime.datetime(2017,9,27,2,2,52,62313),
' log_count / DEBUG':8,
' log_count / ERROR':3,
' log_count / INFO':7,
' scheduler / dequeued':3,
' scheduler / dequeued / memory':3,
' scheduler / enqueued':3,
' scheduler / enqueued / memory':3,
' start_time&#39 ;: datetime.datetime(2017,9,27,2,2,23,380740)}
2017-09-27 02:02:52 + 0000 [mdcalc]信息:蜘蛛关闭(已完成)
提前感谢您的帮助。
答案 0 :(得分:0)
这是因为scrapinghub云默认运行的python版本是2.7,要修复该问题,您必须指定蜘蛛必须使用的python版本python3,此链接说明了操作方法。 https://support.scrapinghub.com/support/solutions/articles/22000200387-deploying-python-3-spiders-to-scrapy-cloud