Python和Linux初学者试图让scrapy启动并运行。按照https://doc.scrapy.org/en/latest/intro/tutorial.html中的说明和代码操作。获取用户警告You do not have a working installation of the service_identity module: 'cannot import name 'opentype'
下载并尝试安装service_identity
,但在安装的不同部分获得了Requirement already satisfied
。尝试了pip3并从下面.whl
下载的pypi-URL
文件下载并安装。
python 3.5.3
mat@mat-VirtualBox:~$ scrapy startproject tutorial2
:0: UserWarning: You do not have a working installation of the service_identity module: 'cannot import name 'opentype''. Please install it from <https://pypi.python.org/pypi/service_identity> and make sure all of its dependencies are satisfied. Without the service_identity module, Twisted can perform only rudimentary TLS client hostname verification. Many valid certificate/hostname mappings may be rejected.
New Scrapy project 'tutorial2', using template directory '/usr/local/lib/python3.5/dist-packages/scrapy/templates/project', created in:
/home/mat/tutorial2
You can start your first spider with:
cd tutorial2
scrapy genspider example example.com
mat@mat-VirtualBox:~$ pip3 install Downloads/
geckodriver-v0.19.1-linux64.tar.gz
NOOBS_lite_v2_4.zip
npm-debug.log
phantomjs-2.1.1-linux-x86_64/
phantomjs-2.1.1-linux-x86_64.tar.bz2
reveal.js-master.zip
service_identity/
service_identity-17.0.0.dist-info/
service_identity-17.0.0-py2.py3-none-any.whl
mat@mat-VirtualBox:~$ pip3 install Downloads/service_identity-17.0.0
Invalid requirement: 'Downloads/service_identity-17.0.0'
It looks like a path. Does it exist ?
mat@mat-VirtualBox:~$ pip3 install Downloads/service_identity-17.0.0-py2.py3-none-any.whl
Requirement already satisfied: service-identity==17.0.0 from file:///home/mat/Downloads/service_identity-17.0.0-py2.py3-none-any.whl in /usr/local/lib/python3.5/dist-packages
Requirement already satisfied: pyopenssl>=0.12 in /usr/local/lib/python3.5/dist-packages (from service-identity==17.0.0)
Requirement already satisfied: attrs in /usr/local/lib/python3.5/dist-packages (from service-identity==17.0.0)
Requirement already satisfied: pyasn1-modules in /usr/local/lib/python3.5/dist-packages (from service-identity==17.0.0)
Requirement already satisfied: pyasn1 in /usr/lib/python3/dist-packages (from service-identity==17.0.0)
Requirement already satisfied: six>=1.5.2 in /usr/lib/python3/dist-packages (from pyopenssl>=0.12->service-identity==17.0.0)
Requirement already satisfied: cryptography>=2.1.4 in /usr/local/lib/python3.5/dist-packages (from pyopenssl>=0.12->service-identity==17.0.0)
Requirement already satisfied: asn1crypto>=0.21.0 in /usr/local/lib/python3.5/dist-packages (from cryptography>=2.1.4->pyopenssl>=0.12->service-identity==17.0.0)
Requirement already satisfied: cffi>=1.7; platform_python_implementation != "PyPy" in /usr/local/lib/python3.5/dist-packages (from cryptography>=2.1.4->pyopenssl>=0.12->service-identity==17.0.0)
Requirement already satisfied: idna>=2.1 in /usr/lib/python3/dist-packages (from cryptography>=2.1.4->pyopenssl>=0.12->service-identity==17.0.0)
Requirement already satisfied: pycparser in /usr/local/lib/python3.5/dist-packages (from cffi>=1.7; platform_python_implementation != "PyPy"->cryptography>=2.1.4->pyopenssl>=0.12->service-identity==17.0.0)
mat@mat-VirtualBox:~$
无论如何都要尝试看看发生了什么
mat@mat-VirtualBox:~/tutorial2$ scrapy crawl qoutes
:0: UserWarning: You do not have a working installation of the service_identity module: 'cannot import name 'opentype''. Please install it from <https://pypi.python.org/pypi/service_identity> and make sure all of its dependencies are satisfied. Without the service_identity module, Twisted can perform only rudimentary TLS client hostname verification. Many valid certificate/hostname mappings may be rejected.
2017-12-06 19:35:52 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: tutorial2)
2017-12-06 19:35:52 [scrapy.utils.log] INFO: Overridden settings: {'BOT_NAME': 'tutorial2', 'ROBOTSTXT_OBEY': True, 'SPIDER_MODULES': ['tutorial2.spiders'], 'NEWSPIDER_MODULE': 'tutorial2.spiders'}
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/scrapy/spiderloader.py", line 69, in load
return self._spiders[spider_name]
KeyError: 'qoutes'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/scrapy", line 11, in <module>
sys.exit(execute())
File "/usr/local/lib/python3.5/dist-packages/scrapy/cmdline.py", line 149, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "/usr/local/lib/python3.5/dist-packages/scrapy/cmdline.py", line 89, in _run_print_help
func(*a, **kw)
File "/usr/local/lib/python3.5/dist-packages/scrapy/cmdline.py", line 156, in _run_command
cmd.run(args, opts)
File "/usr/local/lib/python3.5/dist-packages/scrapy/commands/crawl.py", line 57, in run
self.crawler_process.crawl(spname, **opts.spargs)
File "/usr/local/lib/python3.5/dist-packages/scrapy/crawler.py", line 167, in crawl
crawler = self.create_crawler(crawler_or_spidercls)
File "/usr/local/lib/python3.5/dist-packages/scrapy/crawler.py", line 195, in create_crawler
return self._create_crawler(crawler_or_spidercls)
File "/usr/local/lib/python3.5/dist-packages/scrapy/crawler.py", line 199, in _create_crawler
spidercls = self.spider_loader.load(spidercls)
File "/usr/local/lib/python3.5/dist-packages/scrapy/spiderloader.py", line 71, in load
raise KeyError("Spider not found: {}".format(spider_name))
KeyError: 'Spider not found: qoutes'
mat@mat-VirtualBox:~/tutorial2$
我的/home/mat/tutorial2/tutorial2/spiders/qoutes_spider.py中的代码
import scrapy
class QuotesSpider(scrapy.Spider):
name = "quotes"
def start_requests(self):
urls = [
'http://quotes.toscrape.com/page/1/',
'http://quotes.toscrape.com/page/2/',
]
for url in urls:
yield scrapy.Request(url=url, callback=self.parse)
def parse(self, response):
page = response.url.split("/")[-2]
filename = 'quotes-%s.html' % page
with open(filename, 'wb') as f:
f.write(response.body)
self.log('Saved file %s' % filename)
答案 0 :(得分:12)
您的问题似乎是一个错字。您正在使用qoutes
和quotes
。交换o
和u
。同样对于service_identity
也是一个警告。如果您想安装它,请尝试使用
pip3 install service_identity --force --upgrade
答案 1 :(得分:0)
我和这个问题的作者有同样的问题。
此命令已将其修复:
pip install service_identity --force --upg
谢谢