我的项目名称是NOTHS。
以下脚本是我使用的spider.py
和items.py
。
spider.py
:
from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from NOTHS.items import NOTHS
class MySpider(BaseSpider):
name = "main"
allowed_domains = ["notonthehighstreet.com"]
start_urls = ["http://www.notonthehighstreet.com"]
def parse(self, response):
hxs = HtmlXPathSelector(response)
titles = hxs.select("//a[@class='title']")
items = []
for titles in titles:
item = NOTHS()
item ["title"] = titles.select("a/text()").extract()
item ["link"] = titles.select("a/@href").extract()
items.append(item)
return items
items.py
:
from scrapy.item import Item, Field
class CraigslistSampleItem(Item):
title = Field()
link = Field()
运行时会出现以下错误:
C:\Users\ACER\Documents\works\source code\NOTHS>scrapy crawl main
:0: UserWarning: You do not have a working installation of the service_identity
module: 'No module named service_identity'. Please install it from <https://pyp
i.python.org/pypi/service_identity> and make sure all of its dependencies are sa
tisfied. Without the service_identity module and a recent enough pyOpenSSL to s
upport it, Twisted can perform only rudimentary TLS client hostname verification
. Many valid certificate/hostname mappings may be rejected.
Traceback (most recent call last):
File "C:\Python27\Scripts\scrapy-script.py", line 9, in <module>
load_entry_point('scrapy==0.24.4', 'console_scripts', 'scrapy')()
File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\cmdline.py"
, line 143, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\cmdline.py"
, line 89, in _run_print_help
func(*a, **kw)
File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\cmdline.py"
, line 150, in _run_command
cmd.run(args, opts)
File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\commands\cr
awl.py", line 57, in run
crawler = self.crawler_process.create_crawler()
File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\crawler.py"
, line 87, in create_crawler
self.crawlers[name] = Crawler(self.settings)
File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\crawler.py"
, line 25, in __init__
self.spiders = spman_cls.from_crawler(self)
File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\spidermanag
er.py", line 35, in from_crawler
sm = cls.from_settings(crawler.settings)
File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\spidermanag
er.py", line 31, in from_settings
return cls(settings.getlist('SPIDER_MODULES'))
File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\spidermanag
er.py", line 22, in __init__
for module in walk_modules(name):
File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\utils\misc.
py", line 68, in walk_modules
submod = import_module(fullpath)
File "C:\Python27\lib\importlib\__init__.py", line 37, in import_module
__import__(name)
File "C:\Users\ACER\Documents\works\source code\NOTHS\NOTHS\spiders\main.py",
line 3, in <module>
from NOTHS.items import NOTHS
ImportError: cannot import name NOTHS
我做错了什么?
答案 0 :(得分:2)
在文件spider.py
中,更改
from NOTHS.items import NOTHS
到
from NOTHS.items import CraigslistSampleItem