我想要抓取网站this link 但是当我使用命令
进行爬行时scrapy crawl metacritic -o metacritic.json -t json
它说ImportError: No module named items
有人可以帮忙吗?
metacritic_spider.py来源:
from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from metacritic.items import MetacriticItem
class MetacriticSpider(BaseSpider):
class MetacriticSpider(BaseSpider):
name = "metacritic" # Name of the spider, to be used when crawling
allowed_domains = ["metacritic.com"] # Where the spider is allowed to go
start_urls = ["http://www.metacritic.com/browse/games/title/pc?page=0"]
def parse(self, response):
hxs = HtmlXPathSelector(response) # The XPath selector
sites = hxs.select('//li[contains(@class, "product game_product")]/div[@class="product_wrap"]')
items = []
for site in sites:
item = MetacriticItem()
item['title'] = site.select('div[@class="basic_stat product_title"]/a/text()').extract()
item['link'] = site.select('div[@class="basic_stat product_title"]/a/@href').extract()
item['cscore'] = site.select('div[@class="basic_stat product_score brief_metascore"]/div/div/span[contains(@class, "data metascore score")]/text()').extract()
item['uscore'] = site.select('div[@class="more_stats condensed_stats"]/ul/li/span[contains(@class, "data textscore textscore")]/text()').extract()
item['date'] = site.select('div[@class="more_stats condensed_stats"]/ul/li/span[@class="data"]/text()').extract()
items.append(item)
return items
source item.py:
from scrapy.item import Item, Field
class MetacriticItem(Item):
# Here are the fields that will be crawled and stored
title = Field() # Game title
link = Field() # Link to individual game page
cscore = Field() # Critic score
uscore = Field() # User score
date = Field() # Release date
我的目录结构:structure
答案 0 :(得分:0)
将您的item.py
移至第二个metacritic
文件夹。