我正在尝试学习如何使用scrapy,并尝试做我认为是一个简单的项目。我试图从单个网页中提取2个数据 - 不需要抓取其他链接。但是,我的代码似乎返回零结果。我已经在Scrapy Shell中测试了xpath,并且都返回了预期的结果。
我的item.py是:
import scrapy
class StockItem(scrapy.Item):
quote = scrapy.Field()
time = scrapy.Field()
我的蜘蛛,名为stockscrapy.py,是:
import scrapy
class StockSpider(scrapy.Spider):
name = "ugaz"
allowed_domains = ["nasdaq.com"]
start_urls = ["http://www.nasdaq.com/symbol/ugaz/"]
def parse(self, response):
stock = StockItem()
stock['quote'] = response.xpath('//*[@id="qwidget_lastsale"]/text()').extract()
stock['time'] = response.xpath('//*[@id="qwidget_markettime"]/text()').extract()
return stock
要运行脚本,我使用命令行:
scrapy crawl ugaz -o stocks.csv
非常感谢任何和所有帮助。
答案 0 :(得分:1)
您需要缩进解析块。
import scrapy
class StockSpider(scrapy.Spider):
name = "ugaz"
allowed_domains = ["nasdaq.com"]
start_urls = ["http://www.nasdaq.com/symbol/ugaz/"]
# Indent this block
def parse(self, response):
stock = StockItem()
stock['quote'] = response.xpath('//*[@id="qwidget_lastsale"]/text()').extract()
stock['time'] = response.xpath('//*[@id="qwidget_markettime"]/text()').extract()
return stock