Question

我正在尝试学习如何使用scrapy，并尝试做我认为是一个简单的项目。我试图从单个网页中提取2个数据 - 不需要抓取其他链接。但是，我的代码似乎返回零结果。我已经在Scrapy Shell中测试了xpath，并且都返回了预期的结果。

我的item.py是：

import scrapy

class StockItem(scrapy.Item):
    quote = scrapy.Field()
    time = scrapy.Field()

我的蜘蛛，名为stockscrapy.py，是：

import scrapy

class StockSpider(scrapy.Spider):
    name = "ugaz"
    allowed_domains = ["nasdaq.com"]
    start_urls = ["http://www.nasdaq.com/symbol/ugaz/"]

def parse(self, response):
    stock = StockItem()
    stock['quote'] = response.xpath('//*[@id="qwidget_lastsale"]/text()').extract()
    stock['time'] = response.xpath('//*[@id="qwidget_markettime"]/text()').extract()
    return stock

要运行脚本，我使用命令行：

scrapy crawl ugaz -o stocks.csv

非常感谢任何和所有帮助。

Answer 1

您需要缩进解析块。

import scrapy

class StockSpider(scrapy.Spider):
    name = "ugaz"
    allowed_domains = ["nasdaq.com"]
    start_urls = ["http://www.nasdaq.com/symbol/ugaz/"]

    # Indent this block
    def parse(self, response):
        stock = StockItem()
        stock['quote'] = response.xpath('//*[@id="qwidget_lastsale"]/text()').extract()
        stock['time'] = response.xpath('//*[@id="qwidget_markettime"]/text()').extract()
        return stock

Scrapy返回零结果

1 个答案: