虽然抓取错误实例方法没有属性' __ getitem __'

时间:2015-01-27 06:26:45

标签: python web-scraping scrapy web-crawler scrapy-spider

我无法理解我收到此错误 - > instance方法没有属性getitem。 我只是想抓住这个网站来提取部门名称。

import scrapy
from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.selector import Selector
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from urlparse import urljoin
from amazon.items import AmazonItem

class delhiveryspider(CrawlSpider):
    name = "amazon"
    allowed_domains = ["amazon.in"]
    start_urls = ["http://www.amazon.in"]


    def parse(self,response):
        sites = response.xpath('//div[@id="nav_browse_flyout"]')
        items = []

        for site in sites:
            item = AmazonItem()
            item['main_title'] = site.xpath('.//li[@id="nav_cat_0"]/text()').extract[0]
            items.append(item)
        return items

1 个答案:

答案 0 :(得分:1)

您需要致电extract(),然后获取第一项:

item['main_title'] = site.xpath('.//li[@id="nav_cat_0"]/text()').extract()[0]
#                                                                  HERE ^

如果您想为每个项目分别设置一个类别,请对其进行迭代:

for title in site.xpath('.//li[starts-with(@id, "nav_cat_")]/text()').extract():
    item = AmazonItem()
    item['main_title'] = title
    items.append(item)