Question

我不知道我做错了什么。我正在尝试提取文本并将其存储在列表中。在萤火虫和火道中，当我进入路径时，它显示正确的正确文本。但是当我申请时，它返回空列表。我正试图刮去www.insider.in/mumbai。它会转到所有链接并抓取事件标题，地址和其他信息。这是我新编辑的代码：

public void

编辑输出：

ViewModel

即使if条件失败并打印RSVP。我似乎不明白我做错了什么。我被困在这部分3天。请帮忙。

Answer 1

我删除了像webdriver这样的东西并获得了可行的基本代码

import scrapy
import logging
from scrapy.http import Request
from scrapy.selector import Selector

class insiderSpider(scrapy.Spider):
    name = 'insider'
    allowed_domains = ["insider.in"]
    start_urls = ["http://www.insider.in/mumbai/"]
    event_details = list() # Changed. Now event_detail is a menber data of class

    def parse(self, response):
        source_link = []
        temp = []
        title =""
        Price = ""
        Venue_name = ""
        Venue_address = ""
        description = ""
        alllinks = response.xpath('//div[@class="bottom-details-right"]//a/@href').extract()
        print alllinks
        for single_event in alllinks:
            if "https://insider.in/event" in single_event:
                yield Request(url = single_event, callback = self.parse_event)
            else:
                print 'Other part'

    def parse_event(self, response):
        title = response.xpath('//div[@class = "cell-title in-headerTitle"]/h1//text()').extract()
        print title
        temp = response.xpath('//div[@class = "cell-caption centered in-header"]//h3//text()').extract()
        print temp
        a = len(response.xpath('//div[@class = "bold-caption price"]//text()').extract())
        if a > 0:
            Price = response.xpath('//div[@class = "bold-caption price"]//text()').extract()
        else:
            Price = "RSVP"
        print Price
        Venue_name = response.xpath('normalize-space(//div[@class = "address"]//div[@class = "section-title"]//text())').extract()

        print Venue_name
        Venue_address = response.xpath('normalize-space(//div[@class ="address"]//div//text()[preceding-sibling::br])').extract()

        print Venue_address
        description = response.xpath('normalize-space(//div[@class="cell-caption accordion-padding"]//text())').extract()

        print description
        self.event_details.append([title,temp,Price,Venue_name,Venue_address,description]) # Notice that event_details is used as self.event_details ie, using member data
        print self.event_details # Here also self.event_details

Scrapy xpath在python中返回空列表

1 个答案: