Question

我设法找到了我想要使用调试蜘蛛隔离的属性，但我不确定是否正确地将它合并到我的蜘蛛中。当蜘蛛运行时，我没有收到明确的错误消息，因此我认为我刚刚进入了选择器。

我抓取的网站是＆＃34; http://www.smiling-moose.com/events/index.php＆＃34; 我输入调试蜘蛛的路径命令是＆＃34; response.xpath（＆＃39; // div [@class =＆＃34; show_sec_button＆＃34;] / text（）＆＃39;）＆＃34 ;，这引起了我正在寻找的确切反应。

这是我的蜘蛛：

import scrapy

from smiling_moose.items import SMItem

class Smspider (scrapy.Spider):
    name = "smspider"
    allowed_domains = ["http://www.smiling-moose.com/index.php"]
    start_urls = [
         "http://www.smiling-moose.com/events/index.php",
    ]

def parse(self, response):
    for sel in response.xpath('//div'):
        item = SMItem()
        item['desc'] = response.xpath('//*[@class="show_sec_band"]/text()').extract()

这是我的Items.py：

import scrapy


class SMItem(scrapy.Item):
    desc = scrapy.Field()

蜘蛛有什么需要改变的吗？如果需要，我可以发布命令提示错误。

谢谢

Answer 1

首先更改allowed_domains：

allowed_domains = ["smiling-moose.com"]

其次，退回项目：

item['desc'] = response.xpath('//*[@class="show_sec_band"]/text()').extract()
yield item

scrapy xpath选择器问题

1 个答案: