Question

我有一个网站，我想保存两个span元素值。

这是我的HTML代码的相关部分：

<div class="box-search-product-filter-row">

    <span class="result-numbers" sth-bind="model.navigationSettings.showFilter">

    <span class="number" sth-bind="span1"></span>

    <span class="result" sth-bind="span2"></span>

    </span>

</div>

我创造了一只蜘蛛：

from scrapy.spiders import Spider
from scrapy.selector import Selector

class MySpdier(Spider):

    name = "list"
    allowed_domains = ["example.com"]
    start_urls = [
        "https://www.example.com"]

    def parse(self, response):
        sel = Selector(response)
        divs = sel.xpath("//div[@class='box-search-product-filter-row']")


        for div in divs:
            sth = div.xpath("/span[class='result']/text()").extract()

            print sth

当我爬行蜘蛛时，它只会打印出来：

[]

有人可以帮助我如何从我的两个（类号和类结果）span元素中获取值？

Answer 1

您在xpath @中忘记了"/span[class='result']/text()"。此外，您所寻找的范围不是一级孩子，因此您需要使用.//代替/。看到：资料来源：http://www.w3schools.com/xsl/xpath_syntax.asp

完整且正确的xpath将是：".//span[@class='result']" +＆＃39; / text（）＆＃39;如果你只想选择文本，但你的例子中的节点没有文字，所以它不会在这里工作。

Answer 2

这对你有用

修改

from scrapy.spiders import Spider from scrapy.selector import Selector class MySpdier(Spider): name = "list" allowed_domains = ["example.com"] start_urls = [ "https://www.example.com"] def parse(self, response): sel = Selector(response) divs = sel.xpath("//div[@class='box-search-product-filter-row']") for div in divs: sth = div.xpath(".//span[@class='result']/text()").extract() print sth

scrapy xpath无法获得价值

2 个答案: