Question

从此Deutsche Börse web page开始，在表标题 Issuer 下，我希望在名称旁边的单元格中获取字符串内容'db X-trackers' >在其中。

使用我的网络浏览器，我检查该表区域并获取代码，我已将其粘贴到此XML树中，以便我可以测试我的xPath。

<root>
    <div class="row">
        <div class="col-lg-12">
            <h2>Issuer</h2>
        </div>
    </div>
    <div class="table-responsive">
        <table class="table">
            <tbody>
                <tr>
                    <td>Name</td>
                    <td class="text-right">db X-trackers</td>
                </tr>
            </tbody>
        </table>
    </div>
</root>

根据FreeFormatter.com，我的xPath成功检索到正确的元素（Text='db X-trackers'）：

my_xpath = "//h2['Issuer']/ancestor::div[@class='row']/following-sibling::div//td['Name']/following-sibling::td[1]/text()"

注意：它首先转到<h2>Issuer</h2>以确定开始工作的正确位置。

但是，当我使用 Selenium WebDriver 在实际网页上运行时，会返回None。

def get_sibling(driver, my_xpath):
    try:
        find_value = driver.find_element_by_xpath(my_xpath).text
    except NoSuchElementException:
        return None
    else:
        value = re.search(r"(.+)", find_value).group()
        return value

我不相信函数本身有任何问题，因此xPath必须是错误的，或者实际的网页源代码中有一些东西会将其抛弃。

在Chrome中研究实际的源代码时，它看起来比我用 Inspector 看到的更麻烦，这是我用来创建上面的小XML树

<div class="box">
                    <div class="row">
                <div class="col-lg-12">
                        <h2>Issuer</h2>
                </div>
            </div>
    <div class="table-responsive">
            <table class="table">
                    <tbody>
            <tr>
                    <td   >
                        Name
                    </td>
                    <td class="text-right"  >
                        db X-trackers
                    </td>
            </tr>
            <tr>
                    <td   >
                        Product Family
                    </td>
                    <td class="text-right"  >
                        db X-trackers
                    </td>
            </tr>
            <tr>
                    <td   >
                        Homepage
                    </td>
                    <td class="text-right"  >
                        <a target="_blank" href="http://www.etf.db.com">www.etf.db.com</a>
                    </td>
            </tr>
    </tbody>

            </table>
    </div>

上面的源代码是否有一些特殊之处，或者我的xPath（或函数）是错误的？

Answer 1

我会使用following和following-sibling轴：

//h2[. = "Issuer"]/following::table//td[. = "Name"]/following-sibling::td

首先，我们找到h2元素，然后获取以下table元素。在table元素中，我们查找带有td文字的Name元素，然后获取以下td兄弟。

xPath：将表达式与实际源代码匹配的困难

1 个答案: