Question

以前使用python工作n xpath，它可以从网页中提取数据。现在我需要在同一个网页上使用YQL，但它不够健壮。

我想得到的是 1.最后（AUD） 2.关闭 3.关闭（％） 4.累积量来自https://www.shareinvestor.com/fundamental/factsheet.html?counter=TPM.AX 我在python中使用的xpath如下：

xpath('//td[contains(., "Last")]/strong/text()')
xpath('//td[contains(., "Change")]/strong/text()')[0]
xpath('//td[contains(., "Change (%)")]/strong/text()')
xpath('//td[contains(., "Cumulative Volume")]/following-sibling::td[1]/text()')

部分html在这里

<tr>
                <td rowspan="2" class="sic_lastdone">Last (AUD): <strong>6.750</strong></td>
                <td class="sic_change">Change: <strong>-0.080</strong></td>
                <td>High: <strong>6.920</strong></td>
                <td rowspan="2" class="sic_remarks">
                  Remarks: <strong>-</strong>
                </td>
              </tr>
              <tr>
                <td class="sic_change">Change (%): <strong>-1.17</strong></td>
                <td>Low: <strong>6.700</strong></td>
              </tr>
              <tr>

<tr>
                <td>Cumulative Volume (share)</td>
                <td class='sic_volume'>3,100,209</td>
                <td>Cumulative Value</td>
                <td class='sic_value'></td>
              </tr>

但是当我想在YQL中应用时，它不起作用。它只适用于

select * from html where
url="https://www.shareinvestor.com/fundamental/factsheet.html?counter=TPM.AX"
and xpath="//td/strong"

它会获得大量数据。我想要一个特定的数据并且需要健壮，以便网页的更改，我的查询仍然有效。如何获得强大的YQL xpath？

Answer 1

您应该避免根据可见文本构建xpath。

我总是根据标签属性构建xpath，因为它们通常不会更改。这使得xpath结果独特，并且不受HTML中可见文本更改的影响。

例如，“Last（AUD）：”值xpath： //td[@class="sic_lastdone"]/strong/text()

YQL xpath不够健壮

1 个答案: