Question

我需要在此html页面末尾的“ td”标签中提取数字（0.04）。

      <div class="boxContentInner">
         <table class="values non-zebra">
   <thead>
   <tr>
      <th>Apertura</th>
      <th>Max</th>
      <th>Min</th>
      <th>Variazione giornaliera</th>
      <th class="last">Variazione %</th>
   </tr>
   </thead>
   <tbody>
   <tr>
      <td id="open" class="quaternary-header">2708.46</td>
      <td id="high" class="quaternary-header">2710.20</td>
      <td id="low" class="quaternary-header">2705.66</td>
      <td id="change" class="quaternary-header changeUp">0.99</td>
      <td id="percentageChange" class="quaternary-header last changeUp">0.04</td>
   </tr>
   </tbody>
</table>

      </div>

我在Python 2.8上使用BeautifulSoup尝试了这段代码：


from bs4 import BeautifulSoup 
import requests 

page= requests.get('https://www.ig.com/au/indices/markets-indices/us-spx-500').text 
soup = BeautifulSoup(page, 'lxml') 

percent= soup.find('td',{'id':'percentageChange'}) 
percent2=percent.text


print percent2

结果为无。

错误在哪里？

Answer 1

我看过https://www.ig.com/au/indices/markets-indices/us-spx-500，看来您在进行percent= soup.find('td', {'id':'percentageChange'})时并没有搜索正确的ID

实际值位于<span data-field="CPC">VALUE</span>

您可以通过以下方式检索此信息：

percent = soup.find("span", {'data-field': 'CPC'})
print(percent.text.strip())

Answer 2

这对我有用。

percents = soup.find_all("span", {'data-field': 'CPC'})
for percent in percents:
  print(percent.text.strip())

BeautifulSoup：如何提取封装在多个div / span / id标签中的文本

2 个答案: