BeautifulSoup find()返回奇数数据

时间:2018-03-10 01:42:45

标签: python beautifulsoup

我正在使用BeautifulSoup从网站上获取数据。我可以找到我想要的数据但是当我打印它时,它显示为“-1”该字段中的值是32.27。这是我正在使用的代码

import requests
from BeautifulSoup import BeautifulSoup
import csv

symbols = {'451020'}

with open('industry_pe.csv', "ab") as csv_file:
    writer = csv.writer(csv_file, delimiter=',')
    writer.writerow(['Industry','PE'])
    for s in symbols:
        try:
            url = 'https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/industries.jhtml?tab=learn&industry='
            full = url + s
            response = requests.get(full)
            html = response.content
            soup = BeautifulSoup(html)

            for PE in soup.find("div", {"class": "sec-fundamentals"}):
                print PE
                #IndPE = PE.find("td")
                #print IndPE

当我打印PE时,它会返回...

    <h2>
                        Industry Fundamentals
                      <span>AS OF 03/08/2018</span>
</h2>


<table summary="" class="data-tbl">
<colgroup>
<col class="col1" />
<col class="col2" />
</colgroup>
<thead>
<tr>
<th scope="col"></th>
<th scope="col"></th>
</tr>
</thead>
<tbody>
<tr>
<th scope="row" class="align-left"><a href="javascript:void(0);" onclick="javasc
ript:openPopup('https://www.fidelity.com//webcontent/ap010098-etf-content/18.01/
help/research/learn_er_glossary_3.shtml#priceearningsratio',420,450);return fals
e;">P/E (Last Year GAAP Actual)</a></th>
<td>



                        32.27


           </td>
</tr>
<tr>
<th scope="row" class="align-left"><a href="javascript:void(0);" onclick="javasc
ript:openPopup('https://www.fidelity.com//webcontent/ap010098-etf-content/18.01/
help/research/learn_er_glossary_3.shtml#priceearningsratio',420,450);return fals
e;">P/E (This Year's Estimate)</a>.....

我希望从'td'获得值32.27,但是当我使用我已注释掉的代码来获取并打印'td'时它会给我这个。

-1
None
-1
<td>



                       32.27


           </td>
-1     

任何想法?

1 个答案:

答案 0 :(得分:0)

find()方法返回第一个匹配的标记。迭代标签的内容,将逐个为您提供所有标签。

因此,要获取表中的<td>标记,首先应找到该表并将其存储在变量中。然后使用td迭代所有find_all('td')代码。

table = soup.find("div", {"class": "sec-fundamentals"})
for row in table.find_all('td'):
    print(row.text.strip())

部分输出:

32.27
34.80
$122.24B
$3.41
14.14%
15.88%

如果您只想要第一个值,可以使用:

table = soup.find("div", {"class": "sec-fundamentals"})
value = table.find('td').text.strip()
print(value)
# 32.27