我正在使用BeautifulSoup从网站上获取数据。我可以找到我想要的数据但是当我打印它时,它显示为“-1”该字段中的值是32.27。这是我正在使用的代码
import requests
from BeautifulSoup import BeautifulSoup
import csv
symbols = {'451020'}
with open('industry_pe.csv', "ab") as csv_file:
writer = csv.writer(csv_file, delimiter=',')
writer.writerow(['Industry','PE'])
for s in symbols:
try:
url = 'https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/industries.jhtml?tab=learn&industry='
full = url + s
response = requests.get(full)
html = response.content
soup = BeautifulSoup(html)
for PE in soup.find("div", {"class": "sec-fundamentals"}):
print PE
#IndPE = PE.find("td")
#print IndPE
当我打印PE时,它会返回...
<h2>
Industry Fundamentals
<span>AS OF 03/08/2018</span>
</h2>
<table summary="" class="data-tbl">
<colgroup>
<col class="col1" />
<col class="col2" />
</colgroup>
<thead>
<tr>
<th scope="col"></th>
<th scope="col"></th>
</tr>
</thead>
<tbody>
<tr>
<th scope="row" class="align-left"><a href="javascript:void(0);" onclick="javasc
ript:openPopup('https://www.fidelity.com//webcontent/ap010098-etf-content/18.01/
help/research/learn_er_glossary_3.shtml#priceearningsratio',420,450);return fals
e;">P/E (Last Year GAAP Actual)</a></th>
<td>
32.27
</td>
</tr>
<tr>
<th scope="row" class="align-left"><a href="javascript:void(0);" onclick="javasc
ript:openPopup('https://www.fidelity.com//webcontent/ap010098-etf-content/18.01/
help/research/learn_er_glossary_3.shtml#priceearningsratio',420,450);return fals
e;">P/E (This Year's Estimate)</a>.....
我希望从'td'获得值32.27,但是当我使用我已注释掉的代码来获取并打印'td'时它会给我这个。
-1
None
-1
<td>
32.27
</td>
-1
任何想法?
答案 0 :(得分:0)
find()
方法返回第一个匹配的标记。迭代标签的内容,将逐个为您提供所有标签。
因此,要获取表中的<td>
标记,首先应找到该表并将其存储在变量中。然后使用td
迭代所有find_all('td')
代码。
table = soup.find("div", {"class": "sec-fundamentals"})
for row in table.find_all('td'):
print(row.text.strip())
部分输出:
32.27
34.80
$122.24B
$3.41
14.14%
15.88%
如果您只想要第一个值,可以使用:
table = soup.find("div", {"class": "sec-fundamentals"})
value = table.find('td').text.strip()
print(value)
# 32.27