如何访问同一标签的两个条目中的第二个?

时间:2013-07-29 13:46:16

标签: python html parsing beautifulsoup

我正在尝试使用名为“lead-value”的类访问div中的值。这是带有值'lead-value'的类的div的第二次出现,所以我试图通过在第二个'lead-value'出现的父对象下进行子集化来访问这个特定的实例。

这是html:

<td title="College Readiness is based on the percentages of 12th graders who were tested and passed AP&#174; exams. The maximum college readiness index value is 100.0." class="column-last column-even table-column-last table-column-even  g_school_in_country_college_readiness_index_stacked  cluetip">

                    <div>
    <p><div class="lead-value">100.0</div>

所以我想使用怪物类名"column-last column-even table-column-last table-column-even g_school_in_country_college_readiness_index_stacked cluetip"来获取值'100'。

我如何使用BeautifulSoup做到这一点?

1 个答案:

答案 0 :(得分:3)

例如,原始的example.html文件如下:

<div class="lead-value">80.0</div>
<div class="lead-value">100.0</div>
<div class="lead-value">120.0</div>
<div class="lead-value">140.0</div>

python代码是:

>>>inf = open("example.html") 
>>>content = inf.read()
>>>inf.close()
>>>soup = BeautifulSoup(content)
>>>soup.findall('div',{'class':'lead-value'})
[<div class="lead-value">80.0</div>, <div class="lead-value">100.0</div>, <div class="lead-value">120.0</div>, <div class="lead-value">140.0</div>]
>>>blocks = soup.findall('div',{'class':'lead-value'}) 
>>>print blocks[1].string
100.0