如何使用Beautifulsoup刮擦桌子的高度和宽度?

时间:2011-02-10 13:15:28

标签: python beautifulsoup

<table id="t_id" cellspacing="0" border="0" align="center" height="700" width="600" cellpadding="0">
<tbody>
<tr><td> ..test... </td></tr>
<tr><td> ..test... </td></tr>
<tr><td> ..test... </td></tr>
</tbody>
</table>

2 个答案:

答案 0 :(得分:3)

人们倾向于喜欢lxml这几天而不是BeautifulSoup。看看这有多容易:

from lxml import etree
data = """<table id="t_id" cellspacing="0" border="0" align="center" height="700" width="600" cellpadding="0">
<tbody>
<tr><td> ..test... </td></tr>
<tr><td> ..test... </td></tr>
<tr><td> ..test... </td></tr>
</tbody>
</table>
"""
tree = etree.fromstring(data)
table_element = tree.xpath("/table")[0] # because it returns a list of table elements
print table_element.attrib['height'] + " and " + table_element.attrib['width']

答案 1 :(得分:1)

如果这是您的整个HTML,那么这就足够了:

import BeautifulSoup
soup = BeautifulSoup.BeautifulSoup("...your HTML...")
print soup.table['width'], soup.table['height']
# prints: 600 700

如果您需要先搜索该表,那么它也不会复杂得多:

table = soup.find('table', id='t_id')
print table['width'], table['height']