使用BeautifulSoup获取标记样式

时间:2016-06-15 19:11:38

标签: python beautifulsoup python-requests

我正在抓取一个页面,并从该页面上的表格中获取所有<tr>元素,如下所示:

r = requests.get("http://lol.esportswikis.com/wiki/G2_Esports/Match_History")
s = BeautifulSoup(r.content, "lxml")
tr = s.find_all("table", class_="wikitable sortable")[0].find_all("tr")[3:]

print tr[0]

输出:

<tr style="background-color:#C6EFCE"><td>...</td> ... <td>...</td></tr>

现在我正试图获得<tr>标签的样式,但我不知道如何。如果我这样做:

for item in tr[0]:
    print item

它显然只是打印<td> ... </td>的东西。我想我可能会做像print tr[0].something这样的事情,比如tr[0].tag,但到目前为止我所尝试的一切都没有达到我想要的效果。

1 个答案:

答案 0 :(得分:1)

只需使用tag["attribute"]访问该属性:

In [28]: soup = BeautifulSoup('<tr style="pretty"></tr>', 'html.parser')

In [29]: print soup.find("tr")["style"]
pretty

如果你只想要带有样式属性的tr标签来获取它们:

trs = s.find("table", class_="example-table").find_all("tr", style=True)

for tr in trs:
    print(tr["style"])

或使用css选择器:

trs = s.select("table.example-table tr[style]")

for tr in trs:
    print(tr["style"])

使用您的实际网址:

In [41]: r = requests.get("http://lol.esportswikis.com/wiki/G2_Esports/Match_History")

In [42]: s = BeautifulSoup(r.content, "lxml")

In [43]: trs = s.select("table.wikitable.sortable tr[style]")

In [44]: 

In [44]: for tr in trs:
   ....:         print(tr["style"])
   ....:     
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#FFC7CE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#FFC7CE
background-color:#FFC7CE
background-color:#FFC7CE
background-color:#C6EFCE
background-color:#FFC7CE
background-color:#C6EFCE
background-color:#FFC7CE
background-color:#FFC7CE
background-color:#FFC7CE
background-color:#FFC7CE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#FFC7CE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#FFC7CE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#FFC7CE
background-color:#C6EFCE
background-color:#FFC7CE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#FFC7CE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#C6EFCE
background-color:#FFC7CE
background-color:#FFC7CE
background-color:#C6EFCE
background-color:#FFC7CE
background-color:#FFC7CE
background-color:#C6EFCE