我正在尝试从此xml中提取“ totalvotes”值:
<poll title="User Suggested Number of Players" totalvotes="0" name="suggested_numplayers">
<results numplayers="3+"> </results>
</poll>
我弄乱了以下代码的许多不同组合,但是它们都不起作用。
soup.find_all('poll',{'title':'User Suggested Number of Players'})[0].find_all('totalvotes')
在这种情况下,我只是试图检索0的值。我该怎么做?
谢谢。
答案 0 :(得分:1)
有多种获取元素的方法,一种是使用CSS选择器:
data = '''<poll title="User Suggested Number of Players" totalvotes="0" name="suggested_numplayers">
<results numplayers="3+"> </results>
</poll>'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(data, 'html.parser')
# method 1 (select <poll> with attribute "votes")
print(soup.select_one('poll[totalvotes]')['totalvotes'])
# method 2 (more specific, select <poll> that has in attribute title "User Suggested Number of Players")
print(soup.select_one('poll[title="User Suggested Number of Players"][totalvotes]')['totalvotes'])
# method 3 (select <poll> that has <results> inside )
print(soup.select_one('poll:has(results)[totalvotes]')['totalvotes'])
打印:
0
0
0
进一步阅读:
答案 1 :(得分:0)
要从第一个元素中提取
soup.find('poll').get('totalvotes')
从所有元素中提取
for poll in soup.find_all('poll'):
print (poll.get('totalvotes'))