无法从beautifulsoup获取p class = info <span>标签数据

时间:2018-05-30 04:06:10

标签: python beautifulsoup

无法获取标签数据,无法从beautifulsoup获取p class = info标签数据,谢谢!

from bs4 import BeautifulSoup 
import re

html = """"
<p class="info">
<span>Kranji Mile Day simulcast races, 
Kranji Racecourse, SIN</span>
<span>Class 3 Handicap   -  1200M TURF</span>
<span>Saturday, 26 May 2018</span>
<span>Race 1, 5:15 PM</span>
</p>
"""
soup = BeautifulSoup(html, "html.parser")
table = soup.find('p', attrs={class:'info'})
rows = table.findAll("span")

print rows

预期输出以逗号分隔

Kranji Mile Day simulcast races, Kranji Racecourse, SIN , Class 3, Handicap, 1200M, TURF, Saturday, 26 May 2018, Race 1, 5:15PM

3 个答案:

答案 0 :(得分:0)

它是class_因为class是保留关键字

table = soup.find('p', attrs={'class':'info'})

table = soup.find('p',class_='info'})
  

使用文本属性连接标记内的所有文本

     

如果字符串属性中包含另一个标记

,则该字符串属性将无效
print (', '.join(i.text for i in rows)) # For getting text 

答案 1 :(得分:0)

解决class问题后,如其他答案中所述,您仍然需要从代码中提取字符串:

result = ', '.join(r.string for r in rows)
print(result)
#Kranji Mile Day simulcast races, 
# Kranji Racecourse, SIN, Class 3 Handicap   -  1200M TURF, Saturday, 26 May 2018, Race 1, 5:15 PM

答案 2 :(得分:0)

嗯 - 在python3中,如果您只引用此行中的“class”

,这对我来说很好
table = soup.find('p', attrs={'class':'info'})
                          ^

虽然输出将是......元素而不仅仅是文本。您想要元素还是只需要文本?