带有空格的BeautifulSoup4类不被视为单个字符串

时间:2015-12-19 18:04:38

标签: python beautifulsoup

>>> soup = BeautifulSoup('<div class="class1 class2 class3">...</div>','lxml')
>>> soup.find('div')['class']
['class1', 'class2', 'class3']

如何强制BS4将类名视为单个字符串?

1 个答案:

答案 0 :(得分:1)

您可以使用xml作为解析器:

soup = BeautifulSoup('<div class="class1 class2 class3">...</div>',"xml")
print(soup.find('div')['class'])
class1 class2 class3

或者您可以从'class'删除builder.cdata_list_attributes['*']

del BeautifulSoup().builder.cdata_list_attributes["*"][0]

soup = BeautifulSoup('<div class="class1 class2 class3">...</div>')
print(soup.find('div')['class'])
class1 class2 class3