使用BeautifulSoup4,我可以隔离:
<a href="#" data-nutrition="{
"serving-name":"Milk, 2%",
"serving-size":"16 FL OZ",
"calories":"267"}">
Milk, 2%
<i class="icon-leaf icon-hidden-text">Meatless</i>
</a>
通过运行:
for i in soup('a', attrs={'data-nutrition' : True}):
sample = i
break
print(sample)
我需要创建字典:
my_dict = {
'serving-name': 'Milk, 2%',
'serving-size': '16 FL OZ',
'calories': '267'
}
如何在Python中使用BeautifulSoup4执行此操作?
答案 0 :(得分:1)
找到元素并使用json.loads()
将data-nutrition
属性值加载到Python字典中:
import json
from bs4 import BeautifulSoup
data = """
<a href="#" data-nutrition="{
"serving-name":"Milk, 2%",
"serving-size":"16 FL OZ",
"calories":"267"}">
Milk, 2%
<i class="icon-leaf icon-hidden-text">Meatless</i>
</a>"""
soup = BeautifulSoup(data, "html.parser")
a = soup.select_one("a[data-nutrition]")
nutrition = json.loads(a["data-nutrition"])
print(nutrition)
打印:
{'serving-name': 'Milk, 2%', 'serving-size': '16 FL OZ', 'calories': '267'}