此结果:
soup.find('span', {'class':'js-date-picker btn--secondary btn--secondary--no-spacing'})
<span class="js-date-picker btn--secondary btn--secondary--no-spacing" data-clear="/h/?type=ln&search=ethereum&lang=en&searchheadlines=1" data-date='{"sel":false,"latest":1599889680,"now":1599897600}' data-href="/h/?type=ln&search=ethereum&lang=en&searchheadlines=1&d=" href="javascript://">
<span class="btn--secondary__icon"><i class="far fa-calendar-alt"></i></span>
<span class="btn--secondary__label">
<span class="dtctxt"><span class="d">12 Sep</span><span class="t"> 06:48</span></span></span>
</span>
现在我要从此文本中提取{"sel":false,"latest":1599889680,"now":1599897600}
我该怎么做?
答案 0 :(得分:1)
尝试一下:
import ast
from bs4 import BeautifulSoup
html = """
<span class="js-date-picker btn--secondary btn--secondary--no-spacing" data-clear="/h/?type=ln&search=ethereum&lang=en&searchheadlines=1" data-date='{"sel":false,"latest":1599889680,"now":1599897600}' data-href="/h/?type=ln&search=ethereum&lang=en&searchheadlines=1&d=" href="javascript://"></span>
<span class="btn--secondary__icon"><i class="far fa-calendar-alt"></i></span>
<span class="btn--secondary__label">
<span class="dtctxt"><span class="d">12 Sep</span><span class="t"> 06:48</span></span></span>
</span>
"""
soup = BeautifulSoup(html, 'html.parser').find("span", {"class": "js-date-picker btn--secondary btn--secondary--no-spacing"})
result = soup.get("data-date")
print(result)
输出:
{"sel":false,"latest":1599889680,"now":1599897600}
如果需要,可以将结果转换为dict
对象,例如:
data_date = ast.literal_eval(result.replace("false", "False"))
print(data_date['now'])
输出:1599897600