我有一段代码来自产品的价格(分期付款的价格和报价),我试图用python来获取价格(649)。
<span style="color: #404040; font-size: 12px;"> from </span>
<span class="money-int">649</span>
<sup class="money-decimal">99</sup>
<span class="money-currency">$</span>
<br />
<span style="color: #404040; font-size: 12px;">from
<b>
<span class="money-int">37</span>
<sup class="money-decimal">35</sup>
<span class="money-currency">$</span>/month
</b>
</span>
我尝试使用re.findall
这样的
match = re.findall('\"money-int\"\>(\d*)\<\/span\>\<sup class=\"money-decimal\"\>(\d*)',content)
问题是我得到两个价格的列表,649和37,我只需要649。
答案 0 :(得分:0)
re.findall(r"<span[^>]*class=\"money-int\"[^>]*>([^<]*)</span>[^<]*<sup[^>]*class=\"money-decimal\"[^>]*>([^<]*)</sup>", YOUR_STRING)
答案 1 :(得分:0)
考虑使用xml解析器来完成这项工作,以避免未来的麻烦:
#!/usr/bin/python
from bs4 import BeautifulSoup as BS
html = '''
<span style="color: #404040; font-size: 12px;"> from </span>
<span class="money-int">649</span>
<sup class="money-decimal">99</sup>
<span class="money-currency">$</span>
<br />
<span style="color: #404040; font-size: 12px;">from
<b>
<span class="money-int">37</span>
<sup class="money-decimal">35</sup>
<span class="money-currency">$</span>/month
</b>
</span>
'''
soup = BS(html, 'lxml')
print soup.find_all("span", attrs={"class": "money-int"})[0].get_text()
上的在线演示