我有以下HTML代码:
<div class="test">
"Test"
<br>
<script type="text/javascript"></script>
<a href="mailto:asdf@adsf.com">asdf@adsf.com</a>
" "
</div>
如何使用lxml从此代码中获取电子邮件地址?
答案 0 :(得分:4)
import lxml.html as LH
text='''\
<div class="test">
"Test"
<br>
<script type="text/javascript"></script>
<a href="mailto:asdf@adsf.com">asdf@adsf.com</a>
" "
</div>
'''
doc=LH.fromstring(text)
print(doc.xpath('//a[starts-with(@href,"mailto:")]/text()')[0])
# asdf@adsf.com