在MsExcel / LibreOfficeCalc中,我有这样的文字:
<h3><strong>Ways to stretch your budget</strong>
<p>passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
Why do we use it?
</p>
<p>passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
Why do we use it?</p>
<ul>
<li><strong>Instrument Rentals</strong> passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
Why do we use it?</li>
<li><strong>passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
Why do we use it?</li>
</ul>
如何删除html标签之间的文本?
示例:
<p>content<p><ul><li>content></li></ul>
答案 0 :(得分:1)
只需使用正则表达式:
import re
result = re.sub('>\s*<', '><', text, 0, re.M)