嗨,大家好遇到特定问题。 我正在使用python的正则表达式来改变标记源以输出html格式。
标记来源:
[
# sometextsometextsometextsometextsometextsometext. #
# sometextsometextsometextsometextsometextsometextsometextsometext
sometextsometextsometextsometextsometextsometext. #
]
[
hello i am a normal paragraph.
]
期望的输出:
<ol>
<li> sometextsometextsometextsometextsometextsometext. </li>
<li> sometextsometextsometextsometextsometextsometextsometextsometext
sometextsometextsometextsometextsometextsometext. </li>
</ol>
<p>
hello i am a normal paragraph.
</p>
答案 0 :(得分:1)
import re
with open('mk.txt') as f:
with open('newmk.txt','w+') as g:
text = f.read()
SquareGroups = re.findall(r'\[(?:.|\n)+?\]',text)
for group in SquareGroups:
if '#' in group: #must be ol
group = group.replace('[','<ol>')
group = group.replace(']','</ol>')
group = re.sub('#(?= ?\w)','<li>',group)
group = re.sub('(?<=[\w ])#','</li>',group)
else:
group = group.replace('[','<p>')
group = group.replace(']','</p>')
g.write(group)
g.write('\n') #optional, just makes the output look 'nicer'
将mk.txt
中的输入转换为newmk.txt
中的以下文字:
<ol>
<li> sometextsometextsometextsometextsometextsometext. </li>
<li> sometextsometextsometextsometextsometextsometextsometextsometext
sometextsometextsometextsometextsometextsometext. </li>
</ol>
<p>
hello i am a normal paragraph.
</p>