我有一些文字:
\frac{A}{B}
我需要将此文本转换为:
<mfrac>
<mrow>
A
</mrow>
<mrow>
B
</mrow>
</mfrac>
我必须使用Python和正则表达式。 A
和B
可以是更多分数,因此函数必须是递归的,例如text:
\frac{1+x}{1+\frac{1}{x}}
必须改为
<mfrac>
<mrow>
1+x
</mrow>
<mrow>
1+
<mfrac>
<mrow>
1
</mrow>
<mrow>
x
</mrow>
</mfrac>
</mrow>
</mfrac>
请帮助正则表达式:)
答案 0 :(得分:1)
如果你需要在默认的python re模块中匹配递归模式, 你可以像我一样为我最近建立的递归评论 css预处理器。
通常使用re来将文本拆分为标记然后使用循环 使用嵌套级别变量来查找所有语法。这是我的代码:
COMMENTsRe = re.compile( r"""
// |
\n |
/\* |
\*/
""", re.X )
def rm_comments( cut ):
nocomment = 0 # no inside comment
c = 1 # c-like comments, but nested
cpp = 2 # c++like comments
mode = nocomment
clevel = 0 # nesting level of c-like comments
matchesidx = []
# in pure RE we cannot find nestesd structuries
# so we are just finding all boundires and parse it here
matches = COMMENTsRe.finditer( str(cut) )
start = 0
for i in matches:
m = i.group()
if mode == cpp:
if m == "\n":
matchesidx.append( ( start, i.end()-1 ) ) # -1 because without \n
mode = nocomment
elif mode == c:
if m == "/*":
clevel += 1
if m == "*/":
clevel -= 1
if clevel == 0:
matchesidx.append( ( start, i.end() ) )
mode = nocomment
else:
if m == "//":
start = i.start()
mode = cpp
elif m == "/*":
start = i.start()
mode = c
clevel += 1
cut.rm_and_save( matchesidx )