递归函数是parseMML。我希望它将MathML表达式解析为Python表达式。简单的例子mmlinput是产生3/5的分数,但它产生:
['(', '(', '3', ')', '/', '(', '5', ')', '(', '3', ')', '(', '5', ')', ')']
而不是:
['(', '(', '3', ')', '/', '(', '5', ')', ')']
因为我不知道如何摆脱递归输入的元素。关于如何跳过它们的任何想法?
由于
mmlinput='''<?xml version="1.0"?> <math xmlns="http://www.w3.org/1998/Math/MathML" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/1998/Math/MathML http://www.w3.org/Math/XMLSchema/mathml2/mathml2.xsd"> <mrow> <mfrac> <mrow> <mn>3</mn> </mrow> <mrow> <mn>5</mn> </mrow> </mfrac> </mrow> </math>'''
def parseMML(mmlinput):
from lxml import etree
from StringIO import *
from lxml import objectify
exppy=[]
events = ("start", "end")
context = etree.iterparse(StringIO(mmlinput),events=events)
for action, elem in context:
if (action=='start') and (elem.tag=='mrow'):
exppy+='('
if (action=='end') and (elem.tag=='mrow'):
exppy+=')'
if (action=='start') and (elem.tag=='mfrac'):
mmlaux=etree.tostring(elem[0])
exppy+=parseMML(mmlaux)
exppy+='/'
mmlaux=etree.tostring(elem[1])
exppy+=parseMML(mmlaux)
if action=='start' and elem.tag=='mn': #this is a number
exppy+=elem.text
return (exppy)
答案 0 :(得分:0)
问题是您要在mfrac
标记内解析两次子树,
因为你是递归地解析它。快速解决方法是计算
递归级别:
mmlinput = "<math> <mrow> <mfrac> <mrow> <mn>3</mn> </mrow> <mrow> <mn>5</mn> </mrow> </mfrac> </mrow> </math>"
def parseMML(mmlinput):
from lxml import etree
from StringIO import *
from lxml import objectify
exppy=[]
events = ("start", "end")
level = 0
context = etree.iterparse(StringIO(mmlinput),events=events)
for action, elem in context:
if (action=='start') and (elem.tag=='mfrac'):
level += 1
mmlaux=etree.tostring(elem[0])
exppy+=parseMML(mmlaux)
exppy+='/'
mmlaux=etree.tostring(elem[1])
exppy+=parseMML(mmlaux)
if (action=='end') and (elem.tag=='mfrac'):
level -= 1
if level:
continue
if (action=='start') and (elem.tag=='mrow'):
exppy+='('
if (action=='end') and (elem.tag=='mrow'):
exppy+=')'
if action=='start' and elem.tag=='mn': #this is a number
exppy+=elem.text
return (exppy)
注意:我必须删除命名空间才能使其正常工作,因为elem.tag
会为我返回完全限定的标记名称。您还使用+=
将字符串添加到列表中。对于可能有效的单字符字符串,但列表上的+
就像调用extend
一样,所以:
>>> lst = []
>>> lst += 'spam'
>>> lst
['s', 'p', 'a', 'm']