将令牌列表转换为XML输出

时间:2015-12-30 17:55:07

标签: python pyparsing

我有一个由pyparsing生成的令牌列表。我需要根据它们周围的标记对列表中的各个标记进行操作。目前,我只是使用for循环。有没有更好的机制来做到这一点?

例如,一个简单的例子是[1, "+", 2]

<block s="reportSum">
    <l>1</l>
    <l>2</l>
</block>

修改 我一直在阅读pyparsing文档,并了解operatorPrecedence和setParseAction。我最终试图将一种语言转换成另一种语言。

例如,say("hi")进入<block s="bubble"><l>Hello!</l></block>。我目前正在将say("hi")解析为["say", "hi"],并想知道如何将其转换为我上面的XML。

1 个答案:

答案 0 :(得分:2)

infixNotation(又名operatorPrecedence)中,您可以将解析操作附加到找到的每个子表达式。见下文:

from pyparsing import *

opfunc = {
    '+': 'reportSum',
    '-': 'reportDifference',
    '*': 'reportProduct',
    '/': 'reportDivision',
    }
def makeXML(a, op, b):
    #~ print a,op,b
    return '<block s="%s"><l>%s</l><l>%s</l></block>' % (opfunc[op], a, b)

def outputBinary(tokens):
    t = tokens[0].asList()
    ret = makeXML(t.pop(0), t.pop(0), t.pop(0))
    while t:
        ret = makeXML(ret, t.pop(0), t.pop(0))
    return ret



integer = Word(nums)
# expand this to include other supported operands, like floats, variables, etc.
operand = integer

arithExpr = infixNotation(operand, 
    [
    (oneOf('* /'), 2, opAssoc.LEFT, outputBinary),
    (oneOf('+ -'), 2, opAssoc.LEFT, outputBinary),
    ])

tests = """\
    1+2
    1+2*5
    1+2*6/3
    1/4+3*4/2""".splitlines()

for t in tests:
    t = t.strip()
    print t
    print arithExpr.parseString(t)[0]
    print

,并提供:

1+2
<block s="reportSum"><l>1</l><l>2</l></block>

1+2*5
<block s="reportSum"><l>1</l><l><block s="reportProduct"><l>2</l><l>5</l></block></l></block>

1+2*6/3
<block s="reportSum"><l>1</l><l><block s="reportDivision"><l><block s="reportProduct"><l>2</l><l>6</l></block></l><l>3</l></block></l></block>

1/4+3*4/2
<block s="reportSum"><l><block s="reportDivision"><l>1</l><l>4</l></block></l><l><block s="reportDivision"><l><block s="reportProduct"><l>3</l><l>4</l></block></l><l>2</l></block></l></block>

请注意,解析'1 + 2 + 3'不会提供传统的[['1','+','2'],'+','3']嵌套列表,而是提供连续序列['1','+','2','+','3'],这就是outputBinary必须迭代的原因列表不仅仅是前3个元素。

至于您的say("hi")示例,以下内容应该有所帮助:

LPAR,RPAR = map(Suppress,"()")
say_command = Keyword("say")('cmd') + LPAR + delimitedList(QuotedString('"'))('args') + RPAR
ask_command = Keyword("ask")('cmd') + LPAR + delimitedList(QuotedString('"'))('args') + RPAR
cmd_func = {
    'say': 'bubble',
    'ask': 'prompt',
    }
def emitAsXML(tokens):
    func = cmd_func[tokens.cmd]
    args = ''.join('<l>%s</l>' % arg for arg in tokens.args)
    return """<block s="%s">%s</block>""" % (func, args)
cmd = (say_command | ask_command).setParseAction(emitAsXML)

tests = """\
    say("hi")
    say("hi","handsome")
    ask("what is your name?")""".splitlines()

for t in tests:
    t = t.strip()
    print t
    print cmd.parseString(t)[0]
    print

,并提供:

say("hi")
<block s="bubble"><l>hi</l></block>

say("hi","handsome")
<block s="bubble"><l>hi</l><l>handsome</l></block>

ask("what is your name?")
<block s="prompt"><l>what is your name?</l></block>

如果您需要更宽的上下文来创建一些输出,那么只需将解析操作附加到解析器中的更高级别表达式。