我开发了一个必须在输出中插入新标记的pyparsing语法。此令牌不是来自原始输入。
例:
输入:
'/* foo bar*/'
输出继电器:
['comment', '/* foo bar*/']
如果这些元素不在原始表达式中,如何将元素添加到解析器输出中?
答案 0 :(得分:2)
阅读pyparsing's API我找到了一个名为 replaceWith 的暗示名称的函数。使用此函数和 addParseAction 我能够解决问题。
以下代码是问题的解决方案:
from pyparsing import *
crazyVariable = Empty().addParseAction(replaceWith('comment')) + cStyleComment
print(crazyVariable.parseString('/* foo bar*/' ))
输出:
['comment', '/* foo bar*/']
答案 1 :(得分:2)
实现相同结果的另一种方法,也许是具有更强表现力的方法,是使用命名表达式。例如:
from pyparsing import *
grammar = cStyleComment("comment")
s = '/* foo bar*/'
sol = grammar.parseString(s)
print sol.asDict()
>>> {'comment': '/* foo bar*/'}
您会注意到您没有按照预期的列表,但这样可以让您在结果变得更复杂时直接访问结果。让我们看看实际行动:
code = Word(alphanums+'(){},.<>"; ')
grammar = OneOrMore(code("code") | cStyleComment("comment"))
s = 'cout << "foobar"; /* foo bar*/'
sol = grammar.parseString(s)
print "code:", sol["code"]
print "comment", sol["comment"]
>>> code: cout << "foobar";
>>> comment: /* foo bar*/
答案 2 :(得分:1)
这个替代解决方案并不完全是标题中问题的答案,而是问题旨在解决的更普遍问题的答案:
如何使用从类实例化的节点对象构建语法树:
# -*- coding: utf-8 -*-
from pyparsing import *
def uncommentCStyleComment(t): ''' remove /* and */ from a comment '''; return t[0][2:-2]
'''
classes which replaces functions as arguments in setParseAction or addParseAction
each class will be used to build a node in a syntax tree
t argument on constructor is the list of child nodes of the node
'''
class Foo(object):
def __init__(self,t): self.value = t[0] # t = ['foo']
def __str__(self): return self.value # return 'foo'
class Bar(object):
members = [] # list of foos and comments
def __init__(self,t): self.members.extend(t) # t = list of foos and comments
def __str__(self):
_str = 'Bar:\n'
for member in self.members: _str = _str + '\t' + str(member) + '\n'
return _str
class Comment(object):
def __init__(self,t): self.value = t[0]; # t = ['/* Some comment */']
def __str__(self): return '/*' + str(self.value) + '*/' # return '/* Some comment */'
# return an object of type Foo instead a token
foo = Combine('foo') .setParseAction(Foo)
# uncomment and return an object of type Comment instead a token
comment = cStyleComment .setParseAction(uncommentCStyleComment, Comment)
# return an object of type Bar instead a token
bar = OneOrMore(comment | foo)('ast') .setParseAction(Bar)
# parse the input string
tokens = bar.parseString('foo\n/* data bar*/\nfoo\nfoo' )
# print the object named ast in the parser output
print( tokens['ast'] )
这是一种非常优雅的构建输出方式,无需后期处理。