在pyparsing

时间:2017-07-06 13:06:39

标签: parsing pyparsing

我发现了非常酷的pyparsing模块。我试图解析一组简单的布尔表达式,其中一些标识符(foo / bar / bla和zoo)与数值进行比较。解析器用于检查用户表达式是否正确,但我还想获得表达式中使用的标识符的名称(即使用了foo / bar / bla和zoo的组合)。我无法通过简单的方法来实现这一目标。 在下面的示例中,foo和bar用于表达式中。但我怎样才能获得这些信息?

最佳

from pyparsing import oneOf
from pyparsing import Group
from pyparsing import Regex
from pyparsing import operatorPrecedence
from pyparsing import opAssoc
from pyparsing import Literal
from pyparsing import Word
from pyparsing import nums
from pyparsing import Combine
from pyparsing import Optional
from pyparsing import CaselessLiteral
from pyparsing import alphanums
from pyparsing import quotedString
from pyparsing import Forward

lparen              = Literal("(")
rparen              = Literal(")")
and_operator        = CaselessLiteral("and")
or_operator         = CaselessLiteral("or")
comparison_operator = oneOf(['==','!=','>','>=','<', '<='])
point               = Literal('.')
e                   = CaselessLiteral('E')
plusorminus         = Literal('+') | Literal('-')
number              = Word(nums)
integer             = Combine( Optional(plusorminus) + number )
float_nb            = Combine( integer +
                        Optional( point + Optional(number) ) +
                        Optional( e + integer ))
value               = float_nb
value.resultsName   = 'value'

identifier = oneOf(['foo','bar', 'bla', 'zoo'], caseless=False)
identifier.resultsName = 'key'
group_1 = Group(identifier + comparison_operator + value)
group_2 = Group(value + comparison_operator + identifier)
comparison = group_1 | group_2
boolean_expr = operatorPrecedence(
                    comparison, 
                    [(and_operator, 2, opAssoc.LEFT),
                    (or_operator,  2, opAssoc.LEFT)])

boolean_expr_par = "(" + boolean_expr + ")"

expression = Forward()
expression << boolean_expr | boolean_expr_par

exp = expression.parseString('2.5 > foo and (3 < bar or (foo > 10 and bar < 3)) ' , parseAll=True)
# Now how can I get the 'identifiers used in exp' ?

1 个答案:

答案 0 :(得分:0)

我遇到了类似的问题。

我在“标识符”解析器中使用了'setParseAction'方法来设置记录每个匹配'identifier'的令牌的函数。然后我打印录制的项目:

  1. 我宣布:

    idSet = set()
    def recordID(tokens):
        idSet.add(tokens[0])
        return
    
  2. 我按如下方式修改'标识符'解析器:

    identifier = oneOf(['foo','bar', 'bla', 'zoo'], caseless=False).setParseAction(recordID)
    
  3. 在脚本结束时,我打印'idSet':

    exp = expression.parseString('2.5 > foo and (3 < bar or (foo > 10 and bar < 3)) ' , parseAll=True)
    print(idSet)
    
  4. 它给出了以下结果:

    {'foo', 'bar'}