pyparsing - 定义关键字 - 比较Literal,Word,Keyword和Combine

时间:2013-12-10 10:08:14

标签: python pyparsing

我的问题与here (nested function calls)

相同

我还想限制仿函数只是许多给定单词(a,b,c)中的一个

如此合法:

a(dd, ee)
b(a(1)) 

但不是:

aa(b(9))  - aa is invalid functor here

我可以使用以下方法之一来实现:

functor1 = Literal('a') | Literal('b') | Literal('c')
functor2 = Word('a') | Word('b') | Word('c')
functor3 = Keyword('a') | Keyword('b') | Keyword('c')
functor4 = Combine(Keyword('a') | Keyword('b') | Keyword('c'))

首先很容易,但是对我来说休息太模糊了(特别是当Word有param asKeyword,但它的代码不使用Keyword类,反之亦然)。

请比较一下。

OR列表是否为Combine?

1 个答案:

答案 0 :(得分:2)

以下是一些用于比较您的pyparsing表达式的测试代码。

from pyparsing import *

functor1 = Literal('a') | Literal('b') | Literal('c')
functor2 = Word('a') | Word('b') | Word('c')
functor3 = Keyword('a') | Keyword('b') | Keyword('c')
functor4 = Combine(Keyword('a') | Keyword('b') | Keyword('c'))

functor1.setName("Literal('a') | Literal('b') | Literal('c')")
functor2.setName("Word('a') | Word('b') | Word('c')")
functor3.setName("Keyword('a') | Keyword('b') | Keyword('c')")
functor4.setName("Combine(Keyword('a') | Keyword('b') | Keyword('c'))")
functors = [functor1, functor2, functor3, functor4]

tests = "a b c aaa bbb ccc after before".split()
for func in functors:
    print func
    for t in tests:
        try:
            print t, ':', func.parseString(t)
        except ParseException as pe:
            print pe
    print

打印:

Literal('a') | Literal('b') | Literal('c')
a : ['a']
b : ['b']
c : ['c']
aaa : ['a']
bbb : ['b']
ccc : ['c']
after : ['a']
before : ['b']

Word('a') | Word('b') | Word('c')
a : ['a']
b : ['b']
c : ['c']
aaa : ['aaa']
bbb : ['bbb']
ccc : ['ccc']
after : ['a']
before : ['b']

Keyword('a') | Keyword('b') | Keyword('c')
a : ['a']
b : ['b']
c : ['c']
aaa : Expected "a" (at char 0), (line:1, col:1)
bbb : Expected "a" (at char 0), (line:1, col:1)
ccc : Expected "a" (at char 0), (line:1, col:1)
after : Expected "a" (at char 0), (line:1, col:1)
before : Expected "a" (at char 0), (line:1, col:1)

Combine(Keyword('a') | Keyword('b') | Keyword('c'))
a : ['a']
b : ['b']
c : ['c']
aaa : Expected "a" (at char 0), (line:1, col:1)
bbb : Expected "a" (at char 0), (line:1, col:1)
ccc : Expected "a" (at char 0), (line:1, col:1)
after : Expected "a" (at char 0), (line:1, col:1)
before : Expected "a" (at char 0), (line:1, col:1)

你应该能够做出这些观察:

  • Literal将匹配给定的字符串,即使它只是开头的 一个更大的字符串

  • Word将匹配单词字组 由构造函数字符串中的字母组成。

  • Keyword会 仅匹配给定的字符串,如果它不是更大的单词的一部分 (后跟空格或非单词字符)

  • Combine没有 在这个例子中真的做了什么。

Combine的目的是将多个匹配的标记合并为一个字符串。例如,如果您将社会安全号码定义为:

Word(nums,exact=3) + '-' + Word(nums,exact=2) + '-' + Word(nums,exact=4)

然后解析“555-66-7777”会给你

['555', '-', '66', '-', '7777']

您最有可能将此作为单个字符串,因此通过将解析器表达式包装在Combine中来合并结果:

Combine(Word(nums,exact=3) + '-' + Word(nums,exact=2) + '-' + Word(nums,exact=4))

['555-66-7777']