我想,给定一个JSGF语法,生成它将映射到的所有终端字符串。例如,给定A (B | C) D [E]
,我想要的输出是:
A B D E
A C D E
A B D
A C D
我决定从最简单的项目,可选的括号开始,但很快就碰到了一堵砖墙。它适用于1个项目,但不适用于具有替代项目的项目。任何意见,将不胜感激。
我现在拥有的:
import re
rule = raw_input('Enter the rule you want to test: ')
items = re.findall(r"\w[\w ]*\w|\w|\[|\]|\(|\)", rule)
for anitem in range(len(items)):
bracketc = items[:anitem].count('[') - items[:anitem].count(']')
if items[anitem] != '[' and items[anitem] != ']':
if bracketc > 0:
optional = True
else:
optional = False
while optional == True:
print ' '.join(items)
it2 = items[:]
it2.remove(it2[anitem])
print ' '.join(it2)
break
它适用于1个项目,并且给定字符串A B [C] D,返回:
A B [ C ] D
A B [ ] D
但随着复杂性的增加而崩溃,所以我猜我需要完全不同的东西。
答案 0 :(得分:1)
从您的示例中,我编写了以下代码:
rule="A(B|C)D[E]FG"
def generate_strings(rule):
if not rule:
return [""]
begin,end=rule[0],rule[1:]
if begin=='[':
i=end.find(']')
if i==-1:
raise Exception("Unmatched '['")
alt,end=end[0:i],end[i+1:]
return [a+e for e in generate_strings(end) for a in [alt,""]]
if begin=='(':
i=end.find(')')
if i==-1:
raise Exception("Unmatched '('")
alt,end=end[0:i].split('|'),end[i+1:]
return [a+e for e in generate_strings(end) for a in alt]
if begin in [']',')','|']:
raise Exception("Unexpected " + begin)
return [begin + e for e in generate_strings(end)]
print generate_strings(rule)
修改: 这是尝试使用嵌套表达式。它现在并不是很有效,因为现在解析更加微妙:当我们找到一个结束括号时,它可能不是我们想要的那个,而是一个嵌套表达式。管道和括号也是如此。
def flatten(l):
return [item for sublist in l for item in sublist]
def generate_strings(rule):
if not rule:
return [""]
begin,end=rule[0],rule[1:]
if begin=='[':
i=end.find(']')
if i==-1:
raise Exception("Unmatched '['")
alt=flatten([generate_strings(a) for a in [end[0:i],""]])
end=end[i+1:]
return [a+e for e in generate_strings(end) for a in alt]
if begin=='(':
i=end.find(')')
if i==-1:
raise Exception("Unmatched '('")
alt=flatten([generate_strings(a) for a in end[0:i].split('|')])
end=end[i+1:]
return [a+e for e in generate_strings(end) for a in alt]
if begin in [']',')','|']:
raise Exception("Unexpected " + begin)
return [begin + e for e in generate_strings(end)]
print generate_strings(rule)
答案 1 :(得分:0)
Josay回答的生成形式:
def generate_strings(rule):
if not rule:
yield ""
else:
begin, end = rule[0], rule[1:]
if begin == '[':
i = end.find(']')
if i == -1:
raise ValueError("Unmatched '['")
optional, end = end[:i], end[i+1:]
for e in generate_strings(end):
yield e
yield optional + e
elif begin == '(':
i = end.find(')')
if i == -1:
raise ValueError("Unmatched '('")
parts, end = end[:i].split('|'), end[i+1:]
for e in generate_strings(end):
for p in parts:
yield p + e
elif begin in '])|':
raise ValueError("Unexpected " + begin)
else:
for e in generate_strings(end):
yield begin + e
>>> list(generate_strings("A(B|C)D[E]FG"))
['ABDFG', 'ACDFG', 'ABDEFG', 'ACDEFG']