我正在尝试用一串PL / FOL公式构建文字列表,相关的代码段正在查找匹配项,但将它们返回为空白。
我尝试了re.escape(formula)
,但没有执行任何操作。我还尝试了findall
模式的简单变体,但是它们随后会生成空列表。
def clean(formula):
formula = formula.strip()
formula = re.sub("\( +", "(", formula)
formula = re.sub(" +\)", ")", formula)
formula = re.sub("(?P<b_ops>[&v→↔])", " " + "\g<b_ops>" + " ", formula)
formula = re.sub("[ ]+", " ", formula)
# Make an inventory of literals for the original formula.
orig_lit_inv = re.findall("[~]*[A-Z]([a-u]|[w-z]){0,}", formula)
print(orig_lit_inv)
this_WFF = "(P) & ~(~(Q → (R & ~S)))"
clean(formula=this_WFF)
打印结果时,我得到['', '', '', '']
。换句话说,它找到了匹配项,但是返回空白字符串作为那些匹配项,此时它至少应返回[A-Z]
的匹配项。使用this_WFF
作为参数,clean(formula)
应该打印['P', 'Q', 'R', '~S']
。
答案 0 :(得分:1)
如果模式中存在一个或多个捕获组,则返回 组列表;这将是一个元组列表,如果模式 有一个以上的小组。
您的正则表达式包含一个捕获组,因此findall
将永远不会为正则表达式的[A-Z]
部分返回任何内容。将([a-u]|[w-z])
更改为(?:[a-u]|[w-z])
可以看到不同之处:
>>> this_WFF = "(P) & ~(~(Q → (R & ~S)))"
>>> def clean(formula):
... formula = formula.strip()
... formula = re.sub("\( +", "(", formula)
... formula = re.sub(" +\)", ")", formula)
... formula = re.sub("(?P<b_ops>[&v→↔])", " " + "\g<b_ops>" + " ", formula)
... formula = re.sub("[ ]+", " ", formula)
... # Make an inventory of literals for the original formula.
... orig_lit_inv = re.findall("[~]*[A-Z]([a-u]|[w-z]){0,}", formula)
... print(orig_lit_inv)
...
>>> clean(this_WFF)
['', '', '', '']
>>> def clean(formula):
... formula = formula.strip()
... formula = re.sub("\( +", "(", formula)
... formula = re.sub(" +\)", ")", formula)
... formula = re.sub("(?P<b_ops>[&v→↔])", " " + "\g<b_ops>" + " ", formula)
... formula = re.sub("[ ]+", " ", formula)
... # Make an inventory of literals for the original formula
... orig_lit_inv = re.findall("[~]*[A-Z](?:[a-u]|[w-z]){0,}", formula)
... print(orig_lit_inv)
...
>>> clean(this_WFF)
['P', 'Q', 'R', '~S']
由于现在正则表达式不包含捕获组findall
,因此只在结果中返回“组0”的内容(即整个匹配项)。