专门的Powerset要求

时间:2017-04-03 19:11:44

标签: python regex powerset

假设我编译了五个正则表达式模式,然后创建了五个布尔变量:

a =  re.search(first, mystr)
b =  re.search(second, mystr)
c =  re.search(third, mystr)
d = re.search(fourth, mystr)
e = re.search(fifth, mystr)

我想在一个函数中使用(a,b,c,d,e)的Powerset,因此它首先找到更具体的匹配然后通过。如您所见,Powerset(以及它的列表表示)应按降序的元素数进行排序。

期望的行为:

 if a and b and c and d and e:
     return 'abcde' 
 if a and b and c and d:
     return 'abcd'
 [... and all the other 4-matches ]
 [now the three-matches]
 [now the two-matches]
 [now the single matches]
 return 'No Match'  # did not match anything

有没有办法以编程方式,理想地,简洁地利用Powerset来获得此函数的行为?

2 个答案:

答案 0 :(得分:2)

您可以在itertools文档中使用powerset()生成器函数配方,如下所示:

from itertools import chain, combinations
from pprint import pprint
import re

def powerset(iterable):
    "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
    s = list(iterable)
    return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))

mystr   = "abcdefghijklmnopqrstuvwxyz"
first   = "a"
second  = "B"  # won't match, should be omitted from result
third   = "c"
fourth  = "d"
fifth   = "e"

a = 'a' if re.search(first, mystr) else ''
b = 'b' if re.search(second, mystr) else ''
c = 'c' if re.search(third, mystr) else ''
d = 'd' if re.search(fourth, mystr) else ''
e = 'e' if re.search(fifth, mystr) else ''

elements = (elem for elem in [a, b, c, d, e] if elem is not '')
spec_ps = [''.join(item for item in group)
              for group in sorted(powerset(elements), key=len, reverse=True)
                  if any(item for item in group)]

pprint(spec_ps)

输出:

['acde',
 'acd',
 'ace',
 'ade',
 'cde',
 'ac',
 'ad',
 'ae',
 'cd',
 'ce',
 'de',
 'a',
 'c',
 'd',
 'e']

答案 1 :(得分:0)

首先,那些不是布尔人;他们要么匹配对象,要么None。其次,通过电源设置将是一个非常低效的方式来解决这个问题。如果相应的正则表达式匹配,只需在字符串中粘贴每个字母:

return ''.join(letter for letter, match in zip('abcde', [a, b, c, d, e]) if match)