我有一个递归Python函数,该函数从包含表示一周中不同日期的元素的不同“类型”的输入数组生成有效的输出数组,例如[m1, m2, m3, m4, t1, t2, t3, t4, w1, w2, w3, w4]
。
为了满足我的需求,我能够找出一个递归函数(来自另一个堆栈溢出程序的帮助),该函数可以获取输入数组并根据约束条件返回有效数组:
示例输入
[m1,m2,m3,m4,m5,m6,m7,m8,m9,m10,m11,m12,t1,t2,t3,t4,w1,w2,w3,w4,f1,f2,f3,f4]
。
示例输出:
[m1,m2,m3,m4,m5,m6,m7,m8,m9,m10,m11,m12]
(可以是一种类型,因为其他类型不存在)
[m5,m6,m7,m8,m9,m10,m11,m12,t1,t2,t3,t4]
(或每种类型至少4个顺序)
[m4,m5,m6,m7,w1,w2,w3,w4,f1,f2,f3,f4]
(每种类型至少有4种,但可能会丢失)等。
无效:
[m4,m6,m5,m7,w1,w2,w3,w4,f1,f2,f3,f4]
(乱序)
[m4,m5,m6,m7,m8,w1,w2,w3,w4,f1,f2,f3]
(每种类型都不是4)
有效的代码:
import collections
import re
data = ['f13', 'f14', 'f15', 'f16', 'f17', 'w0', 'w1', 'w2', 'w3', 't4', 't5', 't6', 't7', 't8', 't9', 'r4', 'r5', 'r6', 'r7', 'r8', 'r9', 'm0', 'm1', 'm2', 'm3']
def combo(d, c = []):
if len(c) == 12:
yield c
else:
for i in d:
_count1 = collections.Counter([re.findall('^[a-zA-Z]+', j)[0] for j in c])
_count2 = collections.Counter([re.findall('^[a-zA-Z]+', j)[0] for j in c+[i]])
if i not in c:
if len(c) < 11 or all(b >= 4 for b in _count2.values()):
if re.findall('^[a-zA-Z]+', i)[0] in _count1:
if int(re.findall('\d+$', i)[0])-1 == int(re.findall('\d+$', c[-1])[0]) and re.findall('^[a-zA-Z]+', i)[0] == re.findall('^[a-zA-Z]+', c[-1])[0]:
yield from combo(d, c+[i])
else:
yield from combo(d, c+[i])
result = combo(data)
print(next(result))
输出
['f13','f14','f15','f16','w0','w1','w2','w3','t4','t5','t6','t7']
此函数成功返回正确/有效的计划,但是要获得第一个成功的结果,需要299秒。有没有一种方法可以优化代码,或以某种方式处理输入数组以使这些结果可以更快地返回?谢谢
为澄清起见进行编辑:
我需要有一个函数(就像我现在拥有的那样)为输入生成所有可能的输出,该函数根据我的约束是有效的,最好以类似生成器的方式,所以我可以在需要时一次遍历它,以查看该组合在我的程序中是否有效。
例如,使用相同的输入
data = ['f13', 'f14', 'f15', 'f16', 'f17', 'w0', 'w1', 'w2', 'w3', 't4', 't5', 't6', 't7', 't8', 't9', 'r4', 'r5', 'r6', 'r7', 'r8', 'r9', 'm0', 'm1', 'm2', 'm3']
我可能会有类似的输出
['f13','f14','f15','f16','w0','w1','w2','w3','t4','t5','t6','t7']
['f14','f15','f16','f17','w0','w1','w2','w3','t4','t5','t6','t7']
['f13','f14','f15','f16','r4','r5','r6','r7','t4','t5','t6','t7']
等
使用其他输入
data = ['m0','m1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7', 'm8','m9','m10','m11','t0','t1', 't2', 't3', 't4', 't5']
我可能会有类似的输出
['m0','m1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7', 'm8','m9','m10','m11']
['m0','m1', 'm2', 'm3', 'm4', 'm5', 't0','t1', 't2', 't3', 't4','t5']
['m0','m1', 'm2', 'm3', 'm4', 'm5', 'm6','t1', 't2', 't3', 't4','t5']
等
注意:对于我的需求,以下输出将是等效的,但不必仅打印其中一个输出
['m0','m1', 'm2', 'm3', 'm4', 'm5', 't0','t1', 't2', 't3', 't4','t5']
[ 't0','t1', 't2', 't3', 't4', 't5', 'm0','m1', 'm2', 'm3', 'm4', 'm5']
答案 0 :(得分:1)
您可以尝试使用此代码。我对数据进行预处理(从字符串中提取数字并对其进行排序),使其在每次迭代中均不执行regex
:
import re
from itertools import groupby
from itertools import combinations
data = ['m0','m1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7', 'm8','m9','m10','m11','t0','t1', 't2', 't3', 't4', 't5']
# returns eg.:
# {'f': [13, 14, 15, 16, 17], 'w': [0, 1, 2, 3], 't': [4, 5, 6, 7, 8, 9], 'r': [4, 5, 6, 7, 8, 9], 'm': [0, 1, 2, 3]}
def preprocess_data(data):
out = {}
for item in data:
for k, v in re.findall(r'(\w)(\d+)', item):
out.setdefault(k, []).append(int(v))
for k in out:
out[k].sort()
return out
# 1. if an element from a specific day is present, there must be atleast 4 of them
# 2. the elements from a certain day must be sequential <- they are, because we preprocessed the data
# 3. must be 12 total elements
def check(data):
rv = {}
keys = set()
for k, v in data.items():
for vv, gg in groupby(enumerate(v), lambda k: k[0]-k[1]):
consecutive_elements = [ii[1] for ii in gg]
keys.add(k)
for i in range(4, len(consecutive_elements) + 1):
rv.setdefault(k, []).append(consecutive_elements[:i])
break
for k in [*rv.keys()]:
rv[k].append([])
for c in combinations([(k, i) for k, v in rv.items() for i in v], len(rv)):
if any(len(i[1]) < 4 for i in c if len(i[1]) > 0):
continue
elements = [i[0] for i in c]
if len(elements) != len(set(elements)):
continue
c2 = tuple(i[0] + str(ii) for i in c for ii in i[1])
if len(c2) == 12:
yield c2
def get_valid_combinations(data, dont_rotate=[], seen=set()):
for c in check(data):
if c not in seen:
seen.add(c)
yield c
for k, v in data.items():
if k in dont_rotate:
continue
for n in range(len(v)):
data[k] = v[n:] + v[:n]
yield from get_valid_combinations(data, dont_rotate + [k], seen)
for a in get_valid_combinations(preprocess_data(data)):
print(a)
打印:
('m0', 'm1', 'm2', 'm3', 'm4', 'm5', 't0', 't1', 't2', 't3', 't4', 't5')
('m0', 'm1', 'm2', 'm3', 'm4', 'm5', 'm6', 't0', 't1', 't2', 't3', 't4')
('m0', 'm1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7', 't0', 't1', 't2', 't3')
('m0', 'm1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7', 'm8', 'm9', 'm10', 'm11')
('m0', 'm1', 'm2', 'm3', 'm4', 'm5', 'm6', 't1', 't2', 't3', 't4', 't5')
('m0', 'm1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7', 't1', 't2', 't3', 't4')
('m0', 'm1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7', 't2', 't3', 't4', 't5')
('m1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7', 't0', 't1', 't2', 't3', 't4')
('m1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7', 'm8', 't0', 't1', 't2', 't3')
('m1', 'm2', 'm3', 'm4', 'm5', 'm6', 't0', 't1', 't2', 't3', 't4', 't5')
('m1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7', 't1', 't2', 't3', 't4', 't5')
('m1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7', 'm8', 't1', 't2', 't3', 't4')
('m1', 'm2', 'm3', 'm4', 'm5', 'm6', 'm7', 'm8', 't2', 't3', 't4', 't5')
('m2', 'm3', 'm4', 'm5', 'm6', 'm7', 'm8', 'm9', 't0', 't1', 't2', 't3')
('m2', 'm3', 'm4', 'm5', 'm6', 'm7', 'm8', 't0', 't1', 't2', 't3', 't4')
('m2', 'm3', 'm4', 'm5', 'm6', 'm7', 't0', 't1', 't2', 't3', 't4', 't5')
('m2', 'm3', 'm4', 'm5', 'm6', 'm7', 'm8', 't1', 't2', 't3', 't4', 't5')
('m2', 'm3', 'm4', 'm5', 'm6', 'm7', 'm8', 'm9', 't1', 't2', 't3', 't4')
('m2', 'm3', 'm4', 'm5', 'm6', 'm7', 'm8', 'm9', 't2', 't3', 't4', 't5')
('m3', 'm4', 'm5', 'm6', 'm7', 'm8', 'm9', 'm10', 't0', 't1', 't2', 't3')
('m3', 'm4', 'm5', 'm6', 'm7', 'm8', 'm9', 't0', 't1', 't2', 't3', 't4')
('m3', 'm4', 'm5', 'm6', 'm7', 'm8', 't0', 't1', 't2', 't3', 't4', 't5')
('m3', 'm4', 'm5', 'm6', 'm7', 'm8', 'm9', 't1', 't2', 't3', 't4', 't5')
('m3', 'm4', 'm5', 'm6', 'm7', 'm8', 'm9', 'm10', 't1', 't2', 't3', 't4')
('m3', 'm4', 'm5', 'm6', 'm7', 'm8', 'm9', 'm10', 't2', 't3', 't4', 't5')
('m4', 'm5', 'm6', 'm7', 'm8', 'm9', 'm10', 'm11', 't2', 't3', 't4', 't5')
('m4', 'm5', 'm6', 'm7', 'm8', 'm9', 'm10', 'm11', 't0', 't1', 't2', 't3')
('m4', 'm5', 'm6', 'm7', 'm8', 'm9', 'm10', 't0', 't1', 't2', 't3', 't4')
('m4', 'm5', 'm6', 'm7', 'm8', 'm9', 't0', 't1', 't2', 't3', 't4', 't5')
('m4', 'm5', 'm6', 'm7', 'm8', 'm9', 'm10', 't1', 't2', 't3', 't4', 't5')
('m4', 'm5', 'm6', 'm7', 'm8', 'm9', 'm10', 'm11', 't1', 't2', 't3', 't4')
('m5', 'm6', 'm7', 'm8', 'm9', 'm10', 'm11', 't1', 't2', 't3', 't4', 't5')
('m5', 'm6', 'm7', 'm8', 'm9', 'm10', 'm11', 't0', 't1', 't2', 't3', 't4')
('m5', 'm6', 'm7', 'm8', 'm9', 'm10', 't0', 't1', 't2', 't3', 't4', 't5')
('m6', 'm7', 'm8', 'm9', 'm10', 'm11', 't0', 't1', 't2', 't3', 't4', 't5')