我有这本词典:
num_dict = {
(2, 3): [(2, 2), (4, 4), (4, 5)],
(2, 2): [(2, 3), (4, 4), (4, 5)],
(4, 5): [(4, 4)],
(1, 0): [(1, 1), (2, 2), (2, 3), (4, 4), (4, 5)],
(4, 4): [(4, 5)],
(1, 1): [(1, 0), (2, 2), (2, 3), (4, 4), (4, 5)],
}
我需要找到每个元组的第一个值的3个长组合的最大数量,其中只有每个键的值可以进行所述键。
我目前寻找所有独特(3个长)组合的代码是:
ans_set = set()
for x in num_dict:
for y in num_dict[x]:
for z in num_dict[y]:
ans_set.add((x[0], y[0], z[0]))
return len(ans_set)
返回 10
,ans_set
最终成为:
{
(2, 2, 2), (1, 2, 2), (1, 4, 4),
(2, 2, 4), (1, 1, 2), (4, 4, 4),
(1, 2, 4), (1, 1, 4), (1, 1, 1),
(2, 4, 4)
}
但我实际上并不关心这些是什么,只是它们的数量
此方法效率不高,因为它实际上会生成所有可能的组合并将其放入集合中。
我不需要知道每个独特的组合,我只需知道有多少组合。
我觉得可以这样做,也许使用值列表的长度?但是我无法绕过它。
澄清有关我需要的问题是值得欢迎的,因为我意识到我可能没有以最明确的方式解释它。
通过重新评估我需要它做什么,我找到了找到三元组数量的最佳方法。这种方法实际上并没有找到三元组,它只计算它们。
def foo(l):
llen = len(l)
total = 0
cache = {}
for i in range(llen):
cache[i] = 0
for x in range(llen):
for y in range(x + 1, llen):
if l[y] % l[x] == 0:
cache[y] += 1
total += cache[x]
return total
这是一个功能版本,可以解释思考过程(虽然因为垃圾邮件打印而不适合大型列表):
def bar(l):
list_length = len(l)
total_triples = 0
cache = {}
for i in range(list_length):
cache[i] = 0
for x in range(list_length):
print("\n\nfor index[{}]: {}".format(x, l[x]))
for y in range(x + 1, list_length):
print("\n\ttry index[{}]: {}".format(y, l[y]))
if l[y] % l[x] == 0:
print("\n\t\t{} can be evenly diveded by {}".format(l[y], l[x]))
cache[y] += 1
total_triples += cache[x]
print("\t\tcache[{0}] is now {1}".format(y, cache[y]))
print("\t\tcount is now {}".format(total_triples))
print("\t\t(+{} from cache[{}])".format(cache[x], x))
else:
print("\n\t\tfalse")
print("\ntotal number of triples:", total_triples)
答案 0 :(得分:1)
如果我找对你:
from itertools import combinations
num_dict = {
(2, 3): [(2, 2), (4, 4), (4, 5)],
(2, 2): [(2, 3), (4, 4), (4, 5)],
(4, 5): [(4, 4)],
(1, 0): [(1, 1), (2, 2), (2, 3), (4, 4), (4, 5)],
(4, 4): [(4, 5)],
(1, 1): [(1, 0), (2, 2), (2, 3), (4, 4), (4, 5)]
}
set(combinations([k[0] for k in num_dict.keys()], 3))
输出:
{(1, 4, 1),
(2, 1, 1),
(2, 1, 4),
(2, 2, 1),
(2, 2, 4),
(2, 4, 1),
(2, 4, 4),
(4, 1, 1),
(4, 1, 4),
(4, 4, 1)}
而len()
是10
基本上你会做什么,用长度为3的dict键的第一个元素与itertools.combinations
进行所有组合,然后得到set
以消除重复元素。
<强>更新强>
由于您使用所需的输出数据更新了问题
您可以执行以下操作
from itertools import combinations_with_replacement
list(combinations_with_replacement(set([k[0] for k in num_dict.keys()]), 3))
输出:
[(1, 1, 1),
(1, 1, 2),
(1, 1, 4),
(1, 2, 2),
(1, 2, 4),
(1, 4, 4),
(2, 2, 2),
(2, 2, 4),
(2, 4, 4),
(4, 4, 4)]
<强> UPD2 强>
关于时间消耗,我已经运行了
num_dict = {
(2, 3): [(2, 2), (4, 4), (4, 5)],
(2, 2): [(2, 3), (4, 4), (4, 5)],
(4, 5): [(4, 4)],
(1, 0): [(1, 1), (2, 2), (2, 3), (4, 4), (4, 5)],
(4, 4): [(4, 5)],
(1, 1): [(1, 0), (2, 2), (2, 3), (4, 4), (4, 5)]
}
def a(num_dict):
ans_set = set()
for x in num_dict:
for y in num_dict[x]:
for z in num_dict[y]:
ans_set.add((x[0], y[0], z[0]))
return len(ans_set)
def b(num_dict):
from itertools import combinations_with_replacement
return len(list(combinations_with_replacement(set([k[0] for k in num_dict.keys()]), 3)))
%timeit a(num_dict)
%timeit b(num_dict)
结果是:
The slowest run took 4.90 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 12.1 µs per loop
The slowest run took 5.37 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 4.77 µs per loop
我在这里提出的解决方案速度提高了2倍。