iItertools.product带有变量验证

时间:2016-10-05 08:51:34

标签: python list dictionary itertools

我有这种形式的清单

data = [
   [
    {'name': 's11', 'class': 'c1'}, {'name': 's12', 'class': 'c2'}
   ],
   [
    {'name': 's21', 'class': 'c2'}, {'name': 's22', 'class': 'c2'}
   ],
   [
    {'name': 's31', 'class': 'c1'}, {'name': 's32', 'class': 'c1'}
   ]
]

使用itertools.product(data) 我从主列表数据中的每个列表中获取一个元素,从而获得所需的所有组合。 我想做什么,如果第一个子列表中的元素在第二个或第三个子列表中有不同的类,我想跳过。

itertools.product是否为这种情况提供了任何验证选项?

预期结果应为:

({'name': 's11', 'class': 'c1'},{'name': 's31', 'class': 'c1'}),
( {'name': 's11', 'class': 'c1'}, {'name': 's32', 'class': 'c1'}),
({'name': 's12', 'class': 'c2'},{'name': 's21', 'class': 'c2'}),
({'name': 's12', 'class': 'c2'},{'name': 's22', 'class': 'c2'}),

2 个答案:

答案 0 :(得分:1)

from itertools import chain, combinations, product
result = [
    (a, b) for a, b 
    in chain.from_iterable(product(*l) for l in combinations(data, 2)) 
    if a['class'] == b['class']
]

答案 1 :(得分:0)

与@skovorodkin发布的几乎相同。

from itertools import product, chain

data = [
    [{'name': 's11', 'class': 'c1'}, {'name': 's12', 'class': 'c2'}],
    [{'name': 's21', 'class': 'c2'}, {'name': 's22', 'class': 'c2'}],
    [{'name': 's31', 'class': 'c1'}, {'name': 's32', 'class': 'c1'}]
]
output = [i for i in product(data[0], chain.from_iterable(data[1:])) if i[0]['class'] == i[1]['class']]
output

输出:

[({'class': 'c1', 'name': 's11'}, {'class': 'c1', 'name': 's31'}),
 ({'class': 'c1', 'name': 's11'}, {'class': 'c1', 'name': 's32'}),
 ({'class': 'c2', 'name': 's12'}, {'class': 'c2', 'name': 's21'}),
 ({'class': 'c2', 'name': 's12'}, {'class': 'c2', 'name': 's22'})]

<强>更新

只是一点比较。我刚使用了默认数据,时间结果如下:

@skovorodkin回答:

>>> %timeit result = [(a, b) for a, b in chain.from_iterable(product(*l) for l in combinations(data, 2)) if a['class'] == b['class']]
The slowest run took 8.56 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.69 µs per loop

我的回答:

>>> %timeit output = [i for i in product(data[0], chain.from_iterable(data[1:])) if i[0]['class'] == i[1]['class']]
The slowest run took 10.37 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.43 µs per loop