我必须列出其中包含一些共同元素的列表:
p = [('link1/d/b/c', 'target1/d/b/c'), ('link2/a/g/c', 'target2/a/g/c'), ..., ('linkn/b/b/f', 'targetn/b/b/f')]
q = [['target1/d/b/c', 'target1', 123, 334], ['targetn/b/b/f', 'targetn', 23, 64], ... ,['targetx/f/f/f', 'targetx', 999, 888]]
我试图比较它们并找到共同的元素,然后用结果做一些工作:
do_job('target1/d/b/c', 'target1', 123, 334, 'link1/d/b/c')
现在我使用简单且非常慢的alghortihm:
for item in p:
link = item[0]
target = item[1]
for item2 in q:
target2 = item2[0]
if target2 == target:
do_some_job(...)
我知道,我需要比较这两个列表并创建一个包含所有元素的列表,例如:
pq = [['target1/d/b/c', 'target1', 123, 334, 'link1/d/b/c'], ..., ['targetn/b/b/f', 'targetn', 23, 64, 'linkn/b/b/f']]
然后调用do_some_job(pq)
而不是每当我找到相同的元素时调用它
如何获得它?
最好的问候
答案 0 :(得分:5)
使用chain()
展平两个列表,然后使用set()
和intersection()
来获取常用元素。
In [78]: from itertools import chain
In [79]: p
Out[79]:
[('link1/d/b/c', 'target1/d/b/c'),
('link2/a/g/c', 'target2/a/g/c'),
('linkn/b/b/f', 'targetn/b/b/f')]
In [80]: q
Out[80]:
[['target1/d/b/c', 'target1', 123, 334],
['targetn/b/b/f', 'targetn', 23, 64],
['targetx/f/f/f', 'targetx', 999, 888]]
In [81]: set(chain(*p)).intersection(set(chain(*q)))
Out[81]: set(['target1/d/b/c', 'targetn/b/b/f'])
或使用列表理解与短路:
In [86]: [j for i in p for j in i if j in (z for y in q for z in y)]
Out[86]: ['target1/d/b/c', 'targetn/b/b/f']
或使用any()
:
In [87]: [j for i in p for j in i if any (j==z for y in q for z in y)]
Out[87]: ['target1/d/b/c', 'targetn/b/b/f']
<强> timeit 强>:
In [93]: %timeit set(chain(*p)).intersection(set(chain(*q)))
100000 loops, best of 3: 7.38 us per loop ## winner
In [94]: %timeit [j for i in p for j in i if j in (z for y in q for z in y)]
10000 loops, best of 3: 24.9 us per loop
In [95]: %timeit [j for i in p for j in i if any (j==z for y in q for z in y)]
10000 loops, best of 3: 27.4 us per loop
In [97]: %timeit [x for x in chain(*p) if x in chain(*q)]
10000 loops, best of 3: 12.6 us per loop
答案 1 :(得分:1)
您应该使用字典:
target_to_link = dict((v,k) for (k,v) in p)
for item in q:
args = item + [target_to_link[item[0]]
do_some_job(*args)
target_to_link
字典为您提供目标的相应链接。只需确保您没有多个目标共享同一个链接...
在for
循环中,我们只创建一个临时参数列表args
,将item
(例如['target1/d/b/c', 'target1', 123, 334]
)与相应的链接结合起来,我们使用function(*args)
语法...
如果您需要在p
上循环,则可以构建类似
target_to_args = dict((k[0],k[1:]) for k in q)
然后执行类似
的操作for (link, target) in p:
args = [target] + target_to_args[target] + [link]
do_some_job(*args)
答案 2 :(得分:0)
使用chain
的列表理解应该有效:
[x for x in chain(*p) if x in chain(*q)]