我有一对大整数对的列表。例如
pairs = [((3, 5), (5, 5)), ((1, 1), (2, 5)), ((5, 1), (4, 3))]
我还有一大堆整数列表。例如,
nums = [(2, 5), (4, 2), (5, 2)]
我想从对中移除任何在nums中有任何对的对。例如,
nums = set(nums)
pairs = [((x1,y1),(x2,y2)) for ((x1,y1),(x2,y2)) in pairs if not (set([(x1,y1),(x2,y2)]) & nums)]
这给出了
[((3, 5), (5, 5)), ((5, 1), (4, 3))]
问题是当对和nums很大时这很慢。你怎么能加快速度呢?
缓慢输入示例:
import random
nums = [(random.randint(1,50000),random.randint(1,50000)) for i in xrange(1000000)]
pairs = [((random.randint(1,50000),random.randint(1,50000)), (random.randint(1,50000),random.randint(1,50000))) for i in xrange(8000000)]
答案 0 :(得分:4)
这将是最快的
nums=set(nums)
pairs= filter(nums.isdisjoint, pairs)
以下是时间:
In [1]: import random
In [2]: pairs=[(random.randint(0,50),random.randint(0,50)) for i in range (1000)]
In [3]: nums=[random.randint(0,1000) for i in range(500)]
In [4]: numset=set(nums)
In [5]: %timeit [(x,y) for (x,y) in pairs if not (set([x,y]) & numset)]
1000 loops, best of 3: 746 us per loop
In [6]: %timeit [(x,y) for (x,y) in pairs if x not in numset and y not in numset]
10000 loops, best of 3: 145 us per loop
In [7]: %timeit filter(numset.isdisjoint, pairs)
10000 loops, best of 3: 95.1 us per loop
答案 1 :(得分:2)
一个建议是将nums
设为一组,以便查找更快。
pairs = [(1,2),(2,3),(7,2)]
nums = {3,7} # Its a set now
print [(first, second) for first, second in pairs if first not in nums and second not in nums]
<强>输出强>
[(1, 2)]
答案 2 :(得分:0)
这应该是您可以衡量的一些解决方案
def rm_pairs(pairs, nums):
cursor = 0
while True:
if cursor == len(pairs):
break
#print cursor
p = pairs[cursor]
for n in nums:
if n in p:
pairs.pop(cursor)
cursor = 0
break
cursor += 1