Python在特定列上合并两个数组

时间:2014-11-14 13:38:13

标签: python arrays performance list numpy

我有两套/格式/列表

a = [(12, 14, 0.3, 0.6, 0.8), (16, 18, 0.4, 0.5, 0.3), (19, 22, 0.4, 0.5, 0.3)]
b = [(12, 14, 44, 12), (5, 4, 66, 12), (19, 22, 96, 45)]

我想找到 c 这是 b a 中的项目列表,这样只有前两个元素元组需要匹配(ex 12 14)。因此,在这种情况下,答案 c 将是

c = [(12, 14, 44, 12), (19, 22, 96, 45)]

我使用了嵌套循环,但速度太慢了。感谢

4 个答案:

答案 0 :(得分:3)

您可以使用列表理解

来完成此操作
>>> a = [(12, 14, 0.3, 0.6, 0.8), (16, 18, 0.4, 0.5, 0.3), (19, 22, 0.4, 0.5, 0.3)]
>>> b = [(12, 14, 44, 12), (5, 4, 66, 12), (19, 22, 96, 45)]
>>> [item for item in b for checker in a if item[:2] == checker[:2]]
[(12, 14, 44, 12), (19, 22, 96, 45)]

答案 1 :(得分:2)

如果您首先将O(N)中所有唯一的两个项目元组存储在一个集合中,则可以a时间执行此操作:

>>> keys = {x[:2] for x in a}
>>> [x for x in b if x[:2] in keys]
[(12, 14, 44, 12), (19, 22, 96, 45)]

请注意,如果您只是尝试匹配同一索引上的项目,那么只需使用zip列表理解:

>>> [y for x, y in zip(a, b) if x[:2] == y[:2]]
[(12, 14, 44, 12), (19, 22, 96, 45)]

#Equivalent Numpy version:
>>> arr_a = np.array(a)
>>> arr_b = np.array(b)
>>> arr_b[(arr_b[:,:2] == arr_a[:,:2]).all(axis=1)]
array([[12, 14, 44, 12],
       [19, 22, 96, 45]])

答案 2 :(得分:1)

如果您正在使用numpy

,则可以使用numpy
In [49]: a = np.array([(12, 14, 0.3, 0.6, 0.8), (16, 18, 0.4, 0.5, 0.3), (19, 22, 0.4, 0.5, 0.3)])

In [50]: b = np.array([(12, 14, 44.0, 12.0), (5, 4, 66.0, 12.0), (19, 22, 96.0, 45.0)])

In [51]: print b[np.all(a[:,:2]==b[:,:2],1)]
[[ 12.  14.  44.  12.]
 [ 19.  22.  96.  45.]]

如何运作?

In [52]: print a[:,:2]==b[:,:2]
[[ True  True]
 [False False]
 [ True  True]]

np.all使用一组布尔值,并沿着可选第二个参数指定的轴(或使用所有元素)使用逻辑和进行缩减

In [53]: print np.all(a[:,:2]==b[:,:2])
False

In [69]: print np.all(a[:,:2]==b[:,:2],1)
[ True False  True]

In [70]: print np.all(a[:,:2]==b[:,:2],0)
[False False]

In [71]:

在我们的情况下,当然,使用的右轴是1

(ps:我必须承认在处理数组值的类型方面有点邋))

答案 3 :(得分:0)

列表理解

      >>> a = [(12, 14, 0.3, 0.6, 0.8), (16, 18, 0.4, 0.5, 0.3), (19, 22, 0.4, 0.5, 0.3)]
      >>> b = [(12, 14, 44, 12), (5, 4, 66, 12), (19, 22, 96, 45)]
      >>> c=[j  for i in a for j in b  if i[:2]==j[:2]]

输出:

 [(12, 14, 44, 12), (19, 22, 96, 45)]