Python:根据某些列过滤numpy值

时间:2018-07-12 10:58:53

标签: python numpy search filtering

我正在尝试创建一种方法来评估大约一周内到期的项目的坐标。

假设我正在3D笛卡尔坐标系中工作-其值存储为numpy数组中的行。我试图读取给定相应的预定“ x”(n [i,0])和“ y”(n [i,1])值是否存在“ z”(n [i,2])个值。 / p>

在分配的值是标量的情况下,我很满意地认为:

# Given that n is some numpy array
x, y = 2,3 
out = []
for i in range(0,n.shape[0]):
 if n[i, 0] == x and n[i,1] == y:
  out.append(n[i,2])

但是,悲伤的根源在于必须检查另一个numpy数组中的值是否在原始numpy数组'n'中。

# Given that n is the numpy array that is to be searched
# Given that x contains the 'search elements'
out = []
for i in range(0,n.shape[0]):
 for j in range(0, x.shape[0]):
  if n[i, 0] == x[j,0] and n[i,1] == x[j,1]:
   out.append(n[i,2])

这样做的问题是我的应用程序中的“ n”矩阵的长度很可能超过10万行。

是否有更有效的方法来执行此功能?

2 个答案:

答案 0 :(得分:1)

这可能比嵌套循环更有效:

out = []
for row in x:
    idx = np.equal(n[:,:2], row).all(1)
    out.extend(n[idx,2].tolist())

请注意,这假设x的形状为(?, 2)。否则,如果它有两列以上,则只需在循环主体中将row更改为row[:2]

答案 1 :(得分:0)

Numpythonic解决方案无循环。

此解决方案在x和y坐标非负的情况下有效。

import numpy as np
# Using a for x and b for n, to avoid confusion with x,y coordinates and array names
a = np.array([[1,2],[3,4]])
b = np.array([[1,2,10],[1,2,11],[3,4,12],[5,6,13],[3,4,14]])

# Adjust the shapes by taking the z coordinate as 0 in a and take the dot product with b transposed
a = np.insert(a,2,0,axis=1)
dot_product = np.dot(a,b.T)

# Reshape a**2 to check the dot product values corresponding to exact values in the x, y coordinates
sum_reshaped = np.sum(a**2,axis=1).reshape(a.shape[0],1)

# Match for values for indivisual elements in a. Can be used if you want z coordinates corresponding to some x, y separately
indivisual_indices = ( dot_product == np.tile(sum_reshaped,b.shape[0]) )

# Take OR of column values and take z if atleast one x,y present
indices  = np.any(indivisual_indices, axis=0)
print(b[:,2][indices]) # prints [10 11 12 14]