在pandas / numpy中对元素运行应用功能时访问元素的2D索引?

时间:2018-10-30 19:28:23

标签: python pandas numpy

我试图遍历numpy中的数组,并使用对索引的一些计算对每个元素应用一个函数。所以我的代码看起来像这样:

# f takes in a matrix element and returns some calculation based on the
# element's 2D index i,j
def f(elt, i,j):

      return elt*i + elt*j

# create a 2x3 matrix, A
A = np.array([[1,2,3]
              [4,5,6]])


# Transform A by applying the function `f` over every element.
A_1 = applyFunction(f, A)


print(A_1)
# A_1 should now be a matrix that is transformed:
# [[0  2  6]
   [4 10 18]

使用for循环很容易做到这一点,但是我的矩阵太大,以至于在这种情况下这样做效率不高。我正在尝试使用numpy的内置方法,例如applyapply_along_axis

我还考虑过将矩阵转换为pandas DataFrame,然后也许将列名和行名用作索引。.但是我不知道如何在apply_along_axis函数调用中访问它

任何帮助将不胜感激。谢谢!

1 个答案:

答案 0 :(得分:2)

def f(elt, i,j):
      return (i,j)

A = [[1,2,3],
     [4,5,6]]

In [306]: [[f(None,i,j) for j in range(len(A[0]))] for i in range(len(A))] 
Out[306]: [[(0, 0), (0, 1), (0, 2)], [(1, 0), (1, 1), (1, 2)]]

一个数组解决方案,速度可能大致相同:

In [309]: np.frompyfunc(f,3,1)(None, [[0],[1]],[0,1,2])
Out[309]: 
array([[(0, 0), (0, 1), (0, 2)],
       [(1, 0), (1, 1), (1, 2)]], dtype=object)
In [310]: _.shape
Out[310]: (2, 3)

最快的numpy方法,但不使用您的f函数:

In [312]: I,J = np.meshgrid(range(2),range(3),indexing='ij')
In [313]: I
Out[313]: 
array([[0, 0, 0],
       [1, 1, 1]])
In [314]: J
Out[314]: 
array([[0, 1, 2],
       [0, 1, 2]])
In [315]: np.stack((I,J), axis=2)
Out[315]: 
array([[[0, 0],
        [0, 1],
        [0, 2]],

       [[1, 0],
        [1, 1],
        [1, 2]]])
In [316]: _.shape
Out[316]: (2, 3, 2)