在Python中。我有一个ND数组列表,我想计算重复数组,以计算每个重复数组值的平均值

时间:2018-02-16 07:07:05

标签: python numpy multidimensional-array counter average

我有一个ND数组(向量)列表,每个向量都有(1,300)形状 我的目标是在列表中找到重复的向量,对它们求和,然后将它们除以列表的大小,结果值(向量)将替换重复的向量。
例如,a是ND数组列表a = [[2,3,1],[5,65,-1],[2,3,1]],然后第一个和最后一个元素是重复的。 他们的sum将是:[4,6,2], 它将除以矢量列表的大小size = 3

输出:a = [[4/3,6/3,2/3],[5,65,-1],[4/3,6/3,2/3]]

我尝试使用Counter,但它对ndarrays不起作用。

Numpy的方式是什么? 感谢。

2 个答案:

答案 0 :(得分:1)

If you have numpy 1.13 or higher, this is pretty simple:

def f(a):
    u, inv, c = np.unique(a, return_counts = True, return_inverse = True, axis = 0)
    p = np.where(c > 1,  c / a.shape[0], 1)[:, None]
    return (u * p)[inv]

If you don't have 1.13, you'll need some trick to convert a into a 1-d array first. I recommend @Jaime's excellent answer using np.void here

How it works:

  • u is the unique rows of a (usually not in their original order)
  • c is the number of times each row of u are repeated in a
  • inv is the indices to get u back to a, i.e. u[inv] = a
  • p is the multiplier for each row of u based on your requirements. 1 if c == 1 and c / n (where n is the number of rows in a) if c > 1. [:, None] turns it into a column vector so that it broadcasts well with u

return u * p indexed back to their original locations by [inv]

答案 1 :(得分:0)

你可以使用numpy unique,with count return count

 elements, count = np.unique(a, axis=0, return_counts=True)

返回计数允许返回数组中每个元素的出现次数

输出就像这样,

(array([[ 2,  3,  1],
        [ 5, 65, -1]]), array([2, 1]))

然后你可以像这样繁殖它们:

(count * elements.T).T

输出:

array([[ 4,  6,  2],
       [ 5, 65, -1]])