矢量化numpy意味着条件

时间:2018-04-15 14:44:35

标签: python numpy

Numpy mean with condition相似 我的问题扩展到矩阵运算:计算矩阵rdat的行均值,跳过某些单元格 - 我在这个例子中使用0作为要跳过的单元格 - 好像这些值从未出现在第一位。例如,以下矩阵的行索引1只有2个条目,因此[4,0,0,1]的平均值等于5/2而不是5/4:

rdat = np.array([
    [5.,3.,0.,1.],
    [4.,0.,0.,1.],
    [1.,1.,0.,5.],
    [1.,0.,0.,4.],
    [0.,1.,5.,4.]
    ],dtype=np.float32)

目标是对计算进行矢量化,也就是说,不允许循环。

以下代码将计算矩阵rdat的行方式,一次一行。产生了正确的结果,但代码尚未矢量化:

u = np.zeros((5,1))
for i in range(5):
    u[i,0] = rdat[i][rdat[i]>0].mean()
print(u)

已尝试过的内容:

I = 5; J = 4
# Try with numpy to develop syntax for user_bias for tf.
mrdat = np.matrix(rdat)
keep = mrdat > 0
print(keep)

keepr,keepc = np.where(keep)
print(keepr)
print(keepc)
#np.mean(rdat[keepr,keepc], 1)

#(keepr,keepc) = np.where(keep)
#np.mean(rdat[keepr,keepc], 1)

#keepidx = zip(np.where(keep))
#np.mean(rdat[keepidx], 1)

#rdat[keepr, keepc]
#rdat[keepr]
#np.mean(rdat[keepr], 1)

#rdat[0,keep].mean()
#rdat[keep[0]].mean()
#rdat[0,keep[0,:]]
print(keep[0])
x0 = np.ravel(keep[0])
print("flatnonzero: {}".format(np.flatnonzero(mrdat)))
print(x0)
#keepr
#rdat[keep[0]]

x = rdat[0]
print("x:{}".format(x))
x[x>0].mean() #OK
rdat[0][rdat[0]>0].mean() #OK output for single row
print(rdat[:][rdat[:]>0].mean()) # wrong output for each row

玩得开心,感谢阅读。

2 个答案:

答案 0 :(得分:2)

简单地得到非零的计数并除以求和 -

from __future__ import division

def meanNA(a, NA, axis):
    mask = a!=NA
    return (a*mask).sum(axis=axis)/mask.sum(axis=axis)

(a*mask).sum(axis=axis)替换为np.einsum('ij,ij->i',a,mask)以获取2D阵列的特定情况,并沿第二轴减少以提高性能。

示例运行 -

In [21]: rdat
Out[21]: 
array([[5., 3., 0., 1.],
       [4., 0., 0., 1.],
       [1., 1., 0., 5.],
       [1., 0., 0., 4.],
       [0., 1., 5., 4.]], dtype=float32)

In [22]: meanNA(rdat, NA=0, axis=1) # mean along each row skipping 0s
Out[22]: array([3.        , 2.5       , 2.33333333, 2.5       , 3.33333333])

In [23]: meanNA(rdat, NA=0, axis=0) # mean along each col skipping 0s
Out[23]: array([2.75      , 1.66666667, 5.        , 3.        ])

In [24]: meanNA(rdat, NA=3, axis=1) # mean along each row skipping 3s
Out[24]: array([2.  , 1.25, 1.75, 1.25, 2.5 ])

答案 1 :(得分:0)

这样的事情怎么样?

{{1}}