在numpy中应用sum和mean时,有没有办法避免使用特定的值?
我想在计算结果时避免使用-999值。
In [14]: c = np.matrix([[4., 2.],[4., 1.]])
In [15]: d = np.matrix([[3., 2.],[4., -999.]])
In [16]: np.sum([c, d], axis=0)
Out[16]:
array([[ 7., 4.],
[ 8., -998.]])
In [17]: np.mean([c, d], axis=0)
Out[17]:
array([[ 3.5, 2. ],
[ 4. , -499. ]])
答案 0 :(得分:8)
使用蒙面数组:
>>> c = np.ma.array([[4., 2.], [4., 1.]])
>>> d = np.ma.masked_values([[3., 2.], [4., -999]], -999)
>>> np.ma.array([c, d]).sum(axis=0)
masked_array(data =
[[7.0 4.0]
[8.0 1.0]],
mask =
[[False False]
[False False]],
fill_value = 1e+20)
>>> np.ma.array([c, d]).mean(axis=0)
masked_array(data =
[[3.5 2.0]
[4.0 1.0]],
mask =
[[False False]
[False False]],
fill_value = 1e+20)
答案 1 :(得分:6)
一种选择是使用np.nan
替换特定值,然后使用numpy.nansum
和numpy.nanmean
,如@ s.k所述:
import numpy as np
def nan_if(arr, value):
return np.where(arr == value, np.nan, arr)
np.nansum([nan_if(c, -999), nan_if(d, -999)], axis=0)
#array([[ 7., 4.],
# [ 8., 1.]])
np.nanmean([nan_if(c, -999), nan_if(d, -999)], axis=0)
#array([[ 3.5, 2. ],
# [ 4. , 1. ]])