Question

我有一个包含某些测量数据的2D数组。考虑到良好的数据，我必须在每列中采用均值。因此，我有另一个相同形状的2D数组，其中包含1和0，显示该（i，j）处的数据是好还是坏。一些“坏”数据也可以是nan。

def mean_exc_mask(x, mas): #x is the real data arrray
                           #mas tells if the data at the location is good/bad
    sum_array   = np.zeros(len(x[0]))
    avg_array   = np.zeros(len(x[0]))
    items_array = np.zeros(len(x[0]))

    for i in range(0, len(x[0])): #We take a specific column first
            for j in range(0, len(x)): #And then parse across rows

                    if mas[j][i]==0: #If the data is good
                            sum_array[i]= sum_array[i] + x[j][i]
                            items_array[i]=items_array[i] + 1

            if  items_array[i]==0: # If none of the data is good for a particular column
                    avg_array[i] = np.nan
            else:
                    avg_array[i] = float(sum_array[i])/items_array[i]
    return avg_array

我的所有价值都是纳米！

关于这里发生了什么问题或其他方式的任何想法？

Answer 1

代码似乎对我有用，但是你可以通过在Numpy中使用内置聚合来简化它：

(x*(m==0)).sum(axis=0)/(m==0).sum(axis=0)

我尝试过：

x=np.array([[-0.32220561, -0.93043128, 0.37695923],[ 0.08824206, -0.86961453, -0.54558324],[-0.40942331, -0.60216952, 0.17834533]]) 和 m=array([[1, 1, 0],[1, 0, 0],[1, 1, 1]])

如果您发布示例数据，通常更容易给出合格的答案。

在Python中使用带有掩码的列的均值

1 个答案: