将np.where中未包含的元素设置为np.nan

时间:2014-04-23 02:04:32

标签: python numpy

如何将所有不是7的元素设置为np.nan?

import numpy as np
data  = np.array([[0,1,2,3,4,7,6,7,8,9,10], 
        [3,3,3,4,7,7,7,8,11,12,11],  
        [3,3,3,5,7,7,7,9,11,11,11],
        [3,4,3,6,7,7,7,10,11,11,11],
        [4,5,6,7,7,9,10,11,11,11,11]])

result = np.where(data==7) 
data[~result] = np.nan
print data

Traceback (most recent call last):
  File "M:\test.py", line 10, in <module>
    data[~result] = np.nan
TypeError: bad operand type for unary ~: 'tuple'

3 个答案:

答案 0 :(得分:3)

使用np.where的3参数形式:

In [49]: np.where(data==7, data, np.nan)
Out[49]: 
array([[ nan,  nan,  nan,  nan,  nan,   7.,  nan,   7.,  nan,  nan,  nan],
       [ nan,  nan,  nan,  nan,   7.,   7.,   7.,  nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,  nan,   7.,   7.,   7.,  nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,  nan,   7.,   7.,   7.,  nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,   7.,   7.,  nan,  nan,  nan,  nan,  nan,  nan]])

In [62]: np.choose(data==7, [np.nan, data])
Out[62]: 
array([[ nan,  nan,  nan,  nan,  nan,   7.,  nan,   7.,  nan,  nan,  nan],
       [ nan,  nan,  nan,  nan,   7.,   7.,   7.,  nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,  nan,   7.,   7.,   7.,  nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,  nan,   7.,   7.,   7.,  nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,   7.,   7.,  nan,  nan,  nan,  nan,  nan,  nan]])

关于在此解决方案中使用using_filters返回的整数索引的要求,我认为有更好的选择。如果您要从np.where移除using_filters

def using_filters(data):
    return np.logical_and.reduce(
        [data == f(data, footprint=np.ones((3,3)), mode='constant', cval=np.inf)
         for f in (filters.maximum_filter, filters.minimum_filter)])

然后using_filters将返回一个布尔掩码。 Using a boolean mask会让这个问题变得更加容易:

import numpy as np
import scipy.ndimage.filters as filters

def using_filters(data):
    return np.logical_and.reduce(
        [data == f(data, footprint=np.ones((3,3)), mode='constant', cval=np.inf)
         for f in (filters.maximum_filter, filters.minimum_filter)])


data  = np.array([[0,1,2,3,4,7,6,7,8,9,10], 
        [3,3,3,4,7,7,7,8,11,12,11],  
        [3,3,3,5,7,7,7,9,11,11,11],
        [3,4,3,6,7,7,7,10,11,11,11],
        [4,5,6,7,7,9,10,11,11,11,11]], dtype='float')

result = using_filters(data)
data[~result] = np.nan
print data
# [[ nan  nan  nan  nan  nan  nan  nan  nan  nan  nan  nan]
#  [ nan  nan  nan  nan  nan  nan  nan  nan  nan  nan  nan]
#  [ nan  nan  nan  nan  nan   7.  nan  nan  nan  nan  nan]
#  [ nan  nan  nan  nan  nan  nan  nan  nan  nan  11.  nan]
#  [ nan  nan  nan  nan  nan  nan  nan  nan  nan  nan  nan]]

答案 1 :(得分:2)

必须有更好的方法,但这是我现在能想到的最好方法。创建所有np.nan的另一个数组,然后将result中索引处的值替换为实际值:

data_nan = np.full(data.shape, np.nan)
data_nan[result] = data[result]
data = data_nan

如果你想获得一个不在result中的所有索引的列表,你可以这样做,虽然我认为以上可能更好:

inc = np.core.rec.fromarrays(result)
all_ind = np.core.rec.fromarrays(np.indices(data.shape).reshape(2,-1))
exc = np.setdiff1d(all_ind, inc)
data[exc['f0'], exc['f1']] = np.nan

这可以通过将每对索引转换为结构化数组的一个元素来实现,这样它们就可以作为set元素与所有索引的类似数组进行比较。然后我们做这些的设定差异并得到其余部分。

答案 2 :(得分:2)

您可以从data获取所需的值,并使用data填充nan,然后将值复制回data

import numpy as np
data  = np.array([[0,1,2,3,4,7,6,7,8,9,10], 
        [3,3,3,4,7,7,7,8,11,12,11],  
        [3,3,3,5,7,7,7,9,11,11,11],
        [3,4,3,6,7,7,7,10,11,11,11],
        [4,5,6,7,7,9,10,11,11,11,11]], float)

result = np.where(data==7) 

values = data[result]
data.fill(np.nan)
data[result] = values