Question

在xarray中使用DataArray对象是查找所有具有值== 0的单元格的最佳方法。

例如在熊猫我会做

df.loc[df.col1 > 0]

我的具体示例是尝试查看3维大脑成像数据。

first_image_xarray.shape
(140, 140, 96)
dims = ['x','y','z']

看xarray.DataArray.where的文档，看来我想要这样的东西：

first_image_xarray.where(first_image_xarray.y + first_image_xarray.x  > 0,drop = True)[:,0,0]

但是我仍然得到带有零的数组。

<xarray.DataArray (x: 140)>
array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0., -0.,  0., -0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])
Dimensions without coordinates: x

-一个附带的问题-为什么会有一些负零？这些值是否取整并为-0。实际上等于-0.009876之类的东西？

Answer 1

（主要问题的答案）

您快到了。但是，语法上的细微差别在这里有很大的不同。一方面，这是使用“基于值” 掩码过滤>0值的解决方案。

# if you want to DROP values which do not suffice a mask condition
first_image_xarray[:,0,0].where(first_image_xarray[:,0,0] > 0, drop=True)

或

# if you want to KEEP values which do not suffice a mask condition as nan
first_image_xarray[:,0,0].where(first_image_xarray[:,0,0] > 0, np.nan)

另一方面，您的尝试未如您所愿的原因是因为使用first_image_xarray.x时，它指的是数组中元素的 index （在{{ 1}}方向），而不是引用元素的值。因此，仅输出的第一个元素应为x而不是nan，因为它仅不足以满足切片0中的掩码条件。是的，您正在创建一个“基于索引的” 掩码。

下面的小实验（希望如此）阐明了这一关键差异。

假设我们有[:,0,0]，它仅由DataArray和0组成（尺寸与问题1的原始帖子（OP）对齐）。首先，让我们像OP那样根据 index 对其进行掩盖：

(140,140,96)

使用上面的掩码，只有import numpy as np import xarray as xr np.random.seed(0) # create a DataArray which randomly contains 0 or 1 values a = xr.DataArray(np.random.randint(0, 2, 140*140*96).reshape((140, 140, 96)), dims=('x', 'y', 'z')) # with this "index-based" mask, only elements where index of both x and y are 0 are replaced by nan a.where(a.x + a.y > 0, drop=True)[:,0,0] Out: <xarray.DataArray (x: 140)> array([ nan, 0., 1., 1., 0., 0., 0., 1., 0., 0., 0., 0., 0., 1., 0., 1., 0., 1., 0., 0., 0., 1., 0., 0., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 0., 0., 0., 1., 1., 1., 0., 0., 1., 0., 0., 1., 0., 1., 1., 0., 0., 1., 0., 0., 1., 1., 1., 0., 0., 0., 1., 1., 0., 1., 0., 1., 1., 0., 0., 0., 0., 1., 1., 0., 1., 1., 1., 1., 0., 1., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 1., 0., 0., 0., 0., 1., 0., 1., 0., 0., 0., 0., 1., 0., 1., 0., 0., 1., 0., 0., 0., 0., 0., 1., 1., 0., 0., 0., 1., 0., 0., 1., 0., 0., 1.]) Dimensions without coordinates: x和x的 index 都为y的元素才变成0，其余元素完全没有更改或删除。

相反，提出的解决方案基于nan元素的值屏蔽了DataArray。

DataArray

此操作成功删除了基于# with this "value-based" mask, all the values which do not suffice the mask condition are dropped a[:,0,0].where(a[:,0,0] > 0, drop=True) Out: <xarray.DataArray (x: 65)> array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]) Dimensions without coordinates: x元素的值的所有不足以满足屏蔽条件的值。

（回答其他问题）

对于DataArray中-0和0的起源，从负或正向DataArray的取整值是可能的：在此进行了相关讨论How to eliminate the extra minus sign when rounding negative numbers towards zero in numpy?下面是这种情况的一个小例子。

作为旁注，在NumPy数组中获取import numpy as np import xarray as xr xr_array = xr.DataArray([-0.1, 0.1]) # you can use either xr.DataArray.round() or np.round() for rounding values of DataArray xr.DataArray.round(xr_array) Out: <xarray.DataArray (dim_0: 2)> array([-0., 0.]) Dimensions without coordinates: dim_0 np.round(xr_array) Out: <xarray.DataArray (dim_0: 2)> array([-0., 0.]) Dimensions without coordinates: dim_0的另一种可能性可以是-0，它像下面一样隐藏在小数点以下（但我知道这次不是这种情况，因为使用numpy.set_printoptions(precision=0)）：

DataArray

无论如何，我的最佳猜测是在数据准备和预处理阶段，向import numpy as np # default value is precision=8 in ver1.15 np.set_printoptions(precision=0) np.array([-0.1, 0.1]) Out: array([-0., 0.])的转换应该是手动和有意的，而不是自动的。

希望这会有所帮助。

稀疏DataArray Xarray搜索

1 个答案: