具有multiindex的堆栈xarray.DataArray两次返回无信息错误

时间:2019-08-17 08:10:40

标签: python pandas scikit-learn python-xarray

我试图在同一数组上两次使用xr.DataArray.stack()方法,将Multiindex堆叠为新的Multiindex:

import numpy as np
import xarray as xr

da = xr.DataArray(np.random.rand(4,4,4,4), 
     [('latitude', range(4)), ('longitude', range(4)), 
      ('variables', range(4)), ('time', range(4))])

da.stack(dim1=('latitude','longitude')).stack(dim2=('variables','time')) # this works fine
da.stack(dim1=('latitude','longitude')).stack(dim2=('dim1','time'))

最后一行会产生以下错误:

  File "/usr/local/Miniconda3-envs/envs/2018/envs/iacpy3_2018/lib/python3.6/site-packages/xarray/core/dataarray.py", line 1102, in stack
    ds = self._to_temp_dataset().stack(**dimensions)
  File "/usr/local/Miniconda3-envs/envs/2018/envs/iacpy3_2018/lib/python3.6/site-packages/xarray/core/dataset.py", line 2101, in stack
    result = result._stack_once(dims, new_dim)
  File "/usr/local/Miniconda3-envs/envs/2018/envs/iacpy3_2018/lib/python3.6/site-packages/xarray/core/dataset.py", line 2070, in _stack_once
    idx = utils.multiindex_from_product_levels(levels, names=dims)
  File "/usr/local/Miniconda3-envs/envs/2018/envs/iacpy3_2018/lib/python3.6/site-packages/xarray/core/utils.py", line 80, in multiindex_from_product_levels
    return pd.MultiIndex(levels, labels, sortorder=0, names=names)
  File "/usr/local/Miniconda3-envs/envs/2018/envs/iacpy3_2018/lib/python3.6/site-packages/pandas/core/indexes/multi.py", line 132, in __new__
    result._set_names(names)
  File "/usr/local/Miniconda3-envs/envs/2018/envs/iacpy3_2018/lib/python3.6/site-packages/pandas/core/indexes/multi.py", line 570, in _set_names
    self.levels[l].rename(name, inplace=True)
  File "/usr/local/Miniconda3-envs/envs/2018/envs/iacpy3_2018/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 1167, in set_names
    raise TypeError("Must pass list-like as `names`.")
TypeError: Must pass list-like as `names`.

动机:

我的原始数组与上述玩具数据集具有相同的坐标,但尺寸不同。但是,要使用sklearn.ensemble.RandomForestRegressor,我需要以 datapoints 形状(展平的 latitude 经度 time < / em>)和变量而不会丢失任何值。因此,我想先堆叠纬度经度,删除所有海洋点,因为它们充满了np.nan,然后堆叠了时间使用新创建的Multiindex。

问题:

这个错误在我看来好像没有实现该功能。

  1. 如果是,为什么不呢?
  2. 如果是,为什么没有更多有用的错误消息?
  3. xarray中是否有解决我的问题的解决方法(即,没有将我的数据转换为numpy数组或类似数据)?

0 个答案:

没有答案