索引未索引的坐标,其尺寸对应于索引坐标?

时间:2017-01-27 12:36:14

标签: python python-xarray

假设我有几个具有相同尺寸的坐标,例如下面的示例:

In [46]: ds = xarray.Dataset({"x": (("a", "b"), arange(25).reshape(5,5)+100), "y": ("b", arange(5)-100)}, {"a": arange(5), "b": arange(5)*2, "c": (("a",), list("ABCDE"))})

In [47]: print(ds)
<xarray.Dataset>
Dimensions:  (a: 5, b: 5)
Coordinates:
  * b        (b) int64 0 2 4 6 8
    c        (a) <U1 'A' 'B' 'C' 'D' 'E'
  * a        (a) int64 0 1 2 3 4
Data variables:
    x        (a, b) int64 100 101 102 103 104 105 106 107 108 109 110 111 ...
    y        (b) int64 -100 -99 -98 -97 -96

坐标a被识别为索引(我猜是名称),但坐标c不是。我可以使用坐标a进行索引:

In [48]: print(ds.loc[dict(a=slice(0, 2))])
<xarray.Dataset>
Dimensions:  (a: 3, b: 5)
Coordinates:
  * b        (b) int64 0 2 4 6 8
    c        (a) <U1 'A' 'B' 'C'
  * a        (a) int64 0 1 2
Data variables:
    x        (a, b) int64 100 101 102 103 104 105 106 107 108 109 110 111 ...
    y        (b) int64 -100 -99 -98 -97 -96

但我无法使用坐标c索引:

In [49]: print(ds.loc[dict(c=slice("A", "C"))])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-49-33483295fec4> in <module>()
----> 1 print(ds.loc[dict(c=slice("A", "C"))])

/dev/shm/gerrit/venv/stable-3.5/lib/python3.5/site-packages/xarray/core/dataset.py in __getitem__(self, key)
    292         if not utils.is_dict_like(key):
    293             raise TypeError('can only lookup dictionaries from Dataset.loc')
--> 294         return self.dataset.sel(**key)
    295 
    296 

/dev/shm/gerrit/venv/stable-3.5/lib/python3.5/site-packages/xarray/core/dataset.py in sel(self, method, tolerance, drop, **indexers)
   1180         """
   1181         pos_indexers, new_indexes = indexing.remap_label_indexers(
-> 1182             self, indexers, method=method, tolerance=tolerance
   1183         )
   1184         result = self.isel(drop=drop, **pos_indexers)

/dev/shm/gerrit/venv/stable-3.5/lib/python3.5/site-packages/xarray/core/indexing.py in remap_label_indexers(data_obj, indexers, method, tolerance)
    273     new_indexes = {}
    274 
--> 275     dim_indexers = get_dim_indexers(data_obj, indexers)
    276     for dim, label in iteritems(dim_indexers):
    277         try:

/dev/shm/gerrit/venv/stable-3.5/lib/python3.5/site-packages/xarray/core/indexing.py in get_dim_indexers(data_obj, indexers)
    241     if invalid:
    242         raise ValueError("dimensions or multi-index levels %r do not exist"
--> 243                          % invalid)
    244 
    245     level_indexers = defaultdict(dict)

ValueError: dimensions or multi-index levels ['c'] do not exist

当然,我可以将c上的布尔数组传递给a

In [61]: ds.loc[dict(a=((ds.c>='A') & (ds.c<='C')))]
Out[61]: 
<xarray.Dataset>
Dimensions:  (a: 3, b: 5)
Coordinates:
  * b        (b) int64 0 2 4 6 8
    c        (a) <U1 'A' 'B' 'C'
  * a        (a) int64 0 1 2
Data variables:
    x        (a, b) int64 100 101 102 103 104 105 106 107 108 109 110 111 ...
    y        (b) int64 -100 -99 -98 -97 -96

或使用where方法(虽然这会使ds['y']变大(?),

会产生副作用
In [57]: ds.where((ds.c>='A') & (ds.c<='C'), drop=True)
Out[57]: 
<xarray.Dataset>
Dimensions:  (a: 3, b: 5)
Coordinates:
  * b        (b) int64 0 2 4 6 8
    c        (a) <U1 'A' 'B' 'C'
  * a        (a) int64 0 1 2
Data variables:
    x        (a, b) float64 100.0 101.0 102.0 103.0 104.0 105.0 106.0 107.0 ...
    y        (b, a) float64 -100.0 -100.0 -100.0 -99.0 -99.0 -99.0 -98.0 ...

但两种情况都适用于任何数据变量。未索引的坐标和数据变量之间是否存在实际差异?我可以使用c的状态作为索引的坐标,还是我需要像数据变量一样采用环形路径?

1 个答案:

答案 0 :(得分:0)

也可以set a new index

In [67]: print(ds.set_index(a='c').loc[dict(a=slice('A', 'C'))])
<xarray.Dataset>
Dimensions:  (a: 3, b: 5)
Coordinates:
  * b        (b) int64 0 2 4 6 8
  * a        (a) object 'A' 'B' 'C'
Data variables:
    x        (a, b) int64 100 101 102 103 104 105 106 107 108 109 110 111 ...
    y        (b) int64 -100 -99 -98 -97 -96

这可能比问题中的两个替代方案略微清晰。