DataArray.mean不保留坐标

时间:2019-11-09 11:25:23

标签: python netcdf python-xarray

DataArray.mean不会保留依赖于应用均值的维的坐标。

注意:XLAT和XLONG与时间无关;但是,某些netcdf文件在这两个文件上都有一个时间坐标。

我有这个netcdf文件wrfout_d03.nc,其中,我使用以下文件打开文件:

ds = xr.open_dataset('/Users/jacob/Desktop/wrfpy/wrfout_d03_may.nc')

然后提供一个DataSet对象:

<xarray.Dataset>
Dimensions:                (Time: 193, bio_emissions_dimension_stag: 41, bottom_top: 50, bottom_top_stag: 51, klevs_for_dvel: 1, seed_dim_stag: 12, soil_layers_stag: 4, south_north: 115, south_north_stag: 116, west_east: 115, west_east_stag: 116)
Coordinates:
    XLAT                   (Time, south_north, west_east) float32 ...
    XLONG                  (Time, south_north, west_east) float32 ...
    XTIME                  (Time) datetime64[ns] ...
    XLAT_U                 (Time, south_north, west_east_stag) float32 ...
    XLONG_U                (Time, south_north, west_east_stag) float32 ...
    XLAT_V                 (Time, south_north_stag, west_east) float32 ...
    XLONG_V                (Time, south_north_stag, west_east) float32 ...
Dimensions without coordinates: Time, bio_emissions_dimension_stag, bottom_top, bottom_top_stag, klevs_for_dvel, seed_dim_stag, soil_layers_stag, south_north, south_north_stag, west_east, west_east_stag
Data variables:
datavars...

然后我使用以下代码访问 PM2_5_DRY 变量:

pm25 = ds.PM2_5_DRY

生成的对象 pm25 的尺寸和坐标如下:

<xarray.DataArray 'PM2_5_DRY' (Time: 193, bottom_top: 50, south_north: 115, west_east: 115)>
[127621250 values with dtype=float32]
Coordinates:
    XLAT     (Time, south_north, west_east) float32 ...
    XLONG    (Time, south_north, west_east) float32 ...
    XTIME    (Time) datetime64[ns] ...
Dimensions without coordinates: Time, bottom_top, south_north, west_east
Attributes:
    FieldType:    104
    MemoryOrder:  XYZ
    description:  pm2.5 aerosol dry mass
    units:        ug m^-3
    stagger: 

然后我操作 pm25 对象,并通过以下方法获取 time 维度中的均值:

pm25_mean = pm25.mean(dim='Time', keep_attrs = True)

结果对象是一个DataArray,但没有坐标XLAT或XLON。

<xarray.DataArray 'PM2_5_DRY' (bottom_top: 50, south_north: 115, west_east: 115)>
array([[[14.73083   , 14.756626  , 14.796355  , ..., 20.325712  ,
         20.855696  , 21.381271  ],
        [14.651459  , 14.34477   , 14.371858  , ..., 18.00389   ,
         18.4109    , 21.337002  ],
        [14.59026   , 14.257076  , 14.293012  , ..., 17.391146  ,
         18.217058  , 20.882664  ],
        ...,
        [27.356459  , 27.21468   , 27.757051  , ...,  8.084272  ,
          8.010168  ,  7.989942  ],
        [27.185486  , 27.02623   , 27.776043  , ...,  7.944748  ,
          7.8795266 ,  7.8552976 ],
        [26.926008  , 27.724253  , 28.427626  , ...,  7.8269224 ,
          7.773637  ,  7.741844  ]],

Dimensions without coordinates: bottom_top, south_north, west_east
Attributes:
    FieldType:    104
    MemoryOrder:  XYZ
    description:  pm2.5 aerosol dry mass
    units:        ug m^-3
    stagger:

要检查的以下代码给出了

pm25_mean.coords

Coordinates:
*empty*

我尝试查看xarray中mean函数的文档;但是,我找不到任何将坐标从以前的对象复制到新对象的选项。

有关如何进行此操作的任何提示?我想我需要从文件访问这些坐标,然后再次将它们组合。但我不确定该如何进行。

此外,这与此有关吗?

XLAT     (Time, south_north, west_east) float32

XLAT是一个多维坐标,它也取决于时间。自从我获得了“时间”维度的均值后,维度pm25的数量已从3个减少为3个,而不是4个。

我需要最终的对象具有XLAT和XLONG坐标,因为我将对此进行可视化。

感谢您的帮助!

2 个答案:

答案 0 :(得分:0)

在xarray中似乎没有直接的解决方案 (据我所知),但是我通过NCO找到了解决方案。

这来自此处的另一篇文章。这些是我为解决此问题而采取的步骤。

首先,最终数据数组中不存在XLAT和XLONG的原因是这些坐标取决于时间。

根据此线程:Setting a coordinate constant in time

  1. 我们必须获取NetCDF文件并分离出感兴趣的变量(随时间变化)。
  2. 我们及时对XLAT和XLONG进行平均,并将其放置在另一个nc文件中。
  3. 我们将两个文件附加在一起,以得到最终的与时间无关的XLAT和XLONG。
ncks -v variable input.nc variable.nc
ncwa -a Time -v XLAT,XLONG input.nc latlon.nc
ncks -A latlon.nc variable.nc

这样,当在xarray中访问文件并计算均值时,我们得到以下DataArray:

<xarray.DataArray 'PM2_5_DRY' (bottom_top: 50, south_north: 115, west_east: 115)>
array([[[14.73083   , 14.756626  , 14.796355  , ..., 20.325712  ,
         20.855696  , 21.381271  ],
        [14.651459  , 14.34477   , 14.371858  , ..., 18.00389   ,
         18.4109    , 21.337002  ],
        [14.59026   , 14.257076  , 14.293012  , ..., 17.391146  ,
         18.217058  , 20.882664  ],
        ...,
Coordinates:
    XLAT     (south_north, west_east) float32 ...
    XLONG    (south_north, west_east) float32 ...
Dimensions without coordinates: bottom_top, south_north, west_east

希望这对其他遇到相同问题的人也有帮助!

答案 1 :(得分:0)

您只需要将坐标变量从3维转换为2维。

d = xr.open_dataset('.../pm25_sample.nc')
d['XLAT'] = d.XLAT.mean(dim = 'Time')
d['XLONG'] = d.XLONG.mean(dim = 'Time')

d['PM2_5_DRY'].mean(dim = 'Time')

     ...,
    [ 0.03839084,  0.03837739,  0.03835952, ...,  0.03414929,
      0.03412561,  0.03410038],
    [ 0.03837854,  0.03836632,  0.03834687, ...,  0.03414606,
      0.0341224 ,  0.03409675],
    [ 0.03836945,  0.03835024,  0.03833132, ...,  0.03414177,
      0.03411727,  0.03409337]]], dtype=float32)
Coordinates:
    XLAT     (south_north, west_east) float32 14.086891 14.086907 ... 15.111996
    XLONG    (south_north, west_east) float32 120.49799 120.50717 ... 121.55791
Dimensions without coordinates: bottom_top, south_north, west_east