使用边界框子集切片NetCDF变量

时间:2019-05-12 19:20:20

标签: python numpy multidimensional-array netcdf netcdf4

背景

我正在尝试使用经/纬度边界框对NetCDF文件进行切片。该文件的相关信息如下(变量,形状,尺寸):

enter image description here

在这里和标准教程的大多数答案中,这应该非常简单,我的解释是,您只需找到经纬度的索引,然后用这些索引对变量数组进行切片即可。

尝试/验证码

 def netcdf_worker(nc_file, bbox):
    dataset = Dataset(nc_file)
    for variable in dataset.variables.keys():
        if (variable != 'lat') and (variable != 'lon'):
            var_name = variable
            break

    # Full extent of data
    lats = dataset.variables['lat'][:]
    lons = dataset.variables['lon'][:]

    if bbox:
        lat_bnds = [bbox[0], bbox[2]]  # min lat, max lat
        lon_bnds = [bbox[1], bbox[3]]  # min lon, max lon
        lat_inds = np.where((lats > lat_bnds[0]) & (lats < lat_bnds[1]))
        lon_inds = np.where((lons > lon_bnds[0]) & (lons < lon_bnds[1]))

        var_subset = dataset.variables[var_name][:, lat_inds[0], lon_inds[0]]

        # would also be great to slice the lats and lons too for visualization

问题

当尝试通过上述代码实施在SO上列出的其他答案中找到的解决方案时,我遇到错误:

File "/Users/XXXXXX/Desktop/Viewer/viewer.py", line 41, in netcdf_worker
    var_subset = dataset.variables[var_name][:, lat_inds[0], lon_inds[0]]
  File "netCDF4/_netCDF4.pyx", line 4095, in netCDF4._netCDF4.Variable.__getitem__
  File "/Users/XXXXXX/Viewer/lib/python3.6/site-packages/netCDF4/utils.py", line 242, in _StartCountStride
    ea = np.where(ea < 0, ea + shape[i], ea)
IndexError: tuple index out of range

我认为切片多维数组方面我缺少/不了解一些小知识,希望对您有所帮助。我对带任何其他软件包或在python外部运行的任何解决方案不感兴趣(请不要提供CDO或NCKS答案!)。谢谢您的帮助。

1 个答案:

答案 0 :(得分:2)

在Python中,我认为最简单的解决方案是使用xarray。最小示例(使用一些ERA5数据):

import xarray as xr

f = xr.open_dataset('model_fc.nc')

print(f['latitude'].values)  # [52.771 52.471 52.171 51.871 51.571 51.271 50.971]
print(f['longitude'].values) # [3.927 4.227 4.527 4.827 5.127 5.427 5.727]

f2 = f.sel(longitude=slice(4.5, 5.4), latitude=slice(52.45, 51.5))  

print(f2['latitude'].values)  # [52.171 51.871 51.571]
print(f2['longitude'].values) # [4.527 4.827 5.127]

作为示例,我仅显示latitudelongitude变量,但是NetCDF文件中具有latitudelongitude尺寸的所有变量都是自动切片的。


或者,如果要手动选择该框(使用NetCDF4):

import netCDF4 as nc4
import numpy as np

f = nc4.Dataset('model_fc.nc')

lat = f.variables['latitude'][:]
lon = f.variables['longitude'][:]

# All indices in bounding box:
where_j = np.where((lon >= 4.5) & (lon <= 5.4))[0]
where_i = np.where((lat >= 51.5) & (lat <= 52.45))[0]

# Start and end+1 indices in each dimension:
i0 = where_i[0]
i1 = where_i[-1]+1

j0 = where_j[0]
j1 = where_j[-1]+1

print(lat[i0:i1])  # [52.171 51.871 51.571]
print(lon[j0:j1])  # [4.527 4.827 5.127]

当然,您现在必须使用例如var_slice = var[j0:j1, i0:i1]