背景
我正在尝试使用经/纬度边界框对NetCDF文件进行切片。该文件的相关信息如下(变量,形状,尺寸):
在这里和标准教程的大多数答案中,这应该非常简单,我的解释是,您只需找到经纬度的索引,然后用这些索引对变量数组进行切片即可。
尝试/验证码
def netcdf_worker(nc_file, bbox):
dataset = Dataset(nc_file)
for variable in dataset.variables.keys():
if (variable != 'lat') and (variable != 'lon'):
var_name = variable
break
# Full extent of data
lats = dataset.variables['lat'][:]
lons = dataset.variables['lon'][:]
if bbox:
lat_bnds = [bbox[0], bbox[2]] # min lat, max lat
lon_bnds = [bbox[1], bbox[3]] # min lon, max lon
lat_inds = np.where((lats > lat_bnds[0]) & (lats < lat_bnds[1]))
lon_inds = np.where((lons > lon_bnds[0]) & (lons < lon_bnds[1]))
var_subset = dataset.variables[var_name][:, lat_inds[0], lon_inds[0]]
# would also be great to slice the lats and lons too for visualization
问题
当尝试通过上述代码实施在SO上列出的其他答案中找到的解决方案时,我遇到错误:
File "/Users/XXXXXX/Desktop/Viewer/viewer.py", line 41, in netcdf_worker
var_subset = dataset.variables[var_name][:, lat_inds[0], lon_inds[0]]
File "netCDF4/_netCDF4.pyx", line 4095, in netCDF4._netCDF4.Variable.__getitem__
File "/Users/XXXXXX/Viewer/lib/python3.6/site-packages/netCDF4/utils.py", line 242, in _StartCountStride
ea = np.where(ea < 0, ea + shape[i], ea)
IndexError: tuple index out of range
我认为切片多维数组方面我缺少/不了解一些小知识,希望对您有所帮助。我对带任何其他软件包或在python外部运行的任何解决方案不感兴趣(请不要提供CDO或NCKS答案!)。谢谢您的帮助。
答案 0 :(得分:2)
在Python中,我认为最简单的解决方案是使用xarray
。最小示例(使用一些ERA5数据):
import xarray as xr
f = xr.open_dataset('model_fc.nc')
print(f['latitude'].values) # [52.771 52.471 52.171 51.871 51.571 51.271 50.971]
print(f['longitude'].values) # [3.927 4.227 4.527 4.827 5.127 5.427 5.727]
f2 = f.sel(longitude=slice(4.5, 5.4), latitude=slice(52.45, 51.5))
print(f2['latitude'].values) # [52.171 51.871 51.571]
print(f2['longitude'].values) # [4.527 4.827 5.127]
作为示例,我仅显示latitude
和longitude
变量,但是NetCDF文件中具有latitude
和longitude
尺寸的所有变量都是自动切片的。
或者,如果要手动选择该框(使用NetCDF4):
import netCDF4 as nc4
import numpy as np
f = nc4.Dataset('model_fc.nc')
lat = f.variables['latitude'][:]
lon = f.variables['longitude'][:]
# All indices in bounding box:
where_j = np.where((lon >= 4.5) & (lon <= 5.4))[0]
where_i = np.where((lat >= 51.5) & (lat <= 52.45))[0]
# Start and end+1 indices in each dimension:
i0 = where_i[0]
i1 = where_i[-1]+1
j0 = where_j[0]
j1 = where_j[-1]+1
print(lat[i0:i1]) # [52.171 51.871 51.571]
print(lon[j0:j1]) # [4.527 4.827 5.127]
当然,您现在必须使用例如var_slice = var[j0:j1, i0:i1]