Question

我想提取一个相当大的netcdf文件的空间子集。来自Loop through netcdf files and run calculations - Python or R

from pylab import *
import netCDF4

f = netCDF4.MFDataset('/usgs/data2/rsignell/models/ncep/narr/air.2m.1989.nc')
# print variables
f.variables.keys()
atemp = f.variables['air'] # TODO: extract spatial subset

如何仅提取对应于状态（例如爱荷华州）的netcdf文件的子集。爱荷华州有以下边界拉特隆：

经度：89°5'W至96°31'W

纬度：40°36'N至43°30'N

Answer 1

这很简单，你必须找到纬度和经度上下限的索引。你可以通过找到最接近你正在寻找的值来做到这一点。

latbounds = [ 40 , 43 ]
lonbounds = [ -96 , -89 ] # degrees east ? 
lats = f.variables['latitude'][:] 
lons = f.variables['longitude'][:]

# latitude lower and upper index
latli = np.argmin( np.abs( lats - latbounds[0] ) )
latui = np.argmin( np.abs( lats - latbounds[1] ) ) 

# longitude lower and upper index
lonli = np.argmin( np.abs( lons - lonbounds[0] ) )
lonui = np.argmin( np.abs( lons - lonbounds[1] ) )

然后只是变量数组的子集。

# Air (time, latitude, longitude) 
airSubset = f.variables['air'][ : , latli:latui , lonli:lonui ]

注意，我假设经度尺寸变量是东经度，空气变量有时间，纬度，经度尺寸。

Answer 2

Favo的答案有效（我假设;没有检查过）。更直接和惯用的方法是使用numpy的where函数来查找必要的索引。

lats = f.variables['latitude'][:] 
lons = f.variables['longitude'][:]
lat_bnds, lon_bnds = [40, 43], [-96, -89]

lat_inds = np.where((lats > lat_bnds[0]) & (lats < lat_bnds[1]))
lon_inds = np.where((lons > lon_bnds[0]) & (lons < lon_bnds[1]))

air_subset = f.variables['air'][:,lat_inds,lon_inds]

Answer 3

如果你喜欢熊猫，那么你应该考虑检查xarray。

import xarray as xr

ds = xr.open_dataset('http://geoport.whoi.edu/thredds/dodsC/usgs/data2/rsignell/models/ncep/narr/air.2m.1980.nc',
                     decode_cf=False)
lat_bnds, lon_bnds = [40, 43], [-96, -89]
ds.sel(lat=slice(*lat_bnds), lon=slice(*lon_bnds))

Answer 4

请注意，使用NCO's ncks命令行可以更快地完成此操作。

ncks -v air -d latitude,40.,43. -d longitude,-89.,-96. infile.nc -O subset_infile.nc

Answer 5

要镜像来自N1B4的响应，您还可以在气候数据运算符（cdo）的一行上执行此操作：

cdo sellonlatbox,-96.5,-89,40,43 in.nc out.nc

因此，为了遍历一组文件，我将在BASH脚本中执行此操作，使用cdo处理每个文件，然后调用python脚本：

#!/bin/bash

# pick up a list of files (I'm presuming the loop is over the years)
files=`ls /usgs/data2/rsignell/models/ncep/narr/air.2m.*.nc`

for file in $files ; do 
   # extract the location, I haven't used your exact lat/lons
   cdo sellonlatbox,-96.5,-89,40,43 $file iowa.nc

   # Call your python or R script here to process file iowa.nc
   python script
done

我总是尝试进行文件处理＆＃34;离线＆＃34;因为我发现它不容易出错。 cdo是ncks的替代品，我并不是说它更好，我只是觉得更容易记住这些命令。 nco一般来说更强大，当cdo无法完成我想要执行的任务时，我会求助它。

Answer 6

需要对lonbounds部分进行小的更改（数据为度数东），因为数据中的经度值范围为0到359，因此在这种情况下负数将不起作用。还需要切换latli和latui的计算，因为该值从北到南，从89到-89。

latbounds = [ 40 , 43 ]
lonbounds = [ 260 , 270 ] # degrees east
lats = f.variables['latitude'][:] 
lons = f.variables['longitude'][:]

# latitude lower and upper index
latli = np.argmin( np.abs( lats - latbounds[1] ) )
latui = np.argmin( np.abs( lats - latbounds[0] ) ) 

# longitude lower and upper index
lonli = np.argmin( np.abs( lons - lonbounds[0] ) )
lonui = np.argmin( np.abs( lons - lonbounds[1] ) )

Answer 7

如果您使用的是Linux或macOS，则可以使用nctoolkit（https://nctoolkit.readthedocs.io/en/latest/）轻松处理：

import nctoolkit as nc
data = nc.open_data('/usgs/data2/rsignell/models/ncep/narr/air.2m.1989.nc')
data.crop(lon = [-(96+31/60), -(89+5/6)], lat = [40 + 36/60, 43 + 30/60])

latcd子集的netcdf4提取

7 个答案: