我有一些NetCDF时间序列数据。通过读取变量,我想为每个数据列出全局平均值的列表。我写的代码行得通,但不是很性感。如何以更好的形式编写(循环)该代码?
variables = 'name1 name2 name3 name4'.split()
name1 =[]
name2 =[]
name3 =[]
name4 =[]
for i in range (9,40):
name1_irrY = name1_aSrc[i].mean()
name1.append(name1_irrY)
name2_irrY = name2_aSrc[i].mean()
name2.append(name2_irrY)
name3_irrY = name3_aSrc[i].mean()
name3.append(name3_irrY)
name4_irrY = name4_aSrc[i].mean()
sname4.append(name4_irrY)
"name"_aSrc[i,:,:]
是NetCDF的变量。
因为我有很多文件,所以我需要足够的方法。
答案 0 :(得分:2)
我认为您根本不需要循环,因为您可以指定要沿着哪个轴计算平均值。因此,这样的事情就足够了(这将替换您发布的整个代码块):
name1 = np.mean(name1_aSrc[9:40,:,:], axis=(1,2))
name2 = np.mean(name2_aSrc[9:40,:,:], axis=(1,2))
# etc..
一个带有我周围的NetCDF数据的小例子:
import xarray as xr
import numpy as np
f = xr.open_dataset('u.xz.nc', decode_times=False)
u = f['u'].values
print(u.shape) # prints: (5, 96, 128, 1)
umean = np.mean(u, axis=(1,2,3))
print(umean.shape) # prints: (5,)
另一种解决方案是让xarray计算(命名)维或多个维的均值。带有其他数据的快速示例:
import xarray as xr
import numpy as np
f = xr.open_dataset('drycblles_default_0000000.nc', decode_times=False)
# Original file has 3 dimensions:
print(f.dims) # prints Frozen(SortedKeysDict({'time': 37, 'z': 32, 'zh': 33}))
# Calculate mean over one single dimension:
fm1 = f.mean(dim='z')
print(fm1.dims) # prints Frozen(SortedKeysDict(OrderedDict([('time', 37), ('zh', 33)])))
# Calculate mean over multiple dimensions:
fm2 = f.mean(dim=['z','zh'])
print(fm2.dims) # prints Frozen(SortedKeysDict(OrderedDict([('time', 37)])))
fm1
和fm2
再次只是xarray数据集:
<xarray.Dataset>
Dimensions: (time: 37)
Coordinates:
* time (time) float64 0.0 300.0 600.0 900.0 ... 1.02e+04 1.05e+04 1.08e+04
Data variables:
iter (time) float64 0.0 5.0 10.0 15.0 20.0 ... 282.0 293.0 305.0 317.0
area (time) float64 1.0 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0 1.0 1.0
areah (time) float64 1.0 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0 1.0 1.0
th (time) float64 304.8 304.8 304.8 304.8 ... 305.1 305.1 305.1 305.1
th_3 (time) float64 1.246e-08 -3.435e-11 ... 7.017e-06 5.548e-05