Question

我有大约4TB MERIS时间序列数据，它们采用netCDF格式。

所以我有很多包含几个＆＃39;变量的netCDF文件。 NetCDF格式对我来说是新的，尽管我已经阅读了很多关于netCDF处理的内容，但我并不了解如何做到这一点。这个问题＆＃39; Combining a large amount of netCDF files＆＃39;以某种方式处理我的问题，但我没有到达那里。我的方法是首先镶嵌，然后叠加并最近从每个像素中取出平均值。

一个文件包含以下32个变量

此处还有一天的.nc文件的ncdump输出： http://www.filedropper.com/ncdumpoutput

我设法读取文件，提取我想要的变量（变量＃32）并使用以下代码将它们放入列表中

l = list()
for i in files_in:

# read netCDF file
dset = nc.Dataset(i, mode = 'r')

# save variables
var = dset.variables['vegetation_index_mean'][:]

# write all temp loop outputs in a list
l.append (var)

# close netCDF file
dset.close()

该列表现在包含24＆＃39; masked_arrays＆＃39; 同一日期的不同位置。每当我想要打印列表的内容时，我的Spyder会冻结。我之后运行的每个命令Spyder在开始之前首先冻结五秒钟。

我的目标是针对特定时间范围（存储在单个.nc文件中的每个日期）进行时间序列分析。所以我的计划是镶嵌（这可能吗？）列表中的变量（将它们视为栅格波段），处理其他日期并取每个像素的平均值（1800 x 1800）。

也许我的整个方法都错了？我可以处理这些变量＆＃39;像栅格乐队？

Answer 1

我不确定以下答案是否可以满足您的需求，因为此程序是为了处理时间序列而设计的，非常手动，而且您还有4Tb的数据......

因此，如果这没有帮助，我会道歉。

这适用于Python 2.7：

首先导入所需的所有模块：

import tkFileDialog 
from netCDF4 import Dataset
import matplotlib.pyplot as plt

第二次解析多个nc文件：

n = []
filename = {}
filename = tkFileDialog.askopenfilenames()  
filename = list(filename)
n = len(filename)

第三次读取nc文件并使用循环对字典中的数据和元数据进行分类：

wtr_tem = {}  # create empty arrays for variable sea water temperature
fh = {}       # create empty arrays for filehandler and variables nc file
vars = {}

for i in range(n):
    filename[i]=filename[i].decode('unicode_escape').encode('ascii','ignore') # remove unicode in order to execute the following command
    filename1 = ''.join(filename[i]) # converts list to string
    fh[i] = Dataset(filename1, mode='r') #create the file handle
    vars[i] = fh[i].variables.keys()  #returns a list with the variables of the file

    wtr_tem[i] = fh[i].variables['WTR_TEM']

    #plot variables in different figures
    plt.plot(wtr_tem[i],'r-')

    plt.xlabel(fh[i].title) #add specific title from each nc file
    plt.show()

我希望它对某些人有帮助。

在python中使用多个netCDF文件/变量

1 个答案: