块状形状不匹配

时间:2020-07-21 22:28:40

标签: python scipy netcdf

美好的一天,

很抱歉,很详细!!! 我有258个输入文件,其中包含15年的数据。该脚本的目的是提取在用户输入的开始和结束年度内分配为变量的特定列。该脚本最初的结构是在该时间段内提取整个列,但是现在我进一步过滤要提取的期望值,因为我只对该列中0-1之间的值感兴趣。但是,由于原始脚本旨在处理731个样本(365天+ 365天+ 1),因此该脚本会自动查找731个样本值。而且由于我对Scipy和Numpy模块中的Netcdf函数没有太多的经验,所以我不知道如何根据具有所需值的过滤天数来重组脚本的这一部分。由于每年的0-1之间会有不同数量的值,因此我很难根据脚本找到的过滤天数分配初始样本量。我确实找到了解决方法,可以通过提供-1的虚拟值来避免广播问题并满足731样本大小的要求,但是当我在ArcGIS中打开Netcdf文件时,我看到了这一点,因为最大值和最小值分别为1和-1,所以我无法在栅格中正确看到所需的值,因为ArcGIS仅在显示屏中显示大多数-1s值。我希望我能解释我的问题。有人可以帮我根据脚本找到的过滤值来构造样本大小吗?虽然,我确实找到了NetCDF4Excel,这是一个Excel加载项,只是删除了Excel中.nc文件中存储的虚拟值,但似乎此加载项在64位系统上不起作用。我知道问题在于几天,但我不知道如何解决。哦,忘了提到它是一个3D数组,其中包含X坐标,Y坐标和日期/时间。 enter image description here

# for what date?
start_year = input("Enter start year:")
end_year = input("End year:")

inidate = datetime.date(start_year,1,1)
enddate = datetime.date(end_year,12,31)

days = enddate.toordinal() - inidate.toordinal()+1 

这是我用来过滤特定列中的值的脚本的一部分:

for l in lixo:
        if int(l.split("\t")[0]) in range(inidate.year, enddate.year+1):
            if var==3:
                if previousValue==-10:                    
                    previousValue=float(l.split("\t")[var])
                    dado.append(-1)
                else:
                    currentValue=float(l.split("\t")[var])
                    if currentValue==0 and previousValue>0:
                        dado[-1]=previousValue
                        dado.append(currentValue)
                    else:
                        dado.append(-1)
                    previousValue=currentValue
            else:
                dado.append(float(l.split("\t")[var]))
        # putting data inside array.
        # Since data has lat & lon fixed uses dimension [:,lat_index,lon_index]

    print dado

整个脚本如下:

import os
import sys
# handle dates...
import datetime
# SciPy netCDF and NumPy
from scipy.io.netcdf import *
from numpy import *

skip_lines = 6

# building file list and sorted lat lon list
file_list = os.listdir(sys.argv[1])

lat_t = []
lon_t = []
lat = []
lon = []

for f in file_list:
    lat_t.append(float(f.split("_")[1]))
    lon_t.append(float(f.split("_")[2]))

for i in lat_t:
    if i not in lat:
        lat.append(i)

for i in lon_t:
    if i not in lon:
        lon.append(i)
# putting in order. Lat should be from top to bottom
# lon from left to right 
lon.sort()
lat.sort()
lat.reverse()

del(lat_t)
del(lon_t)

#determining the parameter to use
print "Choose output parameter"
varini = input('Choose output (1 a 8)>')

#getting the column right
if int (varini) < 8:
    var = varini + 2
#set name of out_file. Named after parameter choice
if var == 3:
    var_txt = "ABCD"
    var_name = "ABCD"
# for what date?
start_year = input("Enter start year:")
end_year = input("End year:")

inidate = datetime.date(start_year,1,1)
enddate = datetime.date(end_year,12,31)

days = enddate.toordinal() - inidate.toordinal()+1

print "Go grab a coffee, this could take a while..."

#
# create array containig all data
# This is going to be huge. Create an array with -9999 (NoData)
# Then populate the array by reading each input file
#

all_data = zeros([days,len(lat),len(lon)], float)-9999

c = len(file_list)

# for each file in list
for f in file_list:
    # get lat & lon and it's index
    latitude = float(f.split("_")[1])
    longitude = float(f.split("_")[2])
    lat_id = lat.index(latitude)
    lon_id = lon.index(longitude)

    print "%i files to write." % c
    c = c -1

    infile = open(sys.argv[1]+f, "r")
    # here we skip the number of header lines
    # variable set at the beginning of the code
    lixo = infile.readlines()[skip_lines:]
    infile.close()
    dado = []
    previousValue = -10

    for l in lixo:
        if int(l.split("\t")[0]) in range(inidate.year, enddate.year+1):
            if var==3:
                if previousValue==-10:                    
                    previousValue=float(l.split("\t")[var])
                    dado.append(-1)
                else:
                    currentValue=float(l.split("\t")[var])
                    if currentValue==0 and previousValue>0:
                        dado[-1]=previousValue
                        dado.append(currentValue)
                    else:
                        dado.append(-1)
                    previousValue=currentValue
            else:
                dado.append(float(l.split("\t")[var]))

        # putting data inside array.
        # Since data has lat & lon fixed uses dimension [:,lat_index,lon_index]

    print dado
all_data[:,lat_id,lon_id] = dado

# writing NetCDF
#

ncfile = netcdf_file(var_txt+".nc", "w")

ncfile.Conventions = "COARDS"
ncfile.history = "Created using flux2cdf.py. " + datetime.date.today().isoformat()
ncfile.production = "ABCD output"

ncfile.start_date = inidate.isoformat()
ncfile.end_date = enddate.isoformat()

#create dimensions
ncfile.createDimension("X", len(lon))
ncfile.createDimension("Y", len(lat))
ncfile.createDimension("T", days)

#create variables
latvar = ncfile.createVariable("Y", "f4", ("Y",))
latvar.long_name = "Latitude"
latvar.units = "degrees_north"
latvar[:] = lat

lonvar = ncfile.createVariable("X", "f4", ("X",))
lonvar.long_name = "Longitude"
lonvar.units = "degrees_east"
lonvar[:] = lon

timevar = ncfile.createVariable("T", "f4", ("T",))
timevar.long_name = "Time"
timevar.units = "days since " + inidate.isoformat()
timevar[:] = range(0, days)

data_var = ncfile.createVariable(var_txt, "f4", ("T","Y","X"))
data_var.long_name = var_name+" calculated by ABCD"
data_var.missing_value = -9999.0
data_var.units = "milimeters"
data_var[:] = all_data

ncfile.close()

在.nc文件中写入的具有所需过滤值的样本数据以黄色突出显示,如下所示:

enter image description here

样本数据:

245 files to write.

244 files to write.

1 个答案:

答案 0 :(得分:0)

事实证明,有一种简单的方法,只需使用ArcGIS https://support.esri.com/en/technical-article/000011318中的此时间片工具,即可提取包含这些日期各自值的特定感兴趣日期。现在,我可以提取感兴趣的每一天的单个栅格文件。