我正在处理三维(x,y,时间)NetCDF文件,这些文件包含一年中每小时的PM10浓度估算值。我的目标是提取几个坐标的每小时估计---这将是365天* 24小时= 8760估计/年/坐标---然后平均到每日(365)估计。
我的脚本(见下文)在2013年运作良好,但2012年的输出有很多NA。我注意到的差异是2012年文件中的lon / lat以矩阵形式存储......
File E:/ENSa.2012.PM10.yearlyrea_.nc (NC_FORMAT_CLASSIC):
3 variables (excluding dimension variables):
float lon[x,y]
long_name: Longitude
units: degrees_east
float lat[x,y]
long_name: Latitude
units: degrees_north
float PM10[x,y,time]
units: ug/m3
3 dimensions:
x Size:701
y Size:401
time Size:8784 *** is unlimited ***
units: day as %Y%m%d.%f
calendar: proleptic_gregorian
head(lon)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] -25.0 -25.0 -25.0 -25.0 -25.0 -25.0 -25.0 -25.0 -25.0
[2,] -24.9 -24.9 -24.9 -24.9 -24.9 -24.9 -24.9 -24.9 -24.9
对于2013年的文件,lon是正常的'像这样
File E:/ENSa.2013.PM25.yearlyrea.nc (NC_FORMAT_NETCDF4):
1 variables (excluding dimension variables):
float PM25[lon,lat,time] (Chunking: [701,401,1])
long_name: PM25
units: ug
_FillValue: -999
3 dimensions:
lon Size:701
standard_name: longitude
long_name: longitude
units: degrees_east
axis: X
lat Size:401
standard_name: latitude
long_name: latitude
units: degrees_north
axis: Y
time Size:8760 *** is unlimited ***
standard_name: time
long_name: time at end of period
units: day as %Y%m%d.%f
calendar: proleptic_gregorian
head(lon)
[1] -25.0 -24.9 -24.8 -24.7 -24.6 -24.5
我使用以下脚本:
# Command brick reads all layers (time slices) in the file
pm102013 <- brick("ENSa.2013.PM10.yearlyrea.nc", varname = "PM10")
# Get date index from the file
idx <- getZ(pm102013)
# Put coordinates and extract values for all time steps
coords <- matrix(c( -2.094278, -1.830583, -2.584482, -0.175269, -3.17625, 0.54797, -2.678731, -1.433611, -1.456944, -3.182186,
57.15736, 52.511722, 51.462839, 51.54421, 51.48178, 51.374264, 51.638094, 53.230583, 53.231722, 55.945589),
ncol = 2) # longitude and latitude
vals <- extract(pm102013, coords, df=T)
# Merge dates and values and fix data frame names
df.pm102013 <- data.frame(idx, t(vals)[-1,])
rownames(df.pm102013) <- NULL
names(df.pm102013) <- c('date','UKA00399', 'UKA00479', 'UKA00494', 'UKA00259', 'UKA00217', 'UKA00553', 'UKA00515', 'UKA00530', 'UKA00529', 'UKA00454')
#output
options(max.print=100000000)
sink("PM10_2013.txt")
print(df.pm102013)
sink()
任何人都知道有&#39;修复&#39; lon / lat问题?或者还有另一种有效的方法来提取和平均数据?
答案 0 :(得分:0)
您可以从位置lon / lat中提取最近的点,并使用bash命令行中的CDO进行每日平均值:
def class_avg(open_file):
'''(file) -> list of float
Return a list of assignment averages for the entire class given the open
class file. The returned list should contain assignment averages in the
order listed in the given file. For example, if there are 3 assignments
per student, the returned list should 3 floats representing the 3 averages.
'''
marks = None
avgs = []
for line in open_file:
grades_list = line.strip().split(',')
if marks is None:
marks = []
for i in range(len(grades_list) -4):
marks.append([])
for idx,i in enumerate(range(4,len(grades_list))):
marks[idx].append(int(grades_list[i]))
for mark in marks:
avgs.append(float(sum(mark)/(len(mark))))
return avgs
remapnn上的减号表示结果通过管道连接到daymean命令。对于每个所需的点,你可以将它放在bash的循环中。