Question

我正在处理三维（x，y，时间）NetCDF文件，这些文件包含一年中每小时的PM10浓度估算值。我的目标是提取几个坐标的每小时估计---这将是365天* 24小时= 8760估计/年/坐标---然后平均到每日（365）估计。

我的脚本（见下文）在2013年运作良好，但2012年的输出有很多NA。我注意到的差异是2012年文件中的lon / lat以矩阵形式存储......

File E:/ENSa.2012.PM10.yearlyrea_.nc (NC_FORMAT_CLASSIC):

     3 variables (excluding dimension variables):
        float lon[x,y]   
            long_name: Longitude
            units: degrees_east
        float lat[x,y]   
            long_name: Latitude
            units: degrees_north
        float PM10[x,y,time]   
            units: ug/m3

     3 dimensions:
        x  Size:701
        y  Size:401
        time  Size:8784   *** is unlimited ***
            units: day as %Y%m%d.%f
            calendar: proleptic_gregorian

head(lon) 
      [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9]
[1,] -25.0 -25.0 -25.0 -25.0 -25.0 -25.0 -25.0 -25.0 -25.0
[2,] -24.9 -24.9 -24.9 -24.9 -24.9 -24.9 -24.9 -24.9 -24.9

对于2013年的文件，lon是正常的＆＃39;像这样

File E:/ENSa.2013.PM25.yearlyrea.nc (NC_FORMAT_NETCDF4):

     1 variables (excluding dimension variables):
        float PM25[lon,lat,time]   (Chunking: [701,401,1])  
            long_name: PM25
            units: ug
            _FillValue: -999

     3 dimensions:
        lon  Size:701
            standard_name: longitude
            long_name: longitude
            units: degrees_east
            axis: X
        lat  Size:401
            standard_name: latitude
            long_name: latitude
            units: degrees_north
            axis: Y
        time  Size:8760   *** is unlimited ***
            standard_name: time
            long_name: time at end of period
            units: day as %Y%m%d.%f
            calendar: proleptic_gregorian

head(lon) 
[1] -25.0 -24.9 -24.8 -24.7 -24.6 -24.5

我使用以下脚本：

# Command brick reads all layers (time slices) in the file
  pm102013 <- brick("ENSa.2013.PM10.yearlyrea.nc", varname = "PM10")

# Get date index from the file
  idx <- getZ(pm102013)

# Put coordinates and extract values for all time steps
  coords <- matrix(c( -2.094278,    -1.830583,  -2.584482,  -0.175269,  -3.17625,   0.54797,    -2.678731,  -1.433611,  -1.456944,  -3.182186,  
 57.15736,  52.511722,  51.462839,  51.54421,   51.48178,   51.374264,  51.638094,  53.230583,  53.231722,  55.945589),
 ncol = 2) # longitude and latitude
 vals <- extract(pm102013, coords, df=T)

# Merge dates and values and fix data frame names
 df.pm102013 <- data.frame(idx, t(vals)[-1,])
 rownames(df.pm102013) <- NULL
 names(df.pm102013) <- c('date','UKA00399', 'UKA00479', 'UKA00494', 'UKA00259', 'UKA00217', 'UKA00553', 'UKA00515', 'UKA00530', 'UKA00529', 'UKA00454')

#output
 options(max.print=100000000)
 sink("PM10_2013.txt")
 print(df.pm102013)
 sink()

任何人都知道有＆＃39;修复＆＃39; lon / lat问题？或者还有另一种有效的方法来提取和平均数据？

Answer 1

您可以从位置lon / lat中提取最近的点，并使用bash命令行中的CDO进行每日平均值：

def class_avg(open_file):
'''(file) -> list of float
Return a list of assignment averages for the entire class given the open
class file. The returned list should contain assignment averages in the
order listed in the given file.  For example, if there are 3 assignments
per student, the returned list should 3 floats representing the 3 averages.
'''
marks = None
avgs = []
for line in open_file:
    grades_list = line.strip().split(',')
    if marks is None:
        marks = []
        for i in range(len(grades_list) -4):
            marks.append([])
    for idx,i in enumerate(range(4,len(grades_list))):
        marks[idx].append(int(grades_list[i]))
for mark in marks:
    avgs.append(float(sum(mark)/(len(mark))))
return avgs

remapnn上的减号表示结果通过管道连接到daymean命令。对于每个所需的点，你可以将它放在bash的循环中。

当变量lat / lon存储为R

1 个答案: