自动化“ncvar_get” - 读取NetCDF文件中不同变量名称

时间:2018-03-02 11:37:37

标签: r if-statement automation netcdf4

我有一个NetCDF数据集,有两个气候情景(rcp& hist),它们都包含25个文件。每个文件包含变量“pr”,“tas”,“tasmax”或“tasmin”的数据。我写了一个for循环迭代地读取hist和rcp的文件,用nc_open读取它们,用ncvar_get提取变量,最后以mean(abs(hist - rcp)的形式进行计算,得到每对之间的平均绝对距离) of hist和rcp。问题:因为ncvar_get需要当前文件的确切变量名,我写了一个if else块(见下文),它将找到当前文件的变量名并将其应用于ncvar_get。运行我获得的代码以下错误:

[1] "vobjtovarid4: error #F: I could not find the requsted var (or dimvar) in the file!"

[1] "var (or dimvar) name: tas"

[1] "file name: /data/historical/tasmax_ICHEC-EC-EARTH_DMI-HIRHAM5_r3i1p1.nc" Error in vobjtovarid4(nc, varid, verbose = verbose, allowdimvar = TRUE) : Variable not found



 #Extract of the files in the hist list. Same file names in the rcp list, but different directory

    > hist.files.cl <- list.files("/historical", full.names = TRUE)

    > hist.files.cl

 [1] "/historical/pr_CNRM-CERFACS-CNRM-CM5_ALADIN53_r1i1p1.nc"           
 [2] "/historical/pr_CNRM-CERFACS-CNRM-CM5_ALARO-0_r1i1p1.nc"            
 [3] "/historical/pr_ICHEC-EC-EARTH_HIRHAM5_r3i1p1.nc"                   
 [4] "/historical/pr_ICHEC-EC-EARTH_RACMO22E_r12i1p1.nc"                 
 [5] "/historical/pr_ICHEC-EC-EARTH_RCA4_r12i1p1.nc"                     
 [6] "/historical/pr_MPI-M-MPI-ESM-LR_RCA4_r1i1p1.nc"                    
 [7] "/historical/pr_MPI-M-MPI-ESM-LR_REMO2009_r1i1p1.nc"                
 [8] "/historical/pr_MPI-M-MPI-ESM-LR_REMO2009_r2i1p1.nc"                
 [9] "/historical/tas_CNRM-CERFACS-CNRM-CM5_CNRM-ALADIN53_r1i1p1.nc"     
[10] "/historical/tas_CNRM-CERFACS-CNRM-CM5_RMIB-UGent-ALARO-0_r1i1p1.nc"
[11] "/historical/tas_ICHEC-EC-EARTH_DMI-HIRHAM5_r3i1p1.nc"              
[12] "/historical/tas_ICHEC-EC-EARTH_KNMI-RACMO22E_r12i1p1.nc"           
[13] "/historical/tas_ICHEC-EC-EARTH_SMHI-RCA4_r12i1p1.nc"               
[14] "/historical/tas_MPI-M-MPI-ESM-LR_MPI-CSC-REMO2009_r1i1p1.nc"       
[15] "/historical/tas_MPI-M-MPI-ESM-LR_MPI-CSC-REMO2009_r2i1p1.nc"       
[16] "/historical/tasmax_ICHEC-EC-EARTH_DMI-HIRHAM5_r3i1p1.nc"           
[17] "/historical/tasmax_ICHEC-EC-EARTH_KNMI-RACMO22E_r12i1p1.nc"        
[18] "/historical/tasmax_ICHEC-EC-EARTH_SMHI-RCA4_r12i1p1.nc"        


euc.distance <- list()


for(i in 1:length(hist.files.cl)) {

#Open ith file in list of hist files as well as in list of rcp files

  hist.data <- nc_open(hist.files.cl[i])   
  rcp.data <- nc_open(rcp.files.cl[i])

  if(grepl("pr", hist.data$filename)){
    hist.var <- ncvar_get(hist.data, "pr")
    rcp.var <- ncvar_get(rcp.data, "pr")
    }else if (grepl("tas", hist.data$filename)){
    hist.var <- ncvar_get(hist.data, "tas")
    rcp.var <- ncvar_get(rcp.data, "tas")
    }else if (grepl("tasmax", hist.data$filename)){
    hist.var <- ncvar_get(hist.data, "tasmax")
    rcp.var <- ncvar_get(rcp.data, "tasmax")
    }else{
    hist.var <- ncvar_get(hist.data, "tasmin")
    rcp.var <- ncvar_get(rcp.data, "tasmin")
    }
 #Converting temperature variable from K to °C: 

  if(grepl("tas", hist.data$filename)){

    hist.var <- hist.var-273.15
    rcp.var <- rcp.var-273.15
  }

 #Find for the ith rcp file with dim=(1,1,360) in the ith hist file with dim=(385,373,360) the grid point with the best fitting distribution (each grid point consists of a distribution of 360 time steps).The calculation may contain errors...

  euc.distance[[i]] <- apply(hist.var, c(1,2), function(x) mean(abs(rcp.var - x)))
  min_values <-  which(rank(euc.distance[[i]], ties.method='min') <= 10)
}

由于cath强调了错误的可能原因,但是从文件名中提取感兴趣部分(=变量名称)的建议方法不起作用。之前我尝试使用stringr(“filename”,startposition,endposition)自动提取变量名,直到我注意到它没有意义,因为每个变量名(pr,tas,tasmax,tasmin)都有另一个字符串长度。您对我有什么看法? 非常感谢!

1 个答案:

答案 0 :(得分:3)

要完成我的评论,如果您需要对每个文件进行操作,您可以立即执行此操作,将所有内容放入列表中。

因此,首先获取每个文件的“keypart”:

keyparts <- sub("^([a-z]+)_.+", "\\1", basename(hist.files.cl))
keyparts
# [1] "pr"     "pr"     "pr"     "pr"     "pr"     "pr"     "pr"     "pr"    
# [9] "tas"    "tas"    "tas"    "tas"    "tas"    "tas"    "tas"    "tasmax"
#[17] "tasmax" "tasmax"

然后,您可以使用lapply一次性为每个文件执行操作:

my_res <- lapply(seq(keyparts), 
                 function(i){

         hist.data <- nc_open(hist.files.cl[i])   
         rcp.data <- nc_open(rcp.files.cl[i])

         hist.var <- ncvar_get(hist.data, keyparts[i])
         rcp.var <- ncvar_get(rcp.data, keyparts[i])

         if(keyparts[i]=="tas"){
           hist.var <- hist.var-273.15
           rcp.var <- rcp.var-273.15
         }


        euc.distance <- apply(hist.var, c(1,2), function(x) mean(abs(rcp.var - x)))
        min_values <-  which(rank(euc.distance[[i]], ties.method='min') <= 10)

        return(list(euc.distance=euc.distance, min.values=min.values))

                   })