Question

我有超过7,000个文件* .bil文件我正在尝试合并到一个* .csv文件中并将其导出。我能够使用raster和as.data.frame读取* .bil文件：

setwd("/.../Prism Weather Data All/")
filenames <- list.files(path = "/.../Prism Weather Data All/", pattern = ".bil")
r = raster("PRISM_ppt_stable_4kmM2_189501_bil.bil")
test <- as.data.frame(r, na.rm=TRUE)

设置工作目录并使用* .bil抓取所有文件。但我只栅格一个文件并设置as.data.frame来验证它是否正确，这是完美的。但我想弄清楚如何将所有7000个文件（文件名）合并为一个。

任何有关此的帮助将不胜感激。提前谢谢。

Answer 1

假设7000是实数而不是近似值，并且每个文件中的所有数据都具有相同的结构（相同的列数和行数）：

setwd("/.../Prism Weather Data All/")

nc<- ## put the number of columns of each file (assuming they're all the same)
nr<- ## put the number of rows of each file (assuming they're all the same)

filenames <- list.files(path = "/.../Prism Weather Data All/", pattern = ".bil")

# initialize what is likely to be a large object
final.df<-as.data.frame(matrix(NA,ncol=7000*nc,nrow=nr)) 
counter=1
# loop through the files
for (i in filenames){
    r = raster(i)
    test <- as.data.frame(r, na.rm=TRUE)
    final.df[,counter:counter+nc]<-test
    counter<-counter+nc+1
}

# write the csv
write.csv(final.df,"final-filename.csv")

请记住，您的计算机必须有足够的内存来保存所有数据，因为R需要在内存中包含对象。

如果列数因文件而异，您可以通过调整循环内final.df分配中的索引并相应增加counter来调整它。

编辑：产生预期结果

我认为for循环是关于完成这类工作的唯一方法。事实上，7000个文件是一个非常大的集合，所以期望花一些时间看它迭代。

setwd("/.../Prism Weather Data All/")

nc<- ## put the number of columns you expect the data in the files to have
nr<- ## put roughly the number of rows times 12 (if you plan to read a year worth of data)
     ## PLUS some tolerance, so you'll end up with an object actually larger than needed

filenames <- list.files(path = "/.../Prism Weather Data All/", pattern = ".bil")

# initialize what is likely to be a large object
final.df<-as.data.frame(matrix(NA,ncol=c,nrow=nr)) 
counter=1
# loop through the files
for (i in filenames){
    r = raster(i)
    test <- as.data.frame(r, na.rm=TRUE)
    numrow2<-nrow(test)
    final.df[counter:counter+numrow2,]<-test
    counter<-counter+numrow2+1
}

final.df[counter-1:nrow(final.df),]<-NULL  ## remove empty rows

# write the csv
write.csv(final.df,"final-filename.csv")

希望它有所帮助。

Answer 2

我一直在使用Prism数据，下面是另一种方法。如果您可以合并7,000个.bil文件中的每个“站点”或行名称。在这种情况下，每个月将是一个单独的列，对应于相同的站ID /行。

setwd("/.../Prism Weather Data All/")
require(dplyr)
require(raster)

#This makes sure only .bil is read (not asc.bil, etc)

filenames <- dir("/.../Prism Weather Data All/", pattern = "\\.bil$")

z <- as.data.frame(matrix(NA)) 

#loop through the data, and name each column the name of the date in 
#the spreadsheet (according to Prism's naming convention, the date 
#starts at character 24 and ends at character 29)

for (file in filenames){
  r <- raster(filenames)
  test <- as.data.frame(r, na.rm=TRUE, row.names=TRUE, col.names=FALSE)
  names(test)<- c(substring(file, 24, 29))
  z <- cbind(z, test)
}

#then export the data.frame to CSV!

将多个* .bil气候数据合并到* .csv中

2 个答案: