我需要为多个CSV文件运行相同的代码集。我想用宏做同样的事情。下面是我正在执行的代码,但结果不正确。它是以二维格式读取数据,而我需要以三维格式运行。
lf = list.files(path = "D:/THD/data", pattern = ".csv",
full.names = TRUE, recursive = TRUE, include.dirs = TRUE)
ds<-lapply(lf,read.table)
答案 0 :(得分:0)
我不知道这是否有用,但我的方法之一是:
##Step 1 read files
mycsv = dir(pattern=".csv")
n <- length(mycsv)
mylist <- vector("list", n)
for(i in 1:n) mylist[[i]] <- read.csv(mycsv[i],header = T)
然后我实际上只使用apply函数来改变事物,例如,
## Change coloumn name
mylist <- lapply(mylist, function(x) {names(x) <- c("type","date","v1","v2","v3","v4","v5","v6","v7","v8","v9","v10","v11","v12","v13","v14","v15","v16","v17","v18","v19","v20","v21","v22","v23","v24","total") ; return(x)})
## changing type coloumn for weekday/weekend
mylist <- lapply(mylist, function(x) {
f = c("we", "we", "wd", "wd", "wd", "wd", "wd")
x$type = rep(f,52, length.out = 365)
return(x)
})
and so on.
然后我在完成所有更改之后再次使用以下代码保存(对于拆分原始文件名并重命名每个文件以保存文件名的一部分,以便稍后我可以跟踪每个单独的文件)
## for example some of my file had a pattern in file name such as "201_E424220_N563500.csv",so I split this to save with a new name like this:
mylist <-lapply(1:length(mylist), function(i) {
mylist.i <- mylist[[i]]
s = strsplit(mycsv[i], "_" , fixed = TRUE)[[1]]
d = cbind(mylist.i[, c("type", "date")], ID = s[1], Easting = s[2], Northing = s[3], mylist.i[, 3:ncol(mylist.i)])
return(d)
})
for(i in 1:n)
write.csv(file = paste("file", i, ".csv", sep = ""), mylist[i], row.names = F)
我希望这会有所帮助。当你得到一些时间阅读有关PLYR包的请求,因为我相信这对你非常有用,它是一个非常有用的包,有很多数据分析选项。 PLYR已应用以下功能:
## l_ply split list, apply function and discard result
## ldply split list, apply function and return result in data frame
## laply split list, apply function and return result in an array
例如,您可以使用ldply读取所有csv并返回数据框simething:
data = ldply(list.files(pattern = ".csv"), function(fname) {
j = read.csv(fname, header = T)
return(j)
})
所以这里J将是你的所有csv文件数据的数据框。
谢谢,阿燕