我是R的新手并尝试对多组数据进行一些相关性分析。我能够进行分析,但我想弄清楚如何输出我的数据结果。我希望得到如下输出:
NAME,COR1,COR2
....,....,....
....,....,....
如果我可以将这样的文件写入输出,那么我可以根据需要发布它。我的处理脚本如下所示:
run_analysis <- function(logfile, name)
{
preds <- read.table(logfile, header=T, sep=",")
# do something with the data: create some_col, another_col, etc.
result1 <- cor(some_col, another_col)
result1 <- cor(some_col2, another_col2)
# somehow output name,result1,result2 to a CSV file
}
args <- commandArgs(trailingOnly = TRUE)
date <- args[1]
basepath <- args[2]
logbase <- paste(basepath, date, sep="/")
logfile_pattern <- paste( "*", date, "csv", sep=".")
logfiles <- list.files(path=logbase, pattern=logfile_pattern)
for (f in logfiles) {
name = unlist(strsplit(f,"\\."))[1]
logfile = paste(logbase, f, sep="/")
run_analysis(logfile, name)
}
是否有一种简单的方法可以创建一个空白数据框,然后逐行添加数据?
答案 0 :(得分:4)
您是否查看过R中用于将数据写入文件的函数?例如,write.csv
。也许是这样的:
rs <- data.frame(name = name, COR1 = result1, COR2 = result2)
write.csv(rs,"path/to/file",append = TRUE,...)
答案 1 :(得分:2)
我喜欢使用foreach库来做这类事情:
library(foreach)
run_analysis <- function(logfile, name) {
preds <- read.table(logfile, header=T, sep=",")
# do something with the data: create some_col, another_col, etc.
result1 <- cor(some_col, another_col)
result2 <- cor(some_col2, another_col2)
# Return one row of results.
data.frame(name=name, cor1=result1, cor2=result2)
}
args <- commandArgs(trailingOnly = TRUE)
date <- args[1]
basepath <- args[2]
logbase <- paste(basepath, date, sep="/")
logfile_pattern <- paste( "*", date, "csv", sep=".")
logfiles <- list.files(path=logbase, pattern=logfile_pattern)
## Collect results from run_analysis into a table, by rows.
dat <- foreach (f=logfiles, .combine="rbind") %do% {
name = unlist(strsplit(f,"\\."))[1]
logfile = paste(logbase, f, sep="/")
run_analysis(logfile, name)
}
## Write output.
write.csv(dat, "output.dat", quote=FALSE)
这样做是为了在每次调用run_analysis
时生成一行输出,将它们绑定到一个名为dat
的表中(.combine="rbind"
部分调用foreach
导致r
ow bind
ing)。然后,您只需使用write.csv
即可获得所需的输出。