查找由R语言中的空格分隔的文本文件的平均值

时间:2016-09-25 21:07:23

标签: r

我在30个不同的目录中有一些名称为Reg.stt的文件,其中包含整数和浮点类型的数值数据。我想在另一个文件中平均数据。我不知道用R语言编写代码。但根据我的发现,用R语言编写这样的脚本是一件容易的事。文件的结构如下......列数和行数可以变化。

操作系统:Ubuntu

文件夹结构:/Desktop/folder1/Reg.stt,/Desktop/folder2/Reg.stt ... /Desktop/folder30/Reg.stt

0 0.3857 0.7942 0.0000 12.418 3.626 4 2 12 4 0.3857 0.7942 0.0000 12.418 3.626 4 2 12 4 505 
1 0 0.4269 0.8726 0.0000 11.146 3.730 19 5 8 3 0.4063 0.8726 0.0000 11.782 3.678 19 5 12 4 584 
2 0 0.4427 0.8442 0.0000 11.388 4.014 19 5 15 6 0.4184 0.8726 0.0000 11.651 3.790 19 5 12 4 561 
3 0 0.4472 0.8718 0.0000 11.928 4.134 16 5 23 6 0.4256 0.8726 0.0000 11.720 3.876 19 5 12 4 579 
4 0 0.4511 0.8893 0.0028 11.514 4.176 16 4 31 10 0.4307 0.8893 0.0000 11.679 3.936 16 4 12 4 583 
5 0 0.4546 0.8193 0.0000 11.362 4.204 6 2 6 3 0.4347 0.8893 0.0000 11.626 3.981 16 4 12 4 566 

1 个答案:

答案 0 :(得分:0)

这是一种做法。我在代码中发表了评论。

# Create a sub-folder into which we create files.
dir.create("temp_dir")
setwd("temp_dir")

# Create some files in temp_dir.
sapply(1:10, FUN = function(i) {
  xy <- data.frame(a = rnorm(10), b = rnorm(10), c = rnorm(10))
  write.table(xy, file = sprintf("filename_%s.txt", i), row.names = FALSE, col.names = TRUE)
})

# Find relevant files, you can use pattern to match if folder doesn't contain only
# files in question.
all.files <- list.files(pattern = "filename_")

# Read in all files. do not simplify the result because we'll need it "raw". The 
# result is a list.
all.dfs <- sapply(all.files, FUN = read.table, header = TRUE, simplify = FALSE)

# Calculate grand mean of the data.frame.
all.means <- lapply(all.dfs, FUN = function(x) mean(sapply(x, mean)))

# Corce from a list to a vector.
one.mean <- as.numeric(all.means)

# Calculate mean for each column.
mean(one.mean)

# Clean up this example.
setwd("../")
unlink("temp_dir")