将R中的csv文件组合到不同的列中

时间:2016-11-22 22:53:30

标签: r csv

我有超过100个csv文件,每个文件包含一列Date和Time(每个文件的值完全相同),然后是每个文件不同的第二列。我希望文件名是列名。

所以基本上我想为每个文件的数据框添加一列。

最有效的方法是什么?

到目前为止,我所做的只是列出我文件夹中的所有csv文件,因为到目前为止我找到的所有信息似乎只是告诉我如何将数据添加为更多行,而不是更多列。

1 个答案:

答案 0 :(得分:0)

这是一个选项。它或多或少是手工制作的,因为我不知道任何一个命令完全按照你的指定这样做。

# the working directory contains d1.csv, d2.csv, d3.csv with
# matching first two columns. The third column is random data
read.csv("./d1.csv")
#>     a b         c
#> 1   1 A 0.5526777
#> 2   2 B 0.2161643
#> 3   3 C 0.3311132
#> 4   4 D 0.3577971
#> 5   5 E 0.2298579
#> 6   6 F 0.4014883
#> 7   7 G 0.2789038
#> 8   8 H 0.5729675
#> 9   9 I 0.3413949
#> 10 10 J 0.5807167

## identify the .csv files in the working directory
file_list <- dir(".", pattern = "*.csv")
file_list
#> [1] "d1.csv" "d2.csv" "d3.csv"

## for each of the .csv files, extract the base filename and 
## create a new object with that name containing the data.
## Additionally, name the third column by the basename
for ( file in file_list) {
  f <- sub("(.*)\\.csv", "\\1", file)
  assign(f, read.csv(file = file))
  assign(f, setNames(get(f), c(names(get(f))[1:2], file)))
}

## at this point, the objects d1, d2, and d3 have been created, 
## containing their respective data. The third column of each of 
## these is their originating filename.
d1
#>     a b    d1.csv
#> 1   1 A 0.5526777
#> 2   2 B 0.2161643
#> 3   3 C 0.3311132
#> 4   4 D 0.3577971
#> 5   5 E 0.2298579
#> 6   6 F 0.4014883
#> 7   7 G 0.2789038
#> 8   8 H 0.5729675
#> 9   9 I 0.3413949
#> 10 10 J 0.5807167

## specify the names of the date and time columns (common between files)
date_col <- "a"
time_col <- "b"

## use Reduce to go through the list of created objects and 
## merge them together
list_of_objects <- mget(sub("(.*)\\.csv", "\\1", file_list))
combined_files <- Reduce(function(x, y) merge(x, y, by = c(date_col, time_col)), list_of_objects)

combined_files
#>     a b    d1.csv    d2.csv    d3.csv
#> 1  10 J 0.5807167 0.8181820 0.7073864
#> 2   1 A 0.5526777 0.3225574 0.3758595
#> 3   2 B 0.2161643 0.6933108 0.5654979
#> 4   3 C 0.3311132 0.9309869 0.1727413
#> 5   4 D 0.3577971 0.8810876 0.7802144
#> 6   5 E 0.2298579 0.1023579 0.9925649
#> 7   6 F 0.4014883 0.1328283 0.7610007
#> 8   7 G 0.2789038 0.2926512 0.7469455
#> 9   8 H 0.5729675 0.8727978 0.3073394
#> 10  9 I 0.3413949 0.3107775 0.4778286

如果有一个您不理解的特定方面,请告诉我,我会扩展评论。