R合并来自不同文件的数据列表

时间:2016-09-27 13:18:29

标签: r dataframe merge

我使用R来分析多个实验,其中结果存储在多个CSV文件中。我运行table()来制表数据并获得如下结果

Tabulations of Combination1.csv
A  1000
B  50
C  200            
Tabulations of Combination2.csv
A 25
B 1500
D 30
Tabulations of Combination3.csv
B 19
C 500
E 2000

我想构建一个结合这些表格的表格。

Combination A     B     C     D    E
c1          1000   50    200   N/A  N/A
c2          25    1500   N/A   30   N/A
c3          N/A    19    500   N/A  2000    

2 个答案:

答案 0 :(得分:1)

以下是我使用tidyrdplyr

的方法

数据

c1 <- rep(LETTERS[1:3], c(1000, 50, 200))
c2 <- rep(LETTERS[c(1:2, 4)], c(25, 1500, 30))
c3 <- rep(LETTERS[c(2:3, 5)], c(19, 500, 2000))

<强>代码

library(tidyr)
library(plyr)
allC <- list(c1 = c1, c2 = c2, c3 = c3)
# get all tables in data.frame format
ldply(names(allC), function(x) {
   tab <- table(allC[[x]]) 
   data.frame(Combination = x, element = names(tab), Freq = c(tab))
}) %>% spread(element, Freq)

#   Combination    A    B   C  D    E
# 1          c1 1000   50 200 NA   NA
# 2          c2   25 1500  NA 30   NA
# 3          c3   NA   19 500 NA 2000

<强>解释

首先将所有表格转换为data.frame,然后在其中附加相应元素的名称。然后使用spread展开值。

答案 1 :(得分:0)

library(dplyr)
library(tidyr)

x <- table(c(rep("A", 1000), rep("B", 50), rep("C", 200)))
y <- table(c(rep("A", 25), rep("B", 1500), rep("D", 30)))
z <- table(c(rep("B", 19), rep("C", 500), rep("E", 2000)))

X <- data.frame(x) %>% spread(Var1, Freq)
Y <- data.frame(y) %>% spread(Var1, Freq)
Z <- data.frame(z) %>% spread(Var1, Freq)

X %>% full_join(Y) %>% full_join(Z) %>%
  mutate(Combination = paste0("c", seq(1,3)))

结果:

> X %>% full_join(Y) %>% full_join(Z) %>%
+ mutate(Combination = paste0("c", seq(1,3)))
Joining, by = c("A", "B")
Joining, by = c("B", "C")
     A    B   C  D    E Combination
1 1000   50 200 NA   NA          c1
2   25 1500  NA 30   NA          c2
3   NA   19 500 NA 2000          c3

请考虑下次提供xyz个对象,以获得可重现的示例:)