我使用R来分析多个实验,其中结果存储在多个CSV文件中。我运行table()
来制表数据并获得如下结果
Tabulations of Combination1.csv
A 1000
B 50
C 200
Tabulations of Combination2.csv
A 25
B 1500
D 30
Tabulations of Combination3.csv
B 19
C 500
E 2000
我想构建一个结合这些表格的表格。
Combination A B C D E
c1 1000 50 200 N/A N/A
c2 25 1500 N/A 30 N/A
c3 N/A 19 500 N/A 2000
答案 0 :(得分:1)
以下是我使用tidyr
和dplyr
:
数据强>
c1 <- rep(LETTERS[1:3], c(1000, 50, 200))
c2 <- rep(LETTERS[c(1:2, 4)], c(25, 1500, 30))
c3 <- rep(LETTERS[c(2:3, 5)], c(19, 500, 2000))
<强>代码强>
library(tidyr)
library(plyr)
allC <- list(c1 = c1, c2 = c2, c3 = c3)
# get all tables in data.frame format
ldply(names(allC), function(x) {
tab <- table(allC[[x]])
data.frame(Combination = x, element = names(tab), Freq = c(tab))
}) %>% spread(element, Freq)
# Combination A B C D E
# 1 c1 1000 50 200 NA NA
# 2 c2 25 1500 NA 30 NA
# 3 c3 NA 19 500 NA 2000
<强>解释强>
首先将所有表格转换为data.frame
,然后在其中附加相应元素的名称。然后使用spread
展开值。
答案 1 :(得分:0)
library(dplyr)
library(tidyr)
x <- table(c(rep("A", 1000), rep("B", 50), rep("C", 200)))
y <- table(c(rep("A", 25), rep("B", 1500), rep("D", 30)))
z <- table(c(rep("B", 19), rep("C", 500), rep("E", 2000)))
X <- data.frame(x) %>% spread(Var1, Freq)
Y <- data.frame(y) %>% spread(Var1, Freq)
Z <- data.frame(z) %>% spread(Var1, Freq)
X %>% full_join(Y) %>% full_join(Z) %>%
mutate(Combination = paste0("c", seq(1,3)))
结果:
> X %>% full_join(Y) %>% full_join(Z) %>%
+ mutate(Combination = paste0("c", seq(1,3)))
Joining, by = c("A", "B")
Joining, by = c("B", "C")
A B C D E Combination
1 1000 50 200 NA NA c1
2 25 1500 NA 30 NA c2
3 NA 19 500 NA 2000 c3
请考虑下次提供x
,y
和z
个对象,以获得可重现的示例:)