这是我的数据框列表:
[[1]]
ID Value
A 1
B 1
C 1
[[2]]
ID Value
A 1
D 1
E 1
[[3]]
ID Value
B 1
C 1
我在左侧列中使用唯一(非冗余)ID的单个数据帧,在列中复制,并将NULL值设置为0:
ID [1]Value [2]Value [3]Value
A 1 1 0
B 1 0 1
C 1 0 1
D 0 1 0
E 0 1 0
我试过了:
Reduce(function(x, y) merge(x, y, by=ID), datahere)
这提供了一个列表但不考虑原始值的来源,并且在新行中重复重复的ID。
rbindlist(datahere, use.names=TRUE, fill=TRUE, idcol="Replicate")
这提供了一个单独的列表,其中[x]值编号作为一个名为Replicate的新列,但它仍然不在我想要的结构中,因为ID列有冗余。
答案 0 :(得分:6)
使用dplyr
/ purrr
:
require(tidyverse);
reduce(lst, full_join, by = "ID");
# ID Value.x Value.y Value
# 1 A 1 1 NA
# 2 B 1 NA 1
# 3 C 1 NA 1
# 4 D NA 1 NA
# 5 E NA 1 NA
或将NA
替换为0
s:
reduce(lst, full_join, by = "ID") %>% replace(., is.na(.), 0);
# ID Value.x Value.y Value
#1 A 1 1 0
#2 B 1 0 1
#3 C 1 0 1
#4 D 0 1 0
#5 E 0 1 0
options(stringsAsFactors = FALSE);
lst <- list(
data.frame(ID = c("A", "B", "C"), Value = c(1, 1, 1)),
data.frame(ID = c("A", "D", "E"), Value = c(1, 1, 1)),
data.frame(ID = c("B", "C"), Value = c(1, 1)))
答案 1 :(得分:1)
您已经有了一个很好的答案,但典型的方法是使用tidyr::spread
您的数据
A <- data.frame(ID=LETTERS[1:3], Value=1, stringsAsFactors=FALSE)
B <- data.frame(ID=LETTERS[c(1,4,5)], Value=1, stringsAsFactors=FALSE)
C <- data.frame(ID=LETTERS[c(2:3)], Value=1, stringsAsFactors=FALSE)
L <- list(A, B, C)
解决方案
dplyr::bind_rows(L, .id="G") %>%
tidyr::spread(G, Value, fill=0)
# ID 1 2 3
# 1 A 1 1 0
# 2 B 1 0 1
# 3 C 1 0 1
# 4 D 0 1 0
# 5 E 0 1 0
答案 2 :(得分:1)
使用base R
,我们需要在all = TRUE
merge
res <- Reduce(function(...) merge(..., all = TRUE, by="ID"), lst)
replace(res, is.na(res), 0)
# ID Value.x Value.y Value
#1 A 1 1 0
#2 B 1 0 1
#3 C 1 0 1
#4 D 0 1 0
#5 E 0 1 0
lst <- list(structure(list(ID = c("A", "B", "C"), Value = c(1, 1, 1)), .Names = c("ID",
"Value"), row.names = c(NA, -3L), class = "data.frame"), structure(list(
ID = c("A", "D", "E"), Value = c(1, 1, 1)), .Names = c("ID",
"Value"), row.names = c(NA, -3L), class = "data.frame"), structure(list(
ID = c("B", "C"), Value = c(1, 1)), .Names = c("ID", "Value"
), row.names = c(NA, -2L), class = "data.frame"))