do.call(" rbind",list(data,frames))但是也按原始数据框索引每一行

时间:2015-01-14 23:42:54

标签: r dplyr

df1 <- data.frame(a = 1:2, b = 3:4)
df2 <- data.frame(a = 5:6, b = 7:8)

# A common method loses the origin of each row.
do.call("rbind", list(df1, df2))
##   a b
## 1 1 3
## 2 2 4
## 3 5 7
## 4 6 8

# Whereas here, X1 records which data frame each row originated in.
library(plyr)
adply(list(df1, df2), 1)
##   X1 a b
## 1  1 1 3
## 2  1 2 4
## 3  2 5 7
## 4  2 6 8

还有其他方法可以做到这一点,也许更有效率吗?

2 个答案:

答案 0 :(得分:2)

这是一种方式。

library(dplyr)
library(tidyr)

foo <- list(df1, df2)

unnest(foo, names) %>%
mutate(names = gsub("^X", "", names))

#  names a b
#1     1 1 3
#2     1 2 4
#3     2 5 7
#4     2 6 8

答案 1 :(得分:1)

有了基地:

df1 <- data.frame(a = 1:2, b = 3:4)
df2 <- data.frame(a = 5:6, b = 7:8)

frames <- list(df1, df2)

do.call(rbind, lapply(seq_along(frames), function(x) {
  frames[[x]]$X1 <- x
  frames[[x]]
}))

##   a b X1
## 1 1 3  1
## 2 2 4  1
## 3 5 7  2
## 4 6 8  2

顺便说一句,如果你想了解plyr如何在(plyr::adply)(plyr:::splitter_a)&amp; (plyr::ldply)。与以下相比,这些答案是微不足道的: - )