Question

我有以下数据框

data.frame(a = c(1,2,3),b = c(1,2,3))
  a b
1 1 1
2 2 2
3 3 3

我想把它变成

或重复N次。在R中有一个简单的功能吗？谢谢！

Answer 1

编辑：更新为更好的现代R答案。

您可以使用replicate()，然后rbind将结果重新组合在一起。 rownames会自动更改为从1：nrows运行。

d <- data.frame(a = c(1,2,3),b = c(1,2,3))
n <- 3
do.call("rbind", replicate(n, d, simplify = FALSE))

更传统的方法是使用索引，但这里的rowname改变不是那么整洁（但更有用）：

 d[rep(seq_len(nrow(d)), n), ]

以上是对上述的改进，前两个使用purrr函数式编程，惯用的purrr：

purrr::map_df(seq_len(3), ~d)

并且不那么惯用的purrr（相同的结果，虽然更尴尬）：

purrr::map_df(seq_len(3), function(x) d)

最后通过索引而不是列表应用dplyr：

d %>% slice(rep(row_number(), 3))

Answer 2

对于data.frame个对象，此解决方案比@ mdsummer＆@ wojciech-sobala的快几倍。

d[rep(seq_len(nrow(d)), n), ]

对于data.table个对象，@ mdsummer比转换为data.frame后应用上述内容要快一些。对于大n，这可能会翻转。 microbenchmark

完整代码：

packages <- c("data.table", "ggplot2", "RUnit", "microbenchmark")
lapply(packages, require, character.only=T)

Repeat1 <- function(d, n) {
  return(do.call("rbind", replicate(n, d, simplify = FALSE)))
}

Repeat2 <- function(d, n) {
  return(Reduce(rbind, list(d)[rep(1L, times=n)]))
}

Repeat3 <- function(d, n) {
  if ("data.table" %in% class(d)) return(d[rep(seq_len(nrow(d)), n)])
  return(d[rep(seq_len(nrow(d)), n), ])
}

Repeat3.dt.convert <- function(d, n) {
  if ("data.table" %in% class(d)) d <- as.data.frame(d)
  return(d[rep(seq_len(nrow(d)), n), ])
}

# Try with data.frames
mtcars1 <- Repeat1(mtcars, 3)
mtcars2 <- Repeat2(mtcars, 3)
mtcars3 <- Repeat3(mtcars, 3)

checkEquals(mtcars1, mtcars2)
#  Only difference is row.names having ".k" suffix instead of "k" from 1 & 2
checkEquals(mtcars1, mtcars3)

# Works with data.tables too
mtcars.dt <- data.table(mtcars)
mtcars.dt1 <- Repeat1(mtcars.dt, 3)
mtcars.dt2 <- Repeat2(mtcars.dt, 3)
mtcars.dt3 <- Repeat3(mtcars.dt, 3)

# No row.names mismatch since data.tables don't have row.names
checkEquals(mtcars.dt1, mtcars.dt2)
checkEquals(mtcars.dt1, mtcars.dt3)

# Time test
res <- microbenchmark(Repeat1(mtcars, 10),
                      Repeat2(mtcars, 10),
                      Repeat3(mtcars, 10),
                      Repeat1(mtcars.dt, 10),
                      Repeat2(mtcars.dt, 10),
                      Repeat3(mtcars.dt, 10),
                      Repeat3.dt.convert(mtcars.dt, 10))
print(res)
ggsave("repeat_microbenchmark.png", autoplot(res))

Answer 3

d <- data.frame(a = c(1,2,3),b = c(1,2,3))
r <- Reduce(rbind, list(d)[rep(1L, times=3L)])

Answer 4

只需使用带重复功能的简单索引。

mydata<-data.frame(a = c(1,2,3),b = c(1,2,3)) #creating your data frame  
n<-10           #defining no. of time you want repetition of the rows of your dataframe

mydata<-mydata[rep(rownames(mydata),n),] #use rep function while doing indexing 
rownames(mydata)<-1:NROW(mydata)    #rename rows just to get cleaner look of data

Answer 5

更简单：

my_data <- data.frame(a = c(1,2,3),b = c(1,2,3))
rbindlist(replicate(n = 3, expr = my_data, simplify = FALSE)

Answer 6

使用data.table软件包，您可以将特殊符号.I与rep一起使用：

df <- data.frame(a = c(1,2,3), b = c(1,2,3))
dt <- as.data.table(df)

n <- 3

dt[rep(dt[, .I], n)]

给出：

重复data.frame N次

6 个答案: