Question

我有以下数据框：

if

我想根据“组”列复制列索引，一次复制，每个数字连续出现df = structure(list(Group = c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3), index = c(1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 1, 2, 3)), row.names = c(NA, -13L), class = c("tbl_df", "tbl", "data.frame"))次，第二次所有数字以组n出现一次，其中n是组的大小（类似于n与rep和rep的比较）。

所以输出看起来像这样（由于太长，所以只能看第1组）：

第一个选项：

each

第二个选项：

df = structure(list(Group = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1), index = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 
4, 4, 4)), row.names = c(NA, -16L), class = c("tbl_df", "tbl", 
"data.frame"))

如何使用df = structure(list(Group = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), index = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4)), row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"))执行此操作？

Answer 1

您可以像这样使用rep和slice

library(dplyr)

选项1：

df %>%
  group_by(Group) %>%
  slice(rep(seq_len(n()), each = n()))

选项2：

df %>%
  group_by(Group) %>%
  slice(rep(seq_len(n()), n()))

Answer 2

您可以结合使用do和lapply复制整个群组

df %>% group_by(Group) %>% 
  do(lapply(.,rep,times=nrow(.)) %>% as.data.frame())
df %>% group_by(Group) %>% 
  do(lapply(.,rep,each=nrow(.)) %>% as.data.frame())

Answer 3

我们可以使用uncount

library(tidyverse)
df %>% 
  group_by(Group) %>% 
  uncount(n())
# A tibble: 61 x 2
# Groups:   Group [3]
#   Group index
#   <dbl> <dbl>
# 1     1     1
# 2     1     1
# 3     1     1
# 4     1     1
# 5     1     2
# 6     1     2
# 7     1     2
# 8     1     2
# 9     1     3
#10     1     3
# … with 51 more rows

或使用data.table

library(data.table)
setDT(df)[, .SD[rep(seq_len(.N), .N)], Group]

或与base R

do.call(rbind, lapply(split(df, df$Group), 
       function(x) x[rep(seq_len(nrow(x)), nrow(x)),]))

按组复制数据帧

3 个答案: