Question

我有一个数据框。说，

data.frame(x = c(1, 3), y = c(5, 0), id = c("A", "B"))

现在我想复制它，所以我在同一个data.frame中有一个副本。我最终会得到类似的东西，

 data.frame(x = c(1, 3, 1, 3), y = c(5, 0, 5, 0), id = c("A", "B", "A", "B"))

现在，这非常接近我想要的但是我还想附加id列，根据我想要的重复数量使它们对每一行都是唯一的（在这种情况下只有一个，但我想要很多）。

data.frame(x = c(1, 3, 1, 3), y = c(5, 0, 5, 0), id = c("A-1", "B-1", "A-2", "B-2"))

所以，正如你所看到的那样，我可以把头包裹在制作物体的周围，但是我想继续做下去＆＃34; hacky＆＃34;使用基数R的代码，用dplyr复制此功能。

Answer 1

所以我注意到你想用dplyr包来做这件事。我认为使用来自group_by()的{{1}}，mutate()和row_number()函数的组合，您可以很好地完成这项工作。

dplyr

请记住，您现在有一个“tibble”/“分组data.frame”而不是基本的data.frame。

如果您愿意，可以很容易地将其恢复为原始data.frame。

library(dplyr)

# so you start with this data.frame:
df <- data.frame(x = c(1, 3), y = c(5, 0), id = c("A", "B"))

# to attach an exact duplication of this df to itself:
df <- rbind(df, df)


# group by id, add a second id to increment within each id group ("A", "B", etc.)
df2 <- group_by(df, id) %>%
    mutate(id2 = row_number())


# paste the id and id2 together for the desired result
df2$id_combined <- paste0(df2$id, '-', df2$id2)

# inspect results
df2
    # x     y     id   id2 id_combined
    # <dbl> <dbl> <fctr> <int>       <chr>
    # 1     1     5      A     1         A-1
    # 2     3     0      B     1         B-1
    # 3     1     5      A     2         A-2
    # 4     3     0      B     2         B-2

编辑 - 探索将相同数据框的`df2 <- data.frame(df2, stringsAsFactors = F) # now to remove the additional columns that were added in this process: df2$id2 <- NULL`次重复附加到自身的其他选项：

然后，您可以使用上面显示的# Not dplyr, but this is how I would normally handle this type of task: df <- data.frame(x = c(1, 3), y = c(5, 0), id = c("A", "B")) # set n equal to the number of times you want to replicate the data.frame n <- 13 # initialize space to hold the data frames list_dfs <- list() # loop through, adding individual data frames to the list for(i in 1:n) { list_dfs[[i]] <- df } # combine them all with do.call my_big_df <- do.call(rbind, list_dfs)，group_by()和mutate()函数为data.frame创建新的唯一键。

重复data.frame，添加一个主键

1 个答案:

编辑 - 探索将相同数据框的`df2 <- data.frame(df2, stringsAsFactors = F) # now to remove the additional columns that were added in this process: df2$id2 <- NULL`次重复附加到自身的其他选项：

重复data.frame，添加一个主键

1 个答案:

编辑 - 探索将相同数据框的df2 <- data.frame(df2, stringsAsFactors = F) # now to remove the additional columns that were added in this process: df2$id2 <- NULL 次重复附加到自身的其他选项：

编辑 - 探索将相同数据框的`df2 <- data.frame(df2, stringsAsFactors = F) # now to remove the additional columns that were added in this process: df2$id2 <- NULL`次重复附加到自身的其他选项：