Question

我有一个包含5个不同组的数据框：

remove_input_attributes: false

我想知道是否可以从第一组，第一组和第二组，第一组，第二组和第三组获得唯一id group 1 L1 1 2 L2 1 3 L1 2 4 L3 2 5 L4 2 6 L3 3 7 L5 3 8 L6 3 9 L1 4 10 L4 4 11 L2 5，依此类推，而不进行循环。我正在使用id或dplyr包进行搜索。

预期结果：

data.table

数据：

    group      id
1   1          c("L1", "L2")
2   1,2        c("L1", "L2", "L3", "L4")
3   1,2,3      c("L1", "L2", "L3", "L4", "L5")
4   1,2,3,4    c("L1", "L2", "L3", "L4", "L5")
5   1,2,3,4,5  c("L1", "L2", "L3", "L4", "L5")

Answer 1

使用基数R，您可以：

for i in range(1,2):
     fdf['avg rent pcm,<br> {i}R'] = radf['avg rent pcm'][{i}]   
     fdf['median rent pcm,<br> {i}R'] = radf['median rent pcm'][{i}]  
     fdf['# for rent,<br> {i}R'] = radf['# for rent'][{i}]

如果您希望按预期输出格式化：

=IF(A2="online","ON",IF(A2="offline","OFF","ON"))

格式化的另一种可能性：

# create the "growing" sets of groups
combi_groups <- lapply(seq_along(unique(df$group)), function(i) unique(df$group)[1:i])

# get the unique ID for each set of groups
uniq_ID <- setNames(lapply(combi_groups, function(x) unique(df$id[df$group %in% x])), 
                    sapply(combi_groups, paste, collapse=","))

# $`1`
# [1] "L1" "L2"

# $`1,2`
# [1] "L1" "L2" "L3" "L4"

# $`1,2,3`
# [1] "L1" "L2" "L3" "L4" "L5" "L6"

# $`1,2,3,4`
# [1] "L1" "L2" "L3" "L4" "L5" "L6"

# $`1,2,3,4,5`
# [1] "L1" "L2" "L3" "L4" "L5" "L6"

或者，如果您想在列中添加data.frame(group=sapply(combi_groups, paste, collapse=", "), id=sapply(uniq_ID, function(x) paste0("c(", paste0("\"", x, "\"", collapse=", "), ")"))) # group id #1 1 c("L1", "L2") #2 1, 2 c("L1", "L2", "L3", "L4") #3 1, 2, 3 c("L1", "L2", "L3", "L4", "L5", "L6") #4 1, 2, 3, 4 c("L1", "L2", "L3", "L4", "L5", "L6") #5 1, 2, 3, 4, 5 c("L1", "L2", "L3", "L4", "L5", "L6")：

data.frame(group=rep(names(uniq_ID), sapply(uniq_ID, length)), id=unlist(uniq_ID))

Answer 2

与@Cath的答案类似，但使用Reduce(..., accumulate = TRUE)创建扩展的组窗口。然后使用lapply遍历组的集合以获取每个窗口的唯一ID：

grp <- Reduce(c, unique(d$group), accumulate = TRUE)

lapply(grp, function(x) unique(d$id[d$group %in% x]))
# [[1]]
# [1] "L1" "L2"
# 
# [[2]]
# [1] "L1" "L2" "L3" "L4"
# 
# [[3]]
# [1] "L1" "L2" "L3" "L4" "L5" "L6"
# 
# [[4]]
# [1] "L1" "L2" "L3" "L4" "L5" "L6"
# 
# [[5]]
# [1] "L1" "L2" "L3" "L4" "L5" "L6"

对于命名和美化，请参考@Cath的好答案。

Answer 3

另一种方法是使用split和Reduce将群组提供给union，并使用accumulate = TRUE：

Reduce(union, split(df$id, df$group), accumulate=TRUE)
[[1]]
[1] "L1" "L2"

[[2]]
[1] "L1" "L2" "L3" "L4"

[[3]]
[1] "L1" "L2" "L3" "L4" "L5" "L6"

[[4]]
[1] "L1" "L2" "L3" "L4" "L5" "L6"

[[5]]
[1] "L1" "L2" "L3" "L4" "L5" "L6"

第1组的唯一值，然后是第1组和第2组，依此类推

3 个答案: