我有一个包含5个不同组的数据框:
remove_input_attributes: false
我想知道是否可以从第一组,第一组和第二组,第一组,第二组和第三组获得唯一 id group
1 L1 1
2 L2 1
3 L1 2
4 L3 2
5 L4 2
6 L3 3
7 L5 3
8 L6 3
9 L1 4
10 L4 4
11 L2 5
,依此类推,而不进行循环。我正在使用id
或dplyr
包进行搜索。
预期结果:
data.table
数据:
group id
1 1 c("L1", "L2")
2 1,2 c("L1", "L2", "L3", "L4")
3 1,2,3 c("L1", "L2", "L3", "L4", "L5")
4 1,2,3,4 c("L1", "L2", "L3", "L4", "L5")
5 1,2,3,4,5 c("L1", "L2", "L3", "L4", "L5")
答案 0 :(得分:8)
使用基数R,您可以:
for i in range(1,2):
fdf['avg rent pcm,<br> {i}R'] = radf['avg rent pcm'][{i}]
fdf['median rent pcm,<br> {i}R'] = radf['median rent pcm'][{i}]
fdf['# for rent,<br> {i}R'] = radf['# for rent'][{i}]
如果您希望按预期输出格式化:
=IF(A2="online","ON",IF(A2="offline","OFF","ON"))
格式化的另一种可能性:
# create the "growing" sets of groups
combi_groups <- lapply(seq_along(unique(df$group)), function(i) unique(df$group)[1:i])
# get the unique ID for each set of groups
uniq_ID <- setNames(lapply(combi_groups, function(x) unique(df$id[df$group %in% x])),
sapply(combi_groups, paste, collapse=","))
# $`1`
# [1] "L1" "L2"
# $`1,2`
# [1] "L1" "L2" "L3" "L4"
# $`1,2,3`
# [1] "L1" "L2" "L3" "L4" "L5" "L6"
# $`1,2,3,4`
# [1] "L1" "L2" "L3" "L4" "L5" "L6"
# $`1,2,3,4,5`
# [1] "L1" "L2" "L3" "L4" "L5" "L6"
或者,如果您想在列中添加data.frame(group=sapply(combi_groups, paste, collapse=", "), id=sapply(uniq_ID, function(x) paste0("c(", paste0("\"", x, "\"", collapse=", "), ")")))
# group id
#1 1 c("L1", "L2")
#2 1, 2 c("L1", "L2", "L3", "L4")
#3 1, 2, 3 c("L1", "L2", "L3", "L4", "L5", "L6")
#4 1, 2, 3, 4 c("L1", "L2", "L3", "L4", "L5", "L6")
#5 1, 2, 3, 4, 5 c("L1", "L2", "L3", "L4", "L5", "L6")
:
data.frame(group=rep(names(uniq_ID), sapply(uniq_ID, length)), id=unlist(uniq_ID))
答案 1 :(得分:6)
与@Cath的答案类似,但使用Reduce(..., accumulate = TRUE)
创建扩展的组窗口。然后使用lapply
遍历组的集合以获取每个窗口的唯一ID:
grp <- Reduce(c, unique(d$group), accumulate = TRUE)
lapply(grp, function(x) unique(d$id[d$group %in% x]))
# [[1]]
# [1] "L1" "L2"
#
# [[2]]
# [1] "L1" "L2" "L3" "L4"
#
# [[3]]
# [1] "L1" "L2" "L3" "L4" "L5" "L6"
#
# [[4]]
# [1] "L1" "L2" "L3" "L4" "L5" "L6"
#
# [[5]]
# [1] "L1" "L2" "L3" "L4" "L5" "L6"
对于命名和美化,请参考@Cath的好答案。
答案 2 :(得分:4)
另一种方法是使用split
和Reduce
将群组提供给union
,并使用accumulate = TRUE:
Reduce(union, split(df$id, df$group), accumulate=TRUE)
[[1]]
[1] "L1" "L2"
[[2]]
[1] "L1" "L2" "L3" "L4"
[[3]]
[1] "L1" "L2" "L3" "L4" "L5" "L6"
[[4]]
[1] "L1" "L2" "L3" "L4" "L5" "L6"
[[5]]
[1] "L1" "L2" "L3" "L4" "L5" "L6"