嗨,我正在寻找在R中过滤数据框并汇总一个人(ID)的多种情况的方法。 使用groupby后该怎么办?
Df:
<partial name="RegUserPartial" />
<partial name="LogUserPartial" />
所需结果:
ID | Condition
S123| D
S123| H
S123| D,L
S456| L
S456| L,D
S456| L
S789| D
S789| L
S789| D
答案 0 :(得分:1)
您可以将逗号分隔的值放入不同的行,然后为每个unique
粘贴Conditions
ID
。
library(dplyr)
df %>%
tidyr::separate_rows(Condition, sep = ",") %>%
group_by(ID) %>%
summarise(Condition = toString(unique(Condition)))
# ID Condition
# <fct> <chr>
#1 S123 D, H, L
#2 S456 L, D
#3 S789 D, L
在基数R中,我们可以将aggregate
与strsplit
一起使用,以逗号分隔字符串。
aggregate(Condition~ID, df, function(x) toString(unique(unlist(strsplit(x, ",")))))
数据
df <- structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L,
3L), .Label = c("S123", "S456", "S789"), class = "factor"), Condition = c("D",
"H", "D,L", "L", "L,D", "L", "D", "L", "D")), row.names = c(NA,-9L),
class = "data.frame")