我有一个如下所示的数据集:
ID week action
1 1 TRUE
1 1 FALSE
1 2 FALSE
1 2 FALSE
1 3 FALSE
1 3 TRUE
2 1 FALSE
2 2 TRUE
2 2 FALSE
...
我想做的是保留每个ID以及ID中的每周,一个操作值,如果有,则优先保留TRUE,否则为FALSE。
所以通过时会看起来像这样:
ID week action
1 1 TRUE
1 2 FALSE
1 3 TRUE
2 1 FALSE
2 2 TRUE
...
答案 0 :(得分:2)
尝试
library(dplyr)
library(tidyr)
df %>%
group_by(ID, week)%>%
arrange(desc(action)) %>%
slice(1)
# ID week action
#1 1 1 TRUE
#2 1 2 FALSE
#3 1 3 TRUE
#4 2 1 FALSE
#5 2 2 TRUE
或使用data.table
library(data.table)
setDT(df)[order(action,decreasing=TRUE),
.SD[1] , by=list(ID, week)][order(ID,week)]
# ID week action
#1: 1 1 TRUE
#2: 1 2 FALSE
#3: 1 3 TRUE
#4: 2 1 FALSE
#5: 2 2 TRUE
或者使用base R
类似于@Sam Dickson使用的方法
aggregate(action~., df, FUN=function(x) sum(x)>0)
# ID week action
#1 1 1 TRUE
#2 2 1 FALSE
#3 1 2 FALSE
#4 2 2 TRUE
#5 1 3 TRUE
或者受到@docendo discimus的启发,data.table选项将是
setDT(df)[, .SD[which.max(action)], by=list(ID, week)]
df <- structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L), week = c(1L,
1L, 2L, 2L, 3L, 3L, 1L, 2L, 2L), action = c(TRUE, FALSE, FALSE,
FALSE, FALSE, TRUE, FALSE, TRUE, FALSE)), .Names = c("ID", "week",
"action"), class = "data.frame", row.names = c(NA, -9L))
答案 1 :(得分:2)
我用过plyr:
library(plyr)
ddply(df,.(ID,week),summarize,action=sum(action)>0)
答案 2 :(得分:2)
两个选项类似于akrun的asnwer,但不一样,这就是我单独发布的原因:
aggregate(action ~ ID + week, df, max)
# ID week action
#1 1 1 1 # you can use 1/0s the same way as TRUE/FALSE
#2 2 1 0
#3 1 2 0
#4 2 2 1
#5 1 3 1
library(dplyr)
group_by(df, ID, week) %>% slice(which.max(action))
#Source: local data frame [5 x 3]
#Groups: ID, week
#
# ID week action
#1 1 1 TRUE
#2 1 2 FALSE
#3 1 3 TRUE
#4 2 1 FALSE
#5 2 2 TRUE
which.max
的帮助页面告诉您它找到了数字或逻辑向量的第一个最大值,因此即使您有多个TRUE条目(与1和FALSE为0),您只需选择第一次出现并返回即可。您可以使用which.min
。
答案 3 :(得分:2)
包含aggregate
和any
的基本R解决方案:
aggregate(action ~ week + ID, df, any)
# week ID action
# 1 1 1 TRUE
# 2 2 1 FALSE
# 3 3 1 TRUE
# 4 1 2 FALSE
# 5 2 2 TRUE
另一个基础R解决方案:
subset(transform(df, action = ave(action, week, ID, FUN = any)), !duplicated(df[-3]))
# ID week action
# 1 1 1 TRUE
# 3 1 2 FALSE
# 5 1 3 TRUE
# 7 2 1 FALSE
# 8 2 2 TRUE