有人可以提供建议如何解决以下问题:
在每个ID
的行上进行串联,但限制是仅考虑ID
的类型流中至少有一个type = C
的情况:
ID<-c(1,1,1,1,2,2,2,2,3,3)
type<-c("A","B","C","D","A","B","C","D","A","D")
mydata<-as.data.frame(cbind(ID,type))
答案 0 :(得分:2)
library(dplyr)
mydata %>% group_by(ID) %>% filter("C" %in% type) %>%
#filter(any(type == 'C')) %>% #as Ronak suggest
#filter(length(unique(type))==4) %>% #OR using length and unique
summarise(type_flow=paste(type, collapse="->"))
# A tibble: 2 x 2
ID type_flow
<fct> <chr>
1 1 A->B->C->D
2 2 A->B->C->D
答案 1 :(得分:2)
使用ave
和aggregate(type~ID, mydata[ave(mydata$type == "C", mydata$ID, FUN = any), ],
function(x) paste0(x, collapse = "->"))
#ID type
#1 1 A->B->C->D
#2 2 A->B->C->D
的基本R选项
ave
逻辑与@A相同。在苏利曼的帖子中,我们将aggregate
和type
的变量ID
和paste
TimedeltaIndex
过滤到数据框。
答案 2 :(得分:1)
使用data.table:
setDT(mydata)
idsWC <- mydata[type == "C", unique(ID)]
mydata[ID %in% idsWC, paste(type, collapse = "->"), ID]
ID V1
1: 1 A->B->C->D
2: 2 A->B->C->D