我正在尝试在R中的大型数据集上按组应用一些基本的“ if”。
我试图编写一个函数,并使用dplyr将其应用于组,但是它不起作用。可能是什么问题?
#dataframe
db <- data.frame(ID=c(1,1),
type=c("a","b"),
qual=c("no","OK"))
#if (no problem)
attach(db)
if(db[type =="a","qual"]=="OK"){
db[type =="a","qual_fin"] <- "OK"
db[type =="b","qual_fin"] <- "no"
} else if ( db[type =="b","qual"]=="OK"){
db[type =="b","qual_fin"] <- "OK"
db[type =="a","qual_fin"] <- "no"
} else {db$qual_fin <- "no"
}
#dataframe with groups
db <- data.frame(ID=c(1,1,2,2),
type=c("a","b","a","b"),
qual=c("OK","OK","no","OK"))
#function
quality <- function( a,b, qual_fin_a,qual_fin_b){
if(a =="OK"){
qual_fin_a <- "OK"
qual_fin_b <- "no"
} else if ( b =="OK"){
qual_fin_b <- "OK"
qual_fin_a <- "no"
} else {qual_fin_a <- "no"
qual_fin_b <- "no"
}}
#if by group
library(dplyr)
db2 <- db %>%
group_by(ID) %>%
do(quality(a=db[db$type =="a","qual"],
b=db[db$type =="b","qual"],
qual_fin_a=db[db$type=="a","qual_fin"],
qual_fin_b=db[db$type=="b","qual_fin"]))
我希望得到这样的结果:
> db
ID type qual qual_fin
1 1 a OK OK
2 1 b OK no
3 2 a no no
4 2 b OK OK
我认为解决方案非常简单,但我一直在努力寻找它!