计算指标中的组合

时间:2017-06-01 09:29:39

标签: r if-statement matrix

我试图找出一种更有效的方法来计算指标内正确组合的数量。

以下是我的数据:

head(data)
   email_flag home_number_flag mobile_flag
1:  incorrect        incorrect     correct
2:  incorrect        incorrect   incorrect
3:  incorrect        incorrect   incorrect
4:  incorrect        incorrect   incorrect
5:  incorrect        incorrect   incorrect
6:  incorrect        incorrect   incorrect

我目前使用ifelse语句的方法:

data <- mutate(data, number_of_correct_flags =
    +                            ifelse(email_flag == "correct" & mobile_flag == "correct", 2, 
    +                            ifelse(email_flag != "correct" & mobile_flag == "correct", 1, 
    +                            ifelse(email_flag == "correct" & mobile_flag != "correct", 1,
    +                            ifelse(email_flag != "correct" & mobile_flag != "correct", 0,
    +                                                        
    +                            ifelse(home_number_flag == "correct" & mobile_flag == "correct", 2, 
    +                            ifelse(home_number_flag != "correct" & mobile_flag == "correct", 1, 
    +                            ifelse(home_number_flag == "correct" & mobile_flag != "correct", 1,
    +                            ifelse(home_number_flag != "correct" & mobile_flag != "correct", 0, 
    +                                                                                    
    +                            ifelse(email_flag == "correct" & mobile_flag == "correct", 2, 
    +                            ifelse(email_flag != "correct" & mobile_flag == "correct", 1, 
    +                            ifelse(email_flag == "correct" & mobile_flag != "correct", 1,
    +                            ifelse(email_flag != "correct" & mobile_flag != "correct", 0, 
    +                                   
    +                            ifelse(email_flag == "correct" & mobile_flag == "correct" & home_number_flag == "correct", 3, 
    +                            ifelse(email_flag != "correct" & mobile_flag != "correct" & home_number_flag != "correct", 0, "check")))))))))))))))

结果

head(data)
      email_flag home_number_flag mobile_flag number_of_correct_flags
    1  incorrect        incorrect     correct                       1
    2  incorrect        incorrect   incorrect                       0
    3  incorrect        incorrect   incorrect                       0
    4  incorrect        incorrect   incorrect                       0
    5  incorrect        incorrect   incorrect                       0
    6  incorrect        incorrect   incorrect                       0

显然,随着指标数量的增加,这会成为问题。

有关更有效方法的任何想法?

2 个答案:

答案 0 :(得分:1)

data$number_of_correct_flags <- rowSums(data == "correct")

如果您的数据包含除这些标志变量之外的其他一些变量,则需要将其从data rowSums调用中删除,例如与select(data, matches("flag$"))

答案 1 :(得分:1)

由于它是data.table,我们可以使用data.table方法

library(data.table)
data[, number_of_correct_flags := Reduce(`+`, lapply(.SD, `==`, "correct")), 
          .SDcols = c("email_flag", "home_number_flag", "mobile_flag")]