如何提取基于组的向量以传递到dplyr的摘要或变异中的函数?

时间:2019-03-07 22:40:37

标签: r dplyr tidyverse mutate summarize

我正在尝试使用AUC包中的psych函数创建准确性,敏感性和特异性的摘要表。我想为分组变量的每个级别定义输入向量(t,4 x 1向量)。

我尝试过的操作似乎忽略了分组。

示例:

library(tidyverse)
library(psych)

Data <- data.frame(Class = c("A","B","C","D"),
                   TP = c(198,185,221,192),
                   FP = c(1,1,6,1),
                   FN = c(42,55,19,48),
                   TN = c(569,570,564,569))

Data %>% 
  group_by(Class) %>%
  mutate(Accuracy = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Accuracy,
         Sensitivity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Sensitivity,
         Specificity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Specificity)

这使我接近正确的输出,只是“准确度”,“灵敏度”和“特异性”的值仅在第一行进行计算,然后重复:

# A tibble: 4 x 8
# Groups:   Class [4]
  Class    TP    FP    FN    TN Accuracy Sensitivity Specificity
  <fct> <dbl> <dbl> <dbl> <dbl>    <dbl>       <dbl>       <dbl>
1 A       198     1    42   569    0.947       0.995       0.931
2 B       185     0    55   570    0.947       0.995       0.931
3 C       221     6    19   564    0.947       0.995       0.931
4 D       192     1    48   569    0.947       0.995       0.931

我也尝试过使用summarize

Data %>% 
  group_by(Class) %>%
  summarize(Accuracy = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Accuracy,
         Sensitivity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Sensitivity,
         Specificity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Specificity)

但是输出与上面相同。

所需的输出是每个级别的“班级”的唯一计算

# A tibble: 4 x 8
  Class    TP    FP    FN    TN Accuracy Sensitivity Specificity
  <fct> <dbl> <dbl> <dbl> <dbl>    <dbl>       <dbl>       <dbl>
1 A       198     1    42   569     0.95        0.99        0.93
2 B       185     0    55   570     0.93        0.99        0.91
3 C       221     6    19   564     0.97        0.97        0.97
4 D       192     1    48   569     0.94        0.99        0.92

如何在摘要或变异中获取函数调用以维护组?

2 个答案:

答案 0 :(得分:0)

这有效

Data %>% 
  group_by(Class) %>%
  mutate(Accuracy = AUC(t = unlist(.[Class,2:5], use.names=FALSE))$Accuracy,
         Sensitivity = AUC(t = unlist(.[Class,2:5], use.names=FALSE))$Sensitivity,
         Specificity = AUC(t = unlist(.[Class,2:5], use.names=FALSE))$Specificity)

但是也许这更清楚

Data %>% 
  group_by(Class) %>%
  mutate(Accuracy = AUC(t = c(TP, FP, FN, TN))$Accuracy,
         Sensitivity = AUC(t = c(TP, FP, FN, TN))$Sensitivity,
         Specificity = AUC(t = c(TP, FP, FN, TN))$Specificity)

答案 1 :(得分:0)

为避免每个类多次调用AUC,我将编写一个包装器,如下所示:

# Load libraries
library(tidyverse)
library(psych)

# Create data frame
Data <- data.frame(Class = c("A","B","C","D"),
                   TP = c(198,185,221,192),
                   FP = c(1,1,6,1),
                   FN = c(42,55,19,48),
                   TN = c(569,570,564,569))

# Wrapper function
AUC_wrapper <- function(Class, TP, FP, FN, TN){
  res <- AUC(t = c(TP, FP, FN, TN))
  data.frame(Class = Class, 
             TP = TP,
             FP = FP,
             FN = FN,
             TN = TN,
             Accuracy = res$Accuracy, 
             Sensitivity = res$Sensitivity, 
             Specificity = res$Specificity)
}

# Run using purrr
pmap_dfr(Data, AUC_wrapper)

#   Class  TP FP FN  TN  Accuracy Sensitivity Specificity
# 1     A 198  1 42 569 0.9469136   0.9949749   0.9312602
# 2     B 185  1 55 570 0.9309494   0.9946237   0.9120000
# 3     C 221  6 19 564 0.9691358   0.9735683   0.9674099
# 4     D 192  1 48 569 0.9395062   0.9948187   0.9222042