R sumif基于多个条件

时间:2020-11-06 14:22:00

标签: r

我正在尝试根据其他列值的总和或平均值而不是按计数创建汇总表。

Amount  Age ActualResult    Prediction
100     20  Pass            Pass
200     24  Pass            Pass
300     30  Pass            Fail
400     34  Pass            Fail
500     40  Fail            Pass
600     44  Fail            Pass
700     50  Fail            Fail
800     54  Fail            Fail

我可以使用以下代码按计数获取表格:

table(data$ActualResult,data$Prediction)


            Predict Pass    Predict Fail
Actual Pass 2               2
Actual Fail 2               2

但是我不知道如何通过金额或平均年龄的总和获得表格: 按金额:

            Predict Pass    Predict Fail
Actual Pass 300             700
Actual Fail 1100            1500

按平均年龄:

            Predict Pass    Predict Fail
Actual Pass 22              32
Actual Fail 42              52

我将使用什么代码按数量和平均年龄创建表格?

2 个答案:

答案 0 :(得分:0)

这可以通过questionr

完成
questionr::wtd.table(
  data$ActualResult,
  data$Prediction,
  weights = data$Amount
)
#>      Fail Pass
#> Fail 1500 1100
#> Pass  700  300

要获取平均年龄,请除以原始表格

questionr::wtd.table(
  data$ActualResult,
  data$Prediction,
  weights = data$Age
) / table(data$ActualResult,data$Prediction)
#>      Fail Pass
#> Fail   52   42
#> Pass   32   22

答案 1 :(得分:0)

这是一种tidyverse方式,将您的数据用作df

library(tidyverse)

# sum of Amount
sum_amount <-
  df %>%
  group_by(ActualResult, Prediction) %>%
  summarize(sum = sum(Amount)) %>%
  pivot_wider(names_from = "Prediction", 
              values_from = "sum", 
              names_prefix = "Predict")

# average Age
avg_age <-
  df %>%
  group_by(ActualResult, Prediction) %>%
  summarize(avg = mean(Age)) %>%
  pivot_wider(names_from = "Prediction", 
              values_from = "avg", 
              names_prefix = "Predict")