R - 基于另一个计算数据帧的组合度量

时间:2018-05-01 14:44:31

标签: r dplyr sample summarization

我的数据框如下所示:

df = data.frame(Region=c(rep("NORDICS",1100),rep("DACH",900),rep("MED",1800),rep("CEE",15000),
                     rep("FRANCE",2000),rep("UK&I",2500)),
            Score=c(sample(seq(from = 1, to = 4, by = 1), size = 1100, replace = TRUE,prob = c(0.6,0.2,0.1,0.1)),
                 sample(seq(from = 1, to = 4, by = 1), size = 900, replace = TRUE,prob = c(0.3,0.3,0.2,0.2)),
                 sample(seq(from = 1, to = 4, by = 1), size = 1800, replace = TRUE,prob = c(0.8,0.1,0.05,0.05)),
                 sample(seq(from = 1, to = 4, by = 1), size = 15000, replace = TRUE,prob = c(0.2,0.2,0.2,0.4)),
                 sample(seq(from = 1, to = 4, by = 1), size = 2000, replace = TRUE,prob = c(0.9,0.05,0.03,0.02)),
                 sample(seq(from = 1, to = 4, by = 1), size = 2500, replace = TRUE,prob = c(0.9,0.05,0.03,0.02))))

数据框是按地区划分的各个得分的集合,其中每个观察结果是问题的单个得分(列Score)。

问题是从1到4的等级。

根据此数据框,我从Score列按区域计算KPI。 KPI是12的响应之和除以给定区域的响应总数。

下面的代码按区域计算KPI:

library(dplyr)

KPI_by_Region=df %>% group_by(Region) %>%
summarise(KPI = sum(Score %in% c(1,2))/n())

我的问题

仅使用KPI_by_Region数据框,其中包含按地区划分的KPI分数 -

  

我可以找到所有地区的KPI分数,但没有   在整个数据框(df)上运行我的代码?

1 个答案:

答案 0 :(得分:1)

这会给出您想要的结果吗?

KPI_by_Region <- df %>%
  group_by(Region) %>%
  summarise(KPI = sum(Score %in% c(1,2))/n(), Count = n())

allRegionsKPI <- sum(KPI_by_Region$KPI * KPI_by_Region$Count) / sum(KPI_by_Region$Count)