我的数据集看起来像这样
Org_ID Market volume Indicator variable
1 100 1
1 200 0
1 300 0
2 50 1
2 500 1
3 400 0
3 200 0
3 300 0
3 100 0
我希望通过市场TRx和org_id按照市场交易量计算0 指标变量的百分比来总结它,如下所示:
Org_ID % of 0's by market volume
1 83.3%
2 0%
3 100%
我尝试过子组,但似乎无法做到这一点。谁能建议我能做些什么?
答案 0 :(得分:0)
dplyr
:
library(dplyr)
df %>%
group_by(Org_ID) %>%
summarize(sum_market_vol = sum(Market_volume*!Indicator_variable),
tot_market_vol = sum(Market_volume)) %>%
transmute(Org_ID, Perc_Market_Vol = 100*sum_market_vol/tot_market_vol)
<强>结果:强>
# A tibble: 3 x 2
Org_ID Perc_Market_Vol
<int> <dbl>
1 1 83.33333
2 2 0.00000
3 3 100.00000
数据:强>
df = structure(list(Org_ID = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L),
Market_volume = c(100L, 200L, 300L, 50L, 500L, 400L, 200L,
300L, 100L), Indicator_variable = c(1L, 0L, 0L, 1L, 1L, 0L,
0L, 0L, 0L)), .Names = c("Org_ID", "Market_volume", "Indicator_variable"
), class = "data.frame", row.names = c(NA, -9L))