使用dplyr计算比例时如何修复错误

时间:2019-06-12 11:11:34

标签: r dplyr

我有以下数据集:

a<-data_frame(gender= c(1,1,1,0,0,1,1,0,0,1),
              school= c(2,2,2,2,2,3,3,3,3,3),
             year=c(2011,2011,2011,2012,2012,2011,2011,2011,2012,2012),
              numberofstudents=c(3,3,3,2,2,3,3,3,2,2))

我想每年为每所学校分配一定比例的男性。因此,结果应类似于

data_frame(maleprop= c(1,0,0.66,0.5),
              school= c(2,2,3,3),
              year=c(2011,2012,2011,2012),
              )

我尝试了这段代码,很不幸,我有一个错误列maleprop的长度必须为1(汇总值),而不是3。

final <- a %>%
  group_by(school,year) %>%
  dplyr::summarize(
    school<-mean(school),
    year<-mean(year),
    maleprop <-(sum(gender==1))/(numberofstudents))

如何避免此问题并获得正确的结果?

1 个答案:

答案 0 :(得分:0)

似乎您的行数过多。应该这样做:

final <- a %>%
  group_by(school,year) %>%
  summarize(maleprop = sum(gender)/mean(numberofstudents))