使用tidyverse从数据子集的因子级别创建比例变量

时间:2017-09-06 01:33:53

标签: r tidyverse

我有一个像这样的数据框:

df<- data.frame(year= as.character(c("1997", 
"1997","1997","1997","1997","1997","1998","1998")),season= 
as.character(c("W", "W","W","D","D","D","W","W")),result= 
as.character(c("Y", "Y","N","N","Y","N","N","N")))

我希望按yearseason对数据进行子集化,并计算该特定子集的result中“Y”的比例。这个新的比例列称为psit_freq。输出的一个例子如下(我已经制作了比例分数以帮助读者理解我需要的计算)。

output<- data.frame(year= as.character(c("1997", 
"1997","1998")),season= as.character(c("W", "D","W")), psit_freq= 
 as.character(c("2/3", "1/3","0/2")))

我尝试过各种变体:

df<- 
 df %>%
 group_by(year, season)%>%
 summarise(psit_freq= freq())

但我不确定如何合并条件if else语句来计算Y响应与每个子集中总result行的比例。

2 个答案:

答案 0 :(得分:2)

您需要做的就是将result更改为整数(或逻辑),然后按照yearseason进行分组,并总结{{1}的均值}。


result

答案 1 :(得分:0)

A3