我有一个名为cleancc的数据框,格式如下:
Education Status
College Default
College No Default
HS Default
PHD No Default
HS No Default
College No Default
我想执行一些计算,根据教育水平查看默认费率。例如,像这样。
Education Def NDef DefRate
HS 1 1 50.00%
College 1 2 33.33%
PHD 0 1 0.00%
以下代码为我提供了每个教育级别的计数。
table(cleancc$Education)
我正在努力解决如何将这些链接到“状态”列并创建显示默认速率的表格。
答案 0 :(得分:1)
我们可以使用功能强大的 dplyr
包来执行此聚合:
library(dplyr)
dat %>%
group_by(Education) %>%
summarise(Def = sum(Status == 'Default'),
NDef = sum(Status != 'Default'),
DefRate = mean(Status == 'Default'))
Education Def NDef DefRate
<chr> <int> <int> <dbl>
1 College 1 2 0.3333333
2 HS 1 1 0.5000000
3 PHD 0 1 0.0000000
我们也可以使用aggregate
函数:
aggregate(Status ~ Education, data = dat, FUN = function(x){
c('Def' = sum(x == 'Default'),
'NDef' = sum(x != 'Default'),
'DefRate' = mean(x == 'Default')
)
})
Education Status.Def Status.NDef Status.DefRate
1 College 1.0000000 2.0000000 0.3333333
2 HS 1.0000000 1.0000000 0.5000000
3 PHD 0.0000000 1.0000000 0.0000000
dput(dat)
structure(list(Education = c("College", "College", "HS", "PHD",
"HS", "College"), Status = c("Default", "No Default", "Default",
"No Default", "No Default", "No Default")), .Names = c("Education",
"Status"), row.names = c(NA, -6L), class = "data.frame")
答案 1 :(得分:1)
<button onclick="reload('8IYzyTYucKQ')">Reload</button> <-- Added the parameter to what the `data-video` should be changed