我不确定如何完成家庭作业,在这里我必须根据男女所属的班级将其分组,以便找到每个班级中幸存的百分比。
这是所使用的文本文件中数据的一小部分。
1=alive
0=dead
Name PClass Age Sex Survived
"Allen, Miss Elisabeth Walton" 1st 29 female 1
"Allison, Miss Helen Loraine" 1st 2 female 0
"Allison, Mr Hudson Joshua Creighton" 1st 30 male 0
"Allison, Mrs Hudson JC (Bessie Waldo Daniels)" 1st 25 female 0
"Allison, Master Hudson Trevor" 1st 0.92 male 1
"Anderson, Mr Harry" 1st 47 male 1
"Andrews, Miss Kornelia Theodosia" 1st 63 female 1
"Andrews, Mr Thomas, jr" 1st 39 male 0
"Appleton, Mrs Edward Dale (Charlotte Lamson)" 1st 58 female 1
"Artagaveytia, Mr Ramon" 1st 71 male 0
"Astor, Colonel John Jacob" 1st 47 male 0
答案 0 :(得分:0)
考虑使用dplyr
库:
library(dplyr)
data %>%
group_by(PClass, Sex) %>%
summarize(ratio = sum(Survived) / n())
(未经验证的代码,对不起)
答案 1 :(得分:0)
有多种方法可以做到这一点。假设DT是您的data.table,一种选择是使用聚合函数:
aggregate(DT$Survived, by=list(DT$PClass, DT$Sex))
看看documentation for aggregate来了解参数的含义。
另一种可能性是导入data.table库,然后按所需的列分组:
library(data.table)
DT[,list(mean_survival = mean(Survived)),by=.(PClass, Sex)]