#First, I divide the age group as follow ,
# 1. group A: 0-17years old;
# 2. group B: 18-35years old
# 3. group C: 36-50years old
# 4. group D: 51-65years old
# 5. group E: above 66 years old
#Then I begin to write code extact the CVC data
Passenger_Age={"PassengerId":titanic["PassengerId"][:],"Age":titanic["Age"][:]}
Passenger_Age_df = pd.DataFrame(Passenger_Age,columns =["Age","PassengerId"])
Passenger_Survived={"PassengerId":titanic["PassengerId"[:],"Survived":titanic["Survived"][:]}
Passenger_Survived_df = pd.DataFrame(Passenger_Survived,columns = ["Survived","PassengerId"])
# consider there are some NAN in Age, so wirte the blow cod to drop the Age data
cleaned_Passenger_Age_df = Passenger_Age_df.dropna()
关于下一步,我想合并两个数据框," cleaning_Passenger_Age_df"和" Passenger_Survived_df"。
之后,使用applymap函数将年龄转换为ABCDE
然后根据那个找到幸存的年龄组的比率
我的问题是smy想法很明确,但我不知道写代码,有人可以帮我吗? THX!
答案 0 :(得分:0)
您可以使用pd.cut()
对年龄进行分组,例如:
group_names = ['A','B','C','D','E']
bins = [0,17,35,50,65,1000]
df['Age_Group'] = pd.cut(df['Age'], bins=bins, labels=group_names)
更多细节: pandas.cut
至于计算幸存率,你可以使用 group by ,例如:
df.groupby(['Age_Group','Survived']).count() / total_numbers