occupation gender number
administrator F 36
M 43
artist F 13
M 15
doctor M 7
educator F 26
M 69
如何获取前两列的滚动平均值并找到每个职业中(M)男和(F)女的平均值
users = pd.read_table('https://raw.githubusercontent.com/justmarkham/DAT8/master/data/u.user',
sep='|', index_col='user_id')
users.head()
age gender occupation zip_code
user_id
1 24 M technician 85711
2 53 F other 94043
3 23 M writer 32067
4 24 M technician 43537
5 33 F other 15213
答案 0 :(得分:1)
# create a data frame and apply count to gender
gender_ocup = users.groupby(['occupation', 'gender']).agg({'gender': 'count'})
# create a DataFrame and apply count for each occupation
occup_count = users.groupby(['occupation']).agg('count')
# divide the gender_ocup per the occup_count and multiply per 100
occup_gender = gender_ocup.div(occup_count, level = "occupation") * 100
# present all rows from the 'gender column'
occup_gender.loc[: , 'gender']