答案 0 :(得分:2)
这是使用pandas.pivot_table
和矢量化的熊猫计算的一种方法。请注意,此方法无需执行单独的groupby
。
df = pd.DataFrame([['A', 'F'], ['A', 'F'], ['A', 'M'], ['B', 'M'], ['B', 'M'], ['B', 'F'],
['C', 'M'], ['C', 'M'], ['D', 'F']], columns=['Occupation', 'Gender'])
# pivot input dataframe
res = df.pivot_table(index='Occupation', columns='Gender', aggfunc='size', fill_value=0)
# calculate ratios
sums = res[['F', 'M']].sum(axis=1)
res['FemaleRatio'] = res['F'] / sums
res['MaleRatio'] = res['M'] / sums
print(res)
Gender F M FemaleRatio MaleRatio
Occupation
A 2 1 0.666667 0.333333
B 1 2 0.333333 0.666667
C 0 2 0.000000 1.000000
D 1 0 1.000000 0.000000
答案 1 :(得分:1)
x=users.groupby(['occupation','gender'])['gender'].count()
y=users.groupby(['occupation'])['gender'].count()
r=((x/y)*100).round(2)
print(r)
#ratio rule "x" is a count of gender(male/female), "y" is the total count of gender
occupation gender
administrator F 45.57
M 54.43
artist F 46.43
M 53.57
doctor M 100.00
educator F 27.37
M 72.63
engineer F 2.99
M 97.01
entertainment F 11.11
M 88.89
executive F 9.38
M 90.62
答案 2 :(得分:0)
也许聚会晚了,但这是我的确切答案:
# create pivot
male_ratio = users.pivot_table(index='occupation', columns='gender', aggfunc='size', fill_value=0)
# calculate male ratio
sums = male_ratio[['F', 'M']].sum(axis=1)
male_ratio['MaleRatio'] = round(100 * male_ratio['M'] / sums , 1)
# result
male_ratio['MaleRatio']
occupation
administrator 54.4
artist 53.6
doctor 100.0
educator 72.6
engineer 97.0
entertainment 88.9
executive 90.6
healthcare 31.2
homemaker 14.3
lawyer 83.3
librarian 43.1
marketing 61.5
none 55.6
other 65.7
programmer 90.9
retired 92.9
salesman 75.0
scientist 90.3
student 69.4
technician 96.3
writer 57.8
Name: MaleRatio, dtype: float64