如何在大熊猫系列中找到groupby函数的比例

时间:2018-06-24 14:20:02

标签: python pandas dataframe pandas-groupby

我使用await picture.DeleteAsync();按职业和性别对数据集进行了分组。现在,我想找到每种职业的男女比例。我无法思考如何进行。

enter image description here

3 个答案:

答案 0 :(得分:2)

这是使用pandas.pivot_table和矢量化的熊猫计算的一种方法。请注意,此方法无需执行单独的groupby

df = pd.DataFrame([['A', 'F'], ['A', 'F'], ['A', 'M'], ['B', 'M'], ['B', 'M'], ['B', 'F'],
                   ['C', 'M'], ['C', 'M'], ['D', 'F']], columns=['Occupation', 'Gender'])

# pivot input dataframe
res = df.pivot_table(index='Occupation', columns='Gender', aggfunc='size', fill_value=0)

# calculate ratios
sums = res[['F', 'M']].sum(axis=1)
res['FemaleRatio'] = res['F'] / sums
res['MaleRatio'] = res['M'] / sums

print(res)

Gender      F  M  FemaleRatio  MaleRatio
Occupation                              
A           2  1     0.666667   0.333333
B           1  2     0.333333   0.666667
C           0  2     0.000000   1.000000
D           1  0     1.000000   0.000000

答案 1 :(得分:1)

x=users.groupby(['occupation','gender'])['gender'].count()
    y=users.groupby(['occupation'])['gender'].count()
    r=((x/y)*100).round(2)
    print(r)

#ratio rule "x" is a count of gender(male/female), "y" is the total count of gender

occupation     gender
administrator  F          45.57
               M          54.43
artist         F          46.43
               M          53.57
doctor         M         100.00
educator       F          27.37
               M          72.63
engineer       F           2.99
               M          97.01
entertainment  F          11.11
               M          88.89
executive      F           9.38
               M          90.62

答案 2 :(得分:0)

也许聚会晚了,但这是我的确切答案:

# create pivot
male_ratio = users.pivot_table(index='occupation', columns='gender', aggfunc='size', fill_value=0)

# calculate male ratio
sums = male_ratio[['F', 'M']].sum(axis=1)
male_ratio['MaleRatio'] = round(100 * male_ratio['M'] / sums , 1)

# result
male_ratio['MaleRatio']

occupation
administrator     54.4
artist            53.6
doctor           100.0
educator          72.6
engineer          97.0
entertainment     88.9
executive         90.6
healthcare        31.2
homemaker         14.3
lawyer            83.3
librarian         43.1
marketing         61.5
none              55.6
other             65.7
programmer        90.9
retired           92.9
salesman          75.0
scientist         90.3
student           69.4
technician        96.3
writer            57.8
Name: MaleRatio, dtype: float64