如何获得分组大小的百分比

时间:2019-02-20 12:28:21

标签: python pandas

我正在寻找一种获取百分比的方法

import React, {useState} from 'react';

const AddButon = ({handleAddValue}) => {
  return <button onClick={handleAddValue}>Add</button>
}

const App = (props) =>{

  const [value, setValue] = useState(0);

  const handleAddValue = () => {
    const newValue = value+1;
    setValue(newValue);
  }

  return (
    <div>
      <div>The Value is: {value}</div>
      <AddButon handleAddValue={handleAddValue} />
    </div>);
}

这很好,但我想要的是百分比而不是计数。

df.groupby(['state', 'approved_or_not']).size()

Output:

school_state  project_is_approved
AK            0                         55
              1                        290
AL            0                        256
              1                       1506
AR            0                        177
              1                        872
AZ            0                        347
              1                       1800

我尝试了但找不到办法。知道有人可以帮忙吗?

2 个答案:

答案 0 :(得分:9)

SeriesGroupBy.value_counts与参数normalize=True一起使用:

df.groupby('state')['approved_or_not'].value_counts(normalize=True)

示例

np.random.seed(2019)

L = list('ABC')
df = pd.DataFrame({'state':np.random.choice(L, size=10),
                   'approved_or_not':np.random.choice([0,1], size=10)})
print (df)
  state  approved_or_not
0     A                0
1     C                0
2     B                1
3     A                0
4     C                1
5     C                1
6     A                0
7     B                0
8     A                0
9     C                1

a = df.groupby(['state', 'approved_or_not']).size()
print (a)
A      0                  4
B      0                  1
       1                  1
C      0                  1
       1                  3
dtype: int64

a = df.groupby('state')['approved_or_not'].value_counts(normalize=True)
print (a)
state  approved_or_not
A      0                  1.00
B      0                  0.50
       1                  0.50
C      1                  0.75
       0                  0.25
Name: approved_or_not, dtype: float64

编辑:您可以在每个第一级sum除以Series.div除以state

a = df.groupby(['state', 'approved_or_not']).size()

a = a.div(a.sum(level=0), level=0)
print (a)
state  approved_or_not
A      0                  1.00
B      0                  0.50
       1                  0.50
C      0                  0.25
       1                  0.75
dtype: float64

答案 1 :(得分:0)

我已经解决了使用聚合函数的问题。

示例:

)

import pandas as pd import numpy as np np.random.seed(316)

lst = ['Karnataka', 'Tamil Nadu', 'Kerala']

data = pd.DataFrame({'state':np.random.choice(lst, size=10), 'approved_or_not':np.random.choice([2,4], size=10)})

print (data)

输出

data.groupby(['state', 'approved_or_not']).agg({'approved_or_not': ["size", "mean"]})