pandas:找到给定列的百分位统计数据

时间:2016-09-19 20:50:57

标签: python python-2.7 pandas statistics

我有一个pandas数据框my_df,在那里我可以找到给定列的mean(),median(),mode():

my_df['field_A'].mean()
my_df['field_A'].median()
my_df['field_A'].mode()

我想知道是否有可能找到更详细的统计数据,如90%?谢谢!

4 个答案:

答案 0 :(得分:31)

您可以使用pandas.DataFrame.quantile()功能,如下所示。

import pandas as pd
import random

A = [ random.randint(0,100) for i in range(10) ]
B = [ random.randint(0,100) for i in range(10) ]

df = pd.DataFrame({ 'field_A': A, 'field_B': B })
df
#    field_A  field_B
# 0       90       72
# 1       63       84
# 2       11       74
# 3       61       66
# 4       78       80
# 5       67       75
# 6       89       47
# 7       12       22
# 8       43        5
# 9       30       64

df.field_A.mean()   # Same as df['field_A'].mean()
# 54.399999999999999

df.field_A.median() 
# 62.0

# You can call `quantile(i)` to get the i'th quantile,
# where `i` should be a fractional number.

df.field_A.quantile(0.1) # 10th percentile
# 11.9

df.field_A.quantile(0.5) # same as median
# 62.0

df.field_A.quantile(0.9) # 90th percentile
# 89.10000000000001

答案 1 :(得分:7)

假设系列s

s = pd.Series(np.arange(100))

获取[.1, .2, .3, .4, .5, .6, .7, .8, .9]

的分位数
s.quantile(np.linspace(.1, 1, 9, 0))

0.1     9.9
0.2    19.8
0.3    29.7
0.4    39.6
0.5    49.5
0.6    59.4
0.7    69.3
0.8    79.2
0.9    89.1
dtype: float64

OR

s.quantile(np.linspace(.1, 1, 9, 0), 'lower')

0.1     9
0.2    19
0.3    29
0.4    39
0.5    49
0.6    59
0.7    69
0.8    79
0.9    89
dtype: int32

答案 2 :(得分:5)

我发现下面会有效:

grep

答案 3 :(得分:0)

您甚至可以为多列提供空值并获得多个分位数(我将95%用于离群值处理)

 <div class="masked-copy texture-orange big-type">
               <p><span class="firstline">Coffee </span><span class="secondline">'N'</span> <span class="thirdline">Code!</span></p>
               <!--
                  
               -->
               <div class="explanation">
                  <div>
                     <h1 class="goo" contenteditable="true">Elected as a <i>Senior Project Lead</i> <br>at the Biggest Club<br> involved in Technology and Coding: <span id = 'newline'>Coffee</span> 'N' Code.</h1>
                  </div>
                  <!-- Filter: https://css-tricks.com/gooey-effect/ -->
                  <svg style="visibility: hidden; position: absolute;" width="0" height="0" xmlns="http://www.w3.org/2000/svg" version="1.1">
                     <defs>
                        <filter id="goo">
                           <feGaussianBlur in="SourceGraphic" stdDeviation="10" result="blur" />
                           <feColorMatrix in="blur" mode="matrix" values="1 0 0 0 0  0 1 0 0 0  0 0 1 0 0  0 0 0 19 -9" result="goo" />
                           <feComposite in="SourceGraphic" in2="goo" operator="atop"/>
                        </filter>
                     </defs>
                  </svg>
               </div>
            </div>