将百分位数分布显示为python中的数据帧

时间:2017-11-27 18:58:52

标签: python pandas loops dataframe percentile

我正在尝试将每列的百分位数分布输出显示为数据帧,因为我希望稍后将其导出到csv。

我只是简单地循环所有列:

for column in data:
    print(data[column].describe([.01,.1,.2,.3,.4,.5,.6,.7,.8,.9,.99]))

然而,我无法弄清楚如何休息。非常感谢任何帮助!

+使用其他查询编辑主要问题:

我还希望按照data.groupby(data['MARKET']).describe([.01,.1,.2,.3,.4,.5,.6‌​,.7,.8,.9,.99])这样的列对输出进行分组。但是,我收到错误,如" describe()接受1个位置参数但是2个被给出"。我该如何处理这个问题?

样本数据集:

d = {'col1': [1, 2, 3, 2, 1],
     'col2': [3, 4, 5, 6, 7], 
     'country': ['TR', 'UK', 'UK' , 'TR', 'TR']};

df = pd.DataFrame(data=d)

1 个答案:

答案 0 :(得分:3)

这就是你想要的吗?

In [19]: df = pd.DataFrame(np.arange(15).reshape(5,3)).add_prefix('col')

In [20]: df
Out[20]:
   col0  col1  col2
0     0     1     2
1     3     4     5
2     6     7     8
3     9    10    11
4    12    13    14

In [21]: df.describe([.01,.1,.2,.3,.4,.5,.6,.7,.8,.9,.99])
Out[21]:
            col0       col1       col2
count   5.000000   5.000000   5.000000
mean    6.000000   7.000000   8.000000
std     4.743416   4.743416   4.743416
min     0.000000   1.000000   2.000000
1%      0.120000   1.120000   2.120000
10%     1.200000   2.200000   3.200000
20%     2.400000   3.400000   4.400000
30%     3.600000   4.600000   5.600000
40%     4.800000   5.800000   6.800000
50%     6.000000   7.000000   8.000000
60%     7.200000   8.200000   9.200000
70%     8.400000   9.400000  10.400000
80%     9.600000  10.600000  11.600000
90%    10.800000  11.800000  12.800000
99%    11.880000  12.880000  13.880000
max    12.000000  13.000000  14.000000

<强>更新

  

d = {&#39; col1&#39;:[1,2,3,2,1],&#39; col2&#39;:[3,4,5,6,7],&# 39;国&#39 ;:   [&#39; TR&#39;,&#39;英国&#39;&#39;英国&#39; ,&#39; TR&#39;,&#39; TR&#39;]};

     

df = pd.DataFrame(data = d)

In [29]: df.groupby('country').apply(lambda x: x.describe([.01,.1,.2,.3,.4,.5,.6,.7,.8,.9,.99]))
Out[29]:
                   col1      col2
country
TR      count  3.000000  3.000000
        mean   1.333333  5.333333
        std    0.577350  2.081666
        min    1.000000  3.000000
        1%     1.000000  3.060000
        10%    1.000000  3.600000
        20%    1.000000  4.200000
        30%    1.000000  4.800000
        40%    1.000000  5.400000
        50%    1.000000  6.000000
        60%    1.200000  6.200000
        70%    1.400000  6.400000
        80%    1.600000  6.600000
        90%    1.800000  6.800000
        99%    1.980000  6.980000
        max    2.000000  7.000000
UK      count  2.000000  2.000000
        mean   2.500000  4.500000
        std    0.707107  0.707107
        min    2.000000  4.000000
        1%     2.010000  4.010000
        10%    2.100000  4.100000
        20%    2.200000  4.200000
        30%    2.300000  4.300000
        40%    2.400000  4.400000
        50%    2.500000  4.500000
        60%    2.600000  4.600000
        70%    2.700000  4.700000
        80%    2.800000  4.800000
        90%    2.900000  4.900000
        99%    2.990000  4.990000
        max    3.000000  5.000000