我正在尝试将每列的百分位数分布输出显示为数据帧,因为我希望稍后将其导出到csv。
我只是简单地循环所有列:
for column in data:
print(data[column].describe([.01,.1,.2,.3,.4,.5,.6,.7,.8,.9,.99]))
然而,我无法弄清楚如何休息。非常感谢任何帮助!
+使用其他查询编辑主要问题:
我还希望按照data.groupby(data['MARKET']).describe([.01,.1,.2,.3,.4,.5,.6,.7,.8,.9,.99])
这样的列对输出进行分组。但是,我收到错误,如" describe()接受1个位置参数但是2个被给出"。我该如何处理这个问题?
样本数据集:
d = {'col1': [1, 2, 3, 2, 1],
'col2': [3, 4, 5, 6, 7],
'country': ['TR', 'UK', 'UK' , 'TR', 'TR']};
df = pd.DataFrame(data=d)
答案 0 :(得分:3)
这就是你想要的吗?
In [19]: df = pd.DataFrame(np.arange(15).reshape(5,3)).add_prefix('col')
In [20]: df
Out[20]:
col0 col1 col2
0 0 1 2
1 3 4 5
2 6 7 8
3 9 10 11
4 12 13 14
In [21]: df.describe([.01,.1,.2,.3,.4,.5,.6,.7,.8,.9,.99])
Out[21]:
col0 col1 col2
count 5.000000 5.000000 5.000000
mean 6.000000 7.000000 8.000000
std 4.743416 4.743416 4.743416
min 0.000000 1.000000 2.000000
1% 0.120000 1.120000 2.120000
10% 1.200000 2.200000 3.200000
20% 2.400000 3.400000 4.400000
30% 3.600000 4.600000 5.600000
40% 4.800000 5.800000 6.800000
50% 6.000000 7.000000 8.000000
60% 7.200000 8.200000 9.200000
70% 8.400000 9.400000 10.400000
80% 9.600000 10.600000 11.600000
90% 10.800000 11.800000 12.800000
99% 11.880000 12.880000 13.880000
max 12.000000 13.000000 14.000000
<强>更新强>
d = {&#39; col1&#39;:[1,2,3,2,1],&#39; col2&#39;:[3,4,5,6,7],&# 39;国&#39 ;: [&#39; TR&#39;,&#39;英国&#39;&#39;英国&#39; ,&#39; TR&#39;,&#39; TR&#39;]};
df = pd.DataFrame(data = d)
In [29]: df.groupby('country').apply(lambda x: x.describe([.01,.1,.2,.3,.4,.5,.6,.7,.8,.9,.99]))
Out[29]:
col1 col2
country
TR count 3.000000 3.000000
mean 1.333333 5.333333
std 0.577350 2.081666
min 1.000000 3.000000
1% 1.000000 3.060000
10% 1.000000 3.600000
20% 1.000000 4.200000
30% 1.000000 4.800000
40% 1.000000 5.400000
50% 1.000000 6.000000
60% 1.200000 6.200000
70% 1.400000 6.400000
80% 1.600000 6.600000
90% 1.800000 6.800000
99% 1.980000 6.980000
max 2.000000 7.000000
UK count 2.000000 2.000000
mean 2.500000 4.500000
std 0.707107 0.707107
min 2.000000 4.000000
1% 2.010000 4.010000
10% 2.100000 4.100000
20% 2.200000 4.200000
30% 2.300000 4.300000
40% 2.400000 4.400000
50% 2.500000 4.500000
60% 2.600000 4.600000
70% 2.700000 4.700000
80% 2.800000 4.800000
90% 2.900000 4.900000
99% 2.990000 4.990000
max 3.000000 5.000000