Python pandas groupby correlation包括所有列而不是选定列

时间:2014-03-07 00:02:00

标签: python pandas

In [27]: df
Out[27]: 
a b c g
0 0.273411 -0.573347 -0.411541 group1
1 -1.068101 -0.987671 1.501582 group1
2 0.784626 0.003757 -0.938192 group1
3 0.221111 0.480360 0.546703 group1
4 1.277762 -1.285070 -0.415170 group1
5 1.870870 1.320776 -1.628251 group1
6 -0.158213 1.317783 1.803866 group2
7 -1.193541 -2.841256 -0.784413 group2
8 -1.242116 0.037265 -0.888934 group2
9 -0.782190 0.678842 -0.127150 group2

10 rows × 4 columns 

In [28]: df.groupby('g')[['a', 'b']].mean()
Out[28]: 
        a      b
g
group1 0.559947 -0.173533
group2 -0.844015 -0.201841

2 rows × 2 columns 

In [29]: df.groupby('g')[['a', 'b']].corr()
Out[29]: 

a  b  c
g
group1
a  1.000000  0.490017 -0.919512
b  0.490017  1.000000 -0.526212
c  -0.919512 -0.526212 1.000000
group2
a 1.000000 0.696526 0.988657
b 0.696526 1.000000 0.652678
c 0.988657 0.652678 1.000000

6 rows × 3 columns 

In [30]: 

如图所示,当按组计算相关矩阵时,pandas会忽略我只想计算列ab之间的相关性,并计算所有列的相关性。这是一个错误吗?

0 个答案:

没有答案