Question

Pandas df.describe()是一个非常有用的方法来概述你的df。但是，它按列描述，我希望对行进行概述。有没有办法让它在没有转置df的情况下“by_row”工作？

Answer 1

使用apply并传递axis=1来逐行调用describe：

In [274]:
df = pd.DataFrame(np.random.randn(4,5))
df

Out[274]:
          0         1         2         3         4
0  0.651863  0.738034 -0.477668 -0.561699  0.047500
1 -1.565093 -0.671551  0.537272 -0.956520  0.301156
2 -0.951549  2.177592  0.059961 -1.631530 -0.620173
3  0.277796  0.169365  1.657189  0.713522  1.649386

In [276]:
df.apply(pd.DataFrame.describe, axis=1)

Out[276]:
   count      mean       std       min       25%       50%       75%       max
0      5  0.079606  0.609069 -0.561699 -0.477668  0.047500  0.651863  0.738034
1      5 -0.470947  0.878326 -1.565093 -0.956520 -0.671551  0.301156  0.537272
2      5 -0.193140  1.458676 -1.631530 -0.951549 -0.620173  0.059961  2.177592
3      5  0.893451  0.722917  0.169365  0.277796  0.713522  1.649386  1.657189

Answer 2

EdChum的答案对我不起作用，因为使用apply命令所有的行都是Series;使用pd.Series.describe工作

df.apply(pd.Series.describe, axis=1)

Answer 3

对我来说，上述解决方案在单列上应用时不起作用。以下使用转置的解决方案对我有用：

多列

df = pd.DataFrame(np.random.randn(4,5))
df.describe().T

单列
```
df[[0]].describe().T
```

结果：

   count      mean       std       min       25%       50%       75%       max
0    4.0  0.341798  1.452760 -0.516745 -0.456267 -0.313535  0.484530  2.511008
1    4.0 -0.151615  0.680945 -0.692965 -0.483918 -0.378968 -0.046664  0.844442
2    4.0  0.186745  1.408807 -1.190361 -0.946510  0.257237  1.390493  1.422869
3    4.0  0.696535  0.850926 -0.134516  0.015622  0.703306  1.384219  1.514046
4    4.0  0.333963  0.693706 -0.269957 -0.083315  0.147917  0.565195  1.309977

和

   count      mean      std       min       25%       50%      75%       max
0    4.0  0.341798  1.45276 -0.516745 -0.456267 -0.313535  0.48453  2.511008

分别。

如果您想查看所有列，可以使用 pd.set_option('display.max_columns', None)

Pandas df.describe（），是否有可能在没有移调的情况下按行进行？

3 个答案: