Question

我有一些数据涉及特定运动的某些球员的某些数字。我想在Pandas中使用数据透视表，让它通过运动分割数据，并且对于每项运动的相应值，所有玩这项运动的人都有平均“数字”值。（所以如果是篮球，它会平均所有打篮球的球员的数量，这个数字基本上代表了一种偏好。）

我可以通过数据透视表很容易地做到这一点，但是如果我想在计算标准差时做同样的事情，我无法弄清楚如何。我可以为np.mean做平均值，但没有np.std。我知道有std()，但我不确定在这种情况下我是如何使用它的。

不建议使用数据透视表来执行此任务吗？我应该如何找到特定运动的所有球员的数值数据的标准偏差？

Answer 1

如果您的DataFrame（df）包含一个名为"sport"的列，则其简单如下：

df.groupby(by=['sport']).std()

Answer 2

你使用的是什么版本的numpy？ 1.9.2有np.std：

np.std?
Type:        function
String form: <function std at 0x0000000003EE47B8>
File:        c:\anaconda3\lib\site-packages\numpy\core\fromnumeric.py
Definition:  np.std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=False)
Docstring:
Compute the standard deviation along the specified axis.

Returns the standard deviation, a measure of the spread of a distribution,
of the array elements. The standard deviation is computed for the
flattened array by default, otherwise over the specified axis.

Answer 3

df.pivot_table(values='number', index='sport', aggfunc='std')

如何使用Pandas中的数据透视表计算标准偏差？

3 个答案: