熊猫数据框的所有列的平均值?

时间:2015-11-20 20:55:37

标签: python-3.x pandas python-3.4

我正在尝试计算DataFrame的所有列的平均值,但看起来在第6行的B列中有一个值可以防止计算C列的平均值。为什么呢?

var myCar = Car{Vehicle{4, 4, "Me"}, "Manual"}
var myBike = Bike{Vehicle{2, 0, "Bob and I"}, false}
var myVehicles = Vehicles{myCar, myBike}
for i := range myVehicles {
    fmt.Println(myVehicles[i])
}

试验:

import pandas as pd
from decimal import Decimal
d = [
    {'A': 2, 'B': None, 'C': Decimal('628.00')},
    {'A': 1, 'B': None, 'C': Decimal('383.00')},
    {'A': 3, 'B': None, 'C': Decimal('651.00')},
    {'A': 2, 'B': None, 'C': Decimal('575.00')},
    {'A': 4, 'B': None, 'C': Decimal('1114.00')},
    {'A': 1, 'B': 'TEST', 'C': Decimal('241.00')},
    {'A': 2, 'B': None, 'C': Decimal('572.00')},
    {'A': 4, 'B': None, 'C': Decimal('609.00')},
    {'A': 3, 'B': None, 'C': Decimal('820.00')},
    {'A': 5, 'B': None, 'C': Decimal('1223.00')}
]

df = pd.DataFrame(d)

In : df
Out:
   A     B        C
0  2  None   628.00
1  1  None   383.00
2  3  None   651.00
3  2  None   575.00
4  4  None  1114.00
5  1  TEST   241.00
6  2  None   572.00
7  4  None   609.00
8  3  None   820.00
9  5  None  1223.00

dtypes:

# no mean for C column
In : df.mean()
Out:
A    2.7
dtype: float64

# mean for C column when row 6 is left out of the DF
In : df.head(5).mean()
Out:
A      2.4
B      NaN
C    670.2
dtype: float64

# no mean for C column when row 6 is part of the DF
In : df.head(6).mean()
Out:
A    2.166667
dtype: float64

1 个答案:

答案 0 :(得分:3)

如果只需要包含数字的列,则可以使用特定列:

In [90]: df[['A','C']].mean()
Out[90]: 
A      2.7
C    681.6
dtype: float64

或在评论中将类型更改为@jezrael建议:

df['C'] = df['C'].astype(float)

可能df.mean尝试将所有对象转换为数字,如果它已经下降,那么它会回滚并仅计算实际数字