我正在尝试计算DataFrame的所有列的平均值,但看起来在第6行的B列中有一个值可以防止计算C列的平均值。为什么呢?
var myCar = Car{Vehicle{4, 4, "Me"}, "Manual"}
var myBike = Bike{Vehicle{2, 0, "Bob and I"}, false}
var myVehicles = Vehicles{myCar, myBike}
for i := range myVehicles {
fmt.Println(myVehicles[i])
}
试验:
import pandas as pd
from decimal import Decimal
d = [
{'A': 2, 'B': None, 'C': Decimal('628.00')},
{'A': 1, 'B': None, 'C': Decimal('383.00')},
{'A': 3, 'B': None, 'C': Decimal('651.00')},
{'A': 2, 'B': None, 'C': Decimal('575.00')},
{'A': 4, 'B': None, 'C': Decimal('1114.00')},
{'A': 1, 'B': 'TEST', 'C': Decimal('241.00')},
{'A': 2, 'B': None, 'C': Decimal('572.00')},
{'A': 4, 'B': None, 'C': Decimal('609.00')},
{'A': 3, 'B': None, 'C': Decimal('820.00')},
{'A': 5, 'B': None, 'C': Decimal('1223.00')}
]
df = pd.DataFrame(d)
In : df
Out:
A B C
0 2 None 628.00
1 1 None 383.00
2 3 None 651.00
3 2 None 575.00
4 4 None 1114.00
5 1 TEST 241.00
6 2 None 572.00
7 4 None 609.00
8 3 None 820.00
9 5 None 1223.00
dtypes:
# no mean for C column
In : df.mean()
Out:
A 2.7
dtype: float64
# mean for C column when row 6 is left out of the DF
In : df.head(5).mean()
Out:
A 2.4
B NaN
C 670.2
dtype: float64
# no mean for C column when row 6 is part of the DF
In : df.head(6).mean()
Out:
A 2.166667
dtype: float64
答案 0 :(得分:3)
如果只需要包含数字的列,则可以使用特定列:
In [90]: df[['A','C']].mean()
Out[90]:
A 2.7
C 681.6
dtype: float64
或在评论中将类型更改为@jezrael建议:
df['C'] = df['C'].astype(float)
可能df.mean
尝试将所有对象转换为数字,如果它已经下降,那么它会回滚并仅计算实际数字