我是Python的新手。我想计算每列的总和,均值,中位数和标准差,但它会返回一个长字符串作为答案
df=pd.DataFrame({
'apple': {
0: '15.8',
1: '3562',
2: '51.36',
3: '179868',
4: '6.0',
5: ''
},
'banana': {
0: '27.84883300816733',
1: '44.64197389840307',
2: '',
3: '13.3',
4: '17.6',
5: '6.1'
},
'cheese': {
0: '27.68303400840678',
1: '39.93121897299962',
2: '',
3: '9.4',
4: '7.2',
5: '6.0'},
'egg': {0: '',
1: '7.2',
2: '66.0',
3: '23.77814972104277',
4: '23967',
5: ''}
}
)
例如,为了计算苹果列的总和,我使用了
df['apple'].sum()
它给了我15.8356251.361798686.0
的输出,这很奇怪。
请帮助。
答案 0 :(得分:1)
这是您想要做的:
df = df.apply(pd.to_numeric, errors='coerce')
df.describe()
apple banana cheese egg
count 5.000000 5.000000 5.000000 4.000000
mean 36700.632000 21.898161 18.042851 6015.994537
std 80047.651817 14.955567 15.077552 11967.362577
min 6.000000 6.100000 6.000000 7.200000
25% 15.800000 13.300000 7.200000 19.633612
50% 51.360000 17.600000 9.400000 44.889075
75% 3562.000000 27.848833 27.683034 6041.250000
max 179868.000000 44.641974 39.931219 23967.000000
df.sum()
apple 183503.160000
banana 109.490807
cheese 90.214253
egg 24063.978150
dtype: float64