让我们考虑以下数据框:
from pandas import Timestamp
dic={'volume': {('E7', Timestamp('2016-11-01 00:00:00')): Decimal('1204'),
('E7', Timestamp('2016-08-16 00:00:00')): Decimal('1070'),
('G6', Timestamp('2016-08-17 00:00:00')): Decimal('1702'),
('G6', Timestamp('2016-08-18 00:00:00')): Decimal('1262'),
('G6', Timestamp('2016-08-26 00:00:00')): Decimal('3333'),
('VG', Timestamp('2016-08-31 00:00:00')): Decimal('1123'),
('VG', Timestamp('2016-09-01 00:00:00')): Decimal('1581'),
('VG', Timestamp('2016-09-02 00:00:00')): Decimal('1276'),
('VG', Timestamp('2016-09-06 00:00:00')): Decimal('2417'),
}}
df=pd.DataFrame(dic)
我希望每个符号(第一列)计算“音量”列的平均值。
我试过了df.groupby(level=0).mean()
,但它没有用。
答案 0 :(得分:1)
不要在Pandas中使用Decimal - 它不是原生的Numpy / Pandas dtype:
In [32]: df.dtypes
Out[32]:
volume object # <---- NOTE
dtype: object
将其转换为数字:
In [29]: df['vol'] = pd.to_numeric(df.volume)
In [30]: df
Out[30]:
volume vol
E7 2016-08-16 1070 1070.0
2016-11-01 1204 1204.0
G6 2016-08-17 1702 1702.0
2016-08-18 1262 1262.0
2016-08-26 3333 3333.0
VG 2016-08-31 1123 1123.0
2016-09-01 1581 1581.0
2016-09-02 1276 1276.0
2016-09-06 2417 2417.0
In [31]: df.mean(level=0)
Out[31]:
vol
E7 1137.00
G6 2099.00
VG 1599.25