我希望按年划分多指数熊猫数据框以绘制年度子集。
我当前的方法使用一个函数创建一个掩码,在绘图函数中创建年图:
# -*- coding: utf-8 -*-
import numpy as np
import pandas as pd
# dataframe with dates
dates = pd.DataFrame()
dates['2016'] = pd.date_range(start='2016', periods=4, freq='60Min')
dates['2017'] = pd.date_range(start='2017', periods=4, freq='60Min')
dates['2018'] = pd.date_range(start='2018', periods=4, freq='60Min')
dates.reset_index()
dates = dates.unstack()
# multi-indexed dataframe
df = pd.DataFrame(np.random.randn(36, 3))
df['concept'] = np.repeat(np.repeat(['A', 'B', 'C'], 3), 4)
df['datetime'] = pd.concat([dates, dates, dates], ignore_index=True)
df.set_index(['concept', 'datetime'], inplace=True)
df.sort_index(inplace=True)
df.info()
# my current approach is to use mask to slice the year
def mask_year(y):
y = str(y)
mask = df.index.get_level_values('datetime').year == \
pd.Timestamp(y).year
return mask
# plotting using the mask
for yy in np.unique(df.index.get_level_values('datetime').year):
idx = pd.IndexSlice
df_plt = df.loc[idx[:, mask_year(yy)], :]
df_plt = df_plt.groupby(df_plt.index.get_level_values('concept')).sum()
df_plt.plot(kind="bar", title=yy, stacked=True)
但是制作面具对我来说似乎很不自然!
我知道我能做到:
idx = pd.IndexSlice
df_slice = df.loc[idx[:, slice(pd.Timestamp('2016-01-01 00:00:00'),
pd.Timestamp('2016-01-01 03:00:00'))], :]
print(df_slice)
返回与掩码相同的结果:
0 1 2
concept datetime
A 2016-01-01 00:00:00 0.490030 1.190667 0.186234
2016-01-01 01:00:00 -1.812031 -0.077387 -0.484749
2016-01-01 02:00:00 0.098846 -0.580555 -0.044958
2016-01-01 03:00:00 -1.866516 -0.368678 -0.958492
B 2016-01-01 00:00:00 -2.001319 -0.589698 0.072124
2016-01-01 01:00:00 -0.267139 -1.374081 1.009086
2016-01-01 02:00:00 0.271937 2.379305 0.134826
2016-01-01 03:00:00 0.781606 0.181515 -0.922162
C 2016-01-01 00:00:00 -0.333881 -1.050664 -1.839776
2016-01-01 01:00:00 -0.535254 -0.800394 -0.551664
2016-01-01 02:00:00 0.410175 0.438866 0.399537
2016-01-01 03:00:00 0.813004 -1.005976 0.465938
但是使用这种方法需要将两个参数传递给绘图函数。
是否有更简单的方法按年分割DateTime级别值?
提前致谢!