Question

我希望按年划分多指数熊猫数据框以绘制年度子集。

我当前的方法使用一个函数创建一个掩码，在绘图函数中创建年图：

# -*- coding: utf-8 -*-
import numpy as np
import pandas as pd

# dataframe with dates
dates = pd.DataFrame()
dates['2016'] = pd.date_range(start='2016', periods=4, freq='60Min')
dates['2017'] = pd.date_range(start='2017', periods=4, freq='60Min')
dates['2018'] = pd.date_range(start='2018', periods=4, freq='60Min')
dates.reset_index()
dates = dates.unstack()

# multi-indexed dataframe
df = pd.DataFrame(np.random.randn(36, 3))
df['concept'] = np.repeat(np.repeat(['A', 'B', 'C'], 3), 4)
df['datetime'] = pd.concat([dates, dates, dates], ignore_index=True)
df.set_index(['concept', 'datetime'], inplace=True)
df.sort_index(inplace=True)
df.info()


# my current approach is to use mask to slice the year
def mask_year(y):
    y = str(y)
    mask = df.index.get_level_values('datetime').year == \
        pd.Timestamp(y).year
    return mask


# plotting using the mask
for yy in np.unique(df.index.get_level_values('datetime').year):
    idx = pd.IndexSlice
    df_plt = df.loc[idx[:, mask_year(yy)], :]
    df_plt = df_plt.groupby(df_plt.index.get_level_values('concept')).sum()
    df_plt.plot(kind="bar", title=yy, stacked=True)

但是制作面具对我来说似乎很不自然！

我知道我能做到：

idx = pd.IndexSlice
df_slice = df.loc[idx[:, slice(pd.Timestamp('2016-01-01 00:00:00'),
                               pd.Timestamp('2016-01-01 03:00:00'))], :]
print(df_slice)

返回与掩码相同的结果：

                                    0         1         2
concept datetime                                         
A       2016-01-01 00:00:00  0.490030  1.190667  0.186234
        2016-01-01 01:00:00 -1.812031 -0.077387 -0.484749
        2016-01-01 02:00:00  0.098846 -0.580555 -0.044958
        2016-01-01 03:00:00 -1.866516 -0.368678 -0.958492
B       2016-01-01 00:00:00 -2.001319 -0.589698  0.072124
        2016-01-01 01:00:00 -0.267139 -1.374081  1.009086
        2016-01-01 02:00:00  0.271937  2.379305  0.134826
        2016-01-01 03:00:00  0.781606  0.181515 -0.922162
C       2016-01-01 00:00:00 -0.333881 -1.050664 -1.839776
        2016-01-01 01:00:00 -0.535254 -0.800394 -0.551664
        2016-01-01 02:00:00  0.410175  0.438866  0.399537
        2016-01-01 03:00:00  0.813004 -1.005976  0.465938

但是使用这种方法需要将两个参数传递给绘图函数。

是否有更简单的方法按年分割DateTime级别值？

提前致谢！

具有MultiIndex的Pandas DataFrame：从DateTime级别值

0 个答案: