Question

不确定是否可以使用pandas。但是我想制作一个DataFrame，如下所示除了我只希望在没有年数的情况下在索引中有数月和数天。

import pandas as pd
import numpy as np 
df2 = pd.DataFrame(np.random.randn(12, 4), index=pd.date_range('1-1', periods=12, freq='M'), columns=['2007', '2008', '2009', '2010'])

只是为了提供更多信息。我做了以下事情。

df = pd.Series(np.random.randn(72), index=pd.date_range('1/1/2000', periods=72, freq='M'))

然后我可以按如下方式使用grouby：

groupYear_Month = df.groupby(lambda x: (x.year, x.month)).sum()

哪个收益率：

groupYear_Month.head()
Out[9]: 
(2000, 1)    1.077949
(2000, 2)   -0.563224
(2000, 3)   -2.016833
(2000, 4)   -0.140693
(2000, 5)    2.113549
dtype: float64

现在我可以：

groupYear_Month.index = pd.MultiIndex.from_tuples(groupYear_Month.index)

然而，这会杀死日期格式。例如，我没有得到两个月01,02 ... 12.
我现在可以将它拆开并在列级获得多年。

groupYear_Month.unstack(0)

这有效，但它不再是日期索引。

由于

Answer 1

一种可能的解决方案是写一个小班：

class Month:
    __slots__ = ['month', 'year']
    def __init__( self, date ):
        self.month, self.year = date.month, date.year

    def __repr__( self ):
        return '{}-{:0>2}'.format( self.year, self.month )

    def __lt__( self, other ):
        return self.year < other.year or self.year == other.year and self.month < other.month

然后：

>>> df.groupby( Month ).sum( )
2000-01   -1.66
2000-02    0.37
2000-03    0.85
...
2005-11   -0.30
2005-12   -0.93
Length: 72, dtype: float64

Pandas DataFrame指数按月计算？

1 个答案: