Question

我有dataframe日期为列。我想将每日的价值平均到每月的水平。我已尝试使用Time Grouper和Resample，但它不喜欢列名称是字符串，我似乎可以弄清楚如何将列设置为类似DatetimeIndex的内容。

我的起始数据框：

import pandas as pd

df = pd.DataFrame(data=[[1,2,3,4],[5,6,7,8]],
                  columns=['2013-01-01', '2013-01-02', '2013-02-03', '2013-02-04'], 
                  index=['A', 'B'])

期望的输出：

   2013-01-01  2013-02-01
A         1.5         3.5
B         5.6         7.5

Answer 1

您可以使用resample

df.columns = pd.to_datetime(df.columns)
df.T.resample('M').mean().T
Out[409]: 
   2013-01-31  2013-02-28
A         1.5         3.5
B         5.5         7.5

或groupby一个

axis=1 
df.groupby(pd.to_datetime(df.columns).to_period('M'),1).mean()
Out[412]: 
   2013-01  2013-02
A      1.5      3.5
B      5.5      7.5

Answer 2

首先，使用pd.to_datetime将列索引转换为datetime，然后将T和groupby与pd.Grouper一起使用（注意pd.TimeGerouper is deprecated使用pd.Grouper）：< / p>

df.columns = pd.to_datetime(df.columns)
df.T.groupby(by=pd.Grouper(freq='MS')).mean().T

输出：

   2013-01-01  2013-02-01
A         1.5         3.5
B         5.5         7.5

Answer 3

您可以使用pd.PeriodIndex：

In [145]: df.groupby(pd.PeriodIndex(df.columns, freq='M'), axis=1).mean()
Out[145]:
   2013-01  2013-02
A      1.5      3.5
B      5.5      7.5

Answer 4

首先尝试将列名称转换为日期：

df = pd.DataFrame(data=[[1,2,3,4],[5,6,7,8]], columns=pd.to_datetime(['2013-01-01', '2013-01-02', '2013-02-03', '2013-02-04']), index=['A', 'B'])

希望它有所帮助！

Answer 5

import pandas as pd

list=df.columns
df_new = pd.DataFrame()

for i in range(int(0.5*len(list))):
    df_new[list[2*i]] = (df[[list[2*i], list[2*i+1]]].mean(axis=1))

输出

       2013-01-01  2013-02-03
A         1.5         3.5
B         5.5         7.5

我不明白你想要的输出：

日期列上的熊猫重新取样

5 个答案: