Question

我有每天一次的csv时间序列数据和累计销售。 Silimar对此

01-01-2010 12:10:10      50.00
01-02-2010 12:10:10      80.00
01-03-2010 12:10:10      110.00
.
. for each dat of 2010
.
01-01-2011 12:10:10      2311.00
01-02-2011 12:10:10      2345.00
01-03-2011 12:10:10      2445.00
.
. for each dat of 2011
.

and so on.

我希望每年的每月销售（最高 - 最低）。因此，在过去的5年中，我将有5个月的值（最大 - 最小），2月5日的值（最大 - 最小）......等等

一旦我有了这些，我接下来得到（5年平均值）1月，5年平均2月...等等。

现在，我通过切割原始df [年/月]然后在一年中的特定月份进行平均来做到这一点。

我希望使用时间序列resample（）方法，但我目前仍然坚持告诉PD在[从今天开始的10年]每个月（最大 - 最小）采样。然后以.mean（）

链接

对于使用resample（）执行此操作的有效方法的任何建议将不胜感激。

Answer 1

它可能看起来像这样（注意：没有累积销售价值）。这里的关键是执行df.groupby（）传递dt.year和dt.month。

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'date': pd.date_range(start='2016-01-01',end='2017-12-31'),
    'sale': np.random.randint(100,200, size = 365*2+1)
})

# Get month max, min and size (and as they are sorted - last and first)
dfg = df.groupby([df.date.dt.year,df.date.dt.month])['sale'].agg(['last','first','size'])

# Assign new cols (diff and avg) and drop max min size
dfg = dfg.assign(diff = dfg['last'] - dfg['first'])
dfg = dfg.assign(avg = dfg['diff'] / dfg['size']).drop(['last','first','size'], axis=1)

# Rename index cols
dfg.index = dfg.index.rename(['Year','Month'])

print(dfg.head(6))

返回：

            diff       avg
Year Month                
2016 1       -56 -1.806452
     2       -17 -0.586207
     3        30  0.967742
     4        34  1.133333
     5        46  1.483871
     6         2  0.066667

Answer 2

您可以使用resample*2：

执行此操作

首次重采样到一个月（M）并获得差异（max()-min()）
然后重新采样到5年（5AS）和groupby个月并取mean()

E.g：

In []:
date_range = pd.date_range(start='2008-01-01',end='2017-12-31')
df = pd.DataFrame({'sale': np.random.randint(100, 200, size=date_range.size)},
                  index=date_range)

In []:
df1 = df.resample('M').apply(lambda g: g.max()-g.min())
df1.resample('5AS').apply(lambda g: g.groupby(g.index.month).mean()).unstack()

Out[]:
            sale                                                                  
              1     2     3     4     5     6     7     8     9     10    11    12
2008-01-01  95.4  90.2  95.2  95.4  93.2  93.8  91.8  95.6  93.4  93.4  94.2  93.8
2013-01-01  93.2  96.4  92.8  96.4  92.6  93.0  93.2  92.6  91.2  93.2  91.8  92.2

大熊猫时间序列月平均量

2 个答案: