Question

一年大约有54个星期。

我想获取每个弱项的销售总额。

给出时间序列数据（带有datetime或unixtime）（我的数据如下：

 userId,movieId,rating,timestamp
 1,31,2.5,1260759144

）

我希望输出看起来像下面的

1week (1/1 - 1/7) : 30$
2week (1/8 - 1/14) : 40$
...
54week (12/24 - 12/31) : 50$

我输入的日期（1/1等只是用于解释，我想获得每周分组（以获得季节性指数），而不必从1/1开始或像这样的东西。

数据可能包含多年。

编辑

我想在多年内按周分组，就像您可以在多年内按月[jan，feb ...... dec]分组（用于12年分组的多年数据）一样。

Answer 1

通过week使用Series.resample并聚合函数-例如，通过mean：

rng = pd.date_range('2017-04-03', periods=10)
s = pd.DataFrame({'a': range(10)},index=rng)['a']
print (s)
2017-04-03    0
2017-04-04    1
2017-04-05    2
2017-04-06    3
2017-04-07    4
2017-04-08    5
2017-04-09    6
2017-04-10    7
2017-04-11    8
2017-04-12    9
Freq: D, Name: a, dtype: int64

s1 = s.resample('W').mean()
#alternative
#s1 = s.groupby(pd.Grouper(freq='W')).mean()
print (s1)
2017-04-09    3
2017-04-16    8
Freq: W-SUN, Name: a, dtype: int64

替代：

s1 = s.groupby(s.index.strftime('%Y-%U')).mean()
print (s1)
2017-14    2.5
2017-15    7.5
Name: a, dtype: float64

编辑：

对于样本数据需要预处理：

print (df)
   userId  movieId  rating   timestamp
0       1       31     2.5  1260759144
1       1       31     2.5  1560759144


df['timestamp'] = pd.to_datetime(df['timestamp'], unit='s')

w = df['timestamp'].rename('week').dt.weekofyear
df = df['rating'].groupby(w).mean().reset_index(name='val')
print (df)
   week  val
0    25  2.5
1    51  2.5

Answer 2

您可以首先创建一个星期列，从pandas开始的to_datetime非常有用

df['week'] = pd.to_datetime(df['timestamp']).dt.week
df['year'] = pd.to_datetime(df['timestamp']).dt.year
weekly_sales = df.groupby(['year','week'])['sales'].sum()

大熊猫，按周分组数据

2 个答案: