按熊猫行索引分组

时间:2020-04-10 21:25:47

标签: pandas

我想对每7行进行分组和求和(因此每周获得总计)。当前有两列。一个用于日期,另一个用于浮动。

1/22/2020   NaN
1/23/2020   0.0
1/24/2020   1.0
1/25/2020   0.0
1/26/2020   3.0
1/27/2020   0.0
1/28/2020   0.0
1/29/2020   0.0
1/30/2020   0.0
1/31/2020   2.0
2/1/2020    1.0
2/2/2020    0.0
2/3/2020    3.0
2/4/2020    0.0
2/5/2020    0.0
2/6/2020    0.0
2/7/2020    0.0
2/8/2020    0.0
2/9/2020    0.0
2/10/2020   0.0
2/11/2020   1.0
2/12/2020   0.0
2/13/2020   1.0
2/14/2020   0.0
2/15/2020   0.0
2/16/2020   0.0
2/17/2020   0.0
2/18/2020   0.0
2/19/2020   0.0
2/20/2020   0.0
... ...
2/28/2020   0.0
2/29/2020   8.0
3/1/2020    6.0
3/2/2020    23.0
3/3/2020    20.0
3/4/2020    31.0
3/5/2020    68.0
3/6/2020    45.0
3/7/2020    119.0
3/8/2020    114.0
3/9/2020    64.0
3/10/2020   194.0
3/11/2020   397.0
3/12/2020   452.0
3/13/2020   590.0
3/14/2020   710.0
3/15/2020   61.0
3/16/2020   1389.0
3/17/2020   1789.0
3/18/2020   906.0
3/19/2020   3068.0
3/20/2020   4009.0
3/21/2020   4017.0
3/23/2020   25568.0
3/24/2020   10074.0
3/25/2020   12043.0
3/26/2020   18058.0
3/27/2020   17822.0
3/28/2020   19825.0
3/29/2020   19408.0

2 个答案:

答案 0 :(得分:1)

将日期列设置为索引,并使用resample

df['Date'] = pd.to_datetime(df['Date'])
df = df.set_index('Date')
df.resample('1W').sum()

答案 1 :(得分:1)

假定您的日期列称为dt,而您的值列称为val

import numpy as np

# in case if it's not already date time format:
df["dt"]=pd.to_datetime(df["dt"])
# your data looks sorted, but in case if it's not - that's the prerequisite here:
df=df.sort_values("dt")

df=df.groupby(np.arange(len(df))//7).agg({"dt": (min, max), "val": sum})

dt的聚合仅完成,因此您可以明确地指示聚合间隔-仅以min为例,或者完全忽略它就足够了……