熊猫用累积和替换列

时间:2018-08-01 22:56:02

标签: python-3.x pandas dataframe group-by cumulative-sum

我有一个按 <android.support.design.widget.AppBarLayout android:id="@+id/appBarLayout" android:layout_width="match_parent" android:layout_height="wrap_content" android:background="#4999E2" app:layout_constraintEnd_toEndOf="parent" app:layout_constraintStart_toStartOf="parent" app:layout_constraintTop_toTopOf="parent"> <android.support.v7.widget.Toolbar android:id="@+id/toolbar" android:layout_width="match_parent" android:layout_height="?android:attr/actionBarSize"> <ImageView android:id="@+id/imageViewLogo" android:layout_width="wrap_content" android:layout_height="wrap_content" android:layout_gravity="end|center_vertical" android:layout_marginEnd="16dp" android:importantForAccessibility="no" app:srcCompat="@drawable/ic_calendar_check"/> </android.support.v7.widget.Toolbar> </android.support.design.widget.AppBarLayout> customer_id分组的数据框,如下所示:

month

我想使用customer_id | month | total 1 | Jan | 20 | Feb | 10 2 | Jan | 20 3 | Feb | 30 | Mar | 10 | Apr | 5 列来计算直到当前月份的所有前几个月的累积总和,如下所示:

total

我尝试了customer_id | month | total | cumsum 1 | Jan | 20 | 20 | Feb | 10 | 30 2 | Jan | 20 | 20 3 | Feb | 30 | 30 | Mar | 10 | 40 | Apr | 5 | 45 ,但没有成功,有人可以帮忙吗?

1 个答案:

答案 0 :(得分:1)

在普通数据框中(不进行分组或弄乱索引),只需执行df.groupby('customer_id').cumsum()

示例:

import io
z=io.StringIO("""customer_id  month  total
1            Jan     20
1             Feb     10
2            Jan     20
3            Feb     30
3             Mar     10
3             Apr     5""")

df = pd.read_table(z, delim_whitespace=True)

收益

    customer_id  month      total
0   1            Jan        20
1   1            Feb        10
2   2            Jan        20
3   3            Feb        30
4   3            Mar        10
5   3            Apr        5

然后

df.groupby('customer_id').cumsum()


    total
0   20
1   30
2   20
3   30
4   40
5   45

然后将其分配回去

df['cumsum'] = df.groupby('customer_id').cumsum()   

    customer_id month       total   cumsum
0   1           Jan         20      20
1   1           Feb         10      30
2   2           Jan         20      20
3   3           Feb         30      30
4   3           Mar         10      40
5   3           Apr         5       45