Python Groupby根据另一列中的字符串运行Total / Cumsum列

时间:2018-11-16 04:57:25

标签: python pandas running-total

我要创建2个“运行总计”列,这些列仅根据每个Amount中的TYPEANNUAL还是MONTHLY来汇总Deal值 因此它将是DF.groupby(['Deal','Booking Month']),然后在第一列TYPE==ANNUAL和第二列TYPE==MONTHLY时以某种方式应用求和函数。

如果我的分组DF看起来像+两个所需列,则为

Deal  TYPE   Month   Amount     Running Total(ANNUAL)   Running Total(Monthly)
A    ANNUAL   April    1000       1000                    0
A    ANNUAL   April    2000       3000                    0
A    MONTHLY  June     1500       3000                   1500
B    MONTHLY  April    11150      0                      11150
B    ANNUAL   July     700        700                    11150
B    ANNUAL   August   303.63     1003.63                11150
C    ANNUAL   April    25624.59   25624.59                0
D    ANNUAL   June     5000       5000                    0
D    ANNUAL   July     5000       10000                   0
D    ANNUAL   August   5000       15000                   0
E    ANNUAL   April    10         10                      0
E    MONTHLY  May      1000       10                      1000
E    ANNUAL   May      500        510                     1000
E    MONTHLY  June     500.00     510                     1500
E    ANNUAL   June     600        1110                    1500
E    MONTHLY  July     300        1110                    1800
E    MONTHLY  July     8200       1110                    10000         

2 个答案:

答案 0 :(得分:3)

使用import Foundation import CoreData protocol PickerItemProvider: class { associatedtype PickerType func itemAt(_ indexPath: IndexPath) -> PickerType? } extension PickerItemProvider { public func refresh(_ sender: Any, completion: (() -> Void)?) { print("the default refresh implementation") } } public class PickerSectionProvider<ProvidedType: Equatable> : PickerItemProvider { func itemAt(_ indexPath: IndexPath) -> ProvidedType? { return nil } } extension PickerItemProvider where PickerType: Equatable & SyncableEntity { func refresh(_ sender: Any, completion: (() -> Void)?) { print("we’re trying to have this implementation called instead of the above implementation of refresh") PickerType.startSync() completion?() } } protocol SyncableEntity { associatedtype EntityType static func startSync() } extension SyncableEntity { static func startSync() { print("starting sync") } } class ObservationType: Equatable, SyncableEntity { typealias EntityType = NSManagedObject } func ==(lhs: ObservationType, rhs: ObservationType) -> Bool { return false } class GeneralPickerViewController<PickerType: Equatable, ItemProvider: PickerItemProvider> where ItemProvider.PickerType == PickerType { private var itemProvider: ItemProvider? private var refresher: ((Any, (() -> ())?) -> ())? func setup<T: PickerItemProvider>(with itemProvider: T) where T.PickerType == PickerType { refresher = { sender, completion in itemProvider.refresh(self, completion: { completion?() }) } self.itemProvider = (itemProvider as! ItemProvider) } func setup<T: PickerItemProvider>(with itemProvider: T) where T.PickerType == PickerType, PickerType: SyncableEntity { refresher = { sender, completion in itemProvider.refresh(self, completion: { completion?() }) } self.itemProvider = (itemProvider as! ItemProvider) } func foo() { refresher?(self, { print("finished") }) } } class PopupPickerRow<T: Equatable, ItemProvider: PickerItemProvider> where ItemProvider.PickerType == T { var pickerController = GeneralPickerViewController<T, ItemProvider>() } let pickerSectionProvider = PickerSectionProvider<ObservationType>() let row = PopupPickerRow<ObservationType, PickerSectionProvider<ObservationType>>() row.pickerController.setup(with: pickerSectionProvider) row.pickerController.foo() groupby + transform

ActionMailer::Base.delivery_method

答案 1 :(得分:0)

您可以使用.expanding.sum()来执行此操作,这将维护组的multiIndex,您可以将其拆开以获取每种类型的单独列。使用另一个groupby来相应地填充每个组中的缺失值。将其连接回去。

有趣的是,它可以用于多种类型,而无需在任何地方明确定义它们。

import pandas as pd

df2 = (df.groupby(['Deal', 'TYPE'])
         .Amount.expanding().sum()
         .unstack(level=1)
         .groupby(level=0)
         .ffill().fillna(0)
         .reset_index(level=0, drop=True)
         .drop(columns='Deal'))

pd.concat([df, df2], axis=1)

输出

   Deal     TYPE   Month    Amount    ANNUAL  MONTHLY
0     A   ANNUAL   April   1000.00   1000.00      0.0
1     A   ANNUAL   April   2000.00   3000.00      0.0
2     A  MONTHLY    June   1500.00   3000.00   1500.0
3     B  MONTHLY   April  11150.00      0.00  11150.0
4     B   ANNUAL    July    700.00    700.00  11150.0
5     B   ANNUAL  August    303.63   1003.63  11150.0
6     C   ANNUAL   April  25624.59  25624.59      0.0
7     D   ANNUAL    June   5000.00   5000.00      0.0
8     D   ANNUAL    July   5000.00  10000.00      0.0
9     D   ANNUAL  August   5000.00  15000.00      0.0
10    E   ANNUAL   April     10.00     10.00      0.0
11    E  MONTHLY     May   1000.00     10.00   1000.0
12    E   ANNUAL     May    500.00    510.00   1000.0
13    E  MONTHLY    June    500.00    510.00   1500.0
14    E   ANNUAL    June    600.00   1110.00   1500.0
15    E  MONTHLY    July    300.00   1110.00   1800.0
16    E  MONTHLY    July   8200.00   1110.00  10000.0