pandas:根据多个条件计算新列中的新值,从多个列应用而不循环

时间:2016-12-10 14:15:19

标签: python pandas

我的数据在以下数据框中

df = pd.DataFrame({'AccID':['001','001','001','002','002','003'],
                   'AccTypes':['A','B','C','A','B','C'],
                   'Status':['Closed','Active','Active','Active','Closed','Active'],
                   'Years':[5,15,10,20,25,30]})

AccID     AccTypes     Status     Years
001       A            Closed     5
001       B            Active     15
001       C            Active     10
002       A            Active     20
002       B            Closed     25
003       C            Active     30

我想创建另一个名为“ActiveYears”的列,对于给定的活动的AccID,无论AccTypes ,每个值都是最大活跃年数。预期的输出如下:

AccID     AccTypes     Status     Years     ActiveYears     Explanations
001       A            Closed     5         5               # Status = Closed, we set ActiveYears = Years
001       B            Active     15        15              # Status = Active, we select the maximum year of AccID = 001 with active status
001       C            Active     10        15              # Status = Active, we select the maximum year of AccID = 001 with active status
002       A            Active     20        20              # Status = Active, we select the maximum year of AccID = 002 with active status
002       B            Closed     25        20              # Status = Closed, we set ActiveYears = Years
003       C            Active     30        30              # Status = Active, we select the maximum year of AccID = 003 with active status

我可以通过循环来做到这一点,但它不够优雅。我可以知道如何以比循环更好的方式做到这一点吗?谢谢。

1 个答案:

答案 0 :(得分:0)

您可以使用以下内容:

首先处理状态Closed

df.loc[df.Status == 'Closed','ActiveYears'] = df.loc[df.Status == 'Closed','Years']

使用groupby transformation处理有效,

df.loc[df.Status == 'Active', 'ActiveYears'] = df[df.Status == 'Active'].groupby('AccID')['Years'].transform(max)

print(df)

  AccID AccTypes  Status  Years  ActiveYears
0   001        A  Closed      5          5.0
1   001        B  Active     15         15.0
2   001        C  Active     10         15.0
3   002        A  Active     20         20.0
4   002        B  Closed     25         25.0
5   003        C  Active     30         30.0