Question

            A    B          C   D  E
0  2002-01-13  Dan 2002-01-15  26 -1
1  2002-01-13  Dan 2002-01-15  10  0
2  2002-01-13  Dan 2002-01-15  16  1
3  2002-01-13  Vic 2002-01-17  14  0
4  2002-01-13  Vic 2002-01-03  18  0
5  2002-01-28  Mel 2002-02-08  37  0
6  2002-01-28  Mel 2002-02-06  29  0
7  2002-01-28  Mel 2002-02-10  20  0
8  2002-01-28  Rob 2002-02-12  30 -1
9  2002-01-28  Rob 2002-02-12  48  1
10 2002-01-28  Rob 2002-02-12   0  1
11 2002-01-28  Rob 2002-02-01  19  0

一小时前，

温回答了一个非常相似的问题，但我忘记了一些条件。我会用粗体：

写下来

我想创建一个新的df['F']列，其中包含下一个条件，每个B组并忽略D列中的零：

F=D值，其中A日期距离C日期晚10天且E = 0 最近。
如果最近的E=0日期到10天不存在A（2002-01-28 Rob}，则F将是当E = -1且E = 1时，D值的平均值。
如果距离C（2002-01-28 Mel的情况）相同距离的两个A日期，F将是这些日期的平均值 - 期间D值。

输出应为：

            A    B          C   D  E   F
0  2002-01-13  Dan 2002-01-15  26 -1  10
1  2002-01-13  Dan 2002-01-15  10  0  10
2  2002-01-13  Dan 2002-01-15  16  1  10
3  2002-01-13  Vic 2002-01-17  14  0  14
4  2002-01-13  Vic 2002-01-03  18  0  14
5  2002-01-28  Mel 2002-02-08  37  0  33
6  2002-01-28  Mel 2002-02-06  29  0  33
7  2002-01-28  Mel 2002-02-10  20  0  33
8  2002-01-28  Rob 2002-02-12  30 -1  39
9  2002-01-28  Rob 2002-02-12  48  1  39
10 2002-01-28  Rob 2002-02-12   0  1  39
11 2002-01-28  Rob 2002-02-01  19  0  39

Wen 回答：

df['F']=abs((df.C-df.A).dt.days-10)# get the days different 
df['F']=df.B.map(df.loc[df.F==df.groupby('B').F.transform('min')].groupby('B').D.mean())# find the min value for the different , and get the mean 
df

但是现在我无法插入新条件（我已经加入了粗体）。

Answer 1

将映射器更改为

m=df.loc[(df.F==df.groupby('B').F.transform('min'))&(df.D!=0)].groupby('B').apply(lambda x : x['D'][x['E']==0].mean() if (x['E']==0).any() else x['D'].mean())

df['F']=df.B.map(m)

多个数据帧日期和组条件

1 个答案: