我有df;
ID YEART Commdate Cat Category
0 LVI6AE2 1993 2017-03-24 LVI6AE2_1 56
1 LVI6BE2 1994 2017-03-24 LVI6BE2_1 67
2 APJ5LEV 1975 2017-03-13 APJ5LEV_1 78
3 LQL0AE3 1986 2017-03-16 LQL0AE3_1 87
4 BLR3UEV 1982 2017-03-15 BLR3UEV_1 90
5 BRL1NEV 1981 2017-03-15 BRL1NEV_1 90
6 BRL1NEV 1981 2017-03-16 BRL1NEV_1 90
7 BRL1NEV 1981 2017-03-22 Ungrouped 190
8 BRL1NEV 1981 2017-03-17 Ungrouped 190
9 BRL1NEV 1981 2017-03-17 Ungrouped 190
10 BRL1NEV 1981 2017-03-22 Ungrouped 190
11 BRL1NEV 1981 2017-03-20 BRL1NEV_1 90
12 BRL1NEV 1981 2017-02-01 BRL1NEV_1 90
13 UEE6JSV 2000 2017-03-15 UEE6JSV_1 34
14 UGQ4VE2 1993 2014-07-25 UGQ4VE2_1 45
15 UTU6BE1 1986 2017-03-13 UTU6BE1_1 12
16 NVT 1999 2017-03-10 NVT_1 12
17 OTL3JE1 2001 2017-02-01 OTL3JE1_1 12
18 OTL5XS1 2003 2017-03-01 OTL5XS1_1 12
19 OTL6AE1 2001 2017-03-01 OTL6AE1_1 12
20 JVU6AE1 1999 2017-03-31 JVU6AE1_1 12
21 JVU6AE2 1993 2017-03-31 Ungrouped 120
仅当它们属于未分组的“猫”或类别> 100时,我才想计算出具有类似“ ID”和“ YEART”的每个组中最早的“奖励”
我想出了以下一行
#To Datetime
df['Commdate'] =pd.to_datetime(df['Commdate'])
#groupby
df["EarliestD"] =df.groupby(['ID', 'YEART']).filter(lambda x : x['Category'].count()>=90)['Commdate'].min()
结果为“最早的D”返回“ NaT”
ID YEART Commdate Cat Category EarliestD
0 LVI6AE2 1993 2017-03-24 LVI6AE2_1 56 NaT
1 LVI6BE2 1994 2017-03-24 LVI6BE2_1 67 NaT
2 APJ5LEV 1975 2017-03-13 APJ5LEV_1 78 NaT
3 LQL0AE3 1986 2017-03-16 LQL0AE3_1 87 NaT
4 BLR3UEV 1982 2017-03-15 BLR3UEV_1 90 NaT
问题;
1。如果满足不同列中的条件,是否可以有条件地使用多个列进行分组?
2.是否可以通过def
函数调用多个条件分组?
谢谢
答案 0 :(得分:0)
您可以使用布尔过滤器和groupby
+ transform
:
# convert Commdate to datetime if necessary
df['Commdate'] = pd.to_datetime(df['Commdate'])
# calculate mask for splitting dataframe
cat_mask = (df['Cat'] == 'Ungrouped') | (df['Category'] > 100)
# groupby uncategorised / category > 100
df.loc[cat_mask, 'Commdate'] = df.loc[cat_mask].groupby(['ID', 'YEART'])['Commdate'].transform('min')
结果:
print(df)
ID YEART Commdate Cat Category
0 LVI6AE2 1993 2017-03-24 LVI6AE2_1 56
1 LVI6BE2 1994 2017-03-24 LVI6BE2_1 67
2 APJ5LEV 1975 2017-03-13 APJ5LEV_1 78
3 LQL0AE3 1986 2017-03-16 LQL0AE3_1 87
4 BLR3UEV 1982 2017-03-15 BLR3UEV_1 90
5 BRL1NEV 1981 2017-03-15 BRL1NEV_1 90
6 BRL1NEV 1981 2017-03-16 BRL1NEV_1 90
7 BRL1NEV 1981 2017-03-17 Ungrouped 190
8 BRL1NEV 1981 2017-03-17 Ungrouped 190
9 BRL1NEV 1981 2017-03-17 Ungrouped 190
10 BRL1NEV 1981 2017-03-17 Ungrouped 190
11 BRL1NEV 1981 2017-03-20 BRL1NEV_1 90
12 BRL1NEV 1981 2017-02-01 BRL1NEV_1 90
13 UEE6JSV 2000 2017-03-15 UEE6JSV_1 34
14 UGQ4VE2 1993 2014-07-25 UGQ4VE2_1 45
15 UTU6BE1 1986 2017-03-13 UTU6BE1_1 12
16 NVT 1999 2017-03-10 NVT_1 12
17 OTL3JE1 2001 2017-02-01 OTL3JE1_1 12
18 OTL5XS1 2003 2017-03-01 OTL5XS1_1 12
19 OTL6AE1 2001 2017-03-01 OTL6AE1_1 12
20 JVU6AE1 1999 2017-03-31 JVU6AE1_1 12
21 JVU6AE2 1993 2017-03-31 Ungrouped 120