如何实现对多个条件的分组。例如:
列CL行== a,b,c按A和C列分组[[TOTAL] .min()和列CL行== d,e,f列按B分组[TOTAL] .min()
CL | A | B | C | TOTAL
a | 1 | 6 | 5 | 125,000
b | 2 | 5 | 5 | 140,000
c | 3 | 4 | 5 | 148,000
d | 4 | 3 | 6 | 125,000
e | 5 | 2 | 6 | 136,000
f | 6 | 1 | 6 | 156,000
答案 0 :(得分:0)
好,我在这里看到2个选项吗?
(1)您分别进行分组和聚合,然后将其合并回去:
pd.concat([df.loc[df["CL"].isin(["a", "b", "c"])].groupby(["A", "C"])["TOTAL"].min(),
df.loc[df["CL"].isin(["d", "e", "f"])].groupby("B")["TOTAL"].min()])
输出:
(1, 5) 125000
(2, 5) 140000
(3, 5) 148000
1 156000
2 136000
3 125000
Name: TOTAL, dtype: int64
(2)或者-您需要组成一个虚拟分组密钥-例如,您可以通过用-1
屏蔽不需要的分组密钥来做到这一点,因此:
import numpy as np
#using the copy so original df won't be amended:
df2=df.copy()
#mask unwanted grouping keys by any object, other than None (None-s are automatically excluded from the group)
#choose key, so it won't get mixed up with any of other grouping keys
df2["B"]=np.where(df["CL"].isin(["a", "b", "c"]), -1, df["B"])
df2["A"]=np.where(df["CL"].isin(["a", "b", "c"]), df["A"], -1)
df2["C"]=np.where(df["CL"].isin(["a", "b", "c"]), df["C"], -1)
df2.groupby(["A", "B", "C"])["TOTAL"].min()
输出:
A B C
-1 1 -1 156000
2 -1 136000
3 -1 125000
1 -1 5 125000
2 -1 5 140000
3 -1 5 148000
Name: TOTAL, dtype: int64
答案 1 :(得分:0)
我最终通过添加额外的列“ test”和以下代码来解决了问题:
z['test'] = np.where(z['ACTIVITY_PHASE'].isin(['FRAC','COIL']), z['TOTAL'],
(np.where(z['ACTIVITY_PHASE']=='PREW', z.groupby(z['ACTIVITY_PHASE']=='PREW')['TOTAL'].transform('min'),
(np.where(z['ACTIVITY_PHASE']=='WINF', z.groupby(z['ACTIVITY_PHASE']=='WINF')['TOTAL'].transform('min'),
(np.where(z['ACTIVITY_PHASE']=='WOR', z.groupby(z['ACTIVITY_PHASE']=='WOR')['TOTAL'].transform('min'), 0)))))))