我有一个如下所示的数据框:
Id Name Mag Out Des
23 Yah 1.0 base n-0
23 Yah 1.0 base n-0
23 Yah 1.0 base n-0
24 Nah 0.99 base n-0
24 Nah 1.01 line-2 line-2
24 Nah 0.95 line-3 line-3
24 Nah 1.1 line-4 line-4
25 lol 1.0 line-1 line-1
25 lol 1.1 line-3 line-3
25 lol 0.9 line-4 line-4
25 lol 0.95 line-5 line-5
输出必须满足以下条件:
输出必须采用以下格式:
Id Name Mag Out Des
23 Yah 1.0 base n-0
24 Nah 0.99 base n-0
24 Nah 0.95 line-3 line-3
24 Nah 1.1 line-4 line-4
25 lol 0.9 line-4 line-4
25 lol 0.95 line-5 line-5
25 lol 1.0 line-1 line-1
25 lol 1.1 line-3 line-3
答案 0 :(得分:1)
这是一种方法。为了清晰起见,分几个步骤进行操作:
def check_base(x):
if all([elem == "base" for elem in x]):
return ["keep"] + ["drop"] * (len(x)-1)
elif "base" in list(x):
return ["keep" if i=="base" else "maybe" for i in list(x)]
else:
return "keep"
df["criteria"] = df.groupby(["Id", "Name"], as_index = False).Out.transform(check_base)
g_min = df.groupby(["Id", "Name"]).Mag.transform("min")
g_max = df.groupby(["Id", "Name"]).Mag.transform("max")
df = df[(df.criteria == "keep") | (df.criteria == "maybe") & ((df.Mag == g_min) | (df.Mag == g_max))]
结果是:
Id Name Mag Out Des criteria
0 23 Yah 1.00 base n-0 keep
3 24 Nah 0.99 base n-0 keep
5 24 Nah 0.95 line-3 line-3 maybe
6 24 Nah 1.10 line-4 line-4 maybe
7 25 lol 1.00 line-1 line-1 keep
8 25 lol 1.10 line-3 line-3 keep
9 25 lol 0.90 line-4 line-4 keep
10 25 lol 0.95 line-5 line-5 keep