Question

我昨天发布了this question，内容涉及在df中创建新列。现在，我很好奇如何制作一个仅包含极端元素的新数据框。例如：

df = pd.DataFrame({'Event':['A','A','A','A', 'A' ,'B','B','B','B','B'],  'Number':[1,2,3,4,5,6,7,8,9,10],'Ref':[False,False,False,False,True,False,False,False,True,False]})
df["new"] = df.Number - df.Number[df.groupby('Event')['Ref'].transform('idxmax')].reset_index(drop=True)
print(df)

这给出了表1中的df。现在，我很好奇如何创建新的df1，该df1只是与new的最大绝对值相对应的行。输出将是下面的Output2。我知道我可以利用df1 = df.loc([df['new'].idxmin())之类的东西，但是只能给出行。我不确定如何遍历不同的组以及如何应用numpy函数。我说这是一个班轮，但我不太确定如何处理

输出1：

  Event  Number    Ref  new
0     A       1  False   -4
1     A       2  False   -3
2     A       3  False   -2
3     A       4  False   -1
4     A       5   True    0
5     B       6  False   -3
6     B       7  False   -2
7     B       8  False   -1
8     B       9   True    0
9     B      10  False    1

输出2：

  Event  Number    Ref  new
0     A       1  False   -4
1     B       6  False   -3

Answer 1

让我尝试在此处用merge回答您的扩展问题

new_df = pd.merge(df.loc[df['new'].abs().groupby(df['Event']).idxmax()],
         df.loc[df['Ref'], ['Event','Number']],
         on='Event',
         suffixes=['','_ref']
        )

输出：

  Event  Number    Ref  new  Number_ref
0     A       1  False   -4           5
1     B       6  False   -3           9

根据旧的groupby创建新的DataFrame

1 个答案: