Question

我有2个数据帧，df1和df2：

我想合并数据框，但同时包括A列中集合的第一个和/或最后一个值。这是所需结果的一个示例：

我正在尝试使用merge，但这只会切片重合的数据帧部分。有人有想法解决这个问题吗？谢谢！

Answer 1

以下是使用merge指标groupby和rolling执行此操作的一种方法：

df[df.merge(df2, on='B', how='left', indicator='Ind').eval('Found=Ind == "both"')
     .groupby('A')['Found']
     .apply(lambda x: x.rolling(3, center=True, min_periods=2).max()).astype(bool)]

输出：

Answer 2

 pd.concat([df1.groupby('A').min().reset_index(), pd.merge(df1,df2, on="B"), df1.groupby('A').max().reset_index()]).reset_index(drop=True).drop_duplicates().sort_values(['A','B'])
    A   B
0   1   2
4   1  32
5   1  42
1   2  16
2   3  13
7   3  24
8   3  35
3   4  12
9   4  39
10  4  49

分解每个部分

#Get Minimum
df1.groupby('A').min().reset_index()

# Merge on B
pd.merge(df1,df2, on="B")

# Get Maximum
df1.groupby('A').max().reset_index()

# Reset the Index and drop duplicated rows since there may be similarities between the Merge and Min/Max. Sort values by 'A' then by 'B'
.reset_index(drop=True).drop_duplicates().sort_values(['A','B'])

合并包含极值的数据帧

2 个答案: