如何将数据框中的列映射到另一个数据框中的两个不同列并检索映射的项目?

时间:2020-09-08 08:58:23

标签: python pandas

我有两个数据帧,如下所示:我想将df1 ['Data1']中的值映射到df2 ['Data1']和df2 ['Data2']。我用下面的方法,但它更长。熊猫还有其他替代方法吗

df1 = pd.read_excel("df1.xlsx")
df2 = pd.read_excel("df2.xlsx"

df1

Data1   Data2   Score
ABC      AB1    1
AB1      ABC    4
AB2      AB2    6
ABC      ABD    0.7
GDH      ABD    0.9
KMN      KSF    0.5
KSF      KSF    6

df2

Data1
AB1
AB2
ABC
ABD

mapped=pd.merge(df1, df2, left_on='Data1', right_on='Data1')
mappedx = pd.merge(df1, df2, left_on='Data2', right_on='Data1')
mappedx.rename(columns = {'Data1_x':'Data1'}, inplace = True)
mappedx = mappedx[['Data1','Data2','Score']]
frame = [mapped, mappedx]
result = pd.concat(frame)
result = result.drop_duplicates()

result

Data1   Data2   Score
ABC      AB1    1
AB1      ABC    4
AB2      AB2    6
ABC      ABD    0.7
GDH      ABD    0.9

1 个答案:

答案 0 :(得分:2)

对于由|链接的两列,按位OR使用Series.isin

df = df1[df1['Data1'].isin(df2['Data1']) | df1['Data2'].isin(df2['Data1'])]
print (df)
  Data1 Data2  Score
0   ABC   AB1    1.0
1   AB1   ABC    4.0
2   AB2   AB2    6.0
3   ABC   ABD    0.7
4   GDH   ABD    0.9

或将DataFrame.isinDataFrame.any一起使用:

df = df1[df1[['Data1','Data2']].isin(df2['Data1'].tolist()).any(axis=1)]
print (df)
  Data1 Data2  Score
0   ABC   AB1    1.0
1   AB1   ABC    4.0
2   AB2   AB2    6.0
3   ABC   ABD    0.7
4   GDH   ABD    0.9