检查一个数据框单元格是否包含另一个数据框单元格中的值

时间:2019-04-26 17:21:04

标签: python pandas dataframe

我正在尝试执行以下操作:

给出df1中的一行,如果str(row ['code'])在df2 ['code']的任何行中,那么我希望df2 ['lamer_url_1']和df2 ['shopee_url_1]中的所有行']取自df1的相应值。 然后继续进行df1 ['code'] ...的下一行...

'''

==============

初始表格:

df1

     code                  lamer_url_1                 shopee_url_1

0  L61B18H089                       b                            a

1  L61S19H014                       e                            d

2  L61S19H015                       z                            y

df2

  code             lamer_url_1   shopee_url_1   lamer_url_2  shopee_url_2

0 L61B18H089-F1424         NaN           NaN          NaN           NaN

1 L61S19H014-S1500         NaN           NaN          NaN           NaN

2 L61B18H089-F1424         NaN           NaN          NaN           NaN

==============

预期输出:

df2

   code              lamer_url_1  shopee_url_1  lamer_url_2  shopee_url_2
0  L61B18H089-F1424           b             a          NaN           NaN

1  L61S19H014-S1500           e             d          NaN           NaN

2  L61B18H089-F1424           b             a          NaN           NaN

'''

1 个答案:

答案 0 :(得分:1)

我假设来自“ df2”的“代码”的共同部分是“-”之前的字符。我还假设从“ df1”开始,我们想要“ lamer_url_1”,“ shopee_url_1”,从“ df2”开始,我们想要“ lamer_url_2”,“ shopee_url_2”(如果我输入错了,请在注释中纠正我,以便我可以完善代码):

df1.set_index(df1['code'], inplace=True)
df2.set_index(df2['code'].apply(lambda x: x.split('-')[0]), inplace=True)
df2.index.names = ['code_join']

df3 = pd.merge(df2[['code', 'lamer_url_2', 'shopee_url_2']],
               df1[['lamer_url_1', 'shopee_url_1']],
               left_index=True, right_index=True)