如何检查一列中的数据是否等于另一个数据帧中另一列中的任何数据?
以下是一些df_out1数据:
'LN' 'E' 'DATE'
SMITH H '4-2-2000'
HARPER A '2-9-2018'
JONES NaN '1-1-2000'
以下是一些ef_file_in数据:
'LN' 'E'
PERRY F
JONES H
SMITH B
这是我写的:
df_out1.loc[df_out1['E'].isnull() & df_out1['LN'] == ef_file_in['LN'], 'LN'] =
ef_file_in['E']
这是我一直在寻找的结果: df_out1数据:
'LN' 'E' 'DATE'
SMITH H '4-2-2000'
HARPER A '2-9-2018'
JONES H '1-1-2000'
但是,看起来它只是在检查df_out1 ['LN'] == ef_file_in ['LN']在同一行上,而不是检查df_out1 ['LN'] == 任何点ef_file_in ['LN']。
答案 0 :(得分:0)
使用.isin()
方法定位那些行
df_out1.loc[df_out1['E'].isnull() & df_out1['LN'].isin(ef_file_in['LN'])]
编辑:看到您的数据后,我想我就会得到您想要的
import pandas as pd
df_out1 = pd.DataFrame({'LN':['SMITH','JONES','HARPER'],
'E': ['H',np.nan,'F'],
'DATE':['4-2-2000', '2-9-2018', '1-1-2000']})
ef_file_in = pd.DataFrame({'LN':['PERRY','JONES','SMITH'],'E': ['F','H','B']})
# merge the two dataframes to get all in one
df_out1 = pd.merge(df_out1,ef_file_in,on='LN',how='left')
# LN E_x DATE E_y
# 0 SMITH H 4-2-2000 B
# 1 JONES NaN 2-9-2018 H
# 2 HARPER F 1-1-2000 NaN
# Now E has changed to E_x and you have a new column E_y form ef_file_in
# get a new column E = copy E_x except when null, then get E_y
df_out1['E'] = np.where(df_out1.E_x.isnull(),df_out1['E_y'],df_out1['E_x'])
# drop E_x and E_y, not useful now
df_out1.drop(columns=['E_x','E_y'],inplace=True)
# Notice that column E is not null anymore for LN=Jones
# LN DATE E
# 0 SMITH 4-2-2000 B
# 1 JONES 2-9-2018 H
# 2 HARPER 1-1-2000 F
答案 1 :(得分:0)
使用isin
:
df_out1.loc[(df_out1['E'].isnull()) & df_out1['LN'].isin(ef_file_in['LN']), 'LN'] =
ef_file_in['E']