我有两个数据帧:
DF:
index some_variable identifier1 identifier2
1 x AB2 AB3
2 x BB2 BB3
3 x CB2 CB3
4 y DB2 DB3
5 y EB2 EB3
DFA:
index some_variable identifier1 identifier2 identifier3
1 x AB5 AB3 AB3
2 x BB5 BB2 AB2
3 x CB5 CB2 AB5
4 y DB5 DB3 AB3
5 y EB5 EB3 AB3
如果df['identifier1']
的元素在dfa['identifier2']
中,则df['identifier2']
替换该索引dfa['identifier3']
,如果some_variable等于'x'
。所以条件是:
[(df['identifier1'].isin(dfa['identifier2'])&(df[some_variable]=='x')]
我希望:
index some_variable identifier1 identifier2
1 x AB2 AB3
2 x BB2 AB2
3 x CB2 AB5
4 y DB2 DB3
5 y EB2 EB3
我可以设置条件,但无法弄清楚如何获得输出。
答案 0 :(得分:1)
我认为这就是你要做的事情:
df1
# index some_variable identifier1 identifier2
# 0 1 x AB2 AB3
# 1 2 x BB2 BB3
# 2 3 x CB2 CB3
# 3 4 y DB2 DB3
# 4 5 y EB2 EB3
df2
# index some_variable identifier1 identifier2 identifier3
# 0 1 x AB5 AB3 AB3
# 1 2 x BB5 BB2 AB2
# 2 3 x CB5 CB2 AB5
# 3 4 y DB5 DB3 AB3
# 4 5 y EB5 EB3 AB3
idx = df1['identifier1'].isin(df2['identifier2']) & (df1['some_variable'] == 'x')
df1.loc[idx, 'identifier2'] = df2['identifier3']
df1
# index some_variable identifier1 identifier2
# 0 1 x AB2 AB3
# 1 2 x BB2 AB2
# 2 3 x CB2 AB5
# 3 4 y DB2 DB3
# 4 5 y EB2 EB3
答案 1 :(得分:0)
类似以下内容(尽管可能有更简单的方法)
d1 = {'some_variable':['x','x','x','y','y'], 'identifier1':['AB2','BB2','CB2','DB2','EB2'], 'identifier2':['AB3','BB3','CB3','DB3','EB3']}
df = pd.DataFrame(d1)
d2 = {'some_variable':['x','x','x','y','y'], 'identifier1':['AB5','BB5','CB5','DB5','EB5'], 'identifier2':['AB3','BB2','CB2','DB3','EB3'], 'identifier3':['AB3','AB2','AB5','AB3','AB3']}
dfa = pd.DataFrame(d2)
df['identifier2'][(df['identifier1'].isin(dfa['identifier2']) & (df['some_variable'] == 'x'))] = dfa['identifier3'][
(df['identifier1'].isin(dfa['identifier2']) & (df['some_variable'] == 'x'))]