具有两个数据帧,其中一个具有在另一个中要替换的值。替换值的最佳方法是什么?
例如,应将df1中的type:none
替换为df2中的值。到目前为止,这是我已经取得的进步,但是我对这种方法不满意:
df1=pd.DataFrame({"word":['The','big','cat','house'], "type": ['article','none','noun','none'],"pos":[1,2,3,4]})
df2=pd.DataFrame({"word":['big','house'], "type": ['adjective','noun'],"pos":[2,4]})
df1.set_index('pos',inplace=True, drop=True)
df2.set_index('pos',inplace=True, drop=True)
for i, row in df1.iterrows():
if row['type']=='none':
row['word']=df2.loc[df2.index[i],'word']
df1数据框应更改为:
word type pos
0 The article 1
1 big adjective 2
2 cat noun 3
3 house noun 4
谢谢:)
答案 0 :(得分:1)
怎么样:
df= df2.set_index('word').combine_first(df1.set_index('word'))
df.pos = df.pos.astype(int)
输出:
type pos
word
The article 1
big adjective 2
cat noun 3
house noun 4
和
df.reset_index()
In [970]: df.reset_index()
Out[970]:
word type pos
0 The article 1
1 big adjective 2
2 cat noun 3
3 house noun 4
或'pos':
df = df2.set_index('pos').combine_first(df1.set_index('pos')).reset_index()
colidx=['word', 'type', 'pos']
df.reindex(columns=colidx)
输出:
Out[976]:
word type pos
0 The article 1
1 big adjective 2
2 cat noun 3
3 house noun 4
答案 1 :(得分:1)
不使用.apply()
方法。
condition = df1['type']=='none'
df1.loc[condition, 'type'] = df2.loc[condition]['type']
df1.reset_index(inplace=True)
输出:
pos word type
0 1 The article
1 2 big adjective
2 3 cat noun
3 4 house noun
答案 2 :(得分:1)
如果df2
始终指示应替换df1
中的单词的位置,则只需执行以下操作:
df1.loc[df2.index,"type"] = df2["type"]
print (df1)
#
word type
pos
1 The article
2 big adjective
3 cat noun
4 house noun