我想替换符合条件的数据框中的某些值。 我试着编写代码,但似乎没有工作
dfa = df.copy()
for value in df['Clean Company Name']:
if value=="NaN":
dfa['Clean Company Name'].replace(df['Company Name'])
dfa.head()
如您所见,NaN值未被'公司名称替换
我如何实现这一结果?
答案 0 :(得分:1)
如果需要替换NaN
值需要函数combine_first
或fillna
:
df['Clean Company Name'].combine_first(df['Company Name'])
或者:
df['Clean Company Name'].fillna(df['Company Name'])
样品:
df = pd.DataFrame({'Company Name':['s','d','f'], 'Clean Company Name': [np.nan, 'r', 't']})
print (df)
Clean Company Name Company Name
0 NaN s
1 r d
2 t f
#if need check NaNs
print (df['Clean Company Name'].isnull())
0 True
1 False
2 False
Name: Clean Company Name, dtype: bool
df['Clean Company Name'] = df['Clean Company Name'].combine_first(df['Company Name'])
print (df)
Clean Company Name Company Name
0 s s
1 r d
2 t f
有关missing data的更多信息。
编辑:
对于按条件替换数据,可以使用loc
与boolean mask
:
print (df['Company Name'] == 'd')
0 False
1 True
2 False
Name: Company Name, dtype: bool
df.loc[df['Company Name'] == 'd', 'Clean Company Name'] = 'sss'
print (df)
Clean Company Name Company Name
0 NaN s
1 sss d
2 t f