我的数据框有2列Sector&部门名称,我想更新'部门' ='部门名称'其中' Sector'是空白
我有下面的脚本,但它的给出错误" ValueError:无法从重复的轴重新索引"
dataFinal.loc[dataFinal['Sector'] == '', 'Sector'] = \
dataFinal.loc[dataFinal['Sector Name'] != '', 'Sector Name']
请帮忙。
答案 0 :(得分:2)
我认为最好的是创建唯一索引,然后使用loc
,mask
或where
来反转条件:
dataFinal = dataFinal.reset_index(drop=True)
#John Gals dol from comment
dataFinal.loc[dataFinal['Sector'] == '', 'Sector'] = dataFinal['Sector Name']
或者:
m = dataFinal['Sector'] == ''
dataFinal['Sector'] = dataFinal['Sector'].mask(m, dataFinal['Sector Name'])
m = dataFinal['Sector'] != ''
dataFinal['Sector'] = dataFinal['Sector'].where(m, dataFinal['Sector Name'])
样品:
dataFinal = pd.DataFrame({'Sector':['a','ss',''],
'Sector Name':['r','t','y']}, index=[4,4,1])
print (dataFinal)
Sector Sector Name
4 a r
4 ss t
1 y
dataFinal = dataFinal.reset_index(drop=True)
m = dataFinal['Sector'] == ''
dataFinal['Sector'] = dataFinal['Sector'].mask(m, dataFinal['Sector Name'])
print (dataFinal)
Sector Sector Name
0 a r
1 ss t
2 y y
答案 1 :(得分:1)
您可以使用np.where
dataFinal['Sector'] = \
np.where(dataFinal['Sector'] == '', dataFinal['Sector Name'], dataFinal['Sector'])
感谢jezrael获取数据:
dataFinal
Sector Sector Name
4 a r
4 ss t
1 y
dataFinal['Sector'] = \
np.where(dataFinal['Sector'] == '', dataFinal['Sector Name'], dataFinal['Sector'])
dataFinal
Sector Sector Name
4 a r
4 ss t
1 y y
答案 2 :(得分:1)
您可以使用掩码查找数据框中Sector
仅包含空格的所有行,然后使用此掩码应用相应的Sector Name
:
mask = dataFinal['Sector'].str.isspace()
dataFinal.loc[mask, 'Sector'] = dataFinal.loc[mask, 'Sector Name']