想象一下,我们有以下数据框:
Name ID Phone Email
Paul 10 000001 paul@mail.com
Sarah 20 sara@mail.com
John 30 000003
Will 40
Evelyn 50 000005 evelyn@mail.com
还有以下列表:
['Sarah', '20', '000002', 'sara@mail.com']
['John', '30', '000003', 'john@mail.com']
['Will', '40', '000004', 'will@mail.com']
是否有任何pythonic pandas方式可以从列表中更新数据框中的None值,而不必循环查看各个字段?
结果应为:
Name ID Phone Email
Paul 10 000001 paul@mail.com
Sarah 20 000002 sara@mail.com
John 30 000003 john@mail.com
Will 40 000004 will@mail.com
Evelyn 50 000005 evelyn@mail.com
提前谢谢!
答案 0 :(得分:2)
您可以从列表创建DataFrame,将Name
设置为在两个DataFrame
中建立索引,并使用DataFrame.combine_first
,因为相同的顺序将索引转换为列,然后按该列进行处理和最后排序:
L = [['Sarah', '20', '000002', 'sara@mail.com'],
['John', '30', '000003', 'john@mail.com'],
['Will', '40', '000004', 'will@mail.com']]
df1 = pd.DataFrame(L, columns=['Name','ID','Phone','Email']).set_index('Name')
print (df1)
ID Phone Email
Name
Sarah 20 000002 sara@mail.com
John 30 000003 john@mail.com
Will 40 000004 will@mail.com
df = (df.reset_index()
.set_index('Name')
.combine_first(df1)
.reset_index()
.sort_values('index', ignore_index=True)
.reindex(df.columns, axis=1))
print (df)
Name ID Phone Email
0 Paul 10 000001 paul@mail.com
1 Sarah 20 000002 sara@mail.com
2 John 30 000003 john@mail.com
3 Will 40 000004 will@mail.com
4 Evelyn 50 000005 evelyn@mail.com
另一个想法是使用DataFrame.update
,但是所有值都被忽略了,不仅NaN
s:
df = df.set_index('Name')
df.update(df1)
df = df.reset_index()
print (df)
Name ID Phone Email
0 Paul 10 000001 paul@mail.com
1 Sarah 20 000002 sara@mail.com
2 John 30 000003 john@mail.com
3 Will 40 000004 will@mail.com
4 Evelyn 50 000005 evelyn@mail.com