从其他数据框填充零

时间:2020-04-01 23:10:25

标签: python pandas dataframe merge

我有两个数据帧df1和df2。考虑到唯一标识符(id),我想根据df1中的相应条目填充df2中的空值。下面是代码:

import pandas as pd
import numpy as np
df1 = pd.DataFrame({"id": [3,4,5,6,7,8,9],
                     "col1": ['mike', 'matt', 'mertha', 'peter', 'tabby', 'carl', 'brian'],
                     "col2": ['645-345', '645-333', '324-543', '123-432', '563-654', '324-123', '902-342'],
                     "col3": ['cat', 'cat','dog', 'none', 't-rex', 'goat', 'snake']})
df2 = pd.DataFrame({"id": [6, 6, 7, 7, 7, 8, 8, 9],
                    "col1": ['peter', 'peter', np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],
                    "col2": ['324-123','324-123', '902-342', '902-332', '902-123', '556-786', '113-786', '901-345'],
                    "col3": ['none', 'none', np.nan, np.nan, np.nan, np.nan, np.nan, np.nan]})

为简便起见,当我尝试在此网站上进行所有阳光照射时,我并不是在开玩笑,而且我似乎找不到答案。任何帮助将不胜感激!

已编辑:预期输出

我只想填写col1col3 np.nan值。 None只是另一个选择。我的预期输出如下:

df_merged = pd.DataFrame({"id": [6, 6, 7, 7, 7, 8, 8, 9],
                    "col1": ['peter', 'peter', 'tabby','tabby', 'tabby',  'carl','carl','brian'],
                    "col2": ['324-123','324-123', '902-342', '902-332', '902-123', '556-786', '113-786', '901-345'],
                    "col3": ['none', 'none', 't-rex', 't-rex', 't-rex', 'goat', 'goat', 'snake']})

1 个答案:

答案 0 :(得分:1)

如果id在两个数据框中都是索引,则Erfan的注释应该起作用。否则:

(df2.set_index('id')
    .fillna(df1.set_index('id'))
    .reset_index()
)

输出:

   id   col1     col2   col3
0   6  peter  324-123   none
1   6  peter  324-123   none
2   7  tabby  902-342  t-rex
3   7  tabby  902-332  t-rex
4   7  tabby  902-123  t-rex
5   8   carl  556-786   goat
6   8   carl  113-786   goat
7   9  brian  901-345  snake