Question

Df1：

Df2：

Id    val
1     5
7     2

必填：

我有这些df1和df2，我想获取必需的df，其中Df1和Df2中存在的常见ID将被更新，而新ID将被追加。

我似乎找不到我是否需要使用更新，合并或联接之类的东西。

Answer 1

将concat与drop_duplicates一起使用（请注意，可能不会保留订单）。

pd.concat([df1, df2]).drop_duplicates('Id', keep='last')

   Id  val
1   3    7
2   9    2
3   4    5
0   1    5
1   7    2

Answer 2

使用combine_first

df2.set_index('Id').combine_first(df1.set_index('Id')).reset_index()
Out[6]: 
   Id  val
0   1  5.0
1   3  7.0
2   4  5.0
3   7  2.0
4   9  2.0

Answer 3

`dictionary`开箱

m1 = dict(zip(df1.Id, df1.val))
m2 = dict(zip(df2.Id, df2.val))

pd.DataFrame([*{**m1, **m2}.items()], columns=['Id', 'val'])

   Id  val
0   1    5
1   3    7
2   4    5
3   7    2
4   9    2

替代形式

cols = ['Id', 'val']
m1 = dict(zip(*map(df1.get, cols)))
m2 = dict(zip(*map(df2.get, cols)))

pd.DataFrame([*{**m1, **m2}.items()], columns=cols)

`get`

m1 = dict(zip(df1.Id, df1.val))
m2 = dict(zip(df2.Id, df2.val))
f = lambda x: m2.get(x, m1.get(x, x))

pd.DataFrame([[x, f(x)] for x in {*df1.Id, *df2.Id}], columns=['Id', 'val'])

   Id  val
0   1    5
1   3    7
2   4    5
3   7    2
4   9    2

Answer 4

您可以先对齐索引update，然后再对齐concat。该解决方案很冗长，但是会根据您想要的结果保持行序。

df1 = df1.set_index('Id')
df2 = df2.set_index('Id')

df1.update(df2)

df = pd.concat([df1, df2[~df2.index.isin(df1.index)]])\
       .reset_index().astype(int)

print(df)

   Id  val
0   1    5
1   3    7
2   9    2
3   4    5
4   7    2

根据索引更新数据框并追加新的

4 个答案:

`dictionary`开箱

`get`

根据索引更新数据框并追加新的

4 个答案:

dictionary开箱

get

`dictionary`开箱

`get`