Question

我有两张这样的表：

Table A:

id id2 value
1   1   a
1   2   b
2   1   c
3   1   d

Table B:

id value2
1    e
2    g
3    h

我需要加入他们，所以我得到这样的东西：

Table needed:

id   id2  value value2
1     1     a     e
1     2     b     e
2     1     c     g
3     1     d     h

Excel或Python或R可以工作。事实是我需要如果表A中的id与表B中的id匹配，则表b中的值被添加到matchin行中。然而，两个表的大小不同，有时候表b中的id不存在于表a中。我只需要表a中的那些。

Answer 1

最快的是使用map：

df1['value2'] = df1['id'].map(df2.set_index('id')['value2'])
print (df1)
   id  id2 value value2
0   1    1     a      e
1   1    2     b      e
2   2    1     c      g
3   3    1     d      h

编辑：

print (df2)
   id value2
0   1      e
1   1      p
2   2      g
3   3      h

df1['value2'] = df1['id'].map(df2.set_index('id')['value2'])
print (df1)

InvalidIndexError：重新索引仅对具有唯一值的索引对象有效

Ans解决方案将其删除：

print (df2)
   id value2
0   1      e
1   1      p
2   2      g
3   3      h

df2 = df2.drop_duplicates(subset='id')
print (df2)
   id value2
0   1      e
2   2      g
3   3      h

df1['value2'] = df1['id'].map(df2.set_index('id')['value2'])
print (df1)
   id  id2 value value2
0   1    1     a      e
1   1    2     b      e
2   2    1     c      g
3   3    1     d      h

使用merge和左连接重复值的解决方案：

df = pd.merge(df1, df2, on='id', how='left')
print (df)
   id  id2 value value2
0   1    1     a      e
1   1    1     a      p
2   1    2     b      e
3   1    2     b      p
4   2    1     c      g
5   3    1     d      h

通过列Python合并不同大小的表

1 个答案: