我有2个数据帧:
df1 (sample, has more columns):
+---+----------------+--------------+-----------+
| | Region | Placement ID | Units |
+---+----------------+--------------+-----------+
| 0 | Western Europe | 1.10872E+13 | 367628.76 |
| 1 | Western Europe | 1.10872E+13 | 367628.76 |
| 2 | Western Europe | 1.10872E+13 | 74604.63 |
+---+----------------+--------------+-----------+
df2 (sample, has more columns:
+-----------+----------------+--------------+
| Creatives | Publisher Name | Placement ID |
+-----------+----------------+--------------+
| Temenos | Quantcast | 1.10872E+13 |
| Temenos | Quantcast | 1.10872E+13 |
| Temenos | Quantcast | 1.10872E+13 |
+-----------+----------------+--------------+
我想要做的是在数据框2中添加一个额外的列,其中数据框1的索引列基于Placement ID。
某些展示位置数据框1或2中的Id字段可能为空,或者有错误值,如果没有匹配,或者发现错误,那么我想添加一个Missing或Error值,例如N / A ,遗漏或留空
答案 0 :(得分:1)
您需要{II}的IIUC,但重复有问题,因此请先按merge
删除它们,然后选择第一列添加,另一列添加(Placement ID
):
print (pd.merge(df2,
df1.drop_duplicates('Placement ID')[['Units', 'Placement ID']],
how='left',
on='Placement ID'))
Creatives Publisher Name Placement ID Units
0 Temenos Quantcast 1.108720e+13 367628.76
1 Temenos Quantcast 1.108720e+13 367628.76
2 Temenos Quantcast 1.108720e+13 367628.76
如果需要添加索引需要drop_duplicates
:
print (pd.merge(df2,
df1.drop_duplicates('Placement ID')
.reset_index()[['level_0','Placement ID']],
how='left',
on='Placement ID'))
Creatives Publisher Name Placement ID level_0
0 Temenos Quantcast 1.108720e+13 0
1 Temenos Quantcast 1.108720e+13 0
2 Temenos Quantcast 1.108720e+13 0
需要删除重复项,因为merge
多个行由连接键组成 - 1.108720e+13
中有3个相同的值df2
,df1
中有3行,所以得到3 x 3行如:
print (pd.merge(df2,
df1.reset_index()[['level_0', 'Placement ID']],
how='left',
on='Placement ID'))
Creatives Publisher Name Placement ID level_0
0 Temenos Quantcast 1.108720e+13 0
1 Temenos Quantcast 1.108720e+13 1
2 Temenos Quantcast 1.108720e+13 2
3 Temenos Quantcast 1.108720e+13 0
4 Temenos Quantcast 1.108720e+13 1
5 Temenos Quantcast 1.108720e+13 2
6 Temenos Quantcast 1.108720e+13 0
7 Temenos Quantcast 1.108720e+13 1
8 Temenos Quantcast 1.108720e+13 2