Pandas Merge复制所有行

时间:2017-10-18 11:56:52

标签: python pandas merge

我正在尝试合并两个数据帧以查找任何新条目。目前这两个数据帧是相同的。

Dataframe A

    BusinessName                        Ubi             IdentifierValue
0   CHULA VISTA PAINTING/SERVICES       604000010       CHULAVP841MQ
1   MANU TECH LLC                       604000040       MANUTTL833BL
2   HAWTHORN LANDSCAPE MTRILS INC       604000042       HAWTHLM845MM
3   M M R CONSTRUCTION LLC              604000082       MMRCOCL848MM
4   HURTADO PAINTING                    604000120       HURTAP*831JJ

Dataframe B

        BusinessName                    Ubi             IdentifierValue
0   CHULA VISTA PAINTING/SERVICES       604000010       CHULAVP841MQ
1   MANU TECH LLC                       604000040       MANUTTL833BL
2   HAWTHORN LANDSCAPE MTRILS INC       604000042       HAWTHLM845MM
3   M M R CONSTRUCTION LLC              604000082       MMRCOCL848MM
4   HURTADO PAINTING                    604000120       HURTAP*831JJ

当我合并Ubi时,它会复制所有行。

A = A[['Ubi']]
B = B[['Ubi']]
A = A.merge(B, how='outer', indicator=True)
A


    Ubi         _merge
0   604000010.0 left_only
1   604000040.0 left_only
2   604000042.0 left_only
3   604000082.0 left_only
4   604000120.0 left_only
5   604000010.0 right_only
6   604000040.0 right_only
7   604000042.0 right_only
8   604000082.0 right_only
9   604000120.0 right_only

如果我只使用商业名称合并,尽管它按预期工作。

A = A[['BusinessName']]
B = B[['BusinessName']]
A = A.merge(B, how='outer', indicator=True)
A

BusinessName                        _merge
0   CHULA VISTA PAINTING/SERVICES   both
1   MANU TECH LLC                   both
2   HAWTHORN LANDSCAPE MTRILS INC   both
3   M M R CONSTRUCTION LLC          both
4   HURTADO PAINTING                both

最好合并Ubi,但我似乎无法找到问题。 Ubi列是Int64,而其他列是Objects。当我合并Ubi列时,我注意到列类型切换到float64。

1 个答案:

答案 0 :(得分:1)

有不同类型的问题,需要相同。

检查:

print (A['Ubi'].dtype)
print (B['Ubi'].dtype)

所以需要:

A['Ubi'] = A['Ubi'].astype(str)
B['Ubi'] = B['Ubi'].astype(str)

或者:

A['Ubi'] = A['Ubi'].astype(int)
B['Ubi'] = B['Ubi'].astype(int)