我正在尝试合并两个数据帧以查找任何新条目。目前这两个数据帧是相同的。
Dataframe A
BusinessName Ubi IdentifierValue
0 CHULA VISTA PAINTING/SERVICES 604000010 CHULAVP841MQ
1 MANU TECH LLC 604000040 MANUTTL833BL
2 HAWTHORN LANDSCAPE MTRILS INC 604000042 HAWTHLM845MM
3 M M R CONSTRUCTION LLC 604000082 MMRCOCL848MM
4 HURTADO PAINTING 604000120 HURTAP*831JJ
Dataframe B
BusinessName Ubi IdentifierValue
0 CHULA VISTA PAINTING/SERVICES 604000010 CHULAVP841MQ
1 MANU TECH LLC 604000040 MANUTTL833BL
2 HAWTHORN LANDSCAPE MTRILS INC 604000042 HAWTHLM845MM
3 M M R CONSTRUCTION LLC 604000082 MMRCOCL848MM
4 HURTADO PAINTING 604000120 HURTAP*831JJ
当我合并Ubi时,它会复制所有行。
A = A[['Ubi']]
B = B[['Ubi']]
A = A.merge(B, how='outer', indicator=True)
A
Ubi _merge
0 604000010.0 left_only
1 604000040.0 left_only
2 604000042.0 left_only
3 604000082.0 left_only
4 604000120.0 left_only
5 604000010.0 right_only
6 604000040.0 right_only
7 604000042.0 right_only
8 604000082.0 right_only
9 604000120.0 right_only
如果我只使用商业名称合并,尽管它按预期工作。
A = A[['BusinessName']]
B = B[['BusinessName']]
A = A.merge(B, how='outer', indicator=True)
A
BusinessName _merge
0 CHULA VISTA PAINTING/SERVICES both
1 MANU TECH LLC both
2 HAWTHORN LANDSCAPE MTRILS INC both
3 M M R CONSTRUCTION LLC both
4 HURTADO PAINTING both
最好合并Ubi,但我似乎无法找到问题。 Ubi列是Int64,而其他列是Objects。当我合并Ubi列时,我注意到列类型切换到float64。
答案 0 :(得分:1)
有不同类型的问题,需要相同。
检查:
print (A['Ubi'].dtype)
print (B['Ubi'].dtype)
所以需要:
A['Ubi'] = A['Ubi'].astype(str)
B['Ubi'] = B['Ubi'].astype(str)
或者:
A['Ubi'] = A['Ubi'].astype(int)
B['Ubi'] = B['Ubi'].astype(int)