计算两个数据之间的差异

时间:2020-06-10 14:15:04

标签: python pandas dataframe

读取输入和输出文件

Input2 = pd.read_excel('~.xlsx')
input = pd.read_excel('~.xlsx')

检查与输入和输出文件不同的列

inserted_cols = input2

cols = ([col for col in inserted_cols if col not in input ]
            +  [col for col in inserted_cols if col in input ])

input = input2 [cols]

输入:-

ram     redist  rotril  shyam
asdasd  asdasd  fff     rtrr
adsd    adsd    zzz     fhgfhgf
sadasd  sadasd  bbb     cbcbv
zxcxz   zxcxz   xxx     hjhj
fdfsd   fdfsd   rrr     piio

Input2:-

ram    shyam    tramp   rotril
asdasd  rtrr    asdasd  rtrr
adsd    fhgfhgf adsd    fhgfhgf
sadasd  cbcbv   sadasd  cbcbv
zxcxz   hjhj    zxcxz   hjhj
fdfsd   piio    fdfsd   piio

我得到的输出:-

ram    shyam    tramp   rotril
asdasd  rtrr    asdasd  rtrr
adsd    fhgfhgf adsd    fhgfhgf
sadasd  cbcbv   sadasd  cbcbv
zxcxz   hjhj    zxcxz   hjhj
fdfsd   piio    fdfsd   piio

预期输出:-

ram     shyam   tramp   rotril
asdasd  rtrr    NA      fff
adsd    fhgfhgf NA      zzz
sadasd  cbcbv   NA      bbb
zxcxz   hjhj    NA      xxx
fdfsd   piio    NA      rrr

可以帮助我在代码中做错什么吗?

1 个答案:

答案 0 :(得分:0)

这是一个解决方案:

input1.index.name = "inx"
input2.index.name = "inx"

new_cols = list(set(input2.columns) - set(input1.columns))

input1 = input1.join(input2[new_cols])

res = input1[input2.columns]
print(res)

输出为:

        ram    shyam   tramp rotril
inx                                
0    asdasd     rtrr  asdasd    fff
1      adsd  fhgfhgf    adsd    zzz
2    sadasd    cbcbv  sadasd    bbb
3     zxcxz     hjhj   zxcxz    xxx
4     fdfsd     piio   fdfsd    rrr