Python Pandas比较专栏

时间:2017-06-01 15:23:22

标签: python csv pandas multiple-columns

我需要帮助来比较来自csv列的数据与另一个csv。获取正确的地址

num_lines = sum(1表示打开行('example.csv'))#count行数

for row in range(num_line - 1):
    df1 = pd.read_csv("example.csv", na_values=['NA'])  # read csv addresses list that need to be fixed
    df2 = pd.read_csv("CTT.csv", na_values=['NA'])      # read csv with correct addresses 

    if cp7 == True:

将列与另一个csv文件进行比较

        if cp7 == 1:  

cp7只匹配一个地址

            File = open('Norm.csv', 'w')
            Norm = csv.writer(File)
            Norm = [column for column in Norm]
            File.close()
        else: 

所有cp7可能性

            File = open('PNorm.csv', 'w')
            PNorm = csv.writer(File)
            PNorm = [column for column in PNorm]
            File.close()

    elif cp4 == True: 

所有cp4可能性

将列与另一个csv文件进行比较

        File = open('PNorm.csv', 'w')
        PNorm = csv.writer(File)
        PNorm = [column for column in PNorm]
        File.close()
    else:
        pass

    if localidade == True:

所有localidade可能性

阅读localidade

        File = open('PNorm.csv', 'w')
        PNorm = csv.writer(File)
        PNorm = [column for column in PNorm]
        File.close()

    else:
        pass

    if tipovia == True 

        # compare column with another csv file

        if tipovia == 1:  

TipoVia仅匹配一个地址

            File = open('Norm.csv', 'w')
            Norm = csv.writer(File)
            Norm = [column for column in Norm]
            File.close()
        else: 

所有cp7可能性

            File = open('PNorm.csv', 'w')
            PNorm = csv.writer(File)
            PNorm = [column for column in PNorm]
            File.close()
    else:
        pass

    if nomerua_numpolicia == True

将列与另一个csv文件进行比较

        if nomerua_numpolicia == 1: 

NomeRua_NumPolicia仅匹配一个地址

            File = open('Norm.csv', 'w')
            Norm = csv.writer(File)
            Norm = [column for column in Norm]
            File.close()
        else: 

所有cp7可能性

            File = open('PNorm.csv', 'w')
            PNorm = csv.writer(File)
            PNorm = [column for column in PNorm]
            File.close()
    else:
        pass

1 个答案:

答案 0 :(得分:0)

解决方案:

if df1.reset_index(drop=True)["CP4"] == df2.reset_index(drop=True)["CP4"] and
df1.reset_index(drop=True)["CP3"] == df2.reset_index(drop=True)["CP3"]