比较两个在python中可能有或没有共同值的字典列表

时间:2018-09-03 22:41:57

标签: python list dictionary

我需要帮助:P

我有此代码,并且...

using (FileStream fs = new FileStream(fileName, FileMode.Open))
{
    using (BinaryReader br = new BinaryReader(fs))
    {
        byte[] data = br.ReadBytes((int)fs.Length);

        // store to db.
    }
}

然后我需要找到两者之间的区别。

所以我编写这段代码

lista_final = [] #storethe difference beetween this two lists
lista1 = (
    {
        'ip': '127.0.0.1',
        'hostname': 'abc',
        'state': 'open',
        'scan_id': '2'
    },
    {
        'ip': '127.0.0.2',
        'hostname': 'bca',
        'state': 'closed',
        'scan_id': '2'
    }
)
lista2 = (
    {
        'ip': '127.0.0.1',
        'hostname': 'abc',
        'state': 'closed',
        'scan_id': '3'
    },
    {
        'ip': '127.0.0.3',
        'hostname': 'qwe',
        'state': 'open',
        'scan_id': '3'
    },
    {
        'ip': '127.0.0.2',
        'hostname': 'xxx',
        'state': 'up',
        'scan_id': '3'
    },
)

我的输出是

for l1 in lista1:  
    for l2 in lista2:
        if l1['ip'] == l2['ip']: #if ip is equal
            ip = l1['ip'] #store ip
            hostname = l1['hostname'] #default hostname
            if l1['hostname'] != l2['hostname']: #if hostnames are differente, store
                hostname = '({scan_id_l1}:{valuel1}) != ({scan_id_l2}:{valuel2})'.format(scan_id_l1=l1['scan_id'], valuel1=l1['hostname'], scan_id_l2=l2['scan_id'], valuel2=l2['hostname'])
            state = l1['state'] #default state
            if l1['state'] != l2['state']:  #if states are differente, store
                state = '({scan_id_l1}:{valuel1}) != ({scan_id_l2}:{valuel2})'.format(scan_id_l1=l1['scan_id'], valuel1=l1['state'], scan_id_l2=l2['scan_id'], valuel2=l2['state'])
            # create a temp dict
            tl = {
                'ip': ip,
                'hostname': hostname,
                'state': state
            }
            #append the temp dict to lista_final
            lista_final.append(tl)
            break #okok, go next

print(lista_final)

请注意,在list2中有一个IP'127.0.0.3'未出现在lista_final中,我想要的结果是这样的:

[
    {
        'ip': '127.0.0.1',
        'hostname': 'abc',
        'state': '(2:open) != (3:closed)'
    },
    {
        'ip': '127.0.0.2',
        'hostname': '(2:bca) != (3:xxx)',
        'state': '(2:closed) != (3:up)'
    }
] 

您能为我提供最佳解决方案吗?

2 个答案:

答案 0 :(得分:0)

让我们先清理一下您的解决方案

#let's make this tuple lists
lista1 = list(lista1)
lista2 = list(lista2)

#let's sort them by ip
lista1.sort( key = lambda d : d['ip'] )
lista2.sort( key = lambda d : d['ip'] )


for dd in zip(lista1,lista2):         
        for k, v in dd[0].iteritems():
                if( v != dd[1].get(k) and k != 'scan_id' ):
                        dd[0][k] = "({}:{}) != ({}:{})".format( dd[0]['scan_id'], v, dd[1]['scan_id'], dd[1].get(k))
        dd[0].pop('scan_id')
        lista_final.append(dd[0])

这几乎与您的代码相同,只是以更Python的方式就地进行。这是输出:

[
    {
      'hostname': 'abc',
      'ip': '127.0.0.1',
      'state': '(2:open) != (3:closed)'
     },
     {
      'hostname': '(2:bca) != (3:xxx)',
      'ip': '127.0.0.2',
      'state': '(2:closed) != (3:up)'
     }
 ]

您想覆盖一个列表比另一个列表长的极端情况,您可以简单地比较它们,然后重复如下操作

longer_lista = lista1 if lista1>lista2 else lista2
#iterating only through the dictionaries in the longer part of the list
for d in longer_lista[ len( zip( lista1, lista2 ) ) : ]:
        for k,v in d.iteritems():
                if( k != 'ip' and k!='scan_id' ):
                        d[k] = "({}) != ({}:{})".format( "NOT EXISTS", lista2[0]['scan_id'], v )
        lista_final.append( d ) 

这将为您提供预期的输出。当然,该代码并没有涵盖所有可能的极端情况,但是是一个不错的起点。

[
    {
      'hostname': 'abc',
      'ip': '127.0.0.1',
      'state': '(2:open) != (3:closed)'
     },
     {
      'hostname': '(2:bca) != (3:xxx)',
      'ip': '127.0.0.2',
      'state': '(2:closed) != (3:up)'
     }
     {
      'hostname': '(NOT EXISTS) != (3:qwe)',
      'ip': '127.0.0.3',
      'scan_id': '3',
      'state': '(NOT EXISTS) != (3:open)'
     }
 ]

答案 1 :(得分:0)

我使用了您在此处发布的一些功能作为答案,我能够解决问题!

谢谢。

这是评论过的解决方案。

功能

def search_diffences(list_one, list_two):
    # Store the result
    list_final = []

    # Sort by IP
    list_one.sort( key = lambda d : d['ip'] )
    list_two.sort( key = lambda d : d['ip'] )

    # Find the bigger list
    bigger_list = list_one
    smaller_list = list_two
    if len(list_two) > len(list_one):
        bigger_list = list_two
        smaller_list = list_one

    # Start the for inside for    
    for lo in bigger_list:
        found = False # Store if the result was found
        pop_i = 0 # Control what dict will be remove in the smaller_list (For code optimization)
        for lt in smaller_list:
            print("lo['ip']({0}) == lt['ip']({1}): {2}".format(lo['ip'], lt['ip'], lo['ip'] == lt['ip'])) # For debug
            if lo['ip'] == lt['ip']: # If the biggest_list as lo ['ip'] was found in smaller_list
                found = True # Set found as True
                # Store scan_id because will be used more than one time
                scan_id_lo = lo['scan_id']
                scan_id_lt = lt['scan_id']

                # Create a temp list for add in list_final
                ip = lo['ip']
                # Struct how i want the result
                hostname = lo['hostname']
                if lo['hostname'] != lt['hostname']:
                    hostname = '({SCAN_ID_LO}:{VALUE_LO}) != ({SCAN_ID_LT}:{VALUE_LT})'.format(
                        SCAN_ID_LO=scan_id_lo,
                        SCAN_ID_LT=scan_id_lt,
                        VALUE_LO=lo['hostname'],
                        VALUE_LT=lt['hostname']
                    )

                state = lo['state']
                if lo['state'] != lt['state']:
                    state = '({SCAN_ID_LO}:{VALUE_LO}) != ({SCAN_ID_LT}:{VALUE_LT})'.format(
                        SCAN_ID_LO=scan_id_lo,
                        SCAN_ID_LT=scan_id_lt,
                        VALUE_LO=lo['state'],
                        VALUE_LT=lt['state']
                    )
                # Create the temp_list
                temp_list = {
                    'ip': ip,
                    'hostname': hostname,
                    'state': state
                }
                # Append temp_list in list_final
                list_final.append(temp_list)
                # Pop the value because so, the next for of bigger_list does not go through the first value of the smaller list again                    
                smaller_list.pop(pop_i)
                # Go to Next value of bigger_list
                break
            # pop_i++ because if the smaller list does not hit == in the first attempt, then it pops in pop_i value
            pop_i += 1
        print(found) # Debug
        if not found: # If the value of bigger list doesnt exist in smaller_list, append to list_final anyway
            scan_id_lo = lo['scan_id']
            scan_id_lt = lt['scan_id']

            ip = lo['ip']

            print("This was not found, adding to list_final", ip)

            hostname = '({SCAN_ID_LO}:{VALUE_LO}) != ({SCAN_ID_LT}:{VALUE_LT})'.format(
                SCAN_ID_LO=scan_id_lo,
                SCAN_ID_LT=scan_id_lt,
                VALUE_LO='NOT EXIST',
                VALUE_LT=lo['hostname']
            )

            state = '({SCAN_ID_LO}:{VALUE_LO}) != ({SCAN_ID_LT}:{VALUE_LT})'.format(
                SCAN_ID_LO=scan_id_lo,
                SCAN_ID_LT=scan_id_lt,
                VALUE_LO='NOT EXIST',
                VALUE_LT=lo['state']
            )

            temp_list = {
                'ip': ip,
                'hostname': hostname,
                'state': state
            }
            list_final.append(temp_list)

            # bigger_list.pop(0)

    # If smaller_list still have elements
for lt in smaller_list:
    scan_id_lt = lt['scan_id']

    ip = lt['ip']

    hostname = '({SCAN_ID_LO}:{VALUE_LO}) != ({SCAN_ID_LT}:{VALUE_LT})'.format(
        SCAN_ID_LO='NOT EXIST',
        SCAN_ID_LT=scan_id_lt,
        VALUE_LO='NOT EXIST',
        VALUE_LT=lt['hostname']
    )

    state = '({SCAN_ID_LO}:{VALUE_LO}) != ({SCAN_ID_LT}:{VALUE_LT})'.format(
        SCAN_ID_LO='NOT EXIST',
        SCAN_ID_LT=scan_id_lt,
        VALUE_LO='NOT EXIST',
        VALUE_LT=lt['state']
    )

    temp_list = {
        'ip': ip,
        'hostname': hostname,
        'state': state
    }

    list_final.append(temp_list) # Simple, append

return list_final

主要代码和列表

# First List
list_one = [
    {
        'ip': '127.0.0.1',
        'hostname': 'abc',
        'state': 'open',
        'scan_id': '2'
    },
    {
        'ip': '127.0.0.2',
        'hostname': 'bca',
        'state': 'closed',
        'scan_id': '2'
    },
    {
        'ip': '100.0.0.4',
        'hostname': 'ddd',
        'state': 'closed',
        'scan_id': '2'
    },
    {
        'ip': '100.0.0.1',
        'hostname': 'ggg',
        'state': 'up',
        'scan_id': '2'
    },
]
# Second List
list_two = [
    {
        'ip': '127.0.0.1',
        'hostname': 'abc',
        'state': 'closed',
        'scan_id': '3'
    },
    {
        'ip': '127.0.0.3',
        'hostname': 'qwe',
        'state': 'open',
        'scan_id': '3'
    },
    {
        'ip': '127.0.0.2',
        'hostname': 'xxx',
        'state': 'up',
        'scan_id': '3'
    },
    {
        'ip': '10.0.0.1',
        'hostname': 'ddd',
        'state': 'open',
        'scan_id': '3'
    },
    {
        'ip': '100.0.0.1',
        'hostname': 'ggg',
        'state': 'down',
        'scan_id': '3'
    },
]

print(search_diffences(list_one, list_two))

产生

[{
    'ip': '10.0.0.1',
    'hostname': '(3:NOT EXIST) != (2:ddd)',
    'state': '(3:NOT EXIST) != (2:open)'
}, {
    'ip': '100.0.0.1',
    'hostname': 'ggg',
    'state': '(3:down) != (2:up)'
}, {
    'ip': '127.0.0.1',
    'hostname': 'abc',
    'state': '(3:closed) != (2:open)'
}, {
    'ip': '127.0.0.2',
    'hostname': '(3:xxx) != (2:bca)',
    'state': '(3:up) != (2:closed)'
}, {
    'ip': '127.0.0.3',
    'hostname': '(3:NOT EXIST) != (2:qwe)',
    'state': '(3:NOT EXIST) != (2:open)'
}, {
    'ip': '100.0.0.4',
    'hostname': '(NOT EXIST:NOT EXIST) != (2:ddd)',
    'state': '(NOT EXIST:NOT EXIST) != (2:closed)'
}]