Python通过common元素合并列表

时间:2013-12-05 04:04:12

标签: python list merge

我试图合并两个在它们之间有共同点的列表(在这种情况下是id参数)。 我有这样的事情:

list1=[(id1,host1),(id2,host2),(id1,host5),(id3,host4),(id4,host6),(id5,host8)]

list2=[(id1,IP1),(id2,IP2),(id3,IP3),(id4,IP4),(id5,IP5)]

主机是唯一的,但list1中的ID可以像您看到的那样重复。 我想要一个输出,它将id参数与两个列表中的常见内容相关联:

一些输出如:

IP1(host1,host5), IP2(host2), IP3(host4), IP4(host6), IP5(host8)

如您所见,IP1有两个主机关联。

有没有快速的方法呢?

谢谢

6 个答案:

答案 0 :(得分:4)

>>> from collections import defaultdict
>>> list1 = [('id1','host1'),('id2','host2'),('id1','host5'),('id3','host4'),('id4','host6'),('id5','host8')]
>>> list2 = [('id1','IP1'),('id2','IP2'),('id3','IP3'),('id4','IP4'),('id5','IP5')]
>>> d1 = defaultdict(list)
>>> for k,v in list1:
...     d1[k].append(v)
... 

您可以打印这样的项目

>>> for k, s in list2:
...     print s, d1[k]
... 
IP1 ['host1', 'host5']
IP2 ['host2']
IP3 ['host4']
IP4 ['host6']
IP5 ['host8']

您可以使用列表推导将结果放入列表

>>> res = [(s, d1[k]) for k, s in list2]
>>> res
[('IP1', ['host1', 'host5']), ('IP2', ['host2']), ('IP3', ['host4']), ('IP4', ['host6']), ('IP5', ['host8'])]

答案 1 :(得分:1)

  1. 使用collections.defaultdict映射id-> ip
  2. 然后map id - > IP
  3. >>> d = defaultdict(set)
    >>> d['id'].add('host1')
    >>> d['id'].add('host2')
    >>> d['id'].add('host1')
    >>> d
    defaultdict(<type 'set'>, {'id': set(['host2', 'host1'])})
    

答案 2 :(得分:1)

也许是这样的?

#!/usr/local/cpython-3.3/bin/python

import pprint
import collections

class Host_data:
    def __init__(self, ip_address, hostnames):
        self.ip_address = ip_address
        self.hostnames = hostnames
        pass

    def __str__(self):
        return '{}({})'.format(self.ip_address, ','.join(self.hostnames))

    __repr__ = __str__

    # The python 2.x way
    def __cmp__(self, other):
        if self.ip_address < other.ip_address:
            return -1
        elif self.ip_address > other.ip_address:
            return 1
        else:
            if self.hostnames < other.hostnames:
                return -1
            elif self.hostnames > other.hostnames:
                return 1
            else:
                return 0

    # The python 3.x way
    def __lt__(self, other):
        if self.__cmp__(other) < 0:
            return True
        else:
            return False


def main():
    list1=[('id1','host1'),('id2','host2'),('id1','host5'),('id3','host4'),('id4','host6'),('id5','host8')]

    list2=[('id1','IP1'),('id2','IP2'),('id3','IP3'),('id4','IP4'),('id5','IP5')]

    keys1 = set(tuple_[0] for tuple_ in list1)
    keys2 = set(tuple_[0] for tuple_ in list2)
    keys = keys1 | keys2

    dict1 = collections.defaultdict(list)
    dict2 = {}

    for tuple_ in list1:
        id_str = tuple_[0]
        hostname = tuple_[1]
        dict1[id_str].append(hostname)

    for tuple_ in list2:
        id_str = tuple_[0]
        ip_address = tuple_[1]
        dict2[id_str] = ip_address

    result_dict = {}
    for key in keys:
        hostnames = []
        ip_address = ''
        if key in dict1:
            hostnames = dict1[key]
        if key in dict2:
            ip_address = dict2[key]
        host_data = Host_data(ip_address, hostnames)
        result_dict[key] = host_data

    pprint.pprint(result_dict)
    print('actual output:')
    values = list(result_dict.values())
    values.sort()
    print(', '.join(str(value) for value in values))

    print('desired output:')
    print('IP1(host1,host5), IP2(host2), IP3(host4), IP4(host6), IP5(host8)')


main()

答案 3 :(得分:1)

代码:

list1=[('id1','host1'),('id2','host2'),('id1','host5'),('id3','host4'),('id4','host6'),('id5','host8')]
list1 = map(list,list1)
list2=[('id1','IP1'),('id2','IP2'),('id3','IP3'),('id4','IP4'),('id5','IP5')]
list2 = map(list,list2)

for item in list1:
    item += [x[1] for x in list2 if x[0]==item[0]]

list1 += [x for x in list2 if not any(i for i in list1 if x[0]==i[0])]

print list1

输出

[['id1', 'host1', 'IP1'], ['id2', 'host2', 'IP2'], ['id1', 'host5', 'IP1'], ['id3', 'host4', 'IP3'], ['id4', 'host6', 'IP4'], ['id5', 'host8', 'IP5']]  

希望这有助于:)

答案 4 :(得分:1)

from collections import defaultdict
list1 = [("id1","host1"),("id2","host2"),("id1","host5"),("id3","host4"),("id4","host6"),("id5","host8")]
list2 = [("id1","IP1"),("id2","IP2"),("id3","IP3"),("id4","IP4"),("id5","IP5")]
host = defaultdict(list)
IP4id = {}
for k, v in list2:
    IP4id[v] = {"id" : k, "host" : []}

for k, v in list1:
    host[k].append(v)

for item in IP4id:
    IP4id[item]["host"] = host[IP4id[item]["id"]]
print IP4id

答案 5 :(得分:0)

您需要浏览两个列表中的每一个,并将其内容添加到具有list类型元素的新defaultdict

这样可以创建包含{id1: (host1, host5), id2: host2, ...}

等内容的字典

然后,您可以将id值映射到相应的IP值。

请注意,要使其生效,id值必须为hashable。字符串,数字和其他基本类型是可以清除的。

如果id值是您定义的类的对象,则可以让该类继承自collections.Hashable抽象基类。