Question

我正在尝试遍历列表但同时在之前引用这些项目，以便我可以进行比较。

这是我的代码：

list1=[(1,'a','hii'),(2,'a','byee'),(3,'a','yoo'),(4,'b','laa'),(5,'a','mehh')]

我想循环遍历我的list1元组，这样如果元组中的第二个值与之前元组中的第二个值（两个=='a'）相同，则连接元组中的第三个项目。

输出我想要：

list2=[('a','hii,byee,yoo'),('b','laa'),('a','mehh')]

我尝试了什么：

for item in list1:
    for item2 in list2:
            if item[0]==(item2[0]-1) and item[1]==item2[1]:
                     print item[2]+','+item2[2]
            elif item[0] != item2[0]-1:
                    continue
            elif item[0]==(item2[0]-1) and item[1] != item2[1]:
                     print item[2]

输出错误

hii,byee
byee,yoo
yoo
laa

从前2个输出开始，似乎循环只查看前面的值，而不是前面的2个或更多值。因此它只加入了2个单词，而不是它应该有的3个单词。输出也最终有重复。

我如何解决这个问题？

Answer 1

我使这个方式变得比它需要的更难

def combine(inval):
    outval = [inval[0]]
    for item in inval[1:]:
        if item[0] == outval[-1][0] + 1 and item[1] == outval[-1][1]:
            outval[-1] = (item[0], item[1], ",".join([outval[-1][2], item[2]]))
            continue
        outval.append(item)
    return [(item[1], item[2]) for item in outval]

并测试它......

list1 = [(1,'a','hii'),(2,'a','byee'),(3,'a','yoo'),(4,'b','laa'),(5,'a','mehh')]
list2 = [(1,'a','hii'),(3,'a','byee'),(4,'a','yoo'),(5,'b','laa'),(6,'a','mehh')]
list3 = [(1,'a','hoo'),(3,'a','byee'),(5,'a','yoo'),(6,'a','laa'),(7,'a','mehh'),(9, 'b', 'nope')]

for l in (list1, list2, list3):
    print "IN:", l
    print "OUT:", combine(l)
    print

输出

IN: [(1, 'a', 'hii'), (2, 'a', 'byee'), (3, 'a', 'yoo'), (4, 'b', 'laa'), (5, 'a', 'mehh')]
OUT: [('a', 'hii,byee,yoo'), ('b', 'laa'), ('a', 'mehh')]

IN: [(1, 'a', 'hii'), (3, 'a', 'byee'), (4, 'a', 'yoo'), (5, 'b', 'laa'), (6, 'a', 'mehh')]
OUT: [('a', 'hii'), ('a', 'byee,yoo'), ('b', 'laa'), ('a', 'mehh')]

IN: [(1, 'a', 'hoo'), (3, 'a', 'byee'), (5, 'a', 'yoo'), (6, 'a', 'laa'), (7, 'a', 'mehh'), (9, 'b', 'nope')]
OUT: [('a', 'hoo'), ('a', 'byee'), ('a', 'yoo,laa,mehh'), ('b', 'nope')]

这既保证了第0个索引处的序列号，也保证了第1个索引处的相等值。

Answer 2

编辑：我已根据要求更新了算法。您可以通过调用group（values，sort = True）对具有相同键的所有元组进行分组，或者通过调用group（values）仅使用相同的键对相邻元组进行分组。此算法还会收集最终元组的键之后的所有元素，而不是仅抓取第三个元素。

GroupBy做得非常好。您可以按元组中的第二个元素对值进行分组。然后对于每个组，抓取组中的所有第三个元素并将它们连接成一个字符串：

import itertools

def keySelector(tup):
    return tup[1]

def group(values, sort=False):
    """
    Group tuples by their second element and return a list of 
    tuples (a, b) where a is the second element and b is the 
    aggregated string containing all of the remaining contents
    of the tuple.

    If sort=True, sort the tuples before grouping.  This will
    group all tuples with the same key.  Otherwise, only adjacent
    tuples wth the same key will be grouped.
    """

    if sort:
        values.sort(key=keySelector)

    grouped = itertools.groupby(values, key=keySelector)

    result = []
    for k, group in grouped:

        # For each element in the group, grab the remaining contents of the tuple
        allContents = [] 
        for tup in group:
            # Convert tuple to list, grab everything after the second item
            contents = list(tup)[2:]
            allContents.extend(contents)

        # Concatenate everything into one string
        aggregatedString = ','.join(allContents)

        # Add to results
        result.append((k, aggregatedString))

    return result

vals = [(1,'a','hii','abc','def'),
        (2,'a','byee'),
        (3,'a','yoo'),
        (4,'b','laa'),
        (5,'a','mehh','ghi','jkl')]

print(group(vals, sort=True))

输出：

[('a', 'hii,abc,def,byee,yoo,mehh,ghi,jkl'), ('b', 'laa')]

带有列表推导的缩短版本：

def getGroupContents(tuples):
    return ','.join(item for tup in tuples for item in list(tup)[2:])

def group(values, sort=False):
    if sort:
        values.sort(key=keySelector)

    grouped = itertools.groupby(values, key=keySelector)
    return [(k, getGroupContents(tuples)) for k, tuples in grouped]

列表循环中前面的项目的python引用

2 个答案: