如果第二个元素相同,则将列表中的第一个元素合并为元组?

时间:2017-02-03 17:00:20

标签: python nlp nltk

我有一个元组列表:

[('Donald', 'PERSON'), ('Trump', 'PERSON'), ('enters', 'O'), ('the', 'O'), ('White', 'LOCATION'), ('House', 'LOCATION')]

我想要的输出是:

[('Donald Trump'), ('enters the'), ('White House')]

下面的代码让我更接近想要的结果,但我还不熟悉groupby函数。

mergedTags = []
    from itertools import groupby
    for tag, chunk in groupby(tagList, lambda x: x[1]):
        if tag != "O":
            tagMerged = " ".join(w for w, t in chunk)
            mergedTags.extend([tagMerged])
        else:
            #tagMerged = " ".join(t for t, w in chunk)
            for word, chunk in groupby(tagList, lambda x: x[0]):
                mergedTags.extend([word])

    print(mergedTags)

1 个答案:

答案 0 :(得分:1)

您可以将itertools.groupby列表理解表达式一起使用:

from itertools import groupby
my_list = [('Donald', 'PERSON'), ('Trump', 'PERSON'), ('enters', 'O'), ('the', 'O'), ('White', 'LOCATION'), ('House', 'LOCATION')]

output_list = [tuple(i[0] for i in e) for _, e in groupby(my_list, lambda x: x[1])]
#                 ^ generate the desired tuple

output_list保留的值为:

[('Donald', 'Trump'), ('enters', 'the'), ('White', 'House')]