如何在列表中找到相同/重复的元素(超过1个单词)?

时间:2017-10-12 12:21:28

标签: python python-2.7

例如,我有一个列表:

lst = ["abc bca","bca abc","cde def"]

我想将“abc bca”和“bca abc”元素视为相同/重复,应该采用什么方法?

4 个答案:

答案 0 :(得分:3)

>>> [' '.join(j) for j in set(tuple(sorted(i.split())) for i in lst)]
['abc bca', 'cde def']

这种方法的工作方式是首先在空格上分割字符串

>>> [i.split() for i in lst]
[['abc', 'bca'], ['bca', 'abc'], ['cde', 'def']]

然后对每个子列表进行排序

>>> [tuple(sorted(i.split())) for i in lst]
[('abc', 'bca'), ('abc', 'bca'), ('cde', 'def')]

最后,您可以创建set,因为我们转换为tuple可以播放(而list不是)。

>>> set(tuple(sorted(i.split())) for i in lst)
{('abc', 'bca'), ('cde', 'def')}

最外面的列表理解只是使用join来重新创建空白连接的原始字符串。

答案 1 :(得分:1)

>>> from collections import Counter
>>> lst = ["abc bca","bca abc","cde def"]
>>> c = Counter(lst)
>>> c
Counter({'abc bca': 1, 'cde def': 1, 'bca abc': 1})
>>> for i in c:
...     if c[i]>1:
...             print i
... 
>>> lst = ["abc","bca","bca","abc","cde","def"]
>>> c = Counter(lst)
>>> for i in c:
...     if c[i]>1:
...             print i
... 
abc
bca
>>> 

答案 2 :(得分:0)

您可以将您的字符串更改为单词集:

>>> lst = ["abc bca","bca abc","cde def"]
>>> new_lst = [frozenset(x.split(' ')) for x in lst]

然后你可以只使用一些method of finding duplicates in the list

>>> print [item for item, count in collections.Counter(new_lst).items() if count > 1]
[frozenset(['abc', 'bca'])]
>>>

答案 3 :(得分:0)

我不确定你的意思是“我想要考虑相同的元素”,但如果你想要返回一组“独特”的项目,你可以使用这种方法:

original_list = ["abc bca", "bca abc", "cde def"]
modified_list = []

for original_one_item in original_list:
    original_one_items = original_one_item.split(' ')
    original_one_items.sort()
    modified_list.append(" ".join(original_one_items))

modified_list = set(modified_list)

这将从第一个列表中删除"bca abc"项并返回一个集。