例如,我有一个列表:
lst = ["abc bca","bca abc","cde def"]
我想将“abc bca”和“bca abc”元素视为相同/重复,应该采用什么方法?
答案 0 :(得分:3)
>>> [' '.join(j) for j in set(tuple(sorted(i.split())) for i in lst)]
['abc bca', 'cde def']
这种方法的工作方式是首先在空格上分割字符串
>>> [i.split() for i in lst]
[['abc', 'bca'], ['bca', 'abc'], ['cde', 'def']]
然后对每个子列表进行排序
>>> [tuple(sorted(i.split())) for i in lst]
[('abc', 'bca'), ('abc', 'bca'), ('cde', 'def')]
最后,您可以创建set
,因为我们转换为tuple
可以播放(而list
不是)。
>>> set(tuple(sorted(i.split())) for i in lst)
{('abc', 'bca'), ('cde', 'def')}
最外面的列表理解只是使用join
来重新创建空白连接的原始字符串。
答案 1 :(得分:1)
>>> from collections import Counter
>>> lst = ["abc bca","bca abc","cde def"]
>>> c = Counter(lst)
>>> c
Counter({'abc bca': 1, 'cde def': 1, 'bca abc': 1})
>>> for i in c:
... if c[i]>1:
... print i
...
>>> lst = ["abc","bca","bca","abc","cde","def"]
>>> c = Counter(lst)
>>> for i in c:
... if c[i]>1:
... print i
...
abc
bca
>>>
答案 2 :(得分:0)
您可以将您的字符串更改为单词集:
>>> lst = ["abc bca","bca abc","cde def"]
>>> new_lst = [frozenset(x.split(' ')) for x in lst]
然后你可以只使用一些method of finding duplicates in the list:
>>> print [item for item, count in collections.Counter(new_lst).items() if count > 1]
[frozenset(['abc', 'bca'])]
>>>
答案 3 :(得分:0)
我不确定你的意思是“我想要考虑相同的元素”,但如果你想要返回一组“独特”的项目,你可以使用这种方法:
original_list = ["abc bca", "bca abc", "cde def"]
modified_list = []
for original_one_item in original_list:
original_one_items = original_one_item.split(' ')
original_one_items.sort()
modified_list.append(" ".join(original_one_items))
modified_list = set(modified_list)
这将从第一个列表中删除"bca abc"
项并返回一个集。