我有一个字符串列表。字符串很长。我想删除前10个字符或更多字符相同的字符串,并只保留其中一个字符串。例如:
lst = ['I am going today to London', 'I am going today to Tokyo', 'My name is name']
应该给,
lst = ['I am going today to Tokyo', 'My name is name']
可以保留任何字符串。如何有效地做到这一点?
答案 0 :(得分:1)
使用set对象的解决方案:
lst = ['I am going today to London', 'I am going today to Tokyo', 'My name is name']
s10 = set()
result = []
for l in lst:
if (l[0:10] not in s10): result.append(l)
s10.add(l[0:10])
print(result)
输出:
['I am going today to London', 'My name is name']
l[0:10] not in s10
- 测试l[0:10]
行的前10个字符,非{<}> 集s10
中的非成员资格<{>} s10
填充唯一的10 - 字符序列)
答案 1 :(得分:0)
def pick_one_from_each_equivalence_class(original_list, equivalence_key_function):
res = []
equivalence_classes = set()
for value in original_list:
equivalence_class = equivalence_key_function(value)
if equivalence_class not in equivalence_classes:
res.append(value)
equivalence_classes.add(equivalence_class)
return res
pick_one_from_each_equivalence_class(lst, lambda x: x[:10])
答案 2 :(得分:0)
我会使用字典来获得唯一性:
dict((x[:10]: x) for x in lst).values()