假设您有一个列表,例如:
first_list = ['a', 'b', 'c']
你有以下清单:
second_list = ['a', 'a b c', 'abc zyx', 'ab cc ac']
如何根据整个第一个列表中的元素与第二个列表中单个字符串的任何部分匹配的总次数,创建一个简单地重新排序第二个列表的函数?
为了进一步明确:
我的尝试:
first_list = ['a', 'b', 'c']
second_list = ['a', 'a b c', 'abc zyx', 'ab cc ac']
print second_list
i = 0
for keyword in first_list:
matches = 0
for s in second_list:
matches += s.count(keyword)
if matches > second_list[0].count(keyword):
popped = second_list.pop(i)
second_list.insert(0, popped)
print second_list
答案 0 :(得分:3)
最直接的方法是使用key
parameter of the sorted
built-in function:
<form runat="server" defaultbutton="Button1">
对要排序的列表中的每个项目调用一次键函数。由于>>> sorted(second_list, key = lambda s: sum(s.count(x) for x in first_list), reverse=True)
['ab cc ac', 'a b c', 'abc zyx', 'a']
需要线性时间,因此这仍然是低效的。
答案 1 :(得分:2)
类似的答案:
first_list = ['a', 'b', 'c']
second_list = ['a', 'a b c', 'abc zyx', 'ab cc ac']
#Find occurrences
list_for_sorting = []
for string in second_list:
occurrences = 0
for item in first_list:
occurrences += string.count(item)
list_for_sorting.append((occurrences, string))
#Sort list
sorted_by_occurrence = sorted(list_for_sorting, key=lambda tup: tup[0], reverse=True)
final_list = [i[1] for i in sorted_by_occurrence]
print(final_list)
['ab cc ac', 'a b c', 'abc zyx', 'a']
答案 2 :(得分:1)
这是一种不稳定的方法:
>>> l1 = ['a', 'b', 'c']
>>> l2 = ['a', 'a b c', 'abc zyx', 'ab cc ac']
>>> [s for _, s in sorted(((sum(s2.count(s1) for s1 in l1), s2) for s2 in l2), reverse=True)]
['ab cc ac', 'abc zyx', 'a b c', 'a']
如果需要稳定排序,您可以使用enumerate
:
>>> l1 = ['a', 'b', 'c']
>>> l2 = ['a', 'a b c', 'ccc ccc', 'bb bb bb', 'aa aa aa']
>>> [x[-1] for x in sorted(((sum(s2.count(s1) for s1 in l1), -i, s2) for i, s2 in enumerate(l2)), reverse=True)]
['ccc ccc', 'bb bb bb', 'aa aa aa', 'a b c', 'a']
上面会生成元组,其中第二项是来自l2
的字符串,第一项是来自l1
的匹配项:
>>> tuples = [(sum(s2.count(s1) for s1 in l1), s2) for s2 in l2]
>>> tuples
[(1, 'a'), (3, 'a b c'), (3, 'abc zyx'), (6, 'ab cc ac')]
然后这些元组按降序排列:
>>> tuples = sorted(tuples, reverse=True)
>>> tuples
[(6, 'ab cc ac'), (3, 'abc zyx'), (3, 'a b c'), (1, 'a')]
最后只拍摄字符串:
>>> [s for _, s in tuples]
['ab cc ac', 'abc zyx', 'a b c', 'a']
在第二个版本中,元组具有反向索引以确保稳定性:
>>> [(sum(s2.count(s1) for s1 in l1), -i, s2) for i, s2 in enumerate(l2)]
[(1, 0, 'a'), (3, -1, 'a b c'), (3, -2, 'abc zyx'), (6, -3, 'ab cc ac')]