我想通过告知2个元素重复多少次来缩短具有重复元素的列表。
list1 = ["New York", "California", "Illinois", "Texas", "Illinois", "Texas", "Illinois", "Texas", "Illinois", "Texas", "Illinois", "Texas", "Indiana"]
repetitives = []
for num, element in enumerate(list1):
if element == list1[num - 2]:
repetitives.append(element)
core_repetitives = repetitives[0:2]
string_repetitives = ",".join(repetitives)
string_core_repetitives = ",".join(core_repetitives)
repetitives_times = string_repetitives.count(string_core_repetitives)
string_list1 = ",".join(list1)
print string_list1.replace(string_repetitives, "(" + "-".join(core_repetitives) + ") " + str(repetitives_times) + " times")
输出结果为:
New York,California,(Illinois-Texas) 4 times,Illinois,Texas,Indiana
显然它错过了1次。
问题是列表"重复"没有得到正确的部分,从第34行开始;如果element == list1 [num - 2]:"。
通过如何正确认识"伊利诺伊州 - 德克萨斯州"重复了5次?
谢谢。
相关问题
上述问题是已知的2个要素。但是如果重复部分是重复数量不明的组合怎么办?
例如:
list2 = ["New York", "California", "Illinois", "Texas", "Indiana", "Ohio", "North Carolina", "Washington", "Illinois", "Texas", "Indiana", "Ohio", "North Carolina", "Washington", "Colorado", "Michigan"]
如何分辨[" Illinois"," Texas"," Indiana"," Ohio"," North Carolina&# 34;,"华盛顿"]在这里重复了两次?
再次感谢。
答案 0 :(得分:1)
以下是我将如何实现您的代码:
from collections import OrderedDict
def repeats(lst):
return [el for el in lst if lst.count(el) > 1]
def shorten(lst):
repeat_els = repeats(lst)
new_lst = [el for el in lst if el not in repeat_els]
repeats_str = '-'.join(repeat_els)
core_repeats = '-'.join(list(OrderedDict.fromkeys(repeat_els)))
repeat_times = repeats_str.count(core_repeats)
first_repeat_index = lst.index(repeat_els[0])
repeats_str = '({}) {}'.format(core_repeats, repeat_times)
new_lst.insert(first_repeat_index, repeats_str)
return ','.join(new_lst)
概括地说:上面的代码首先将重复和非重复的元素分成两个单独的列表。然后,它将重复的元素格式化为正确的字符串格式,将格式化的字符串添加到非重复元素列表中的正确位置,然后将整个非重复元素列表','.join
编辑在一起。
这是一个演示:
>>> list1 = ["New York", "California", "Illinois",
... "Texas", "Illinois", "Texas", "Illinois",
... "Texas", "Illinois", "Texas", "Illinois",
... "Texas", "Indiana"]
>>>
>>> shorten(list1)
'New York,California,(Illinois-Texas) 5,Indiana'
>>>
>>> list2 = ["New York", "California", "Illinois",
... "Texas", "Indiana", "Ohio",
... "North Carolina", "Washington", "Illinois",
... "Texas", "Indiana", "Ohio",
... "North Carolina", "Washington", "Colorado",
... "Michigan"]
>>> shorten(list2)
'New York,California,(Illinois-Texas-Indiana-Ohio-North Carolina-Washington) 2,Colorado,Michigan'
>>>
答案 1 :(得分:0)
我想到了一种操纵第一次尝试的方法,让它看起来更好......
笨拙并且不是真正的技术。
即使它看起来不错,但实际上它是错的 - 无论它出现在哪里(伊利诺伊州 - 德克萨斯州)都是额外的(但是它应该只考虑(伊利诺伊州 - 德克萨斯州)从第一次尝试时错过的时间。) / p>
list1 = ["New York", "California", "Illinois", "Texas", "Illinois", "Texas", "Illinois", "Texas", "Illinois", "Texas", "Illinois", "Texas", "Indiana"]
repetitives = []
for num, element in enumerate(list1):
if element == list1[num - 2]:
repetitives.append(element)
core_repetitives = repetitives[0:2]
string_repetitives = ",".join(repetitives)
string_core_repetitives = ",".join(core_repetitives)
repetitives_times = string_repetitives.count(string_core_repetitives)
string_list1 = ",".join(list1)
first_try = string_list1.replace(string_repetitives, "(" + "-".join(core_repetitives) + ") " + str(repetitives_times) + " times")
extra_count = first_try.count(string_core_repetitives)
actual_times = repetitives_times + extra_count
second_try = string_list1.replace(string_repetitives, "(" + "-".join(core_repetitives) + ") " + str(actual_times) + " times")
print second_try.replace(string_core_repetitives, "").replace(",,", ",")
输出是:
New York,California,(Illinois-Texas) 5 times,Indiana
答案 2 :(得分:-1)
这会将您的字词与其在列表中出现的内容进行映射
from collections import Counter
occurrences = Counter(list1)
然后您可以基于它创建新地图
sublists = {}
for k, v in occurrences.iteritems():
sublists.setdefault(v, []).append(k)