所以我在列表中有一些要处理的元素,基本上我希望做到这一点:
input:
my_list = ['Gold Trophy (January)', 'Gold Trophy (February)', 'Bronze Trophy
(March)']
output:
['Gold Trophy x2', 'Bronze Trophy (March)']
当有重复的公用字符串时(例如在Gold Trophy的情况下),我要删除这两个元素,并形成一个新元素,内容为(Gold Trophy x(重复数量))
这是我到目前为止所拥有的:
my_list = ['Gold Trophy (January)', 'Gold Trophy (February)', 'Bronze Trophy
(March)']
# function to count how many duplicates
def countX(my_list, myString):
count = 0
for ele in my_list:
if (myString in ele):
count = count + 1
return count
myString = 'Gold Trophy'
real_count = (countX(my_list, myString))
print(*my_list, sep=', ')
print('duplicates = '+str(countX(my_list, myString)))
这时,此代码运行并返回列表中指定字符串的重复项数量。关于从何处获得所需输出的任何想法?谢谢!
答案 0 :(得分:1)
这应该不使用正则表达式就可以工作。为了清楚起见,我发表了评论。
from collections import Counter
my_list = ['Gold Trophy (January)', 'Gold Trophy (February)', 'Bronze Trophy (March)']
output_ls = []
trophy_ls = []
month_ls = []
trophy_cnt_dc = {}
for item in my_list:
trophy_ls.append(item.split(' (')[0])
month_ls.append(item.split(' (')[1])
# print(trophy_ls) >> ['Gold Trophy', 'Gold Trophy', 'Bronze Trophy']
# print(month_ls) >> ['January)', 'February)', 'March)']
trophy_cnt_dc = dict(Counter(trophy_ls))
#print(trophy_cnt_dc) >> {'Gold Trophy': 2, 'Bronze Trophy': 1}
for k,v in trophy_cnt_dc.items():
if v > 1:
output_ls.append(k+' x'+str(v))
else:
ind = trophy_ls.index(k)
output_ls.append(k+' ('+month_ls[ind])
print(output_ls)
输出:
['Gold Trophy x2', 'Bronze Trophy (March)']
答案 1 :(得分:0)
这是一个解决方案(请参阅注释以进行澄清)。请注意,我使用了一些技巧来拆分名称和日期:我在(
上拆分,然后根据需要将其还原。可以使它更清洁,但尚不清楚是否需要。
my_list = ['Gold Trophy (January)', 'Gold Trophy (February)', 'Bronze Trophy (March)']
# Create map of tuples: (name, date)
pairs = [tuple(x.split('(')) for x in my_list]
# count the number of each name
counts = dict()
for (name, day) in pairs:
counts[name] = counts.get(name, 0) + 1
# create a dictionary from initial list
# it doesn't matter how collisions are resolved
# the dictionary is required to process each name only once
init = dict(pairs)
res = []
# for each name:
# if count is > 1, print the count
# if count is 1, then print its date
for (name, date) in init.items():
if counts[name] > 1:
res.append(name + 'x' + str(counts[name]))
else:
res.append(name + '(' + date)
print(res)