我有一个字符串列表,其中字符串的第一部分是列表中其他元素的子字符串。 我的目标是找到所有类似的字符串,即带有'ID_1'子字符串的元素,将它们添加到列表中,然后在“ =”之后加上它们各自的值。
示例:
start_list = ['ID_1=1', 'ID_1=2', 'ID_1=3', 'ID_2=4', 'ID_2=5', 'ID_2=6']
我尝试过使用for循环遍历start_list,创建各种嵌套列表,甚至尝试使用字典,但我一直盘旋而过。
我知道某处有一个优雅的解决方案。
我期望的输出是:
ID_1 = 6
ID_2 = 15
提前谢谢!
答案 0 :(得分:1)
您可以使用groupby
中的itertools
以优雅的方式做到这一点
l = ['ID_1=1', 'ID_1=2', 'ID_1=3', 'ID_2=4', 'ID_2=5', 'ID_2=6']
l_2 = sorted(x.split('=') for x in l)
from itertools import groupby
ans = [(k, sum(int(y) for x,y in g))
for k,g in groupby(l_2, key=lambda x: x[0])]
for key, value in ans:
print(key, '=', value)
其他优雅的解决方案可以是使用defaultdict或reduce
请注意,这是O(nlog(n))解决方案,因为您需要对列表进行排序
答案 1 :(得分:1)
您可以使用defaultdict。我发现它是最紧凑,最正确的变体。
代码:
from collections import defaultdict
start_list = ['ID_1=1', 'ID_1=2', 'ID_1=3', 'ID_2=4', 'ID_2=5', 'ID_2=6']
d = defaultdict(int)
lst = [item.split('=') for item in start_list]
for k, v in lst:
d[k] += int(v)
print(d.items())
输出:
dict_items([('ID_1', 6), ('ID_2', 15)])
您可以遍历d.items
来以所需格式打印数据。
代码:
for k, v in d.items():
print(f"{k}={v}")
输出:
ID_1=6
ID_2=15
答案 2 :(得分:1)
您可以使用collections.Counter
来跟踪总和。如果您愿意的话,与functools.reduce
结合使用,甚至可以将它变成单线的:
>>> from functools import reduce
>>> from collections import Counter
>>> start_list = ['ID_1=1', 'ID_1=2', 'ID_1=3', 'ID_2=4', 'ID_2=5', 'ID_2=6']
>>> reduce(lambda c, x: c.update({x[0]: int(x[1])}) or c,
... (x.split("=") for x in start_list), collections.Counter())
...
Counter({'ID_1': 6, 'ID_2': 15})
(这里,or c
使lambda
返回c
而不是update
的结果(None
)
答案 3 :(得分:0)
start_list = ['ID_1=1', 'ID_1=2', 'ID_1=3', 'ID_2=4', 'ID_2=5', 'ID_2=6']
dict = {}
for item in start_list:
k = item.split('=')[0]
if k in dict.keys():
dict[k] = int(dict[k])+int(item.split('=')[1])
else:
dict.update({k:int(item.split('=')[1])})
print (dict) # {'ID_1': 6, 'ID_2': 15}
for key,val in dict.items():
print ("{} = {}".format(key,val))
输出:
ID_1 = 6
ID_2 = 15
答案 4 :(得分:0)
考虑到这是您的第一个问题,我的方法是力求尽可能简单和直截了当,并在每一步中添加很多评论来详细解释。
虽然提供更复杂的代码或pythonic代码将是更好的解决方案,但最终可能会为您提供您无法轻易理解或自定义的代码。
start_list = ['ID_1=1', 'ID_1=2', 'ID_1=3', 'ID_2=4', 'ID_2=5', 'ID_2=6']
print start_list
# Here I am preparing an empty dictionary to store the counted keys and values
counted = {}
# Now I iterate through every string in start_list
for item in start_list:
# As 1st thing I will use split method to separate the current_key
current_key = item.split("=")[0]
# and the current value.
current_value = int(item.split("=")[1])
# Then I check if current_key (e.g. ID_1) is present in the
# count dictionary using "in"
if current_key in counted:
# If the key is present I update its value with the sum
# of its old value + new one
counted[current_key] = current_value + counted[current_key]
else:
# If the key doesn't exist it means that we are adding it
# to the counted dictionary for the 1st time
counted[current_key] = current_value
# Job is done!
print counted
# It is now easy to iterate through counted dict for further manipulation
# for example let's print the number of hits for ID_1
# You can use items() to enumerate keys and values in a dictionary
for key, value in counted.items():
if key == "ID_1":
print("Found ID_1 value: " + str(value))
# To obtain the output in your requirement
for key in counted.keys():
print( '%s = %d' %(key, counted[key]))
如果您想进一步了解split方法的工作原理,请参考以下示例:
https://www.w3schools.com/python/ref_string_split.asp
在其他答案中,您将找到更多简洁明了的方法来获得此结果。
因此,为了改进我编写的代码,建议您在此处阅读有关列表推导的更多信息:
https://www.pythonforbeginners.com/basics/list-comprehensions-in-python
黑客很开心!
答案 5 :(得分:0)
您可以使用列表理解+字典理解:
start_list = ['ID_1=1', 'ID_1=2', 'ID_1=3', 'ID_2=4', 'ID_2=5', 'ID_2=6']
l = [i.split('=') for i in start_list]
d = dict(l)
print({k:sum([int(i[1]) for i in l if i[0] == k]) for k,v in d.items()})
输出:
{'ID_1': 6, 'ID_2': 15}
答案 6 :(得分:0)
如果可以确保数据始终具有相同的格式,则可以遍历列表,然后创建一个字典来保存结果:
start_list = ['ID_1=1', 'ID_1=2', 'ID_1=3', 'ID_2=4', 'ID_2=5', 'ID_2=6']
result = {}
for item in start_list:
id, value = item.split('=')
# Create new key, value if key is not in 'result' dict, sum up if it does exists
result[id] = int(value) if not result.get(id) else (int(value) + result[id])
print(result) # {'ID_2': 15, 'ID_1': 6}
答案 7 :(得分:0)
您可以执行以下操作:
l = ['ID_1=1', 'ID_1=2', 'ID_1=3', 'ID_2=4', 'ID_2=5', 'ID_2=6']
def calculate_score_byid(s):
'''takes a list of items and adds up scores. returns a dictionary of scores'''
d = dict()
for i in l:
if i.split('=')[0] not in d.keys():
d[i.split('=')[0]]=int(i.split('=')[1])
else:
d[i.split('=')[0]]=int(d[i.split('=')[0]])+int(i.split('=')[1])
return d
calculate_score_byid(l)
for key in d.keys():
print( '%s = %d' %(key,d[key]))
>>>ID_1 = 6
>>>ID_2 = 15