我有一个非常荒谬的看起来像这样的名单。
[['Biking', '10'], ['Biking|Gym', '14'], ['Biking|Gym|Hiking', '9'], ['Biking|Gym|Hiking|Running', '27']]
我想把它变成['Type',total,%]的格式,如下所示:
[['Biking',60,'34.7%'],['Gym',50,'28.9%'],['Hiking',36,'20.8%'],['Running',27,'15.6%']]
我确信我这样做是最困难的方式 - 有人能指出我更好的方向吗?我以前用过itertools.groupby,看起来它可能是一个好地方,但我不确定如何在这种情况下实现。
# TODO: This is totally ridiculous.
running = 0
hiking = 0
gym = 0
biking = 0
no_exercise = 0
for r in exercise_types_l:
if 'Running' in r[0]:
running += int(r[1])
if 'Hiking' in r[0]:
hiking += int(r[1])
if 'Gym' in r[0]:
gym += int(r[1])
if 'Biking' in r[0]:
biking += int(r[1])
if 'None' in r[0]:
no_exercise += int(r[1])
total = running + hiking + gym + biking + no_exercise
l = list()
l.append(['Running', running, '{percent:.1%}'.format(percent=running/total)])
l.append(['Hiking', hiking, '{percent:.1%}'.format(percent=hiking/total)])
l.append(['Gym', gym, '{percent:.1%}'.format(percent=gym/total)])
l.append(['Biking', biking, '{percent:.1%}'.format(percent=biking/total)])
l.append(['None', no_exercise, '{percent:.1%}'.format(percent=no_exercise/total)])
l = sorted(l, key=lambda r: r[1], reverse=True)
答案 0 :(得分:2)
给出一个像
这样的初始列表>>> test_list = [['Biking', '10'], ['Biking|Gym', '14'], ['Biking|Gym|Hiking', '9'], ['Biking|Gym|Hiking|Running', '27']]
你可以先编造defaultdict
来总结价值(获得最终结果的第二个元素),比如
>>> from collections import defaultdict
>>> final_dict = defaultdict(int)
>>> for keys, values in test_list:
for elem in keys.split('|'):
final_dict[elem] += int(values)
>>> final_dict
defaultdict(<type 'int'>, {'Gym': 50, 'Biking': 60, 'Running': 27, 'Hiking': 36})
然后,您可以使用列表推导来获得最终结果。
>>> final_sum = float(sum(final_dict.values()))
>>> [(elem, num, str(num/final_sum)+'%') for elem, num in final_dict.items()]
[('Gym', 50, '0.28901734104%'), ('Biking', 60, '0.346820809249%'), ('Running', 27, '0.156069364162%'), ('Hiking', 36, '0.208092485549%')]
因为,您希望对它们进行排序和格式化,将最终结果更改为。
>>> [(elem, num, '{:.1%}'.format(num/final_sum)) for elem, num in final_dict.items()]
[('Gym', 50, '28.9%'), ('Biking', 60, '34.7%'), ('Running', 27, '15.6%'), ('Hiking', 36, '20.8%')]
>>> from operator import itemgetter
>>> sorted([(elem, num, '{:.1%}'.format(num/final_sum)) for elem, num in final_dict.items()], key = itemgetter(1), reverse=True)
[('Biking', 60, '34.7%'), ('Gym', 50, '28.9%'), ('Hiking', 36, '20.8%'), ('Running', 27, '15.6%')]
答案 1 :(得分:1)
您可以在此处使用collections.defaultdict
。 dict是一种更好的数据结构,因为您可以访问与'Type'
类型中的任何O(1)
相关的值。
>>> from collections import defaultdict
>>> lis = [['Biking', '10'], ['Biking|Gym', '14'], ['Biking|Gym|Hiking', '9'], ['Biking|Gym|Hiking|Running', '27']]
>>> total = 0
>>> dic = defaultdict(lambda :[0])
for keys, val in lis:
keys = keys.split('|')
val = int(val)
total += val*len(keys)
for k in keys:
dic[k][0] += val
...
for k,v in dic.items():
dic[k].append(format(v[0]/float(total), '.2%'))
...
>>> dic
defaultdict(<function <lambda> at 0xb60e772c>,
{'Gym': [50, '28.90%'],
'Biking': [60, '34.68%'],
'Running': [27, '15.61%'],
'Hiking': [36, '20.81%']})
访问值:
>>> dic['Biking']
[60, '34.68%']
>>> dic['Hiking']
[36, '20.81%']
另一种方法是使用dict作为值而不是列表:
>>> dic = defaultdict(lambda :dict(val = 0))
>>> total = 0
for keys, val in lis:
keys = keys.split('|')
total += int(val)*len(keys)
for k in keys:
dic[k]['val'] += int(val)
...
for k,v in dic.items():
dic[k]['percentage'] = format(v['val']/float(total), '.2%')
...
>>> dic
defaultdict(<function <lambda> at 0xb60e7b8c>,
{'Gym': {'percentage': '28.90%', 'val': 50},
'Biking': {'percentage': '34.68%', 'val': 60},
'Running': {'percentage': '15.61%', 'val': 27},
'Hiking': {'percentage': '20.81%', 'val': 36}})
访问值:
#Return percentage related to 'Gym'
>>> dic['Gym']['percentage']
'28.90%'
#return the total sum of 'Biking'
>>> dic['Biking']['val']
60
答案 2 :(得分:1)
也许是这样的(注意:你可以使用一个默认值为0的collections.defaultdict,而不是使用data.get的东西......)?
sum=0
data={}
for extype, value in exercise_types_1:
for item in extype.split('|'):
sum += value
data[item]=data.get(item,0)+value
l=[]
for k,v in data.iteritems():
l.append([k,v, '{percent:.1%}'.format(percent=v/sum)])
l=sorted(l, key=lambda r: r[1], reverse=True)