我是python的新手,我有一个字典。我想从dict中找出最大值字段,例如索引0和1 。在dict中有一个通用值,即1 。因此,我想确定最大值为 0.8 并指出。
0: ['1', 'Metrolink', 0.7054569125175476],
1: ['1', 'Toronto', 0.8],
就像我一样,我想对所有其他值也做同样的事情。
这是我完整的字典。
d={
0: ['1', 'Metrolink', 0.7054569125175476],
1: ['1', 'Toronto', 0.8],
4: ['2', 'Residence Inn Bentonville', 0.721284806728363],
5: ['2', 'Bentonville, Arkansas', 0.8],
7: ['2', 'Rogers', 0.5609406232833862],
8: ['2', 'Toronto', 0.8],
10: ['2', 'Arkansas', 0.8871413469314575],
12: ['2', 'CA', 0.5339972972869873],
14: ['3', 'Toronto', 0.8],
19: ['3', 'ik', 0.555569052696228],
21: ['4', 'DL', 0.47785162925720215],
22: ['4', 'MS', 0.5182732939720154],
23: ['4', 'Nashville International Airport', 0.8],
27: ['4', 'Turkey', 0.8],
30: ['5', 'Hebron, Kentucky', 0.8],
32: ['5', 'OAK PARK', 0.6157999038696289],
35: ['5', 'USA', 0.5055036544799805],
36: ['5', 'Tennessee', 0.5752009153366089],
37: ['5', 'Recov', 0.6585434675216675],
38: ['5', 'County (United States)', 0.8],
40: ['6', 'SFO', 0.6019220948219299],
42: ['6', 'Ontario', 0.8],
45: ['7', 'United States', 0.6973987221717834],
47: ['7', 'Buckingham Gate', 0.8],
48: ['7', 'London', 0.9545853137969971],
53: ['8', 'Phoenix, Arizona', 0.8],
55: ['8', 'STE', 0.5046005249023438],
56: ['8', 'TULSA', 0.7144339680671692],
58: ['8', 'UNITED STATES OF AMERICA', 0.8454625606536865],
60: ['9', 'RDU', 0.6373313069343567],
61: ['9', 'Raleigh–Durham International Airport', 0.8],
65: ['9', 'Piauí', 0.8],
69: ['9', 'CAR', 0.6243148446083069],
71: ['10', 'MONMOUTH JUNCTION', 0.7259661555290222],
72: ['10', 'New Jersey', 0.8],
76: ['10', 'PVK', 0.6593300104141235],
79: ['10', 'TWW', 0.6495188474655151],
81: ['10', 'Morrisville, Bucks County, Pennsylvania', 0.8],
84: ['10', 'United States', 0.8],
88: ['10', 'New Brunswick, New Jersey', 0.8]
答案 0 :(得分:2)
Pandas是处理此类表格数据的非常有效的工具。您可以根据数据创建一个熊猫DataFrame:
import pandas as pd
df = pd.DataFrame(d).T
df.columns = ('group', 'place', 'value')
然后只打印出最大值
df[df['value'] == df.groupby('group')['value'].transform('max')]
给出
Out[41]:
group place value
1 1 Toronto 0.8
10 2 Arkansas 0.887141
14 3 Toronto 0.8
23 4 Nashville International Airport 0.8
27 4 Turkey 0.8
30 5 Hebron, Kentucky 0.8
38 5 County (United States) 0.8
42 6 Ontario 0.8
48 7 London 0.954585
58 8 UNITED STATES OF AMERICA 0.845463
61 9 RaleighDurham International Airport 0.8
65 9 Piauí 0.8
72 10 New Jersey 0.8
81 10 Morrisville, Bucks County, Pennsylvania 0.8
84 10 United States 0.8
88 10 New Brunswick, New Jersey 0.8
如果要获取原始格式的输出,可以使用df.to_dict
In [47]: df[df['value'] == df.groupby('group')['value'].transform('max')].T.to_dict(orient='list')
Out[47]:
{1: ['1', 'Toronto', 0.8],
10: ['2', 'Arkansas', 0.8871413469314575],
14: ['3', 'Toronto', 0.8],
23: ['4', 'Nashville International Airport', 0.8],
27: ['4', 'Turkey', 0.8],
30: ['5', 'Hebron, Kentucky', 0.8],
38: ['5', 'County (United States)', 0.8],
42: ['6', 'Ontario', 0.8],
48: ['7', 'London', 0.9545853137969971],
58: ['8', 'UNITED STATES OF AMERICA', 0.8454625606536865],
61: ['9', 'RaleighDurham International Airport', 0.8],
65: ['9', 'Piauí', 0.8],
72: ['10', 'New Jersey', 0.8],
81: ['10', 'Morrisville, Bucks County, Pennsylvania', 0.8],
84: ['10', 'United States', 0.8],
88: ['10', 'New Brunswick, New Jersey', 0.8]}
.T
仅对表进行转置。df.groupby('group')['value']
返回一个SeriesGroupBy对象,该对象的行为与常规的pandas.Series对象非常相似。这样,我们可以使用transform
方法来计算每个value
的最大值group
。df['value'] == df.groupby('group')['value'].transform('max')
创建一个布尔掩码,用于按df[mask]
选择最大行。答案 1 :(得分:0)
听起来您想在每个子键(每个条目的值的第一项)中获得最大值。为此,您可以使用以下代码:
from collections import defaultdict
max_values = defaultdict(lambda: (float('-inf'), None))
for label, text, value in d.values():
max_values[label] = max(max_values[label], (value, text))
在此处使用defaultdict
,其默认值为(float('-inf'), None)
,使我们可以将新的最大值与旧的新值进行比较,而不必检查是否首先记录了最大值。
max_values
的最终结果是:
{
'1': (0.8, 'Toronto'),
'2': (0.8871413469314575, 'Arkansas'),
'3': (0.8, 'Toronto'),
'4': (0.8, 'Turkey'),
'5': (0.8, 'Hebron, Kentucky'),
'6': (0.8, 'Ontario'),
'7': (0.9545853137969971, 'London'),
'8': (0.8454625606536865, 'UNITED STATES OF AMERICA'),
'9': (0.8, 'Raleigh–Durham International Airport'),
'10': (0.8, 'United States')
}
答案 2 :(得分:0)
您可以使用以下代码获得排序字典:
dict(sorted(d.items(), key=lambda kv:(int(kv[1][0]), kv[1][2])))
如果您要基于第一个元素和第二个元素进行排序,则会出现云:
dict(sorted(d.items(), key=lambda kv:(int(kv[1][0]), kv[1][1])))