从Python字典中找出最大值

时间:2020-07-20 05:33:26

标签: python dictionary

我是python的新手,我有一个字典。我想从dict中找出最大值字段,例如索引0和1 。在dict中有一个通用值,即1 。因此,我想确定最大值为 0.8 并指出。

 0: ['1', 'Metrolink', 0.7054569125175476],
 1: ['1', 'Toronto', 0.8],

就像我一样,我想对所有其他值也做同样的事情。

这是我完整的字典。

 d={
 0: ['1', 'Metrolink', 0.7054569125175476],
 1: ['1', 'Toronto', 0.8],
 4: ['2', 'Residence Inn Bentonville', 0.721284806728363],
 5: ['2', 'Bentonville, Arkansas', 0.8],
 7: ['2', 'Rogers', 0.5609406232833862],
 8: ['2', 'Toronto', 0.8],
 10: ['2', 'Arkansas', 0.8871413469314575],
 12: ['2', 'CA', 0.5339972972869873],
 14: ['3', 'Toronto', 0.8],
 19: ['3', 'ik', 0.555569052696228],
 21: ['4', 'DL', 0.47785162925720215],
 22: ['4', 'MS', 0.5182732939720154],
 23: ['4', 'Nashville International Airport', 0.8],
 27: ['4', 'Turkey', 0.8],
 30: ['5', 'Hebron, Kentucky', 0.8],
 32: ['5', 'OAK PARK', 0.6157999038696289],
 35: ['5', 'USA', 0.5055036544799805],
 36: ['5', 'Tennessee', 0.5752009153366089],
 37: ['5', 'Recov', 0.6585434675216675],
 38: ['5', 'County (United States)', 0.8],
 40: ['6', 'SFO', 0.6019220948219299],
 42: ['6', 'Ontario', 0.8],
 45: ['7', 'United States', 0.6973987221717834],
 47: ['7', 'Buckingham Gate', 0.8],
 48: ['7', 'London', 0.9545853137969971],
 53: ['8', 'Phoenix, Arizona', 0.8],
 55: ['8', 'STE', 0.5046005249023438],
 56: ['8', 'TULSA', 0.7144339680671692],
 58: ['8', 'UNITED STATES OF AMERICA', 0.8454625606536865],
 60: ['9', 'RDU', 0.6373313069343567],
 61: ['9', 'Raleigh–Durham International Airport', 0.8],
 65: ['9', 'Piauí', 0.8],
 69: ['9', 'CAR', 0.6243148446083069],
 71: ['10', 'MONMOUTH JUNCTION', 0.7259661555290222],
 72: ['10', 'New Jersey', 0.8],
 76: ['10', 'PVK', 0.6593300104141235],
 79: ['10', 'TWW', 0.6495188474655151],
 81: ['10', 'Morrisville, Bucks County, Pennsylvania', 0.8],
 84: ['10', 'United States', 0.8],
 88: ['10', 'New Brunswick, New Jersey', 0.8]

3 个答案:

答案 0 :(得分:2)

Pandas是处理此类表格数据的非常有效的工具。您可以根据数据创建一个熊猫DataFrame:

import pandas as pd
df = pd.DataFrame(d).T
df.columns = ('group', 'place', 'value')

然后只打印出最大值

df[df['value'] == df.groupby('group')['value'].transform('max')]

给出

Out[41]:
   group                                    place     value
1      1                                  Toronto       0.8
10     2                                 Arkansas  0.887141
14     3                                  Toronto       0.8
23     4          Nashville International Airport       0.8
27     4                                   Turkey       0.8
30     5                         Hebron, Kentucky       0.8
38     5                   County (United States)       0.8
42     6                                  Ontario       0.8
48     7                                   London  0.954585
58     8                 UNITED STATES OF AMERICA  0.845463
61     9      RaleighDurham International Airport       0.8
65     9                                    Piauí       0.8
72    10                               New Jersey       0.8
81    10  Morrisville, Bucks County, Pennsylvania       0.8
84    10                            United States       0.8
88    10                New Brunswick, New Jersey       0.8

如果要获取原始格式的输出,可以使用df.to_dict

In [47]: df[df['value'] == df.groupby('group')['value'].transform('max')].T.to_dict(orient='list')
Out[47]:
{1: ['1', 'Toronto', 0.8],
 10: ['2', 'Arkansas', 0.8871413469314575],
 14: ['3', 'Toronto', 0.8],
 23: ['4', 'Nashville International Airport', 0.8],
 27: ['4', 'Turkey', 0.8],
 30: ['5', 'Hebron, Kentucky', 0.8],
 38: ['5', 'County (United States)', 0.8],
 42: ['6', 'Ontario', 0.8],
 48: ['7', 'London', 0.9545853137969971],
 58: ['8', 'UNITED STATES OF AMERICA', 0.8454625606536865],
 61: ['9', 'RaleighDurham International Airport', 0.8],
 65: ['9', 'Piauí', 0.8],
 72: ['10', 'New Jersey', 0.8],
 81: ['10', 'Morrisville, Bucks County, Pennsylvania', 0.8],
 84: ['10', 'United States', 0.8],
 88: ['10', 'New Brunswick, New Jersey', 0.8]}

简短说明

    可以使用字典作为参数来创建
  • Pandas数据框。值应为列表。 .T仅对表进行转置。
  • df.groupby('group')['value']返回一个SeriesGroupBy对象,该对象的行为与常规的pandas.Series对象非常相似。这样,我们可以使用transform方法来计算每个value的最大值group
  • df['value'] == df.groupby('group')['value'].transform('max')创建一个布尔掩码,用于按df[mask]选择最大行。

答案 1 :(得分:0)

听起来您想在每个子键(每个条目的值的第一项)中获得最大值。为此,您可以使用以下代码:

from collections import defaultdict

max_values = defaultdict(lambda: (float('-inf'), None))

for label, text, value in d.values():
    max_values[label] = max(max_values[label], (value, text))

在此处使用defaultdict,其默认值为(float('-inf'), None),使我们可以将新的最大值与旧的新值进行比较,而不必检查是否首先记录了最大值。

max_values的最终结果是:

{
    '1': (0.8, 'Toronto'), 
    '2': (0.8871413469314575, 'Arkansas'), 
    '3': (0.8, 'Toronto'), 
    '4': (0.8, 'Turkey'), 
    '5': (0.8, 'Hebron, Kentucky'),
    '6': (0.8, 'Ontario'), 
    '7': (0.9545853137969971, 'London'), 
    '8': (0.8454625606536865, 'UNITED STATES OF AMERICA'),
    '9': (0.8, 'Raleigh–Durham International Airport'), 
    '10': (0.8, 'United States')
}

答案 2 :(得分:0)

您可以使用以下代码获得排序字典:

dict(sorted(d.items(), key=lambda kv:(int(kv[1][0]), kv[1][2])))

如果您要基于第一个元素和第二个元素进行排序,则会出现云:

dict(sorted(d.items(), key=lambda kv:(int(kv[1][0]), kv[1][1])))