Question

我有以下字典，其中的键是“ month，country：ID”，而值只是总数：

bool StringEquals(string string1, string string2)
{
    foreach (char ch in string1)
    {
        if (!string2.Contains(ch))
        {
            return false;
        }
    }
    return true;
}

实际的字典将比这本大得多。

我试图返回包含最高总数的每个“月，国家”的密钥。如果有平局，则ID用逗号分隔。基于以上字典的示例输出：

ID_dict = {'11,United Kingdom:14416': 129.22, '11,United Kingdom:17001': 357.6, 
'12,United States:14035': 90000.0, '12,United Kingdom:17850': 241.16,'12,United 
States:14099': 90000.0, '12,France:12583': 252.0, '12,United Kingdom:13047': 
215.13, '01,Germany:12662': 78.0, '01,Germany:12600': 14000}

我可以使用以下代码获取最高值的字符串：

'11,United Kingdom:17001'
'12,United Kingdom:17850'
'12,United States:14035, 14099'
'12,France:12583'
'01,Germany:12600'

但真的很难克服这一点。我正在尝试使用re.match和re.search，但并没有走得太远。

Answer 1

您可以找到每个month, country对的最大值，并将此关系存储在字典中。然后创建一个字典，将(month, country)对作为键，并将IDs的列表作为值，这些列表的值等于该(month, country)对的最大值：

import re

ID_dict = {'11,United Kingdom:14416': 129.22, '11,United Kingdom:17001': 357.6,
           '12,United States:14035': 90000.0, '12,United Kingdom:17850': 241.16, '12,United States:14099': 90000.0,
           '12,France:12583': 252.0, '12,United Kingdom:13047': 215.13, '01,Germany:12662': 78.0,
           '01,Germany:12600': 14000}

table = {tuple(re.split(',|:', key)[:2]): value for key, value in sorted(ID_dict.items(), key=lambda e: e[1])}

result = {}
for key, value in ID_dict.items():
    splits = re.split(',|:', key)
    if value == table[tuple(splits[:2])]:
        result.setdefault(tuple(splits[:2]), []).append(splits[2])

for key, value in result.items():
    print('{}:{}'.format(','.join(key), ', '.join(value)))

输出

01,Germany:12600
12,United States:14099, 14035
12,United Kingdom:17850
11,United Kingdom:17001
12,France:12583

上述方法为 O（nlogn），因为它使用sorted，要使其变为 O（n），您可以通过此循环更改字典理解：

table = {}
for s, v in ID_dict.items():
    key = tuple(re.split(',|:', s)[:2])
    table[key] = max(table.get(key, v), v)

Answer 2

以下代码使用“ month，country”键和（值，IDnum）列表作为值创建一个新字典。然后，它将对每个列表进行排序，并收集与最大值对应的所有IDnum。

ID_dict = {
    '11,United Kingdom:14416': 129.22, '11,United Kingdom:17001': 357.6, 
    '12,United States:14035': 90000.0, '12,United Kingdom:17850': 241.16,
    '12,United States:14099': 90000.0, '12,France:12583': 252.0, 
    '12,United Kingdom:13047': 215.13, '01,Germany:12662': 78.0, 
    '01,Germany:12600': 14000
}

# Create a new dict with 'month,country' keys 
# and lists of (value, IDnum) as the values
new_data = {}
for key, val in ID_dict.items():
    newkey, idnum = key.split(':')
    new_data.setdefault(newkey, []).append((val, idnum))

# Sort the values for each 'month,country' key,
# and get the IDnums corresponding to the highest values
for key, val in new_data.items():
    val = sorted(val, reverse=True)
    highest = val[0][0]
    # Collect all IDnums that have the highest value
    ids = []
    for v, idnum in val:
        if v != highest:
            break
        ids.append(idnum)
    print(key + ':' + ', '.join(ids))

输出

11,United Kingdom:17001
12,United States:14099, 14035
12,United Kingdom:17850
12,France:12583
01,Germany:12600

python字典中部分匹配键的最大值

2 个答案: