Question

我有字典，

myDict = {1: 10, 1.1: 10, 2: 15, 2.1: 20}

而不是只有4个键值对，它有成千上万的键值对，有些对非常接近，例如在我的示例中，键1和键1.1有时可达机器epsilon。

是否有一个简单的过程，如何在将键的相应值相加时将它们合并在一起？在我的binwidth为1的示例中，它将变为

myBinnedDict = {1.05: 20, 2.05: 35}

我选择密钥作为先前密钥的平均值（甚至可以用相应密钥的值加权，但是由于这是特定于应用程序的，因此在这里并不重要）。

感谢您的帮助。

P.S .：我知道我到这里结束了，因为我可能没有熟练地使用数据结构。

Answer 1

您可以结合使用itertools.groupby进行一些单行理解：

from itertools import groupby
from statistics import mean

myDict = {1: 10, 1.1: 10, 2: 15, 2.1: 20}

{mean(keys): sum(vals) for keys, vals in (zip(*g) for _, g in groupby(sorted(myDict.items()), key=lambda x: round(x[0])))}

任何四舍五入到相同整数的东西将被分组在一起。

说明：

{
    mean(keys): sum(vals)
    for keys, vals in (
        zip(*g) for _, g in groupby(
            sorted(myDict.items()), 
            key=lambda x: round(x[0])
        )
    )
}

sorted(myDict.items())按键对字典进行排序（按字典顺序排序，键在前）。

groupby(sorted(myDict.items()), key=lambda x: round(x[0])))}通过四舍五入的键值对排序的项目进行分组。

zip(*g) for _, g in groupby(...)转换由groupby吐出的组。 groupby产生两件事：我们不需要的“键”（四舍五入的数字）（由_表示）和“组”，其格式为(key, val), (key, val), (key, val), etc. zip(*)将其转换为(key, key, key, ...), (val, val, val, ...)，这正是我们所需要的。

最后，mean(keys): sum(vals) for keys, vals in (...)通过分别应用mean和sum来变换键和值。

Answer 2

我们可以使用一些numpy来利用一些数组操作。

import numpy as np

myDict = {1: 10, 1.1: 10, 1.7: 6, 2: 15, 2.1: 20, 2.3: 50, 2.6: 1, 3: 1}

x = np.array([*myDict]) # just the keys from the dictionary

print(x)

array([1. , 1.1, 1.7, 2. , 2.1, 2.3, 2.6, 3. ])

clusters = x[x == x.astype(int)] # just the integers to get the bins

print(clusters)

array([1., 2., 3.])

digits = np.digitize(x, clusters) # bin the data based on the bins

print(digits)

array([1, 1, 1, 2, 2, 2, 2, 3])

res = dict()

for c in clusters:
    keys = x[digits == c] # grab all keys for this bin
    value = sum([myDict.get(k) for k in keys]) # sum values for these keys from the original dict
    res[keys.mean().round(2)] = value

print(res)

{1.27: 26, 2.25: 86, 3.0: 1}

根据垃圾箱组合字典键

2 个答案: