在defaultdict中查找最接近的键

时间:2017-02-16 08:28:52

标签: python defaultdict

我使用defaultdicts存储值列表,其中keys是可以观察到值的句点。 从所有感兴趣的时段列表中查找时,我想在我的defaultdict中找到最接近的时段(注意:并非所有时段都存储在defaultdict中)。

由于默认情况未排序,因此以下方法不会返回正确的值。

是否有另一种方法可以返回默认分区的最近可用密钥?

from collections import defaultdict
import numpy as np

def_dict = defaultdict(list)
# entries that will be stored in the defaultdict
reg_dict = {0: ["a", "b"], 2: ["c", "d"], 5: ["k", "h"], -3: ["i", "l"]}

# store items from regular dict in defaultdict 
for k, v in reg_dict.items():
    def_dict[k] = v

# Lookup periods
periods = [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8]

for period in periods:

    # this approach does not return the right keys as defaultdicts are not sorted
    closest_key = np.abs(np.array(list(def_dict.keys())) - period).argmin()

    print("period: ", period, " - looked up key: ", closest_key)

返回以下内容:

period:  -1  - looked up key:  0
period:  0  - looked up key:  0
period:  1  - looked up key:  0
period:  2  - looked up key:  1
period:  3  - looked up key:  1
period:  4  - looked up key:  2
period:  5  - looked up key:  2
period:  6  - looked up key:  2
period:  7  - looked up key:  2
period:  8  - looked up key:  2

4 个答案:

答案 0 :(得分:2)

使用OrderedDict和排序键,您可以使用二进制搜索。 对于大量的键,查找将比您当前的方法快得多。

由于您需要最近的密钥,因此您需要找到最右边的密钥低于x,最左边的密钥高于x。找到最右边的密钥低于x的索引i后,另一个候选者(最左边的密钥高于x)将在索引i+1上。

您需要确保这些索引仍在您的数组中。

最后,您只需要从这两个值计算到x的距离。

这是bisectnp.searchsorted

的文档

答案 1 :(得分:1)

我理解的方式,你想要一个与此相似的输出?

[0, 0, 0, 2, 2, 5, 5, 5, 5, 5]

对于上述情况,逻辑将是

closest_key = [min(def_dict.keys(), key = lambda x: abs(x - p)) for p in periods]

指定在python函数中构建的可选key参数在这种情况下很有用。

答案 2 :(得分:1)

我同意@septra您需要euqlidean距离,但这也可以通过numpy实现:

from collections import defaultdict
import numpy as np

def_dict = defaultdict(list)
# entries that will be stored in the defaultdict
reg_dict = {0: ["a", "b"], 2: ["c", "d"], 5: ["k", "h"], -3: ["i", "l"]}

# store items from regular dict in defaultdict 
for k, v in reg_dict.items():
    def_dict[k] = v

periods = [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8]
a = list(def_dict.keys())
for period in periods:
    closest_key  = np.sqrt(np.power(np.add(a, -period),2)).argmin()
    # OR closest_key  = np.abs(np.add(a, -period)).argmin()

    print("period: ", period, " - looked up key: ", a[closest_key])

答案 3 :(得分:1)

正如Eric所说,为了有效地做到这一点,你应该使用二进制搜索。然而,如果键的数量很小,则简单的线性搜索可能是足够的。没有必要使用defaultdict或OrderedDict,只需对键进行排序。

import numpy as np

# entries
reg_dict = {0: ["a", "b"], 2: ["c", "d"], 5: ["k", "h"], -3: ["i", "l"]}

keys = np.array(sorted(reg_dict.keys()))
print('keys', keys)

# Lookup periods
periods = np.arange(-1, 9)

for period in periods:
    closest_key = keys[np.abs(keys - period).argmin()]
    print("period: ", period, " - looked up key: ", closest_key)

<强>输出

keys [-3  0  2  5]
period:  -1  - looked up key:  0
period:  0  - looked up key:  0
period:  1  - looked up key:  0
period:  2  - looked up key:  2
period:  3  - looked up key:  2
period:  4  - looked up key:  5
period:  5  - looked up key:  5
period:  6  - looked up key:  5
period:  7  - looked up key:  5
period:  8  - looked up key:  5