Question

我在 Python 2.7 中有一个numpy索引数组，它们对应于字典中的值。所以我想从字典中创建一个相应值的numpy数组。代码可能会立即显示：

import numpy as np
indices = np.array([(0, 1), (2, 0), (2, 0)], dtype=[('A', int), ('B', int)])
d = {(0, 1): 10,
     (2, 0): 9}
values = d[(indices['A'], indices['B'])]

最后一行中的电话不可以播放（我试图找到way to make a np.array hashable，但它不起作用）：

TypeError: unhashable type: 'numpy.ndarray'

我可以通过循环替换它，但这需要很长时间来编写变量values：

np.array([d[(indices[i]['A'], indices[i]['B'])] for i in range(len(indices))])

或者是否有任何替代方法可以使这样的任务pythonic，即更快？变量indices无法更改，但我可以更改dict的类型。

修改

实际索引数组还包含其他条目。这就是我写这么复杂的电话的原因：

indices = np.array([(0, 1, 's'), (2, 0, 's'), (2, 0, 't')],
                   dtype=[('A', int), ('B', int), ('C', str)])

Answer 1

我相信你可以使用列表理解（它比普通的for循环方法快一点）。示例 -

values = [d[tuple(a)] for a in indices]

请注意，我使用的是d而不是dict，因为不建议将dict用作变量名称，因为这会影响内置类型{ {1}}。

演示 -

dict

更大的数组的更快方法是使用np.vectorize()来矢量化In [73]: import numpy as np In [74]: indices = np.array([(0, 1), (2, 0), (2, 0)], dtype=[('A', int), ('B', int)]) In [76]: d = {(0, 1): 10, ....: (2, 0): 9} In [78]: values = [d[tuple(a)] for a in indices] In [79]: values Out[79]: [10, 9, 9]方法，然后将其应用于dict.get()数组。示例 -

indices

演示时间结果 -

vecdget = np.vectorize(lambda x: d.get(tuple(x)))
vecdget(indices)

@hpaulj在评论中提出的新方法的时间测试 - In [88]: vecdget = np.vectorize(lambda x: d.get(tuple(x))) In [89]: vecdget(indices) Out[89]: array([10, 9, 9]) In [98]: indices = np.array([(0, 1), (2, 0), (2, 0)] * 100, dtype=[('A', int), ('B', int)]) In [99]: %timeit [d[tuple(a)] for a in indices] 100 loops, best of 3: 1.72 ms per loop In [100]: %timeit vecdget(indices) 1000 loops, best of 3: 341 µs per loop -

[d.get(x.item()) for x in indices]

转换：np.array of indices到np.array对应的dict条目

修改

1 个答案: