Question

我正在尝试根据值字典填充一个numpy数组。

我的字典看起来像这样：

A = {（12,15）：4，（532,31）：7，（742,1757）：1，...}

我正在尝试填充数组，以便（使用上面的示例）4位于索引（12,15），依此类推。 A中的键被称为'd'，'s'，该值被称为'count'。

A = {（d，s）：count}

目前我填充数组的代码如下：

N = len(seen)
Am = np.zeros((N,N), 'i')
for key, count in A.items():
    Am[d,s] = count

但这只会导致多个阵列大部分都被创建为零。

谢谢

Answer 1

这是一种方法 -

def dict_to_arr(A):
    idx = np.array(list(A.keys()))
    val = np.array(list(A.values()))

    m,n = idx.max(0)+1 # max extents of indices to decide o/p array
    out = np.zeros((m,n), dtype=val.dtype)
    out[idx[:,0], idx[:,1]] = val # or out[tuple(idx.T)] = val
    return out

如果我们避免索引和值的数组转换并在最后一步直接使用它们进行分配，那么可能会更快 -

out[zip(*A.keys())] = list(A.values())

示例运行 -

In [3]: A = {(12, 15): 4, (532, 31): 7, (742, 1757): 1}

In [4]: arr = dict_to_arr(A)

In [5]: arr[12,15], arr[532,31], arr[742,1757]
Out[5]: (4, 7, 1)

存储到稀疏矩阵

为了节省内存并可能获得性能，我们可能希望存储在稀疏矩阵中。让我们用csr_matrix来做，就像这样 -

from scipy.sparse import csr_matrix

def dict_to_sparsemat(A):
    idx = np.array(list(A.keys()))
    val = np.array(list(A.values()))
    m,n = idx.max(0)+1
    return csr_matrix((val, (idx[:,0], idx[:,1])), shape=(m,n))

示例运行 -

In [64]: A = {(12, 15): 4, (532, 31): 7, (742, 1757): 1}

In [65]: out = dict_to_sparsemat(A)

In [66]: out[12,15], out[532,31], out[742,1757]
Out[66]: (4, 7, 1)

使用包含索引的字典填充NumPy数组

1 个答案: