Question

TL; DR：

theano.scan相当于：

M = np.arange(9).reshape(3, 3)
for i in range(M.shape[0]):
    for j in range(M.shape[1]):
        M[i, j] += 5
M

可能（如果可行）而不使用嵌套的scan？

请注意，这个问题不希望具体说明如何将操作元素应用于矩阵，更一般地说，如何使用theano.scan如上所述的嵌套循环结构实现。

长版：

theano.scan（或等效地，在这种情况下，theano.map）允许通过简单地向sequences参数提供一系列元素来映射循环遍历多个索引的函数，类似于

import theano
import theano.tensor as T
M = T.dmatrix('M')
def map_func(i, j, matrix):
    return matrix[i, j] + i * j
results, updates = theano.scan(map_func,
            sequences=[T.arange(M.shape[0]), T.arange(M.shape[1])],
            non_sequences=[M])
f = theano.function(inputs=[M], outputs=results)
f(np.arange(9).reshape(3, 3))
#

大致相当于形式的python循环：

M = np.arange(9).reshape(3, 3)
for i, j in zip(np.arange(M.shape[0]), np.arange(M.shape[1])):
    M[i, j] += 5
M

将M的对角线中的所有元素增加5。

但是，如果我想找到theano.scan相当于：

，该怎么办？

M = np.arange(9).reshape(3, 3)
for i in range(M.shape[0]):
    for j in range(M.shape[1]):
        M[i, j] += 5
M

可能没有嵌套scan？

一种方法当然是flatten矩阵，scan通过扁平元素，然后reshape将其变为原始形状，类似

import theano
import theano.tensor as T
M = T.dmatrix('M')
def map_func(i, X):
    return X[i] + .5
M_flat = T.flatten(M)
results, updates = theano.map(map_func,
                              sequences=T.arange(M.shape[0] * M.shape[1]),
                              non_sequences=M_flat)
final_M = T.reshape(results, M.shape)
f = theano.function([M], final_M)
f([[1, 2], [3, 4]])

但有没有更好的方法不涉及明确展平和重塑矩阵？

Answer 1

以下是使用嵌套theano.scan调用如何实现此类事情的示例。在这个例子中，我们将数字3.141添加到矩阵的每个元素，有效地模拟H + 3.141的输出：

H = T.dmatrix('H')
def fn2(col, row, matrix):
    return matrix[row, col] + 3.141

def fn(row, matrix):
    res, updates = theano.scan(fn=fn2,
                               sequences=T.arange(matrix.shape[1]),
                               non_sequences=[row, matrix])
    return res

results, updates = theano.scan(fn=fn,
                               sequences=T.arange(H.shape[0]),
                               non_sequences=[H])
f = theano.function([H], results)
f([[0, 1], [2, 3]])
# array([[ 3.141,  4.141],
#        [ 5.141,  6.141]])

作为另一个例子，让我们向矩阵的每个元素添加其行和列索引的乘积：

H = T.dmatrix('H')
def fn2(col, row, matrix):
    return matrix[row, col] + row * col

def fn(row, matrix):
    res, updates = theano.scan(fn=fn2,
                               sequences=T.arange(matrix.shape[1]),
                               non_sequences=[row, matrix])
    return res

results, updates = theano.scan(fn=fn,
                               sequences=T.arange(H.shape[0]),
                               non_sequences=[H])
f = theano.function([H], results)
f(np.arange(9).reshape(3, 3))
# Out[2]:array([[  0.,   1.,   2.],
#               [  3.,   5.,   7.],
#               [  6.,   9.,  12.]])

如何使用theano扫描矩阵的所有元素？

TL; DR：

长版：

1 个答案: