Question

在tensorflow层中.dense（输入，单位，激活）实现具有任意激活功能的多层感知器层。

输出=激活（matmul（输入，权重）+偏差）

通常输入的形状为= [batch_size，input_size]，可能看起来像这样：（单位= 128，激活= tf.nn.relu是任意选择的）

inputx = tf.placeholder(float, shape=[batch_size, input_size])
dense_layer = tf.layers.dense(inputx, 128, tf.nn.relu)

我没有找到任何有关如何输入高维输入的文档，例如因为可能会有time_steps导致张量为shape = [time_step，batch_size，input_size]。这里想要的是将层应用于批次的每个元素的每个时间步的每个单个input_vector。换个说法，layers.dense（）的内部矩阵应该简单地以numpy样式使用广播。我在这里期望的行为实际发生了什么？即是：

inputx = tf.placeholder(float, shape=[time_step, batch_size, input_size])
dense_layer = tf.layers.dense(inputx, 128, tf.nn.relu)

在batch_size中每个元素的每个time_step上，将密集层应用于大小为input_size的每个输入？然后，这将导致一个张量（在上面的densed_layer中）为shape = [time_step，batch_size，128] 我问，例如tf.matmul不支持numpy样式的广播，因此我不确定tensorflow如何处理这些情况。

编辑：This post is related, but does not finally answer my question

Answer 1

您可以通过如下检查密集内核的形状来验证您的期望。

>>> inputx = tf.placeholder(float, shape=[2,3,4])
>>> dense_layer = tf.layers.dense(inputx, 128, tf.nn.relu)
>>> g=tf.get_default_graph()
>>> g.get_collection('variables')
[<tf.Variable 'dense/kernel:0' shape=(4, 128) dtype=float32_ref>, <tf.Variable 'dense/bias:0' shape=(128,) dtype=float32_ref>]

密集层的行为与转换层相同。

您可以将inputx视为具有宽= 2，高度= 3和channel = 4的图像，并将密集层视为具有128个滤镜且滤镜尺寸为1 * 1的转换层。

tf.layers.dense（）如何与更高暗淡的输入交互？

1 个答案: