Question

我正在尝试训练编码器解码器网络，该网络对特定维度的特征进行编码并将其解码为不同维度的特征。更具体地说，输入要素看起来像[500、215、7]（批，帧，通道），而输出要素看起来像[500、110、7]。我尝试了两种方法：

线性自动编码器

def encoder(x):
    with tf.name_scope('Encoder'):
       x = np.reshape(x, [-1, 215*7])
       net = tf.layers.dense(inputs=x, units=2000, activation=tf.nn.relu)
       net = tf.layers.dense(inputs=net, units=1000, activation=tf.nn.relu)
       net = tf.layers.dense(inputs=net, units=500, activation=tf.nn.relu)
       net = tf.layers.dense(inputs=net, units=250, activation=tf.nn.relu)
       code = tf.layers.dense(inputs=net, units=125, activation=tf.nn.relu)
return code

def decoder(code):
    with tf.name_scope('Decoder'):
        net = tf.layers.dense(inputs=code, units=125, activation=tf.nn.relu)
        net = tf.layers.dense(inputs=net, units=500, activation=tf.nn.relu)
        net = tf.layers.dense(inputs=net, units=1000, activation=tf.nn.relu)
        net = tf.layers.dense(inputs=net, units=2000, activation=tf.nn.relu)
        net = tf.layers.dense(inputs=net, units=770, activation=tf.nn.relu)
        net = np.reshape(net, [-1, 110, 7])
return net

训练时，损失从20减少到12。训练后，我输入了[500，215，7]特征，并自动编码器生成了[500，110，7]特征。该网络能够生成3个非常接近实际[500、110、7]功能的频道。这是一个数据点：

Original [ 0.35101 2.6753289 -0.84253965 0.971104 -0.34277865 -0.4877893 0.011089 ] Generated [0.3437522 2.6829777 0. 0.9715183 0. 0. 0. ]

但是，所有数据点中有4个通道为0。

ConvolutionalAutoEncoder

def encoder(x):
    with tf.name_scope('Encoder'):
        net = tf.reshape(x, [-1, 215, 7, 1])
        net = tf.layers.conv2d(inputs=net, filters=32, kernel_size=[20, 5], padding="same", activation=tf.nn.relu)
        net = tf.layers.max_pooling2d(inputs=net, pool_size=[10, 1], strides=2)
        net = tf.layers.conv2d(inputs=net, filters=64, kernel_size=[1, 3], padding="same", activation=tf.nn.relu)
        net = tf.layers.max_pooling2d(inputs=net, pool_size=[10, 1], strides=2)
        net = tf.layers.flatten(net)
        net = tf.layers.dense(inputs=net, units=1024, activation=tf.nn.relu)
        net = tf.layers.dropout(inputs=net, rate=keep_prob)
return net

def decoder(code):
    with tf.name_scope('Decoder'):
        net = tf.layers.dense(inputs=code, units=6528, activation=tf.nn.relu)
        net = tf.reshape(net, [-1, 51, 2, 64])
        net = tf.image.resize_bilinear(net, size=[51, 4])
        net = tf.layers.conv2d(inputs=net, filters=32, kernel_size=[1, 3], padding="same", activation=tf.nn.relu)
        net = tf.image.resize_bilinear(net, size=[110, 7])
        net = tf.layers.conv2d(inputs=net, filters=1, kernel_size=[20, 5], padding="same", activation=tf.nn.relu)
        net = tf.reshape(net, [-1, 110, 7*1])
return net

损失从5.8降低至2.7。 这里也可以通过网络学习3个频道，但其他频道为0。

Original [ 1.046344 2.77010455 -0.45842518 0.882744 -0.36058491 -0.37393818 -0.01158755] Generated [1.0245576 2.7641041 0. 0.86774087 0. 0. 0. ]

我使用此损失函数训练了5000个纪元：

# Define loss
with tf.name_scope('Loss'):
    l2 = tf.sqrt(tf.reduce_sum(tf.square(tf.subtract(output, Y)), 1))
    cost = tf.reduce_mean(l2)

# Define optimizer
with tf.name_scope('Optimizer'):
    train_op = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)

我尝试了更多的网络，但他们只能学习2个频道。 为什么网络无法学习所有渠道？您能否指出一些我可以尝试的两端都具有不同功能的网络配置。

自动编码器神经网络无法为不同功能重建所有通道吗？

0 个答案: