是否已将tf.layers.dense连接到不同的输入,或者已将tf.train.optimizer优化张量?

时间:2018-08-08 22:08:52

标签: python tensorflow

我不熟悉tensorflow和更高级的机器学习,因此我尝试通过手动实现而不是使用tf.contrib.rnn.RNNCell更好地掌握RNN。我的第一个问题是,我需要展开网络进行反向传播,因此需要遍历整个序列,并且需要保持一致的权重和偏差,因此无法每次都使用tf.layers.dense重新初始化密集层,但是我也需要将我的图层连接到序列的当前时间步,并且我找不到改变密集图层所连接内容的方法。要解决此问题,我尝试实现自己的tf.layers.dense版本,在我收到错误消息之前,它工作正常:NotImplementedError(“试图更新Tensor” ...),当我尝试优化自定义密集层时

我的代码:

import tensorflow as tf
import numpy as np
from tensorflow.contrib import rnn
import random

# -----------------
# WORD PARAMETERS
# -----------------

target_string = ['Hello ','Hello ','World ','World ', '!']
number_input_words = 1

# --------------------------
# TRAINING HYPERPARAMETERS
# --------------------------

training_steps = 4000
batch_size = 9
learning_rate = 0.01
display_step = 150
hidden_cells = 20

# ----------------------
# PREPARE DATA AS DICT
# ----------------------

# TODO AUTOMATICALLY CREATE DICT
dictionary = {'Hello ': 0, 'World ': 1, '!': 2}
reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys()))
vocab_size = len(dictionary)
# ------------
# LSTM MODEL
# ------------

class LSTM:

    def __init__(self, sequence_length, number_input_words, hidden_cells,     mem_size_x, mem_size_y, learning_rate):

    self.sequence = tf.placeholder(tf.float32, (sequence_length, vocab_size), 'sequence')

    self.memory = tf.zeros([mem_size_x, mem_size_y])

    # sequence_length = self.sequence.shape[0]
    units = [vocab_size, 5,4,2,6, vocab_size]
    weights = [tf.random_uniform((units[i-1], units[i])) for i in range(len(units))[1:]]
    biases = [tf.random_uniform((1, units[i])) for i in range(len(units))[1:]]

    self.total_loss = 0
    self.outputs = []

    for word in range(sequence_length-1):
        sequence_w = tf.reshape(self.sequence[word], [1, vocab_size])
        layers = []
        for i in range(len(weights)):
            if i == 0:
                layers.append(tf.matmul(sequence_w, weights[0]) + biases[0])
            else:
                layers.append(tf.matmul(layers[i-1], weights[i]) + biases[i])
        percentages = tf.nn.softmax(logits=layers[-1])
        self.outputs.append(percentages)
        self.total_loss += tf.losses.absolute_difference(tf.reshape(self.sequence[word+1], (1, vocab_size)), tf.reshape(percentages, (1, vocab_size)))

    optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
    self.train_operation = optimizer.minimize(loss=self.total_loss, var_list=weights+biases, global_step=tf.train.get_global_step())




lstm = LSTM(len(target_string), number_input_words, hidden_cells, 10, 5, learning_rate)

# ---------------
# START SESSION
# ---------------
with tf.Session() as sess:
    sess.run(tf.local_variables_initializer())
    sess.run(tf.global_variables_initializer())

   sequence = []

    for i in range(len(target_string)):
        x = [0]*vocab_size
        x[dictionary[target_string[i]]] = 1
        sequence.append(x)
        print(sequence)
        for x in range(1000):
            sess.run(lstm.train_operation, feed_dict={lstm.sequence: sequence})
        prediction, loss = sess.run((lstm.outputs, lstm.total_loss), feed_dict=    {lstm.sequence: sequence})
        print(prediction)
        print(loss)

任何回答告诉我如何将tf.layers.dense每次连接到不同变量,或者告诉我如何解决NotImplementedError的问题将不胜感激。对于这个问题冗长或措辞不佳,我深表歉意,我仍然是stackoverflow的新手。

编辑:

我已经将代码的LSTM类部分更新为: (在def init 内部)

    self.sequence = [tf.placeholder(tf.float32, (batch_size, vocab_size), 'sequence') for _ in range(sequence_length-1)]

    self.total_loss = 0
    self.outputs = []

    rnn_cell = rnn.BasicLSTMCell(hidden_cells)
    h = tf.zeros((batch_size, hidden_cells))

    for i in range(sequence_length-1):
        current_sequence = self.sequence[i]
        h = rnn_cell(current_sequence, h)
        self.outputs.append(h)

但是,我仍然在网上遇到一个错误:h = rnn_cell(current_sequence, h)关于无法迭代张量。我不是要遍历任何张量,如果我是我,就不是故意的。

1 个答案:

答案 0 :(得分:0)

因此,有一种解决此问题的标准方法(这是我所了解的最好方法),而不是尝试创建密集层的新列表。请执行下列操作。在此之前,假设您的隐藏层大小为h_dim,展开步骤数为num_unroll,批处理大小为batch_size

  1. 在for循环中,您为每个展开的输入计算RNNCell的输出

h = tf.zeros(...) outputs= [] for ui in range(num_unroll): out, state = rnn_cell(x[ui],state) outputs.append(out)

  1. 现在将所有outputs合并为单个张量[batch_size*num_unroll, h_dim]

  2. 通过单个[h_dim, num_classes]

  3. 的密集层发送

logits = tf.matmul(tf.concat(outputs,...), w) + b predictions = tf.nn.softmax(logits)

您现在拥有所有展开输入的登录信息。现在,只需将张量重塑为[batch_size, num_unroll, num_classes]张量即可。

已编辑(数据输入):数据将以num_unroll许多占位符的列表形式显示。所以,

x = [tf.placeholder(shape=[batch_size,3]...) for ui in range(num_unroll)]

现在说您拥有如下数据,

Hello world bye Bye hello world

此处批处理大小为2,序列长度为3。转换为一种热编码后,您的数据如下所示(形状[time_steps, batch_size, 3]

data = [ [ [1,0,0], [0,0,1] ], [ [0,1,0], [1,0,0] ], [ [0,0,1], [0,1,0] ] ]

现在以以下格式输入数据。

feed_dict = {} for ui in range(3): feed_dict[x[ui]] = data[ui]