深度学习:如何编写用于网络架构的伪算法

时间:2019-06-11 19:29:18

标签: python algorithm tensorflow machine-learning deep-learning

如何为任何深度学习模型编写伪算法?

我在图上浏览了几篇深度学习论文,并且有用于网络架构的伪算法,例如:

enter image description here

这是论文中描述的总体架构。

我想为一个自定义网络编写一种伪算法,网络结构非常简单,它使用动态rnn进行句子分类。

Input is sentence [ batch_size x max_sentence_length x embedding_dim ]

labels are [ batch_size ]

网络将句子发送给rnn并将特征向量发送到另一个纳米网络以获取注意力部分并获得注意力向量,然后将注意力向量添加一层致密层以重塑特征向量并获得概率值形状[batch_size]

因此,纳米网络是:

import numpy as np
#simple soft attention

def nano_network( logits, lstm_units ):

    # just for example 
    logits_ = tf.reshape(logits,[-1, lstm_units])
    attention_size = tf.get_variable(name='attention_size',
                                         shape=[lstm_units,1],
                                         dtype=tf.float32,
                                         initializer=tf.random_uniform_initializer(-0.01,0.01))

    attention_matmul =  tf.matmul(logits_,attention_size)
    output_reshape = tf.reshape(attention_matmul,[tf.shape(logits)[0],tf.shape(logits)[1],-1])
    return tf.squeeze(output_reshape)

简单的rnn网络:

import tensorflow as tf

#simple network

class Base_model(object):

    def __init__(self):

        tf.reset_default_graph()

        # define placeholders
        self.sentences        = tf.placeholder(tf.float32, [12, 50, 10], name='sentences') # batch_size x max_sentence_length x dim
        self.targets          = tf.placeholder(tf.int32, [12], name='labels'  )


        with tf.variable_scope('dynamic_rnn') as scope:
            cell = tf.nn.rnn_cell.LSTMCell(num_units=5, state_is_tuple=True)
            outputs, _states = tf.nn.dynamic_rnn(cell, self.sentences, dtype=tf.float32)


        #attention function
        self.output_s = nano_network(outputs,5)


        # simple linear projection
        self.output = tf.layers.dense(self.output_s,2)

        cross_entropy  = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=self.targets,logits=self.output)

        #loss calculation
        loss = tf.reduce_mean(cross_entropy)

        #train / network weights update
        self.train = tf.train.AdamOptimizer().minimize(loss)


        self.out = { 'loss':loss, 'train':self.train } 

如果有人要运行此代码,则可以使用以下随机值来测试代码:

# #model train

# def rand_exec(model):
#     with tf.Session() as sess:
#         sess.run(tf.global_variables_initializer())

#         for i in range(100):
#             loss_ = sess.run(model.out,
#                      feed_dict = {
#                          model.sentences  : np.random.randint(0, 10, [12,50,10]),
#                          model.targets  : np.random.randint(0, 2, [12] )})

#             print(loss_['loss'])

# # wdim, hdim, vocab_size, num_labels,threshold,relation_embeddings,relation_dim,t,adj_file
# if __name__ == '__main__':

#     model = Base_model()
#     out = rand_exec(model)

现在,我想为这个简单的网络编写伪算法。

我试图将这种网络体系结构转换为伪算法,这是我的伪代码:

enter image description here

Here is textual format of this algorithm

但是我很困惑它是否正确,例如参数更新规则和算法中的其他内容。

如果有人能给我一些有关如何为该网络体系结构编写伪算法的建议,我将不胜感激。

谢谢!

0 个答案:

没有答案
相关问题