如何为任何深度学习模型编写伪算法?
我在图上浏览了几篇深度学习论文,并且有用于网络架构的伪算法,例如:
这是论文中描述的总体架构。
我想为一个自定义网络编写一种伪算法,网络结构非常简单,它使用动态rnn进行句子分类。
Input is sentence [ batch_size x max_sentence_length x embedding_dim ]
labels are [ batch_size ]
网络将句子发送给rnn并将特征向量发送到另一个纳米网络以获取注意力部分并获得注意力向量,然后将注意力向量添加一层致密层以重塑特征向量并获得概率值形状[batch_size]
因此,纳米网络是:
import numpy as np
#simple soft attention
def nano_network( logits, lstm_units ):
# just for example
logits_ = tf.reshape(logits,[-1, lstm_units])
attention_size = tf.get_variable(name='attention_size',
shape=[lstm_units,1],
dtype=tf.float32,
initializer=tf.random_uniform_initializer(-0.01,0.01))
attention_matmul = tf.matmul(logits_,attention_size)
output_reshape = tf.reshape(attention_matmul,[tf.shape(logits)[0],tf.shape(logits)[1],-1])
return tf.squeeze(output_reshape)
简单的rnn网络:
import tensorflow as tf
#simple network
class Base_model(object):
def __init__(self):
tf.reset_default_graph()
# define placeholders
self.sentences = tf.placeholder(tf.float32, [12, 50, 10], name='sentences') # batch_size x max_sentence_length x dim
self.targets = tf.placeholder(tf.int32, [12], name='labels' )
with tf.variable_scope('dynamic_rnn') as scope:
cell = tf.nn.rnn_cell.LSTMCell(num_units=5, state_is_tuple=True)
outputs, _states = tf.nn.dynamic_rnn(cell, self.sentences, dtype=tf.float32)
#attention function
self.output_s = nano_network(outputs,5)
# simple linear projection
self.output = tf.layers.dense(self.output_s,2)
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=self.targets,logits=self.output)
#loss calculation
loss = tf.reduce_mean(cross_entropy)
#train / network weights update
self.train = tf.train.AdamOptimizer().minimize(loss)
self.out = { 'loss':loss, 'train':self.train }
如果有人要运行此代码,则可以使用以下随机值来测试代码:
# #model train
# def rand_exec(model):
# with tf.Session() as sess:
# sess.run(tf.global_variables_initializer())
# for i in range(100):
# loss_ = sess.run(model.out,
# feed_dict = {
# model.sentences : np.random.randint(0, 10, [12,50,10]),
# model.targets : np.random.randint(0, 2, [12] )})
# print(loss_['loss'])
# # wdim, hdim, vocab_size, num_labels,threshold,relation_embeddings,relation_dim,t,adj_file
# if __name__ == '__main__':
# model = Base_model()
# out = rand_exec(model)
现在,我想为这个简单的网络编写伪算法。
我试图将这种网络体系结构转换为伪算法,这是我的伪代码:
Here is textual format of this algorithm
但是我很困惑它是否正确,例如参数更新规则和算法中的其他内容。
如果有人能给我一些有关如何为该网络体系结构编写伪算法的建议,我将不胜感激。
谢谢!