我尝试实现《边缘检测的Richer卷积特征》一文中提到的CNN。这是我的张量流代码:
首先,我定义一个函数以方便地为conv-layer创建内核:
def CreateFilter(str, x, y, z, k):
return tf.get_variable(str, shape=[x, y, z, k], initializer=tf.contrib.layers.xavier_initializer())
RCF的计算图,但我仅构建了第一阶段:
g1 = tf.Graph()
with g1.as_default():
x = tf.placeholder(tf.float32, shape=[None, None, None, 3], name="x-input")
y = tf.placeholder(tf.float32, shape=[None, None, None, 3], name="y-input")
## stage 1
layer1 = tf.nn.conv2d(x, CreateFilter('filter1', 3, 3, 3, 64), strides=[1, 1, 1, 1], padding='SAME', name='stage1_layer1')
layer2 = tf.nn.conv2d(layer1, CreateFilter('filter2', 3, 3, 64, 64), strides=[1, 1, 1, 1], padding='SAME', name='stage1_layer2')
layer1_side = tf.nn.conv2d(layer1, CreateFilter('side_filter1', 1, 1, 64, 21), strides=[1, 1, 1, 1], padding='SAME', name='stage1_layer1')
layer2_side = tf.nn.conv2d(layer2, CreateFilter('side_filter2', 1, 1, 64, 21), strides=[1, 1, 1, 1], padding='SAME', name='stage1_layer1')
eltwise_layer1 = tf.add(layer1_side, layer2_side, name='eltwise_layer1')
# deconv1 = tf.nn.conv2d_transpose(eltwise_layer1, ,name='deconv1')
y_output = tf.nn.conv2d(eltwise_layer1, CreateFilter('Output_filter1', 1, 1, 21, 3), strides=[1, 1, 1, 1], padding='SAME',
name='stage1_output_layer')
pool_layer1 = tf.nn.max_pool(layer2, ksize=[1, 2, 2, 1], strides=[1, 1, 1, 1], padding='SAME', name='pool_layer1')
## stage 2
## stage 3
## stage 4
## stage 5
## fusion stage
# loss function
loss_fn = tf.losses.sigmoid_cross_entropy(y_output, y)
#loss_fn_mean = tf.reduce_mean(loss_fn)
# train model
optimizer = tf.train.AdamOptimizer(1e-4)
train = optimizer.minimize(loss_fn) # 训练目标:最小化损失函数
# initialize
init = tf.initialize_all_variables()
然后训练模型,数据数量为70(图片),批量为2:
steps = 1000
with tf.Session(graph=g1) as sess:
sess.run(init)
writer = tf.summary.FileWriter('./graph', sess.graph)
# training process
for i in range(steps):
X = np.array([cv2.imread('./Data/X' + str((i % 70)+1) + '.jpg'), cv2.imread('./Data/X' + str(((i + 1) % 70)+1) + '.jpg')]) / 255
Y = np.array([cv2.imread('./Data/Y' + str((i % 70)+1) + '.jpg'), cv2.imread('./Data/Y' + str(((i + 1) % 70)+1) + '.jpg')]) / 255
print('****************')
print(i)
sess.run(train, feed_dict={x: X, y: Y})
predict = sess.run(y_output, feed_dict={x: X})[0, :, :, :]*255
cv2.imwrite('./predict.png', predict)
在训练过程中,输出(预测)变得越来越大。 训练后,我发现网络的输出(预测)被尝试为inf。也许内核的权重被尝试为inf。 这个问题的原因是什么?网络架构本身吗?