Tensorflow - 语义分割

时间:2016-07-31 11:04:25

标签: matplotlib neural-network tensorflow

在此发布以检查我在TensorFlow中实现简单语义分段模型是否有任何问题。这段代码表示我只使用数据库中的单个图像进行的健全性检查,我正试图过度填充模型。 这是一个二元分类问题,每个图像像素映射到地面实况标签中的[0,1]。

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

img = plt.imread('image.png') #Image of size [750,750,3]
img = plt.imread('map.png') # Ground Truth of size [750, 750]

img = np.expand_dims(img, 0)
lab = np.expand_dims(lab, 0)

w1 = tf.Variable(tf.constant(0.001, shape=[3,3,3,32]))
b1 = tf.Variable(tf.constant(0.0, shape=[32]))

w2 = tf.Variable(tf.constant(0.001, shape=[3,3,32,2]))
b2 = tf.Variable(tf.constant(0.0, shape=[2]))

mul = tf.nn.conv2d(img, w1, strides=[1,1,1,1], padding='SAME')
bias_add = tf.add(mul, b1)
conv1 = tf.nn.relu(bias_add)

mul2 = tf.nn.conv2d(conv1, w2, strides=[1,1,1,1], padding='SAME')
bias_add2 = tf.add(mul2, b2)
conv2 = tf.nn.relu(bias_add2)

sess = tf.InteractiveSession()

lab = lab.astype('int32')

conv2_out = tf.reshape(conv2, [-1, 2])
lab = np.reshape(lab, [-1])

prediction = tf.nn.softmax(pred) # I use this to visualize prediction of the model, and calculate accuracy

loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(conv2_out, lab))

optimizer = tf.train.AdamOptimizer(0.001).minimize(loss)

correct_pred = tf.equal(tf.argmax(prediction, 1), lab)
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.int32)

tf.initialize_all_variables().run()

step = 1 
iter = 5

while step < iter:
    sess.run(optimizer, feed_dict={x: img, y: lab})
    loss_val,acc = sess.run([loss,accuracy], feed_dict={x: img, y: lab})
    print ("Iter:"+ str(step) +" Loss : " + "{:.6f}".format(loss_val)#+ " Accuracy : " + "{:.6f}".format(acc))
    step += 1

print ("optimization finished!")

prediction_logits = prediction.eval()
weights = w1.eval() # first layer learned weights

prediction_logits = np.reshape(prediction_logits, [750,750,2])

plt.figure() # Plotting original image with predicted labels
plt.imshow(img[0,:,:,:])
plt.imshow(prediction_logits[:,:,0], cmap=plt.cm.binary)
plt.show()

plt.figure() # Plotting first layer weights
for i in range(32):
    plt.subplot(8,4,i+1)
    plt.imshow(weights[:,:,:,i])
plt.show()

当我运行它(作为一个交互式会话)时,只是为了训练模型在这个单个图像上过度拟合,损失最小化,但我的准确性似乎没有改变。我不太确定我理解tf.argmax函数是如何工作的,或者我是否正确实现了它 - 无论迭代次数多少,精度都会保持单一值。

思考?另外,我是否正确地绘制图形和预测标签,或者这里有任何错误吗? (任何其他错误 - 或者我没有遵循的最佳实践,请指出它们)

此外,对权重实施正则化的推荐方法是什么?我发现tf.contrib.layers.l2_regularizer是一个可行的选择 - 但是如何将它包含在这种情况中呢?与损失函数的简单总和?

0 个答案:

没有答案