塑造Tensorflow / TFLearn图像输入/输出的问题

时间:2016-10-15 21:30:02

标签: python machine-learning computer-vision neural-network tensorflow

要了解有关深度学习和计算机视觉的更多信息,我正在开展一项项目,以便在道路上进行车道检测。我使用TFLearn作为Tensorflow的包装。

背景

训练输入是道路图像(每个图像表示为50x50像素的2D阵列,每个元素的亮度值为0.0到1.0)。

训练输出的形状相同(50x50阵列),但代表标记的车道区域。基本上,非道路像素为0,道路像素为1。

这不是固定大小的图像分类问题,而是从图片中检测道路与非道路像素的问题。

问题

我无法以TFLearn / Tensorflow接受的方式成功塑造我的输入/输出,我不知道为什么。这是我的示例代码:

# X = An array of training inputs (of shape (50 x 50)).
# Y = An array of training outputs (of shape (50 x 50)).

# "None" equals the number of samples in my training set, 50 represents
# the size of the 2D image array, and 1 represents the single channel
# (grayscale) of the image.
network = input_data(shape=[None, 50, 50, 1])

network = conv_2d(network, 50, 50, activation='relu')

# Does the 50 argument represent the output shape? Should this be 2500?
network = fully_connected(network, 50, activation='softmax')

network = regression(network, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001)

model = tflearn.DNN(network, tensorboard_verbose=1)

model.fit(X, Y, n_epoch=10, shuffle=True, validation_set=(X, Y), show_metric=True, batch_size=1)

我收到的错误是model.fit来电,错误:

ValueError: Cannot feed value of shape (1, 50, 50) for Tensor u'InputData/X:0', which has shape '(?, 50, 50, 1)'

我尝试将样本输入/输出数组缩减为1D向量(长度为2500),但这会导致其他错误。

我对如何塑造这一切感到有点迷茫,任何帮助都将不胜感激!

2 个答案:

答案 0 :(得分:1)

查看tensorflow的imageflow包装器,它将包含多个图像的numpy数组转换为.tfrecords文件,这是使用张量流https://github.com/HamedMP/ImageFlow的建议格式。

您必须使用

进行安装
$ pip install imageflow

假设您的numpy数组包含一些' k'图像为k_images,相应的k标签(单热编码)存储在k_labels中,然后创建一个名为&t; tfr_file.tfrecords'的.tfrecords文件。就像写行

一样简单
imageflow.convert_images(k_images, k_labels, 'tfr_file')

另外,Google的Inception模型包含一个代码来读取文件夹中的图像,假设每个文件夹代表一个标签https://github.com/tensorflow/models/blob/master/inception/inception/data/build_image_data.py

答案 1 :(得分:1)

错误表明您具有冲突的张量形状,大小为4,另一个大小为3.这是由于输入数据(X)不是形状[-1,50,50,1]。这里需要的只是在送入网络之前将X重塑为正确的形状。

# X = An array of training inputs (of shape (50 x 50)).
# Y = An array of training outputs (of shape (50 x 50)).
# "None" equals the number of samples in my training set, 50 represents
# the size of the 2D image array, and 1 represents the single channel
# (grayscale) of the image.

X = tensorflow.reshape(X, shape[-1, 50, 50, 1])
network = input_data(shape=[None, 50, 50, 1])

network = conv_2d(network, 50, 50, activation='relu')

# Does the 50 argument represent the output shape? Should this be 2500?
network = fully_connected(network, 50, activation='softmax')

network = regression(network, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001)

model = tflearn.DNN(network, tensorboard_verbose=1)

model.fit(X, Y, n_epoch=10, shuffle=True, validation_set=(X, Y), show_metric=True, batch_size=1)