Question

要了解有关深度学习和计算机视觉的更多信息，我正在开展一项项目，以便在道路上进行车道检测。我使用TFLearn作为Tensorflow的包装。

背景

训练输入是道路图像（每个图像表示为50x50像素的2D阵列，每个元素的亮度值为0.0到1.0）。

训练输出的形状相同（50x50阵列），但代表标记的车道区域。基本上，非道路像素为0，道路像素为1。

这不是固定大小的图像分类问题，而是从图片中检测道路与非道路像素的问题。

问题

我无法以TFLearn / Tensorflow接受的方式成功塑造我的输入/输出，我不知道为什么。这是我的示例代码：

# X = An array of training inputs (of shape (50 x 50)).
# Y = An array of training outputs (of shape (50 x 50)).

# "None" equals the number of samples in my training set, 50 represents
# the size of the 2D image array, and 1 represents the single channel
# (grayscale) of the image.
network = input_data(shape=[None, 50, 50, 1])

network = conv_2d(network, 50, 50, activation='relu')

# Does the 50 argument represent the output shape? Should this be 2500?
network = fully_connected(network, 50, activation='softmax')

network = regression(network, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001)

model = tflearn.DNN(network, tensorboard_verbose=1)

model.fit(X, Y, n_epoch=10, shuffle=True, validation_set=(X, Y), show_metric=True, batch_size=1)

我收到的错误是model.fit来电，错误：

ValueError: Cannot feed value of shape (1, 50, 50) for Tensor u'InputData/X:0', which has shape '(?, 50, 50, 1)'

我尝试将样本输入/输出数组缩减为1D向量（长度为2500），但这会导致其他错误。

我对如何塑造这一切感到有点迷茫，任何帮助都将不胜感激！

Answer 1

查看tensorflow的imageflow包装器，它将包含多个图像的numpy数组转换为.tfrecords文件，这是使用张量流https://github.com/HamedMP/ImageFlow的建议格式。

您必须使用

进行安装

$ pip install imageflow

假设您的numpy数组包含一些＆＃39; k＆＃39;图像为k_images，相应的k标签（单热编码）存储在k_labels中，然后创建一个名为＆t; tfr_file.tfrecords＆＃39;的.tfrecords文件。就像写行

一样简单

imageflow.convert_images(k_images, k_labels, 'tfr_file')

另外，Google的Inception模型包含一个代码来读取文件夹中的图像，假设每个文件夹代表一个标签https://github.com/tensorflow/models/blob/master/inception/inception/data/build_image_data.py

Answer 2

错误表明您具有冲突的张量形状，大小为4，另一个大小为3.这是由于输入数据（X）不是形状[-1,50,50,1]。这里需要的只是在送入网络之前将X重塑为正确的形状。

# X = An array of training inputs (of shape (50 x 50)).
# Y = An array of training outputs (of shape (50 x 50)).
# "None" equals the number of samples in my training set, 50 represents
# the size of the 2D image array, and 1 represents the single channel
# (grayscale) of the image.

X = tensorflow.reshape(X, shape[-1, 50, 50, 1])
network = input_data(shape=[None, 50, 50, 1])

network = conv_2d(network, 50, 50, activation='relu')

# Does the 50 argument represent the output shape? Should this be 2500?
network = fully_connected(network, 50, activation='softmax')

network = regression(network, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001)

model = tflearn.DNN(network, tensorboard_verbose=1)

model.fit(X, Y, n_epoch=10, shuffle=True, validation_set=(X, Y), show_metric=True, batch_size=1)

塑造Tensorflow / TFLearn图像输入/输出的问题

2 个答案: