要了解有关深度学习和计算机视觉的更多信息,我正在开展一项项目,以便在道路上进行车道检测。我使用TFLearn作为Tensorflow的包装。
背景
训练输入是道路图像(每个图像表示为50x50像素的2D阵列,每个元素的亮度值为0.0到1.0)。
训练输出的形状相同(50x50阵列),但代表标记的车道区域。基本上,非道路像素为0,道路像素为1。
这不是固定大小的图像分类问题,而是从图片中检测道路与非道路像素的问题。
问题
我无法以TFLearn / Tensorflow接受的方式成功塑造我的输入/输出,我不知道为什么。这是我的示例代码:
# X = An array of training inputs (of shape (50 x 50)).
# Y = An array of training outputs (of shape (50 x 50)).
# "None" equals the number of samples in my training set, 50 represents
# the size of the 2D image array, and 1 represents the single channel
# (grayscale) of the image.
network = input_data(shape=[None, 50, 50, 1])
network = conv_2d(network, 50, 50, activation='relu')
# Does the 50 argument represent the output shape? Should this be 2500?
network = fully_connected(network, 50, activation='softmax')
network = regression(network, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001)
model = tflearn.DNN(network, tensorboard_verbose=1)
model.fit(X, Y, n_epoch=10, shuffle=True, validation_set=(X, Y), show_metric=True, batch_size=1)
我收到的错误是model.fit
来电,错误:
ValueError: Cannot feed value of shape (1, 50, 50) for Tensor u'InputData/X:0', which has shape '(?, 50, 50, 1)'
我尝试将样本输入/输出数组缩减为1D向量(长度为2500),但这会导致其他错误。
我对如何塑造这一切感到有点迷茫,任何帮助都将不胜感激!
答案 0 :(得分:1)
查看tensorflow的imageflow包装器,它将包含多个图像的numpy数组转换为.tfrecords文件,这是使用张量流https://github.com/HamedMP/ImageFlow的建议格式。
您必须使用
进行安装$ pip install imageflow
假设您的numpy数组包含一些' k'图像为k_images
,相应的k标签(单热编码)存储在k_labels
中,然后创建一个名为&t; tfr_file.tfrecords'的.tfrecords文件。就像写行
imageflow.convert_images(k_images, k_labels, 'tfr_file')
另外,Google的Inception模型包含一个代码来读取文件夹中的图像,假设每个文件夹代表一个标签https://github.com/tensorflow/models/blob/master/inception/inception/data/build_image_data.py
答案 1 :(得分:1)
错误表明您具有冲突的张量形状,大小为4,另一个大小为3.这是由于输入数据(X)不是形状[-1,50,50,1]。这里需要的只是在送入网络之前将X重塑为正确的形状。
# X = An array of training inputs (of shape (50 x 50)).
# Y = An array of training outputs (of shape (50 x 50)).
# "None" equals the number of samples in my training set, 50 represents
# the size of the 2D image array, and 1 represents the single channel
# (grayscale) of the image.
X = tensorflow.reshape(X, shape[-1, 50, 50, 1])
network = input_data(shape=[None, 50, 50, 1])
network = conv_2d(network, 50, 50, activation='relu')
# Does the 50 argument represent the output shape? Should this be 2500?
network = fully_connected(network, 50, activation='softmax')
network = regression(network, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001)
model = tflearn.DNN(network, tensorboard_verbose=1)
model.fit(X, Y, n_epoch=10, shuffle=True, validation_set=(X, Y), show_metric=True, batch_size=1)