我目前正在尝试使用tflearn开发CNN来检测对象。我的数据来自pickle文件,因此我没有任何.png文件或类似文件。我的图像存储为numpy.array,形状为:
(34799, 32, 32, 3)
34799是图像的数量,所以基本上形状是32,32,3。
我的CNN定义如下:
import tflearn
from tflearn.layers.core import input_data, fully_connected, flatten, dropout
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.estimator import regression
from tflearn.metrics import Accuracy
# Building convolutional network
def neural_network(X, y, dropoutRate=0.8):
network = input_data(shape=[None, 32, 32, 3], name='input')
network = conv_2d(network, nb_filter=6, filter_size=5, strides=1, activation='relu', padding="VALID")
network = conv_2d(network, 6, 4, activation='relu')
network = max_pool_2d(network, 2)
network = conv_2d(network, 16, 5, strides=1, activation="relu", padding="VALID")
network = max_pool_2d(network, 2, padding="VALID")
network = dropout(incoming=network, keep_prob=dropoutRate)
network = fully_connected(network, 84, activation="relu")
network = flatten(network)
network = fully_connected(network, 43, activation='softmax')
acc = Accuracy()
network = regression(network, optimizer='adam', learning_rate=0.001,
loss='categorical_crossentropy', name='target')
# Training
model = tflearn.DNN(network, tensorboard_verbose=0)
model.fit(X_test, y_test, n_epoch=7, batch_size=20, show_metric=True, snapshot_epoch=True, run_id="trafficSign", snapshot_step=500, validation_set=(X_valid, y_valid))
return model
我的问题是,当我使用内置的张量流函数将图像变为灰色时:
tf.image.rgb_to_grayscale(X_train)
这就是来自函数的张量
<tf.Tensor 'rgb_to_grayscale_6:0' shape=(34799, 32, 32, 1) dtype=float64>
但在改变CNN的第一部分时。 input_data()到形状[32,32,1]我得到一个错误,表明形状是错误的,它不能填充形状,因为它有形状[32,32]。
所以我的问题是,是否有一种简单的方法可以将1添加到我的形状中?
感谢您的帮助,如果您需要更多信息,请告诉我
答案 0 :(得分:2)
第一个解决方案您可以在netwrok中进行更改
network = input_data(shape=[None, 32, 32, 3], name='input')
network = tf.image.rgb_to_grayscale(network)
network = conv_2d(network, nb_filter=6, filter_size=5, strides=1, activation='relu', padding="VALID")
...
第二解决方案 :除此之外,您可以减少每个时期转换数据的额外复杂性
使用PIL / opencv将RGB图像转换为灰色
now you have X_TRAIN = (34799, 32, 32)
# conver the input into 4D
X_TRAIN = np.expand_dims(X_TRAIN, 3)
使用第一个代码的次要修改版本
network = input_data(shape=[None, 32, 32, 1], name='input')
network = conv_2d(network, nb_filter=6, filter_size=5, strides=1, activation='relu', padding="VALID")
...