MachineLearning将tflearn / tensorflow图像转换为灰度图像

时间:2017-08-07 16:36:27

标签: python numpy tensorflow tflearn

我目前正在尝试使用tflearn开发CNN来检测对象。我的数据来自pickle文件,因此我没有任何.png文件或类似文件。我的图像存储为numpy.array,形状为:

 (34799, 32, 32, 3)

34799是图像的数量,所以基本上形状是32,32,3。

我的CNN定义如下:

    import tflearn
from tflearn.layers.core import input_data, fully_connected, flatten, dropout
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.estimator import regression
from tflearn.metrics import Accuracy

# Building convolutional network
def neural_network(X, y, dropoutRate=0.8):
    network = input_data(shape=[None, 32, 32, 3], name='input')

    network = conv_2d(network, nb_filter=6, filter_size=5, strides=1, activation='relu', padding="VALID")

    network = conv_2d(network, 6, 4, activation='relu')
    network = max_pool_2d(network, 2)

    network = conv_2d(network, 16, 5, strides=1, activation="relu", padding="VALID")
    network = max_pool_2d(network, 2, padding="VALID")

    network = dropout(incoming=network, keep_prob=dropoutRate)
    network = fully_connected(network, 84, activation="relu")
    network = flatten(network)
    network = fully_connected(network, 43, activation='softmax')

    acc = Accuracy()
    network = regression(network, optimizer='adam', learning_rate=0.001,
                         loss='categorical_crossentropy', name='target')
    # Training
    model = tflearn.DNN(network, tensorboard_verbose=0)
    model.fit(X_test, y_test, n_epoch=7, batch_size=20, show_metric=True, snapshot_epoch=True, run_id="trafficSign", snapshot_step=500, validation_set=(X_valid, y_valid))
    return model

我的问题是,当我使用内置的张量流函数将图像变为灰色时:

tf.image.rgb_to_grayscale(X_train)

这就是来自函数的张量

<tf.Tensor 'rgb_to_grayscale_6:0' shape=(34799, 32, 32, 1) dtype=float64>

但在改变CNN的第一部分时。 input_data()到形状[32,32,1]我得到一个错误,表明形状是错误的,它不能填充形状,因为它有形状[32,32]。

所以我的问题是,是否有一种简单的方法可以将1添加到我的形状中?

感谢您的帮助,如果您需要更多信息,请告诉我

1 个答案:

答案 0 :(得分:2)

第一个解决方案您可以在netwrok中进行更改

network = input_data(shape=[None, 32, 32, 3], name='input')
network = tf.image.rgb_to_grayscale(network)
network = conv_2d(network, nb_filter=6, filter_size=5, strides=1, activation='relu', padding="VALID")
...

第二解决方案  :除此之外,您可以减少每个时期转换数据的额外复杂性

使用PIL / opencv将RGB图像转换为灰色

  now you have X_TRAIN = (34799, 32, 32)
  # conver the input into 4D
  X_TRAIN = np.expand_dims(X_TRAIN, 3)

使用第一个代码的次要修改版本

network = input_data(shape=[None, 32, 32, 1], name='input')
network = conv_2d(network, nb_filter=6, filter_size=5, strides=1, activation='relu', padding="VALID")
...