我正在尝试为CIFAR-10数据集中的鸟类图像着色。
问题设置:
X: (5000,32,32,1)
,其中每个条目都是鸟类图像的灰度版本
Y: (5000,4096)
是一个热编码数组。例如,第一个像素将具有[0,0,1,0]
,其中1表示要使用哪种颜色。
Y
只是每个图像所有一键编码的折叠版本。
我关注了许多文章,这些文章实现了灰度图像的着色,但是我的损失/准确性仍然很高/很低。
model = Sequential()
model.add(Convolution2D(32, (5, 5), strides=(1,1), input_shape=(32,32,1),padding='same', activation='relu'))
model.add(Dropout(0.2))
model.add(Convolution2D(32, (5, 5),activation='relu', padding='same' ))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Convolution2D(64, (5, 5), activation='relu', padding='same' ))
model.add(Flatten())
model.add(Dense(128))
model.add(Dense(4096, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(Xtrain, Ytrain, validation_data=(val_data,Ytest),epochs=5, batch_size=32)
我希望随着时间的推移,精度会有所提高,但还会越来越差。
答案 0 :(得分:0)
You'll have to put some work into architecture (which it sounds like you've been thinking about), but to simply black-box it, you can pump in the gray images and draw out the color images. Why not?
Use model.summary() to make sure your shapes are to your liking. (See below)
I haven't tested this code, but it should be pretty close...
model = Sequential()
model.add(InputLayer(input_shape=(32,32,1)))
model.add(Conv2D(32,(5,5),strides=(1,1), activation='relu', padding='same'))
model.add(SpatialDropout2D(rate=0.2)) # holla at this layer
model.add(Conv2D(32,(5,5), activation='relu', padding='same'))
model.add(MaxPool2D((2,2)))
model.add(Conv2D(64,(5,5),activation='relu',padding='same'))
model.add(Dense(128))
# have to upsample to get your height/width back from max pooling!
model.add(UpSampling2D((2,2)))
model.add(Conv2D(3,(2,2),activation='relu',padding='same'))
model.add(Activation('softmax'))
model.compile(optimizer='adam',loss='mse')
model.summary()
Here's the output of model.summary()
. The output layer is (32,32,3); 32 height, 32 width, channels.
[1]
Now just train it with grayscales as X, and the color originals as Y. And post results, for the curious!