图像输入到神经网络

时间:2018-03-22 07:03:41

标签: python neural-network keras

我正在从扫描的文件中做数字提取器。使用1100x850图像。我们使用44x34网格图像。这样最后一层将是1496完全连接的层。  label是44x34 BINARY数组,其中1为图形,0为非图形区域。即如果数字落在(右上)(x,y)=(0,0)(左下)(x,y)=(50,50),那么bin数组在(0,0)(0,1)处有1 )和(1,0)(1,1)这些位置和休息0。所以我有一个神经网络模型。以下是结构。

conv(5,2,48)
maxpool(3,2)
conv(5,2,96)
maxpool(3,2)
conv(5,2,96)
maxpool(3,2)
FC-1496

符号conv(k,d,n)表示具有n个滤波器的卷积层,每个滤波器的大小为k×k,应用d个像素的移位; maxpool(k,d)表示在k×k个窗口上的下采样操作,应用d个像素的移位。 FC-1496指的是最终完全连接 将隐藏单元从前一层连接到1496输出单元的层(我们有4496个单元用于44x34网格)。

所以我的问题是如何使用keras和张量流向这个模型输入输入(图像和标签(数组))。

这是型号代码

from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Dense
from keras.models import Sequential
from keras.layers import Flatten


xtrain=#image of 850*1100 for 10 images 10 850*1100
xtest=#binary array of size 1496 for 10 images size is 10*1496

# initialize the model
model = Sequential()
model.add(Conv2D(48, 5, 2, input_shape=(1100, 850, 1)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=2))

model.add(Conv2D(96, 5, 2))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=2))

model.add(Conv2D(96, 5, 2))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=2))
model.add(Flatten())
model.add(Dense(1496, activation='sigmoid'))

model.compile(loss='mean_squared_error', optimizer='sgd', metrics=['accuracy'])

1 个答案:

答案 0 :(得分:0)

这是一个基于您的数据的工作示例(我假设您的信息)

我使用标签作为1和0的向量,例如[1,0,1,1,...] 1用于图形区域,0用于无图形区域,总共1496个区域

from __future__ import print_function
import numpy as np
np.random.seed(1337)  # for reproducibility

from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Activation, Flatten
from keras.utils import np_utils

batch_size = 128
nb_epoch = 10
nb_regions = 1496

# input image dimensions
img_rows, img_cols = 850, 1100

# create random test and train sets
X_train = np.random.randint(256, size=(10, img_rows, img_cols))
Y_train = np.random.randint(2, size=(10, nb_regions))

X_test = np.random.randint(256, size=(10, img_rows, img_cols))
Y_test = np.random.randint(2, size=(10, nb_regions))


X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols)
X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255

model = Sequential([
    Dense(32, input_shape=(1, img_rows, img_cols)),
    Activation('relu'),
    Flatten(),
    Dense(nb_regions),
    Activation('softmax'),
])

model.compile(loss='categorical_crossentropy',
              optimizer='adadelta',
              metrics=['accuracy'])

model.fit(X_train, Y_train, batch_size=batch_size, epochs=nb_epoch,
          verbose=1, validation_data=(X_test, Y_test))

score = model.evaluate(X_test, Y_test, verbose=0)

print('Test score:', score[0])
print('Test accuracy:', score[1])