我一直试图通过我发现的两个教程Omligot dataset和CNN tutorial 1来使CNN tutorial 2(105 x 105 x 1图像)的CNN正常工作MNIST数据集(28 x 28 x 1张图像)。
我仍然在实现过程中遇到形形色色的冲突(经过一个星期的开/关时间进行调试),到目前为止,没有人能够提供任何帮助,与此同时,我可以用某种方式进行调试,我认为可以在此更好地描述成形误差。
我的大部分代码如下(只是在这里和那里跳过一些无关的东西)。因此,我的占位符定义如下:
x = tf.placeholder(tf.float32, shape=(None, 105, 105, 1) ) # placeholder for train data
y = tf.placeholder(tf.float32, shape=(None, 20) ) # placeholder for labels
lr = tf.placeholder(tf.float32,shape=(), name="learnRate") # for varying learning rates during training
标签的尺寸为20,因为每个字母有20个不同的字符。所以长度为20的单编码矢量。
从这里开始,我的模型定义如下(我在其中注释了每个输出的尺寸结果):
# weight and bias dimension definitions
self.ConFltSize = 3
self.ConOutSize = 7
self.weights = {
'wc1': tf.Variable(tf.random_normal([self.ConFltSize,self.ConFltSize,1, 32], stddev=0.01, name='W0')),
'wc2': tf.Variable(tf.random_normal([self.ConFltSize,self.ConFltSize,32, 64], stddev=0.01, name='W1')),
'wc3': tf.Variable(tf.random_normal([self.ConFltSize,self.ConFltSize,64, 128], stddev=0.01, name='W2')),
'wd1': tf.Variable(tf.random_normal([self.ConOutSize * self.ConOutSize * 128, 128], stddev=0.01, name='W3')),
'out': tf.Variable(tf.random_normal([128, self.InLabels.shape[1]], stddev=0.01, name='W4')),
}
self.biases = {
'bc1': tf.Variable(tf.random_normal([32], stddev=0.01, name='B0')),
'bc2': tf.Variable(tf.random_normal([64], stddev=0.01, name='B1')),
'bc3': tf.Variable(tf.random_normal([128], stddev=0.01, name='B2')),
'bd1': tf.Variable(tf.random_normal([128], stddev=0.01, name='B3')),
'out': tf.Variable(tf.random_normal([self.InLabels.shape[1]], stddev=0.01, name='B4')),
}
# Model definition + shaping results
# x = provide the input data
# weights = dictionary variables for weights
# biases = dictionary variables for biases
def Architecture(self, x, weights, biases):
conv1 = self.conv(x, weights['wc1'], biases['bc1']) # convolution layer 1
conv1 = self.maxPool(conv1) # max pool layer 1
# out shape -> [None, 53, 53, 32]
conv2 = self.conv(conv1, weights['wc2'], biases['bc2']) # convolution layer 2
conv2 = self.maxPool(conv2) # max pool layer 2
# out shape -> [None, 27, 27, 64]
conv3 = self.conv(conv2, weights['wc3'], biases['bc3']) # convolution layer 3
conv3 = self.maxPool(conv3) # max pool layer 3
# out shape -> [None, 14, 14, 128]
flayer = tf.reshape(conv3, [-1, weights['wd1'].shape[0]]) # flatten the output from convo layer
# for 7 x 7 x 128 this is -> [None, 6272]
flayer = tf.add(tf.matmul(flayer, weights['wd1']), biases['bd1']) # fully connected layer 1
flayer = tf.nn.relu(flayer)
# out shape -> [None, 128]
out = tf.add( tf.matmul(flayer, weights['out']), biases['out'] ) # do last set of output weight * vals + bias
# out shape -> [None, 20]
return out # net input to output layer
但是现在,在我的主程序中,基本上,我将输入数据分批输入模型中:
out = self.Architecture(x, self.weights, self.biases) # Implement network architecture, and get output tensor (net input to output layer)
# normalize, softmax and entropy the net input, in comparison with provided labels
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=out, labels=y) ) # cost function
optimizer = tf.train.GradientDescentOptimizer(learning_rate=lr).minimize(cost) # gradient descent optimizer
pred = tf.equal(tf.argmax(out, 1), tf.argmax(y , 1)) # output true / false if predicted value matches label
accuracy = tf.reduce_mean(tf.cast(pred, tf.float32)) # percentage value of correct predictions
for i in range(iters):
[BX, _, BY, _] = batch.split(trainX, trainY, Bsize) # random split in batch size
# data shapes: BX -> [160, 105, 105, 1], BY -> [160, 20]
# Code bombs out after feeding with input data
opt = sess.run(optimizer, feed_dict={lr:learnr, x:BX, y:BY } )
然后我用sess.run命令得到的例外是:
'logits和标签必须可广播:logits_size = [640,20] labels_size = [160,20] \ n \ t [[节点:softmax_cross_entropy_with_logits = SoftmaxCrossEntropyWithLogits [T = DT_FLOAT,_device =“ / job:localhost / replica :0 /任务:0 /设备:CPU:0“](Add_1,softmax_cross_entropy_with_logits / Reshape_1)]'
据此,我解释说softmax在期望[160,20]时正在获取[640,20]作为输入...我不明白如何将数据整形为[640,20]? ?
如果我缺少某些东西,或者误解了错误,请告诉我?