卷积神经网络架构,整形误差

时间:2018-10-05 10:52:09

标签: python tensorflow deep-learning conv-neural-network tensorflow-datasets

我一直试图通过我发现的两个教程Omligot datasetCNN tutorial 1来使CNN tutorial 2(105 x 105 x 1图像)的CNN正常工作MNIST数据集(28 x 28 x 1张图像)。

我仍然在实现过程中遇到形形色色的冲突(经过一个星期的开/关时间进行调试),到目前为止,没有人能够提供任何帮助,与此同时,我可以用某种方式进行调试,我认为可以在此更好地描述成形误差。

我的大部分代码如下(只是在这里和那里跳过一些无关的东西)。因此,我的占位符定义如下:

x = tf.placeholder(tf.float32, shape=(None, 105, 105, 1) )  # placeholder for train data
y = tf.placeholder(tf.float32, shape=(None, 20) )           # placeholder for labels
lr = tf.placeholder(tf.float32,shape=(), name="learnRate")  # for varying learning rates during training

标签的尺寸为20,因为每个字母有20个不同的字符。所以长度为20的单编码矢量。

从这里开始,我的模型定义如下(我在其中注释了每个输出的尺寸结果):

# weight and bias dimension definitions
self.ConFltSize = 3
self.ConOutSize = 7
self.weights = {
    'wc1': tf.Variable(tf.random_normal([self.ConFltSize,self.ConFltSize,1, 32], stddev=0.01, name='W0')), 
    'wc2': tf.Variable(tf.random_normal([self.ConFltSize,self.ConFltSize,32, 64], stddev=0.01, name='W1')), 
    'wc3': tf.Variable(tf.random_normal([self.ConFltSize,self.ConFltSize,64, 128], stddev=0.01, name='W2')), 
    'wd1': tf.Variable(tf.random_normal([self.ConOutSize * self.ConOutSize * 128, 128], stddev=0.01, name='W3')), 
    'out': tf.Variable(tf.random_normal([128, self.InLabels.shape[1]], stddev=0.01, name='W4')), 
    }

self.biases = {
    'bc1': tf.Variable(tf.random_normal([32], stddev=0.01, name='B0')), 
    'bc2': tf.Variable(tf.random_normal([64], stddev=0.01, name='B1')), 
    'bc3': tf.Variable(tf.random_normal([128], stddev=0.01, name='B2')),
    'bd1': tf.Variable(tf.random_normal([128], stddev=0.01, name='B3')),
    'out': tf.Variable(tf.random_normal([self.InLabels.shape[1]], stddev=0.01, name='B4')),
    }

# Model definition + shaping results
# x = provide the input data
# weights = dictionary variables for weights
# biases = dictionary variables for biases
def Architecture(self, x, weights, biases): 

    conv1 = self.conv(x, weights['wc1'], biases['bc1'])      # convolution layer 1
    conv1 = self.maxPool(conv1)                              # max pool layer 1
    # out shape -> [None, 53, 53, 32]

    conv2 = self.conv(conv1, weights['wc2'], biases['bc2'])  # convolution layer 2
    conv2 = self.maxPool(conv2)                              # max pool layer 2
    # out shape -> [None, 27, 27, 64]

    conv3 = self.conv(conv2, weights['wc3'], biases['bc3'])  # convolution layer 3
    conv3 = self.maxPool(conv3)                              # max pool layer 3
    # out shape -> [None, 14, 14, 128]

    flayer = tf.reshape(conv3, [-1, weights['wd1'].shape[0]])   # flatten the output from convo layer
    # for 7 x 7 x 128 this is -> [None, 6272] 
    flayer = tf.add(tf.matmul(flayer, weights['wd1']), biases['bd1'])           # fully connected layer 1
    flayer = tf.nn.relu(flayer)
    # out shape -> [None, 128]

    out = tf.add( tf.matmul(flayer, weights['out']), biases['out'] )   # do last set of output weight * vals + bias  
    # out shape -> [None, 20]       

    return out      # net input to output layer

但是现在,在我的主程序中,基本上,我将输入数据分批输入模型中:

    out = self.Architecture(x, self.weights, self.biases)    # Implement network architecture, and get output tensor (net input to output layer)

    # normalize, softmax and entropy the net input, in comparison with provided labels
    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=out, labels=y) )   # cost function
    optimizer = tf.train.GradientDescentOptimizer(learning_rate=lr).minimize(cost)                  # gradient descent optimizer

    pred = tf.equal(tf.argmax(out, 1), tf.argmax(y , 1)) # output true / false if predicted value matches label
    accuracy = tf.reduce_mean(tf.cast(pred, tf.float32))             # percentage value of correct predictions

    for i in range(iters):          
            [BX, _, BY, _] = batch.split(trainX, trainY, Bsize) # random split in batch size
            # data shapes: BX -> [160, 105, 105, 1], BY -> [160, 20]

            # Code bombs out after feeding with input data
            opt = sess.run(optimizer, feed_dict={lr:learnr, x:BX, y:BY } )

然后我用sess.run命令得到的例外是:

  

'logits和标签必须可广播:logits_size = [640,20] labels_size = [160,20] \ n \ t [[节点:softmax_cross_entropy_with_logits = SoftmaxCrossEntropyWithLogits [T = DT_FLOAT,_device =“ / job:localhost / replica :0 /任务:0 /设备:CPU:0“](Add_1,softmax_cross_entropy_with_logits / Reshape_1)]'

据此,我解释说softmax在期望[160,20]时正在获取[640,20]作为输入...我不明白如何将数据整形为[640,20]? ?

如果我缺少某些东西,或者误解了错误,请告诉我?

0 个答案:

没有答案