卷积矩阵滤波器大小与通道长度

时间:2018-01-26 14:44:21

标签: python tensorflow conv-neural-network

我是计算机视觉的初学者。我想知道过滤器大小和输出图像上的通道之间的关系

我的目标是了解将大小和通道与卷积神经网络连接起来的各种关系

我正在使用tensorflow库,我找到了一个在 cifar 10 数据集上应用CNN的示例,

数据集包括: -

  1. 数据 - 一个10000x3072 numpy数组的uint8s。数组的每一行 存储32x32彩色图像。前1024个条目包含红色 通道值,下一个1024绿色,最后1024个 蓝色。图像以行主顺序存储,因此第一个32 数组的条目是第一行的红色通道值 图像。
  2. 标签 - 0-9范围内的10000个数字列表。
  3. 守则:

    import tensorflow as tf
    
    
    X = tf.placeholder(tf.float32,shape=[None,32,32,3])
    y_true = tf.placeholder(tf.float32,shape=[None,10])
    
    
    
    hold_prob = tf.placeholder(tf.float32)
    
    #  Helper Functions 
    
    
    
    def init_weight(shape,name_W):
        init_rand_dist = tf.truncated_normal(shape,stddev=0.1)
        return tf.Variable(init_rand_dist,name=name_W)
    
    #  init bais
    
    def init_bias(shape, name_b):
        init_bias_vals = tf.constant(value=0.1,shape=shape)
        return tf.Variable(init_bias_vals,name=name_b)
    
    #  convolution 2d
    
    # Conv2D
    
    def conv2d(X,W,name_conv):
        #  X --> [batch,H,W,Channels]
        #  W --> [filter H , filter W , Channel In , Channel Out]
    
    
    
        return tf.nn.conv2d(X,W,strides=[1,1,1,1],padding='SAME',name=name_conv)
    
    #  convolutional Layer with activation and bais
    
    # Convolutional Layes
    
    def convolutional_layer(input_x, shape,name_W,name_b,name_conv):
        W = init_weight(shape = shape,name_W=name_W)
        b = init_bias(shape = [shape[3]],name_b = name_b)
    
    
        return tf.nn.relu(conv2d(input_x,W, name_conv =name_conv) + b )
    
    #  pooling layer
    
    # Pooling
    def max_pooling_2by2(X):
        #  X --> [batch,H,W,Channels]
    
        return tf.nn.max_pool(X,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')
    
    #  Fully connected Layer
    
    #  Normal Layer (fully connected Layer)
    
    def normal_full_layer(input_layer,size,name_W,name_b):
    
        input_size = int(input_layer.get_shape()[1])
        W = init_weight([input_size,size],name_W=name_W)
        b = init_bias([size],name_b=name_b)
    
        return tf.matmul(input_layer , W) + b
    
    #  Create the Layers 
    
    
    
    convo_1 = convolutional_layer( X , shape = [4,4,3,32] , name_W = "W_conv1" , name_b = "bias_Conv1" , name_conv = "Conv_1")
    convo_1_pooling = max_pooling_2by2(convo_1)
    
    
    
    convo_2 = convolutional_layer( convo_1_pooling, shape = [4,4,32,64] , name_W = "W_conv2" , name_b = "bias_Conv2" , name_conv = "Conv_2")
    convo_2_pooling = max_pooling_2by2(convo_2)
    
    # ** Now create a flattened layer  [-1,8 \* 8 \* 64] or [-1,4096] **
    
    convo_2_flat = tf.reshape(convo_2_pooling,[-1,8*8*64])
    
    
    
    full_layer_one = tf.nn.relu(normal_full_layer(convo_2_flat,1024,name_W="full_layer_W",name_b="full_layer_b"))
    
    
    
    hold_prob = tf.placeholder(tf.float32)
    full_one_dropot = tf.nn.dropout(full_layer_one,keep_prob=hold_prob)
    
    
    
    y_pred = normal_full_layer(full_one_dropot,10,name_W = 'out_W',name_b='out_b' )
    
    #    Loss Function 
    
    cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels = y_true , logits= y_pred ))
    
    #   Optimizer Adam Optimizer. 
    
    optimizer = tf.train.AdamOptimizer(learning_rate=0.001)
    train = optimizer.minimize(cross_entropy)
    
    
    init = tf.global_variables_initializer()
    
    #  Graph Session
    
    steps = 5000
    
    with tf.Session() as sess:
        sess.run(init)
    
    
        print (tf.all_variables())
    
    
    
        for i in range (steps) :   
            batch_x , batch_y = ch.next_batch(100)
            # print(convo_2_flat.eval(feed_dict={X:batch_x, y_true:batch_y, hold_prob:1.0}).shape)
    
    
            sess.run(train,feed_dict={X:batch_x,y_true:batch_y,hold_prob:0.5})
    
            #  PRINT OUT A MESSAGE EVERY 100 STEPS
            if i%100 == 0:
    
                print('Currently on step {}'.format(i))
                print('Accuracy is:')
                #  Test the Train Model
                matches = tf.equal(tf.argmax(y_pred,1),tf.argmax(y_true,1))
    
                acc = tf.reduce_mean(tf.cast(matches,tf.float32))
    
                print(sess.run(acc,feed_dict={X:training_images,y_true:training_labels,hold_prob:1.0}))
                print('\n')
    

    我想知道滤波器大小,例如第一个卷积层有滤波器大小(4 * 4)和通道(32),我想知道为什么选择这个数字并在下一层上级联。

    最后一层的另一个例子

    tf.nn.relu(normal_full_layer(convo_2_flat,1024,name_W="full_layer_W",name_b="full_layer_b"))
    

    它取平面图层的输出并调整为1024,这也是

    背后的原因

    此图片也是enter image description here

1 个答案:

答案 0 :(得分:0)

过滤器大小是执行卷积的窗口大小。它通过与填充交互来确定网络的其余部分,以确定该层的输出张量的高度和宽度。

输出通道听起来像是:该层输出中的通道数。因此,以3个通道(r,g,b)开头并具有32个输出通道的示例图层将产生类似于图像的输出,但每像素有32个数字而不是3个。

滤波器的大小及其输出通道的数量是卷积神经网络的重要特性,并且选择它们以最大化精度。不同的体系结构(初始,resnet,vgg等)对要使用的过滤器和通道数量做出不同的选择。