Question

我想用以下架构实现一个简单的CNN：

conv1：卷积和线性回归激活（RELU）
pool1：最大池
FC2：带有线性线性激活（RELU）的全连接层
softmax层：最终的输出预测，即分类为十个之一类。

我正在遵循本指南：https://towardsdatascience.com/cifar-10-image-classification-in-tensorflow-5b501f7dc77c，但此处的CNN非常复杂。有人可以指导我如何缩短此实现或代码吗？ conv2d的尺寸，权重和偏差也让我感到困惑。

下面是我开始使用的代码！

import pickle
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

dir = 'C:/PythonProjects/cifar-10-batches-py/'

def unpickle(file):
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

def to_onehot(labels, nclasses):
    outlabels = np.zeros((len(labels),nclasses))
    for i,l in enumerate(labels):
        outlabels[i,l]=1
    return outlabels

def normalize(x):
    """
        argument
            - x: input image data in numpy array [32, 32, 3]
        return
            - normalized x 
    """
    min_val = np.min(x)
    max_val = np.max(x)
    x = (x-min_val) / (max_val-min_val)
    return x


data_dash = unpickle(dir+'data_batch_1')
data_test = unpickle(dir+'test_batch')

X = data_dash[b'data'] # m * n
X_test = data_test[b'data'] # m * n

train_X = X.reshape(-1, 32, 32, 3)
train_y = np.array(data_dash[b'labels'])
train_y = to_onehot(train_y,10)

test_X = X_test.reshape(-1,32,32,3)
test_y = np.array(data_test[b'labels'])
test_y = to_onehot(test_y,10)

Answer 1

最好从Keras API开始。请参阅此Cifar10教程。

https://github.com/keras-team/keras/blob/master/examples/cifar10_cnn.py

如果您正在使用最新版本的tensorflow，则Keras API在tensorflow中可作为tf.keras使用。 Keras软件包无需单独安装。

conv2d的尺寸，权重和偏差也让我感到困惑。

从此处的源代码中， https://github.com/deep-diver/CIFAR10-img-classification-tensorflow/blob/c96a0cbbe91ee280a5de1b3b872e407b0a2c7f34/CIFAR10_image_classification.py#L139

方法conv_net()建立网络。共有4个转换层，其中第一个转换层

conv1_filter = tf.Variable(tf.truncated_normal(shape=[3, 3, 3, 64], mean=0, stddev=0.08))
conv1 = tf.nn.conv2d(x, conv1_filter, strides=[1,1,1,1], padding='SAME')

conv1层的权重保存在conv1_filter中，其格式为[filter_height，filter_width，in_channels，out_channels]。

[3、3、3、64]是3 x 3滤波器，具有3个输入通道（RGB输入是第一层）和64个输出通道。

对于conv2层，输入通道将为64，这是conv1层的输出通道数，依此类推。

conv1_filter中存储的权重在训练过程中通过梯度下降进行更新。

这里没有偏见。如果需要偏置，则需要声明另一个tf.Variable，其大小等于输出通道的数量。然后需要调用tf.nn.bias_add()方法，以将偏见添加到转换层的输出中。

尝试使用CIFAR-10学习卷积神经网络

1 个答案: