如何在Tensorflow中阅读CIFAR-10数据集?

时间:2016-11-26 19:26:00

标签: python tensorflow deep-learning

任何人都可以提供干净的代码来在tensoflow中加载CIFAR-10吗?

我已经检查过tensorflow的github repo中给出的例子。但我不想将图像调整为 24x24 。基本上,我正在寻找一个更简单,更简单的代码。

2 个答案:

答案 0 :(得分:2)

请查看以下github页面,我已完成此操作。如果上述链接失败,请按照kgeorge.github.io上的线索查看笔记本tf_cifar.ipynb。我试图使用婴儿步骤加载cifar-10数据。请查找函数load_and_preprocess_input

该代码中的以下函数接受数据作为(nsamples,32x32x3)float32的np数组,并将其标记为nsamples int32的np数组,并预处理数据流训练所消耗的数据。

image_depth=3
image_height=32
image_width=32
#data = (nsamples, 32x32x3) float32
#labels = (nsamples) int32
def prepare_input(data=None, labels=None):
    global image_height, image_width, image_depth
    assert(data.shape[1] == image_height * image_width * image_depth)
    assert(data.shape[0] == labels.shape[0])
    #do mean normaization across all samples
    mu = np.mean(data, axis=0)
    mu = mu.reshape(1,-1)
    sigma = np.std(data, axis=0)
    sigma = sigma.reshape(1, -1)
    data = data - mu
    data = data / sigma
    is_nan = np.isnan(data)
    is_inf = np.isinf(data)
    if np.any(is_nan) or np.any(is_inf):
        print('data is not well-formed : is_nan {n}, is_inf: {i}'.format(n= np.any(is_nan), i=np.any(is_inf)))
    #data is transformed from (no_of_samples, 3072) to (no_of_samples , image_height, image_width, image_depth)
    #make sure the type of the data is no.float32
    data = data.reshape([-1,image_depth, image_height, image_width])
    data = data.transpose([0, 2, 3, 1])
    data = data.astype(np.float32)
    return data, labels

答案 1 :(得分:1)

请注意,现在有一个built-in function可以加载此数据集。