为训练输入数据到tensorflow的文件格式是什么?
是否必须
图片/路径/标签
答案 0 :(得分:3)
有几种方法可以在张量流中处理图像。例如,我在这里整理了使用numpy准备Cifar10数据的代码片段来处理张量流。
但首先从here下载Cifar10数据,然后将代码(.py文件)放在tar.gz文件旁边
"""Base code is here:
https://github.com/tflearn/tflearn/blob/master/tflearn/datasets/cifar10.py
"""
import urllib
import os
import tarfile
import tarfile
import numpy as np
import pickle
def Untar(fname):
if (fname.endswith("tar.gz")):
tar = tarfile.open(fname)
tar.extractall(path = '/'.join(fname.split('/')[:-1]))
tar.close()
print("File Extracted in Current Directory")
else:
print("Not a tar.gz file: '%s '" % sys.argv[0])
def LoadBatch(fpath):
with open(fpath, 'rb') as f:
d = pickle.load(f)
data = d["data"]
labels = d["labels"]
return data, labels
def LoadCifarData(filepath='cifar-10-python.tar.gz',
extract_dir='cifar-10-batches-py/',
one_hot=False):
Untar(filepath)
X_train = []
Y_train = []
for i in range(1, 6):
fpath = os.path.join(extract_dir, 'data_batch_' + str(i))
data, labels = LoadBatch(fpath)
if i == 1:
X_train = data
Y_train = labels
else:
X_train = np.concatenate([X_train, data], axis=0)
Y_train = np.concatenate([Y_train, labels], axis=0)
fpath = os.path.join(extract_dir, 'test_batch')
X_test, Y_test = LoadBatch(fpath)
X_train = np.dstack((X_train[:, :1024], X_train[:, 1024:2048],
X_train[:, 2048:])) / 255.
X_train = np.reshape(X_train, [-1, 32, 32, 3])
X_test = np.dstack((X_test[:, :1024], X_test[:, 1024:2048],
X_test[:, 2048:])) / 255.
X_test = np.reshape(X_test, [-1, 32, 32, 3])
if one_hot:
Y_train = to_categorical(Y_train, 10)
Y_test = to_categorical(Y_test, 10)
print X_train
return (X_train, Y_train), (X_test, Y_test)
def main():
LoadCifarData()
if __name__ == '__main__':
main()
加载的数据集应该是这样的: [[0.5372549 0.51764706 0.49411765] [0.50980392 0.49803922 0.47058824] [0.49019608 0.4745098 0.45098039] ... [0.70980392 0.70588235 0.69803922] [0.79215686 0.78823529 0.77647059] [0.83137255 0.82745098 0.81176471]]
[[ 0.47843137 0.46666667 0.44705882]
[ 0.4627451 0.45490196 0.43137255]
[ 0.47058824 0.45490196 0.43529412]
...,
[ 0.70196078 0.69411765 0.67843137]
[ 0.64313725 0.64313725 0.63529412]
[ 0.63921569 0.63921569 0.63137255]]]]
加载此数据后,您可以通过创建卷积神经网络来对图像进行分类,这样就可以使用
https://github.com/tflearn/tflearn/blob/master/examples/images/convnet_cifar10.py