通过Python从.idx3-ubyte文件或GZIP中提取图像

时间:2016-11-04 16:22:35

标签: python

我使用OpenCV中的facerecognizer创建了一个简单的面部识别功能。它可以很好地处理来自人的图像。

现在我想通过使用手写字符而不是人来进行测试。我遇到了MNIST数据集,但它们将图像存储在一个我以前从未见过的奇怪文件中。

我只需从以下内容中提取一些图片:

train-images.idx3-ubyte

并将其保存在.gif

的文件夹中

或者我想念这个MNIST的事情。如果是,我在哪里可以获得这样的数据集?

修改

我也有gzip文件:

train-images-idx3-ubyte.gz

我正在尝试阅读内容,但show()不起作用,如果我read()我看到了随机符号。

images = gzip.open("train-images-idx3-ubyte.gz", 'rb')
print images.read()

修改

使用以下方法管理以获得一些有用的输出:

with gzip.open('train-images-idx3-ubyte.gz','r') as fin:
    for line in fin:
        print('got line', line)

不知怎的,我现在必须将它转换为图像,输出:

enter image description here

8 个答案:

答案 0 :(得分:34)

下载培训/测试图像和标签:

  • train-images-idx3-ubyte.gz:training set images
  • train-labels-idx1-ubyte.gz:training set labels
  • t10k-images-idx3-ubyte.gz:test set images
  • t10k-labels-idx1-ubyte.gz:测试集标签

并在工作区解压缩,samples/

从PyPi获取python-mnist包:

pip install python-mnist

导入mnist包并阅读培训/测试图像:

from mnist import MNIST

mndata = MNIST('samples')

images, labels = mndata.load_training()
# or
images, labels = mndata.load_testing()

要向控制台显示图像:

index = random.randrange(0, len(images))  # choose an index ;-)
print(mndata.display(images[index]))

你会得到这样的东西:

............................
............................
............................
............................
............................
.................@@.........
..............@@@@@.........
............@@@@............
..........@@................
..........@.................
...........@................
...........@................
...........@...@............
...........@@@@@.@..........
...........@@@...@@.........
...........@@.....@.........
..................@.........
..................@@........
..................@@........
..................@.........
.................@@.........
...........@.....@..........
...........@....@@..........
............@@@@............
.............@..............
............................
............................
............................

说明:

  • images 列表的每个图像都是无符号字节的Python list
  • 标签是无符号字节的Python array

答案 1 :(得分:7)

(仅使用matplotlib,gzip和numpy)
提取图像数据:

import gzip
f = gzip.open('train-images-idx3-ubyte.gz','r')

image_size = 28
num_images = 5

import numpy as np
f.read(16)
buf = f.read(image_size * image_size * num_images)
data = np.frombuffer(buf, dtype=np.uint8).astype(np.float32)
data = data.reshape(num_images, image_size, image_size, 1)

打印图像:

import matplotlib.pyplot as plt
image = np.asarray(data[2]).squeeze()
plt.imshow(image)
plt.show()

enter image description here

打印前50个标签:

f = gzip.open('train-labels-idx1-ubyte.gz','r')
f.read(8)
for i in range(0,50):   
    buf = f.read(1)
    labels = np.frombuffer(buf, dtype=np.uint8).astype(np.int64)
    print(labels)

答案 2 :(得分:4)

安装idx2numpy

AotHelper.EnsureType<StringEnumConverter>();

下载数据

official website下载MNIST数据集。

解压缩数据

最终,您应该具有以下文件:

pip install idx2numpy

使用idx2numpy

train-images-idx3-ubyte
train-labels-idx1-ubyte
t10k-images-idx3-ubyte
t10k-labels-idx1-ubyte

The minst picture

答案 3 :(得分:4)

import gzip
import numpy as np


def training_images():
    with gzip.open('data/train-images-idx3-ubyte.gz', 'r') as f:
        # first 4 bytes is a magic number
        magic_number = int.from_bytes(f.read(4), 'big')
        # second 4 bytes is the number of images
        image_count = int.from_bytes(f.read(4), 'big')
        # third 4 bytes is the row count
        row_count = int.from_bytes(f.read(4), 'big')
        # fourth 4 bytes is the column count
        column_count = int.from_bytes(f.read(4), 'big')
        # rest is the image pixel data, each pixel is stored as an unsigned byte
        # pixel values are 0 to 255
        image_data = f.read()
        images = np.frombuffer(image_data, dtype=np.uint8)\
            .reshape((image_count, row_count, column_count))
        return images


def training_labels():
    with gzip.open('data/train-labels-idx1-ubyte.gz', 'r') as f:
        # first 4 bytes is a magic number
        magic_number = int.from_bytes(f.read(4), 'big')
        # second 4 bytes is the number of labels
        label_count = int.from_bytes(f.read(4), 'big')
        # rest is the label data, each label is stored as unsigned byte
        # label values are 0 to 9
        label_data = f.read()
        labels = np.frombuffer(label_data, dtype=np.uint8)
        return labels

答案 4 :(得分:1)

使用它将mnist数据库提取到python中的images和csv标签:

https://github.com/sorki/python-mnist

答案 5 :(得分:1)

这里直接给你一个函数! (它以二进制格式加载。即 0 或 1)。

def load_mnist(train_data=True, test_data=False):
    """
    Get mnist data from the official website and
    load them in binary format.

    Parameters
    ----------
    train_data : bool
        Loads
        'train-images-idx3-ubyte.gz'
        'train-labels-idx1-ubyte.gz'
    test_data : bool
        Loads
        't10k-images-idx3-ubyte.gz'
        't10k-labels-idx1-ubyte.gz' 

    Return
    ------
    tuple
    tuple[0] are images (train & test)
    tuple[1] are labels (train & test)

    """
    RESOURCES = [
        'train-images-idx3-ubyte.gz',
        'train-labels-idx1-ubyte.gz',
        't10k-images-idx3-ubyte.gz',
        't10k-labels-idx1-ubyte.gz']

    if (os.path.isdir('data') == 0):
        os.mkdir('data')
    if (os.path.isdir('data/mnist') == 0):
        os.mkdir('data/mnist')
    for name in RESOURCES:
        if (os.path.isfile('data/mnist/'+name) == 0):
            url = 'http://yann.lecun.com/exdb/mnist/'+name
            r = requests.get(url, allow_redirects=True)
            open('data/mnist/'+name, 'wb').write(r.content)

    return get_images(train_data, test_data), get_labels(train_data, test_data)


def get_images(train_data=True, test_data=False):

    to_return = []

    if train_data:
        with gzip.open('data/mnist/train-images-idx3-ubyte.gz', 'r') as f:
            # first 4 bytes is a magic number
            magic_number = int.from_bytes(f.read(4), 'big')
            # second 4 bytes is the number of images
            image_count = int.from_bytes(f.read(4), 'big')
            # third 4 bytes is the row count
            row_count = int.from_bytes(f.read(4), 'big')
            # fourth 4 bytes is the column count
            column_count = int.from_bytes(f.read(4), 'big')
            # rest is the image pixel data, each pixel is stored as an unsigned byte
            # pixel values are 0 to 255
            image_data = f.read()
            train_images = np.frombuffer(image_data, dtype=np.uint8)\
                .reshape((image_count, row_count, column_count))
            to_return.append(np.where(train_images > 127, 1, 0))

    if test_data:
        with gzip.open('data/mnist/t10k-images-idx3-ubyte.gz', 'r') as f:
            # first 4 bytes is a magic number
            magic_number = int.from_bytes(f.read(4), 'big')
            # second 4 bytes is the number of images
            image_count = int.from_bytes(f.read(4), 'big')
            # third 4 bytes is the row count
            row_count = int.from_bytes(f.read(4), 'big')
            # fourth 4 bytes is the column count
            column_count = int.from_bytes(f.read(4), 'big')
            # rest is the image pixel data, each pixel is stored as an unsigned byte
            # pixel values are 0 to 255
            image_data = f.read()
            test_images = np.frombuffer(image_data, dtype=np.uint8)\
                .reshape((image_count, row_count, column_count))
            to_return.append(np.where(test_images > 127, 1, 0))

    return to_return


def get_labels(train_data=True, test_data=False):

    to_return = []

    if train_data:
        with gzip.open('data/mnist/train-labels-idx1-ubyte.gz', 'r') as f:
            # first 4 bytes is a magic number
            magic_number = int.from_bytes(f.read(4), 'big')
            # second 4 bytes is the number of labels
            label_count = int.from_bytes(f.read(4), 'big')
            # rest is the label data, each label is stored as unsigned byte
            # label values are 0 to 9
            label_data = f.read()
            train_labels = np.frombuffer(label_data, dtype=np.uint8)
            to_return.append(train_labels)
    if test_data:
        with gzip.open('data/mnist/t10k-labels-idx1-ubyte.gz', 'r') as f:
            # first 4 bytes is a magic number
            magic_number = int.from_bytes(f.read(4), 'big')
            # second 4 bytes is the number of labels
            label_count = int.from_bytes(f.read(4), 'big')
            # rest is the label data, each label is stored as unsigned byte
            # label values are 0 to 9
            label_data = f.read()
            test_labels = np.frombuffer(label_data, dtype=np.uint8)
            to_return.append(test_labels)

    return to_return

答案 6 :(得分:0)

您实际上可以使用PyPI上的idx2numpy软件包。它非常简单易用,可以直接将数据转换为numpy数组。 这是您要做的:

下载数据

official website下载MNIST数据集。
如果您使用的是Linux,则可以使用wget从命令行本身获取它。只需运行:

wget http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz

解压缩数据

解压缩或解压缩数据。在Linux上,您可以使用gzip

最终,您应该具有以下文件:

data/train-images-idx3-ubyte
data/train-labels-idx1-ubyte
data/t10k-images-idx3-ubyte
data/t10k-labels-idx1-ubyte

前缀data/只是因为我已经将它们提取到名为data的文件夹中。您的问题看起来很不错,到这里为止,请继续阅读。

使用idx2numpy

这是一个简单的python代码,以numpy数组的形式读取解压缩文件中的所有内容。

import idx2numpy
import numpy as np
file = 'data/train-images-idx3-ubyte'
arr = idx2numpy.convert_from_file(file)
# arr is now a np.ndarray type of object of shape 60000, 28, 28

您现在可以将其与OpenCV突出显示一起使用,就像显示任何其他图像一样,

cv.imshow("Image", arr[4])

要安装idx2numpy,可以使用PyPI(pip程序包管理器)。只需运行命令:

pip install idx2numpy

答案 7 :(得分:-2)

我有同样的问题。

每当我将文件解压缩为可执行文件时,扩展名都不会被删除,所以我有:

train-images-idx3-ubyte.gz

每当我删除以下内容时: .gz, 我有:

train-images-idx3-ubyte

这解决了我的问题。