Question

我制作了神经网络并且工作正常，但我尝试将其用于图像识别。所以我制作了一个程序，每个3x3平方的图像，平均所有的颜色值，所以它变成1元组的RGB而不是9元组的RGB。然后它会产生一大堆所有这些元组，所以我可以把它喂入我的神经网络。这是将图像转换为可用数据的代码：

from PIL import Image

def norm(lst):
    newtemp = []
    for value in lst:
        if bool(value):
            newtemp.append(value)
    R = []
    G = []
    B = []
    for value in xrange(len(newtemp)):
        R.append(newtemp[value][0])
        G.append(newtemp[value][1])
        B.append(newtemp[value][2])

    R = (sum(R) / len(R))/255.
    G = (sum(G) / len(G))/255.
    B = (sum(B) / len(B))/255.

    return [R,G,B]

def imgConverter(img):
    im = Image.open(img)
    im = im.convert("RGB")
    edge = 3
    width = im.size[0]
    height = im.size[1]
    pix = im.load()
    color = []
    for x in xrange(0, width, edge):
        for y in xrange(0, height, edge):
            xmin = min(x+edge, width)-x
            ymin = min(y+edge, height)-y
            temp = [[] for _ in xrange(edge*edge)]
            for xpos in xrange(0, xmin):
                for ypos in xrange(0, ymin):
                    temp[ymin*ypos+xpos] = pix[x+xpos, y+ypos]
            color.append(norm(temp))
    return color

def datacrunching(color):
    newdata = []
    for RGB in xrange(3):
        for value in xrange(len(color)):
            newdata.append(color[value][RGB])
    return newdata

data = str(datacrunching(imgConverter("cat0.jpg")))
file = open("img_data.txt", "w")
file.write(data)
file.close()

它大大减少了要处理的数据，但即使这样，500x500的图像也会给出一个长度为83667的数组，这是非常大的。将它送入我的神经网络，因为数组中的每个数字都是一个输入节点，它太慢了，是通过将每个3x3平方运算成1x1平方的问题或我将其输入的方式来减少图像数据长度的方式我的神经网络？如果这是我喂它的方式，我该怎么办？请有人帮忙，谢谢！

Answer 1

您可以使用PIL将图像转换为灰度。

from PIL import Image 
image_file = Image.open("convert_image.png") # open colour image
image_file = image_file.convert('1') # convert image to black and white

这比取平均值要快得多。

image_file现在包含图像为黑白图片。您可以将其转换为numpy矩阵，并将商店存储为文本文件。

训练：在上一步中，您应该能够将图像转换为尺寸为500 x 500的2D矩阵。如果您将其作为输入提供，则可以将其展平为维度的一维数组（250000,1）到ANN，或者如果使用CNN则可以保持2D输入。对于图像，首选CNN

什么是处理神经网络图像数据的更好方法？

1 个答案: