Question

我有256×192像素的图像，并搜索一个小的快速cnn作为“预扫描仪”，以在图像上找到有趣的部分（例如52x52、32x32块等），将使用更复杂的cnn进行检查。这么小的cnn的原因是在资源有限的嵌入式系统中的使用。

不幸的是，我是这个话题的新手，tensorflow和keras。我的第一个想法是创建一个只有一个2D转换的网络，该网络的工作方式类似于1D转换。在这种情况下，内核的高度应为192，宽度应为1（可能以后为3）。

这是我在tensorflow 2上构建的模型

# Model
model = models.Sequential()
model.add(layers.Conv2D(5, (1, 192), activation='relu', input_shape=(256, 192, 3)))

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

想法是每行获取一个值，该值指示该行中是否可以包含“有趣”的内容。基于这些信息和邻居，图像的较大部分将被剪切并输入到更复杂的cnn中。

我准备了具有256x192px的普通图像，并为每个图像准备了具有256个值（0.0或1.0）的文本文件。每个0/1代表一行，并指出该行是否有有趣的内容。

这是我的幼稚计划，但培训立即因我不理解的错误而崩溃：

ValueError: Dimensions must be equal, but are 32 and 256 for 'metrics/accuracy/Equal' (op: 'Equal') with input shapes: [32,256], [32,256,1].

我认为我的基本想法/策略是错误的。我不理解32的来源。有人可以解释我的错误吗？我的想法甚至可行吗？

编辑：

根据要求提供完整的脏代码。如果存在一些重大缺陷，请原谅。这是我的第一个Python实验。

import tensorflow as tf
import os
from tensorflow.keras import datasets, layers, models
from PIL import Image
from numpy import asarray

train_images = []
train_labels = []
test_images = []
test_labels = []

# Prepare
dir = os.listdir('images/gt_image')
split = len(dir)*0.2

c = 0

for file in dir:
    c = c + 1

    im = Image.open('images/gt_image/' + file)

    data = im.load()
    image = []

    for x in range(0, im.size[0]):
        row = []
        for y in range(0, im.size[1]):
            row.append([x/255 for x in data[x, y]])
        image.append(row)
    if c <= split:
        test_images.append(image)
    else:
        train_images.append(image)

    file = open('images/gt_labels/' + file + '.txt', 'r')
    label = file.readlines()[0].split(', ')

    if c <= split:
        test_labels.append(label)
    else:
        train_labels.append(label)
print('prepare done')

# Model
model = models.Sequential()
model.add(layers.Conv2D(5, (1, 192), activation='relu', input_shape=(256, 192, 3)))

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.summary()

print('compile done')

# Learning
history = model.fit(train_images, train_labels, epochs=10,
                    validation_data=(test_images, test_labels))

在图像上使用“ 1D卷积”之类的2D卷积？

0 个答案: