Question

我是ML的新手。我正在尝试制作一个包含数字的图像分类的基本示例。我创建了自己的数据集，但准确性差（11％）。我有246个训练项目和62个测试项目。这是我的代码：

#TRAINING

def load_data(input_path, img_height, img_width):
  data = []
  labels = []
  for imagePath in os.listdir(input_path):  
    labels_path = os.path.join(input_path, imagePath)
    if os.path.isdir(labels_path): 
      for img_path in os.listdir(labels_path):
        labels.append(imagePath)
        img_full_path = os.path.join(labels_path, img_path)
        img = image.load_img(img_full_path, target_size=(img_height, img_width)) 
        img = image.img_to_array(img)
        data.append(img)
  return data, labels



  train_data = []
  train_labels = []
  test_data = []
  test_labels = []
  train_data, train_labels = load_data(train_path, 28, 28)
  test_data, test_labels = load_data(test_path, 28, 28)


  train_data = np.array(train_data)
  train_data = train_data / 255.0
  train_data = tf.reshape(train_data, train_data.shape[:3])
  train_labels = np.array(train_labels)
  train_labels = np.asfarray(train_labels,float)


  test_data = np.array(test_data) 
  test_data = tf.reshape(test_data, test_data.shape[:3])
  test_data = test_data / 255.0
  test_labels = np.array(test_labels)


 model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
  ])


  model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])


  model.fit(train_data, train_labels, batch_size=10, epochs=5, steps_per_epoch=246)

  test_loss, test_acc = model.evaluate(test_data, test_labels, steps=1)
  print('Test accuracy:', test_acc)


#CLASSIFICATION

def classify(input_path):
    if os.path.isdir(input_path):
        images = []
        for file_path in os.listdir(input_path):
            full_path = os.path.join(input_path, file_path)
            img_tensor = preprocess_images(full_path, 28, 28, "L")
            images.append(img_tensor)
        images = np.array(images)
        images = tf.reshape(images,(images.shape[0],images.shape[2],images.shape[3]))
        predictions = model.predict(images, steps = 1)


        for i in range(len(predictions)):
            print("Image", i , "is", np.argmax(predictions[i]))

def preprocess_images(image_path, img_height, img_width, mode):
    img = image.load_img(image_path, target_size=(img_height, img_width))
    #convert 3-channel image to 1-channel
    img = img.convert(mode)
    img_tensor = image.img_to_array(img) 
    img_tensor = np.expand_dims(img_tensor, axis=0)   
    img_tensor /= 255.0
    img_tensor = tf.reshape(img_tensor, img_tensor.shape[:3])
    return tf.keras.backend.eval(img_tensor)

进行预测时，总是得到结果“ Image is 5”。所以，我有两个问题： -如何获得其他[0-9]类作为输出？ -通过增加数据数量可以提高准确性吗？

谢谢。

Answer 1

TLDR;

您的load_data()函数是罪魁祸首-您需要将数据集的标签作为整数而不是字符串filepath返回

更完整的解释：

能否通过增加数据数量来获得更好的准确性？

通常，是的。

您的模型本质上没有任何错误。我显然无权访问您创建的数据集，但可以在MNIST数据集（您的数据集可能正在尝试镜像）上对其进行测试：

(train_data, train_labels),(test_data, test_labels) = tf.keras.datasets.mnist.load_data()


model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
  ])


model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])


model.fit(train_data, train_labels, batch_size=10, epochs=5)

test_loss, test_acc = model.evaluate(test_data, test_labels)
print('Test accuracy:', test_acc)

这样做，我们可以训练到大约93％的精度：

Test accuracy: 0.9275

然后，您的推断代码也可以在测试数据上按预期工作：

predictions = model.predict(test_data)

for i in range(len(predictions)):
    print("Image", i , "is", np.argmax(predictions[i]))

提供输出，您期望：

Image 0 is 7
Image 1 is 2
Image 2 is 1
Image 3 is 0
Image 4 is 4
...

所以我们知道模型可以工作。那么，与MNIST（60000）相比，性能差异是否仅降低到数据集（246）的大小？

这很容易测试-我们可以将大小相同的MNIST数据切片并重复练习：

train_data = train_data[:246]
train_labels = train_labels[:246]

test_data = test_data[:62]
test_labels = test_labels[:62]

因此，这次我看到了准确性的大幅降低（这次约为66％），但是即使子集很小，我也可以训练模型达到比您看到的更高的准确性。

因此，问题必须出在您的数据预处理（或数据集本身）上。

实际上，看着您的load_data()函数，我可以看到问题出在您生成的标签上。您的labels刚出现在图片路径中吗？你有这个：

# --snip--

for img_path in os.listdir(labels_path):
  labels.append(imagePath) ## <-- this does not look right!

# --snip--

您需要使用图像所属类别的整数值填充labels（对于整数，这是介于0和9之间的整数）

进行预测时获得相同的输出

1 个答案:

TLDR;

更完整的解释：