使用原始图像尺寸误差

时间:2018-05-21 15:05:02

标签: python image-processing tensorflow loss-function

我对tensorflow很新,所以我尝试使用tutorial中的代码 用图像大小(944,944)和类是/否(1,0)来提供一些图层,看看它是如何表现的,但是我还没有能够使它工作。我得到的最后一个错误是:"尺寸大小必须可被57032704整除,但对于' Reshape_1'是3565440。输入形状:[10,236,236,64],2,输入张量计算为部分形状:输入1 = [?,57032704]"。

我不知道错误是来自任何重塑操作还是因为我不能像这样喂食神经元。代码如下:

import tensorflow as tf
import numpy as np
import os
# import cv2
from scipy import ndimage
import PIL

tf.logging.set_verbosity(tf.logging.INFO)

def define_model(features, labels, mode):
"""Model function for CNN."""
# Input Layer
input_layer = tf.reshape(features["x"], [-1,944, 944, 1])

# Convolutional Layer #1
conv1 = tf.layers.conv2d(
  inputs=input_layer,
  filters=32,
  kernel_size=[16, 16],
  padding="same",
  activation=tf.nn.relu)

# Pooling Layer #1
pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

# Convolutional Layer #2 and Pooling Layer #2
conv2 = tf.layers.conv2d(
    inputs=pool1,
    filters=64,
    kernel_size=[16, 16],
    padding="same",
    activation=tf.nn.relu)
pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)

# Dense Layer
pool2_flat = tf.reshape(pool2, [-1,944*944*64])
dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)
dropout = tf.layers.dropout(
    inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)

# Logits Layer - raw predictions
logits = tf.layers.dense(inputs=dropout, units=10)

predictions = {
    # Generate predictions (for PREDICT and EVAL mode)
    "classes": tf.argmax(input=logits, axis=1),
    # Add `softmax_tensor` to the graph. It is used for PREDICT and by the
    # `logging_hook`.
    "probabilities": tf.nn.softmax(logits, name="softmax_tensor")
}

if mode == tf.estimator.ModeKeys.PREDICT:
    return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

# Calculate Loss (for both TRAIN and EVAL modes)
loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)

# Configure the Training Op (for TRAIN mode)
if mode == tf.estimator.ModeKeys.TRAIN:
    optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
    train_op = optimizer.minimize(
        loss=loss,
        global_step=tf.train.get_global_step())
    return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)

# Add evaluation metrics (for EVAL mode)
eval_metric_ops = {
    "accuracy": tf.metrics.accuracy(
        labels=labels, predictions=predictions["classes"])}
return tf.estimator.EstimatorSpec(
    mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)

if __name__ == '__main__':
# Load training and eval data
# mnist = tf.contrib.learn.datasets.load_dataset("mnist")
# train_data = mnist.train.images  # Returns np.array
# train_labels = np.asarray(mnist.train.labels, dtype=np.int32)
train_data, train_labels = load_images("C:\\Users\\Heads\\Desktop\\BDManchas_Semi")

eval_data = train_data.copy()
eval_labels = train_labels.copy()

# Create the Estimator
classifier = tf.estimator.Estimator(
    model_fn=define_model, model_dir="/tmp/convnet_model")

# Set up logging for predictions
tensors_to_log = {"probabilities": "softmax_tensor"}
logging_hook = tf.train.LoggingTensorHook(
    tensors=tensors_to_log, every_n_iter=50)

# Train the model
train_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": train_data},
    y=train_labels,
    batch_size=10,
    num_epochs=None,
    shuffle=True)
classifier.train(
    input_fn=train_input_fn,
    steps=20000,
    hooks=[logging_hook])

# Evaluate the model and print results
eval_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": eval_data},
    y=eval_labels,
    num_epochs=1,
    shuffle=False)
eval_results = classifier.evaluate(input_fn=eval_input_fn)
print(eval_results)

--------------------------------- MORE:------------ ----------------------

好的,现在我已经进行了重塑,我有另一个错误,训练期间的损失是NaN。我一直在研究这个问题(here有一个很好的答案)但是对于我使用的每个新函数,都有一个不同的错误。我试图改变以下的损失:

loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)

为:

loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=labels, logits=logits)

但它似乎也是重塑的问题,错误说logits和标签必须具有相同的形状((10,10)vs(10,)),我试图重塑logits和标签,但我总是得到一个不同的错误(我想没有办法平衡两个数组)。

标签的定义如下:

list_of_classes = []
# if ... class == 1
list_of_classes.append(1)
#else
list_of_classes.append(0)

labels = np.array(list_of_classes).astype("int32") 

关于如何使用适当的损失的任何想法?

2 个答案:

答案 0 :(得分:1)

初始问题

第二个合并图层(pool2)的输出形状为(1, 236, 236, 64)(卷积和合并缩小了张量的大小),因此尝试将其重新整形为(-1, 944*944*64)({ {1}})抛出错误。

为避免这种情况,您可以将pool2_flat定义为:

pool2_flat

关于您的编辑

不知道如何定义标签,很难说出错了什么。 pool2_shape = tf.shape(pool2) pool2_flat = tf.reshape(pool2, [-1, pool2_shape[1] * pool2_shape[2] * pool2_shape[3]]) # or directly pool2_flat = tf.reshape(pool2, [-1, 236 * 236 * 64]) # if your dimensions are fixed... # or more simply, as suggested by @xdurch0: pool2_flat = tf.layers.flatten(pool2) 必须是labels形状(批次中每个图片的类ID),而(None,)必须是logits形状(每个类的估计概率,每个图像在批处理中。)

以下代码对我有用:

(None, nb_classes)

答案 1 :(得分:-1)

所以解决方案是改变这条线:

pool2_flat = tf.reshape(pool2, [-1,944*944*64])

为该行:

pool2_flat = tf.layers.flatten(pool2)

此外,我需要使用512x512调整大小的图像而不是944x944,因为它不适合内存......