tensorflow cifar10用于读取图像的代码修改

时间:2016-01-07 11:10:33

标签: machine-learning classification deep-learning tensorflow

我正在尝试修改cifar10.py的代码,以便能够将图像提供给网络。

我实际上能够运行代码并开始训练过程,但过了一段时间,如果我运行tensorboard,在“images”部分下我总是拥有相同的图像。 此外,交叉熵变为零。 我认为我正在加载错误的图像。

这是代码

   def distorted_inputs():
   #Reading the dirs file where all the directories of the images are stored
   filedirs = [line.rstrip('\n') for line in open('image_dirs.txt')]

   #create a list of files 
   filenames = []
   i = 0

   for f in filedirs:   
      png_files_path = glob.glob(os.path.join(f, '*.[pP][nN][gG]')) 
      print('found ' + str(len(png_files_path)) + ' files in ' + f)
      for filename in png_files_path:
         #storing file_name label
         s = filename + " " + str(i)
         filenames.append(s)
      i = i+1

   # Create a queue that produces the filenames to read and the labels
   filename_queue = tf.train.string_input_producer(filenames)

   my_img, label = read_my_file_format(filename_queue.dequeue())         
   label = tf.string_to_number(label, tf.int32)
   init_op = tf.initialize_all_variables()
   with tf.Session() as sess:
      sess.run(init_op)

      # Start populating the filename queue.
      coord = tf.train.Coordinator()
      threads = tf.train.start_queue_runners(coord=coord)

      image = my_img.eval()

      coord.request_stop()
      coord.join(threads)

   reshaped_image = tf.cast(image, tf.float32)

   resized_image = tf.image.resize_image_with_crop_or_pad(reshaped_image,IMAGE_SIZE, IMAGE_SIZE)

   distorted_image = tf.image.random_crop(reshaped_image, [24, 24])

   # Randomly flip the image horizontally.
   distorted_image = tf.image.random_flip_left_right(distorted_image)

   # Because these operations are not commutative, consider randomizing
   # randomize the order their operation.
   distorted_image = tf.image.random_brightness(distorted_image,max_delta=63)
   distorted_image = tf.image.random_contrast(distorted_image,lower=0.2, upper=1.8)

   # Subtract off the mean and divide by the variance of the pixels.
   float_image = tf.image.per_image_whitening(distorted_image)

   # Ensure that the random shuffling has good mixing properties.
   min_fraction_of_examples_in_queue = 0.4
   min_queue_examples = int(NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN *min_fraction_of_examples_in_queue)
   print ('Filling queue with ITSD images before starting to train. ''This will take a few minutes.')

   # Generate a batch of images and labels by building up a queue of examples.
   return _generate_image_and_label_batch(float_image, label, min_queue_examples)

图像阅读部分来自https://github.com/HamedMP/ImageFlow 自定义阅读器来自Tensorflow read images with labels,相关功能实现如下

 
 def read_my_file_format(filename_and_label_tensor):
  """Consumes a single filename and label as a ' '-delimited string.

  Args:
    filename_and_label_tensor: A scalar string tensor.

  Returns:
    Two tensors: the decoded image, and the string label.
  """
  filename, label = tf.decode_csv(filename_and_label_tensor, [[""], [""]], " ")

  file_contents = tf.read_file(filename)
  example = tf.image.decode_png(file_contents)
  return example, label

由于

1 个答案:

答案 0 :(得分:0)

您可以使用我创建的这段代码来解决我的分类问题:

        resized_image = cv2.resize(image, (WIDTH, HEIGHT))
        label = np.uint8(nclass)

        arr = np.uint8([0 for x in range(image_bytes)])
        #  fill the label:
        arr[0] = label
        arr_cnt = 1

        #  fill the image (row-major order). first R values, then G values then B values
        for y in range(0, HEIGHT):
            for x in range(0, WIDTH):
                arr[arr_cnt] = np.uint8(resized_image[x, y, 2])  # R
                arr[arr_cnt + 1024] = np.uint8(resized_image[x, y, 1])  # G
                arr[arr_cnt + 2048] = np.uint8(resized_image[x, y, 0])  # B

                arr_cnt += 1

        print "train arr:", arr[0], arr[3072]
        train_arr = np.append(train_arr, arr)
        #print train_arr[file_in_dir*3073]
    else:
        invalids_cnt += 1
        #print "image", files_in_dir[file_in_dir], "is invalid"

    #  Write array to train.bin file:
with open('data_batch_%d.bin' % nclass, 'wb') as f:
        f.write(train_arr)
        f.close()

这里,调整大小的图像是一个输入图像“图像”的调整大小的版本。接下来,我创建一个3073字节的数组:第一个字节=标签,下一个1024字节=图像的红色值,接下来的1024个字节=图像的绿色值,接下来的1024个字节=图像的蓝色值。

我为每个输入图像执行此操作,然后将其连接成一个大二进制数组,该数组以二进制文件“data_batch_%d”编写

我已经在这个要点中发布了我的完整脚本(可能更难理解):gist