Question

我正在使用tensorflow的imageNet trained model来提取最后一个合并图层的特征作为新图像数据集的表示向量。

模型预测新图像如下：

python classify_image.py --image_file new_image.jpeg

我编辑了主要功能，以便我可以获取图像文件夹并立即返回所有图像的预测，并将特征向量写入csv文件。我是这样做的：

def main(_):
  maybe_download_and_extract()
  #image = (FLAGS.image_file if FLAGS.image_file else
  #         os.path.join(FLAGS.model_dir, 'cropped_panda.jpg'))
  #edit to take a directory of image files instead of a one file
  if FLAGS.data_folder:
    images_folder=FLAGS.data_folder
    list_of_images = os.listdir(images_folder)
  else: 
    raise ValueError("Please specify image folder")

  with open("feature_data.csv", "wb") as f:
    feature_writer = csv.writer(f, delimiter='|')

    for image in list_of_images:
      print(image) 
      current_features = run_inference_on_image(images_folder+"/"+image)
      feature_writer.writerow([image]+current_features)

它适用于大约21张图像但随后因以下错误而崩溃：

  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1912, in as_graph_def
    raise ValueError("GraphDef cannot be larger than 2GB.")
ValueError: GraphDef cannot be larger than 2GB.

我认为通过调用方法run_inference_on_image(images_folder+"/"+image)，先前的图像数据会被覆盖，只考虑新的图像数据，但似乎并非如此。如何解决这个问题？

Answer 1

这里的问题是每次调用run_inference_on_image() 都会将节点添加到同一个图表中，最终会超出最大值。至少有两种方法可以解决这个问题：

简单但缓慢的方式是为run_inference_on_image()的每次调用使用不同的默认图表：

for image in list_of_images:
  # ...
  with tf.Graph().as_default():
    current_features = run_inference_on_image(images_folder+"/"+image)
  # ...

更多参与但更高效方式是修改run_inference_on_image()以在多个图片上运行。将for循环重新定位到环绕this sess.run() call，您将不再需要在每次调用时重建整个模型，这样可以更快地处理每个图像。

Answer 2

您可以在此循环create_graph()（循环文件）之前将for image in list_of_images:移动到的某个位置。

它的作用是在同一图表上多次执行推理。

Answer 3

最简单的方法是将create_graph()放在main函数的第一位。然后，它只创建图表

Answer 4

很好地解释了为什么会提到此类错误here，我在使用tf数据集api时遇到了相同的错误，并了解到在会话中进行迭代时，数据会附加到现有图形上。因此，我在数据集迭代器之前使用tf.reset_default_graph()所做的操作来确保清除先前的图形。

希望这有助于解决这种情况。

克服Graphdef在张量流中不能大于2GB

4 个答案: