im2txt:从内存加载输入图像(而不是从磁盘读取)

时间:2017-04-27 15:33:38

标签: python arrays numpy tensorflow

我有兴趣修改the tensorflow implementation of Show and Tell,特别是this v0.12 snapshot,以便接受numpy形式的图片,而不是从磁盘读取它。

使用上游代码加载文件名会在

之后生成一个python字符串
with tf.gfile.GFile(filename, "r") as f:
    image = f.read()
run_inference.py中的

然后变成没有形状的ndarray。但是,我无法复制它。

我尝试过以下方法:

直接加载numpy数组

我写了这个函数来从文件名中加载枕头图像,将图像转换为numpy数组并将其提供给beam_search中的run_inference.py函数

def load_image(filename):
    from keras.preprocessing.image import img_to_array
    arr = img_to_array(PILImage.open(filename))
    return arr
...
captions = generator.beam_search(sess, image)

在这种情况下,稍后会出现尺寸不匹配,导致以下堆栈跟踪:

Traceback (most recent call last):
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 107, in <module>
    tf.app.run()
  File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 97, in main
    captions = generator.beam_search(sess, image)
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_utils/caption_generator.py", line 142, in beam_search
    initial_state = self.model.feed_image(sess, encoded_image)
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_wrapper.py", line 41, in feed_image
    feed_dict={"image_feed:0": encoded_image})
  File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
    run_metadata_ptr)
  File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 943, in _run
    % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (960, 640, 3) for Tensor u'image_feed:0', which has shape '()'

Process finished with exit code 1

我可以以某种方式欺骗numpy认为数组没有形状吗?

转换为tf.string

这里我使用了以下功能

def encode_image(filename):
    g2 = tf.Graph()
    from keras.preprocessing.image import img_to_array
    with g2.as_default() as g:
        with g.name_scope("g2") as g2_scope:
            arr = img_to_array(PILImage.open(filename))
            image = tf.image.encode_jpeg(arr)
            return image
...
captions = generator.beam_search(sess, image)

这也不起作用:

Traceback (most recent call last):
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 107, in <module>
    tf.app.run()
  File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 97, in main
    captions = generator.beam_search(sess, image)
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_utils/caption_generator.py", line 142, in beam_search
    initial_state = self.model.feed_image(sess, encoded_image)
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_wrapper.py", line 41, in feed_image
    feed_dict={"image_feed:0": encoded_image})
  File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
    run_metadata_ptr)
  File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 924, in _run
    raise TypeError('The value of a feed cannot be a tf.Tensor object. '
TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.

此堆栈跟踪中的最后一行似乎很有帮助,但是没有关于预期的结构类型的文档

TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.

那么,有效输入应该是什么样的?预处理的内部对我来说并不是特别清楚。

谢谢你的时间!

编辑:Attached gist of the modified inference script for the big picture

编辑2: sess.run的路径如下:

1:run_inference.py

captions = generator.beam_search(sess, image)

2:caption_generator.py

def beam_search(self, sess, encoded_image):
    initial_state = self.model.feed_image(sess, encoded_image)

3:inference_wrapper.py

def feed_image(self, sess, encoded_image):
    initial_state = sess.run(fetches="lstm/initial_state:0",
                         feed_dict={"image_feed:0": encoded_image})
    return initial_state

编辑3:我忘了提到我限制在TensorFlow v0.12,因此我使用this snapshot of the im2txt repo

1 个答案:

答案 0 :(得分:0)

原始代码:

with tf.gfile.GFile(filename, "r") as f:
    image = f.read()

将图像作为python字符串。

您的代码:

def encode_image(filename):
    g2 = tf.Graph()
    from keras.preprocessing.image import img_to_array
    with g2.as_default() as g:
        with g.name_scope("g2") as g2_scope:
            arr = img_to_array(PILImage.open(filename))
            image = tf.image.encode_jpeg(arr)
            return image

返回sizeor的tensorflow.python.framework.ops.Tensor。

假设函数generator.beam_search(sess,image)需要一个python字符串,而你传递的是Tensor of size()。我想最快的方法是修复你的代码

return image.eval()

而不是

return image

但是,我仍然不知道为什么你要加载一个jpeg,把它变成一个数组,然后再将重新编码作为一个jpeg。

编辑:

如果您正在尝试真正使用numpy数组并将其转换为jpeg二进制文件的python字符串,那么您可以使用它:

from PIL import Image
import numpy as np
import StringIO


def encode(npdata):
    img = Image.fromarray(npdata)
    output = StringIO.StringIO()
    img.save(output, "jpeg")
    image = output.getvalue()
    output.close()
    return image

npdata = np.random.randint(0,256, (480,640,3)).astype(np.uint8)
print len(encode(npdata))
with open("/tmp/random.jpg", "w") as fp:
    fp.write(encode(npdata)) # Just to prove it actually _is_ working

能够直接将其传递到下面的行:

captions = generator.beam_search(sess, encode(mynumpyarray))

另外,这里有原始代码证明:https://github.com/tensorflow/models/blob/f653bd2340b15ce2a22669ba136b77b2751e462e/im2txt/im2txt/run_inference.py#L72

import tensorflow as tf
def puregfile(filename):
    with tf.gfile.GFile(filename, "r") as f:
        image = f.read()
    return image
print type(puregfile("/tmp/random.jpg"))

输出&#34;&lt; type&#39;&#39;&gt;&#34;, python 字符串,一个tf.String 。但是,由于我不想下载模型和mscoco等,我无法对其进行全面测试(或获胜)。