直接加载numpy数组

Question

我有兴趣修改the tensorflow implementation of Show and Tell，特别是this v0.12 snapshot，以便接受numpy形式的图片，而不是从磁盘读取它。

使用上游代码加载文件名会在

之后生成一个python字符串

with tf.gfile.GFile(filename, "r") as f:
    image = f.read()

在run_inference.py中的

然后变成没有形状的ndarray。但是，我无法复制它。

我尝试过以下方法：

直接加载numpy数组

我写了这个函数来从文件名中加载枕头图像，将图像转换为numpy数组并将其提供给beam_search中的run_inference.py函数

def load_image(filename):
    from keras.preprocessing.image import img_to_array
    arr = img_to_array(PILImage.open(filename))
    return arr
...
captions = generator.beam_search(sess, image)

在这种情况下，稍后会出现尺寸不匹配，导致以下堆栈跟踪：

Traceback (most recent call last):
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 107, in <module>
    tf.app.run()
  File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 97, in main
    captions = generator.beam_search(sess, image)
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_utils/caption_generator.py", line 142, in beam_search
    initial_state = self.model.feed_image(sess, encoded_image)
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_wrapper.py", line 41, in feed_image
    feed_dict={"image_feed:0": encoded_image})
  File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
    run_metadata_ptr)
  File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 943, in _run
    % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (960, 640, 3) for Tensor u'image_feed:0', which has shape '()'

Process finished with exit code 1

我可以以某种方式欺骗numpy认为数组没有形状吗？

转换为tf.string

这里我使用了以下功能

def encode_image(filename):
    g2 = tf.Graph()
    from keras.preprocessing.image import img_to_array
    with g2.as_default() as g:
        with g.name_scope("g2") as g2_scope:
            arr = img_to_array(PILImage.open(filename))
            image = tf.image.encode_jpeg(arr)
            return image
...
captions = generator.beam_search(sess, image)

这也不起作用：

Traceback (most recent call last):
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 107, in <module>
    tf.app.run()
  File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 97, in main
    captions = generator.beam_search(sess, image)
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_utils/caption_generator.py", line 142, in beam_search
    initial_state = self.model.feed_image(sess, encoded_image)
  File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_wrapper.py", line 41, in feed_image
    feed_dict={"image_feed:0": encoded_image})
  File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
    run_metadata_ptr)
  File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 924, in _run
    raise TypeError('The value of a feed cannot be a tf.Tensor object. '
TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.

此堆栈跟踪中的最后一行似乎很有帮助，但是没有关于预期的结构类型的文档

TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.

那么，有效输入应该是什么样的？预处理的内部对我来说并不是特别清楚。

谢谢你的时间！

编辑：Attached gist of the modified inference script for the big picture

编辑2： sess.run的路径如下：

1：run_inference.py

captions = generator.beam_search(sess, image)

2：caption_generator.py

def beam_search(self, sess, encoded_image):
    initial_state = self.model.feed_image(sess, encoded_image)

3：inference_wrapper.py

def feed_image(self, sess, encoded_image):
    initial_state = sess.run(fetches="lstm/initial_state:0",
                         feed_dict={"image_feed:0": encoded_image})
    return initial_state

编辑3：我忘了提到我限制在TensorFlow v0.12，因此我使用this snapshot of the im2txt repo。

Answer 1

原始代码：

with tf.gfile.GFile(filename, "r") as f:
    image = f.read()

将图像作为python字符串。

您的代码：

def encode_image(filename):
    g2 = tf.Graph()
    from keras.preprocessing.image import img_to_array
    with g2.as_default() as g:
        with g.name_scope("g2") as g2_scope:
            arr = img_to_array(PILImage.open(filename))
            image = tf.image.encode_jpeg(arr)
            return image

返回sizeor的tensorflow.python.framework.ops.Tensor。

假设函数generator.beam_search（sess，image）需要一个python字符串，而你传递的是Tensor of size（）。我想最快的方法是修复你的代码

return image.eval()

而不是

return image

但是，我仍然不知道为什么你要加载一个jpeg，把它变成一个数组，然后再将重新编码作为一个jpeg。

编辑：

如果您正在尝试真正使用numpy数组并将其转换为jpeg二进制文件的python字符串，那么您可以使用它：

from PIL import Image
import numpy as np
import StringIO


def encode(npdata):
    img = Image.fromarray(npdata)
    output = StringIO.StringIO()
    img.save(output, "jpeg")
    image = output.getvalue()
    output.close()
    return image

npdata = np.random.randint(0,256, (480,640,3)).astype(np.uint8)
print len(encode(npdata))
with open("/tmp/random.jpg", "w") as fp:
    fp.write(encode(npdata)) # Just to prove it actually _is_ working

你应能够直接将其传递到下面的行：

captions = generator.beam_search(sess, encode(mynumpyarray))

另外，这里有原始代码证明：https://github.com/tensorflow/models/blob/f653bd2340b15ce2a22669ba136b77b2751e462e/im2txt/im2txt/run_inference.py#L72

import tensorflow as tf
def puregfile(filename):
    with tf.gfile.GFile(filename, "r") as f:
        image = f.read()
    return image
print type(puregfile("/tmp/random.jpg"))

输出＆＃34;＆lt; type＆＃39;＆＃39;＆gt;＆＃34;， python 字符串，不一个tf.String 。但是，由于我不想下载模型和mscoco等，我无法对其进行全面测试（或获胜）。

im2txt：从内存加载输入图像（而不是从磁盘读取）

直接加载numpy数组

转换为tf.string

1 个答案: