我有兴趣修改the tensorflow implementation of Show and Tell,特别是this v0.12 snapshot,以便接受numpy形式的图片,而不是从磁盘读取它。
使用上游代码加载文件名会在
之后生成一个python字符串with tf.gfile.GFile(filename, "r") as f:
image = f.read()
在run_inference.py
中的然后变成没有形状的ndarray。但是,我无法复制它。
我尝试过以下方法:
我写了这个函数来从文件名中加载枕头图像,将图像转换为numpy数组并将其提供给beam_search
中的run_inference.py
函数
def load_image(filename):
from keras.preprocessing.image import img_to_array
arr = img_to_array(PILImage.open(filename))
return arr
...
captions = generator.beam_search(sess, image)
在这种情况下,稍后会出现尺寸不匹配,导致以下堆栈跟踪:
Traceback (most recent call last):
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 107, in <module>
tf.app.run()
File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 97, in main
captions = generator.beam_search(sess, image)
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_utils/caption_generator.py", line 142, in beam_search
initial_state = self.model.feed_image(sess, encoded_image)
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_wrapper.py", line 41, in feed_image
feed_dict={"image_feed:0": encoded_image})
File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 943, in _run
% (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (960, 640, 3) for Tensor u'image_feed:0', which has shape '()'
Process finished with exit code 1
我可以以某种方式欺骗numpy认为数组没有形状吗?
这里我使用了以下功能
def encode_image(filename):
g2 = tf.Graph()
from keras.preprocessing.image import img_to_array
with g2.as_default() as g:
with g.name_scope("g2") as g2_scope:
arr = img_to_array(PILImage.open(filename))
image = tf.image.encode_jpeg(arr)
return image
...
captions = generator.beam_search(sess, image)
这也不起作用:
Traceback (most recent call last):
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 107, in <module>
tf.app.run()
File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 97, in main
captions = generator.beam_search(sess, image)
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_utils/caption_generator.py", line 142, in beam_search
initial_state = self.model.feed_image(sess, encoded_image)
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_wrapper.py", line 41, in feed_image
feed_dict={"image_feed:0": encoded_image})
File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 924, in _run
raise TypeError('The value of a feed cannot be a tf.Tensor object. '
TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.
此堆栈跟踪中的最后一行似乎很有帮助,但是没有关于预期的结构类型的文档
TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.
那么,有效输入应该是什么样的?预处理的内部对我来说并不是特别清楚。
谢谢你的时间!
编辑:Attached gist of the modified inference script for the big picture
编辑2: sess.run的路径如下:
1:run_inference.py
captions = generator.beam_search(sess, image)
2:caption_generator.py
def beam_search(self, sess, encoded_image):
initial_state = self.model.feed_image(sess, encoded_image)
3:inference_wrapper.py
def feed_image(self, sess, encoded_image):
initial_state = sess.run(fetches="lstm/initial_state:0",
feed_dict={"image_feed:0": encoded_image})
return initial_state
编辑3:我忘了提到我限制在TensorFlow v0.12,因此我使用this snapshot of the im2txt repo。
答案 0 :(得分:0)
原始代码:
with tf.gfile.GFile(filename, "r") as f:
image = f.read()
将图像作为python字符串。
您的代码:
def encode_image(filename):
g2 = tf.Graph()
from keras.preprocessing.image import img_to_array
with g2.as_default() as g:
with g.name_scope("g2") as g2_scope:
arr = img_to_array(PILImage.open(filename))
image = tf.image.encode_jpeg(arr)
return image
返回sizeor的tensorflow.python.framework.ops.Tensor。
假设函数generator.beam_search(sess,image)需要一个python字符串,而你传递的是Tensor of size()。我想最快的方法是修复你的代码
return image.eval()
而不是
return image
但是,我仍然不知道为什么你要加载一个jpeg,把它变成一个数组,然后再将重新编码作为一个jpeg。
编辑:
如果您正在尝试真正使用numpy数组并将其转换为jpeg二进制文件的python字符串,那么您可以使用它:
from PIL import Image
import numpy as np
import StringIO
def encode(npdata):
img = Image.fromarray(npdata)
output = StringIO.StringIO()
img.save(output, "jpeg")
image = output.getvalue()
output.close()
return image
npdata = np.random.randint(0,256, (480,640,3)).astype(np.uint8)
print len(encode(npdata))
with open("/tmp/random.jpg", "w") as fp:
fp.write(encode(npdata)) # Just to prove it actually _is_ working
你应能够直接将其传递到下面的行:
captions = generator.beam_search(sess, encode(mynumpyarray))
另外,这里有原始代码证明:https://github.com/tensorflow/models/blob/f653bd2340b15ce2a22669ba136b77b2751e462e/im2txt/im2txt/run_inference.py#L72
import tensorflow as tf
def puregfile(filename):
with tf.gfile.GFile(filename, "r") as f:
image = f.read()
return image
print type(puregfile("/tmp/random.jpg"))
输出&#34;&lt; type&#39;&#39;&gt;&#34;, python 字符串,不一个tf.String 。但是,由于我不想下载模型和mscoco等,我无法对其进行全面测试(或获胜)。