我正在玩Tensorflow进行图像分类。我使用image_retraining / retrain.py来重新启动带有新类别的初始库,并使用它来使用https://github.com/llSourcell/tensorflow_image_classifier/blob/master/src/label_image.py中的label_image.py对图像进行分类,如下所示:
import tensorflow as tf
import sys
# change this as you see fit
image_path = sys.argv[1]
# Read in the image_data
image_data = tf.gfile.FastGFile(image_path, 'rb').read()
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("/root/tf_files/output_labels.txt")]
# Unpersists graph from file
with tf.gfile.FastGFile("/root/tf_files/output_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
#predictions = sess.run(softmax_tensor,{'DecodeJpeg/contents:0': image_data})
predictions = sess.run(softmax_tensor,{'DecodePng/contents:0': image_data})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
我注意到两个问题。当我重新训练新类别时,它只训练JPG图像。我是机器学习的菜鸟,所以不确定这是一个限制还是可以训练其他扩展图像,如PNG,GIF?
另一个是在对图像进行分类时,输入再次仅用于JPG。我试图在上面的label_image.py中将DecodeJpeg更改为DecodePng,但无法正常工作。我尝试的另一种方法是将其他格式转换为JPG,然后再将它们分类为:
im = Image.open('/root/Desktop/200_s.gif').convert('RGB')
im.save('/root/Desktop/test.jpg', "JPEG")
image_path1 = '/root/Desktop/test.jpg'
还有其他办法吗? Tensorflow是否具有处理JPG以外的其他图像格式的功能?
与@mrry
建议的JPEG相比,我通过输入解析后的图像来尝试以下操作import tensorflow as tf
import sys
import numpy as np
from PIL import Image
# change this as you see fit
image_path = sys.argv[1]
# Read in the image_data
image_data = tf.gfile.FastGFile(image_path, 'rb').read()
image = Image.open(image_path)
image_array = np.array(image)[:,:,0:3] # Select RGB channels only.
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("/root/tf_files/output_labels.txt")]
# Unpersists graph from file
with tf.gfile.FastGFile("/root/tf_files/output_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor,{'DecodeJpeg:0': image_array})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
适用于JPEG图像但是当我使用PNG或GIF时它会抛出
Traceback (most recent call last):
File "label_image.py", line 17, in <module>
image_array = np.array(image)[:,:,0:3] # Select RGB channels only.
IndexError: too many indices for array
谢谢和问候
答案 0 :(得分:4)
该模型只能训练(并评估)JPEG图像,因为您在GraphDef
中保存的/root/tf_files/output_graph.pb
仅包含tf.image.decode_jpeg()
操作,并使用输出该操作用于进行预测。使用其他图像格式至少有几个选项:
输入已解析的图像而不是JPEG数据。在当前程序中,您将JPEG编码的图像作为张量"DecodeJpeg/contents:0"
的字符串值。相反,您可以为张量"DecodeJpeg:0"
(代表tf.image.decode_jpeg()
op的输出)提供解码图像数据的三维数组,并且可以使用NumPy ,PIL或其他一些Python库来创建这个数组。
重新映射tf.import_graph_def()
中的图像输入。 tf.import_graph_def()
功能可让您通过重新映射单个张量值将两个不同的图形连接在一起。例如,您可以执行以下操作,将新的图像处理操作添加到现有图形中:
image_string_input = tf.placeholder(tf.string)
image_decoded = tf.image.decode_png(image_string_input)
# Unpersists graph from file
with tf.gfile.FastGFile("/root/tf_files/output_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
softmax_tensor, = tf.import_graph_def(
graph_def,
input_map={"DecodeJpeg:0": image_decoded},
return_operations=["final_result:0"])
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
predictions = sess.run(softmax_tensor, {image_string_input: image_data})
# ...
答案 1 :(得分:0)
您应该查看tf.image
包。它具有很好的解码/编码JPEG,GIF和PNG的功能。
答案 2 :(得分:0)
按照@ mrry的建议提供解析图像,将图像数据转换为数组并转换为RGB,如下面的代码中所述。现在我可以输入JPG,PNG和GIF。
+------------------------------------------+
| Title of the slide |
+-------+-----------------------+----------|
| o - First Bullet |
| o - Second bullet |
| o - Third bullet |
+------------------------------------------+ <-- End of slide
o - Fourth bullet <- Out of slide
o - Fifth bullet <- Out of slide