Question

我正在尝试使用来自tensorflow

的具有预训练权重的模型

我对于如何加载它以生成预测有点迷惑。我想使用faster_rcnn模型在图像上进行对象检测。

对于模型faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28，我具有以下文件：

|   checkpoint
|   frozen_inference_graph.pb
|   model.ckpt.data-00000-of-00001
|   model.ckpt.index
|   model.ckpt.meta
|   pipeline.config
|
\---saved_model
    |   saved_model.pb
    |
    \---variables

这是我尝试加载模型并生成一些预测的结果：

import tensorflow as tf
import cv2

model_folder = "faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28"

model_graph_file = model_folder+"/frozen_inference_graph.pb"
model_weights_file = model_folder+"/model.ckpt.data-00000-of-00001"

graph_def = tf.GraphDef()
graph_def.ParseFromString(tf.gfile.Open(model_graph_file,'rb').read())

#print([n.name + '=>' +  n.op for n in graph_def.node if n.op in ('Placeholder')])
#print([n.name + '=>' +  n.op for n in graph_def.node if n.op in ('Softmax')])

input = graph.get_tensor_by_name('image_tensor:0')
classes = graph.get_tensor_by_name('detection_classes:0')
scores = graph.get_tensor_by_name('detection_scores:0')
boxes = graph.get_tensor_by_name('detection_boxes:0')
softmax = graph.get_tensor_by_name('Softmax:0')

my_image = cv2.imread('resources/my_image.jpg')

with tf.Session(graph=graph) as sess:
    classes_out,scores_out,boxes_out,softmax  = sess.run([classes,scores,boxes,softmax],feed_dict={input:[my_image]})
    print(classes_out)
    print(classes_out.shape)
    print(scores_out)
    print(scores_out.shape)
    print(boxes_out)
    print(boxes_out.shape)
    print(softmax)
    print(softmax.shape)

打印以下内容：

[[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]
(1, 20)
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
(1, 20)
[[[0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]]]
(1, 20, 4)
[[9.9819970e-01 1.8002436e-03]
 [9.9932957e-01 6.7051285e-04]
 [9.9853170e-01 1.4682930e-03]
 ...
 [9.9990737e-01 9.2630769e-05]
 [9.9939859e-01 6.0135941e-04]
 [9.6443009e-01 3.5569914e-02]]
(115200, 2)

很明显，我在这里做错了什么，但我不知道到底是什么。我怎么知道哪些层用作输出层？如何获取对象的类，分数和盒子？我可以正确加载模型吗？

编辑：

基于Lescurel的答案：

出于某种原因，我必须对代码进行一些更改才能运行它：tf.saved_model.tag_constants.SERVING-> [tf.saved_model.tag_constants.SERVING]

和

input_tensor = model_signature["inputs"].name-> input_tensor = model_signature.inputs['inputs'].name。（使用tensorflow 1.12）

现在我有一些结果，对此我感到非常高兴，但是对于Lescurel使用的相同图像和相同模型，我有非常不同的输出：

[array([[0.5936514 , 0.5774365 , 0.519677  , 0.46745843, 0.36366013,
        0.3496253 , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ]],
      dtype=float32), array([[33.,  1., 68., 11., 13.,  7.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
         1.,  1.,  1.,  1.,  1.,  1.,  1.]], dtype=float32), array([6.], dtype=float32), array([[[0.6699049 , 0.68924683, 0.9372702 , 0.78685343],
        [0.21414267, 0.264757  , 0.9868771 , 0.51174635],
        [0.34444967, 0.65146637, 0.70101655, 0.80124986],
        [0.8743748 , 0.7071637 , 0.9687472 , 0.7784833 ],
        [0.7832241 , 0.51456743, 0.9550611 , 0.59617543],
        [0.32543942, 0.6407225 , 0.9539846 , 0.81454873],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ]]], dtype=float32)]

知道为什么吗？

Answer 1

您已加载网络的图形结构，但未加载经过训练的权重。因此，网络无法进行任何有意义的预测。要在tf 1.x中加载图形的权重，可以参考guide

以下代码段加载了图形及其权重，并执行了预测（此代码段使用了model zoo中的faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco）

import cv2
import tensorflow as tf #tf.1.x

model_dir = "faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco_2018_01_28/saved_model"

img = cv2.imread("/path/to/image.jpg")

with tf.Session() as sess:
    # We load the model and its weights
    # Models from the zoo are frozen, so we use the SERVING tag
    model = tf.saved_model.loader.load(sess, 
                               tf.saved_model.tag_constants.SERVING, 
                               model_dir)
    # we get the model signature
    model_signature = model.signature_def["serving_default"]
    input_tensor = model_signature["inputs"].name
    # getting the name of the outputs
    output_tensor = [v.name for k,v in model_signature.outputs.items() if v.name]
    # running the prediction
    outs = sess.run(output_tensor, feed_dict={input_tensor:[img]})

图像上的示例输出：

>>> outs
[array([[0.9998708 , 0.99963164, 0.9926651 , 0.        , 0.        ,
         0.        , 0.        , 0.        , 0.        , 0.        ,
         0.        , 0.        , 0.        , 0.        , 0.        ,
         0.        , 0.        , 0.        , 0.        , 0.        ]],
       dtype=float32),
 array([[ 1.,  1., 18.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
          1.,  1.,  1.,  1.,  1.,  1.,  1.]], dtype=float32),
 array([3.], dtype=float32),
 array([[[0.35335696, 0.6397857 , 0.96252066, 0.8067749 ],
         [0.25126144, 0.2766906 , 0.97366196, 0.5463176 ],
         [0.7696026 , 0.52089834, 0.9537483 , 0.59052485],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ]]], dtype=float32)]

使用Tensorflow预训练模型

1 个答案: