Question

我有darkflow的张量流图，我在Android设备上运行推理（在CPU Snapdragon 820上）。我找到了this图形转换工具来优化部署模型。所以我优化了我的图表，预计会比以前更快，但它会慢大约10％。

导致这种情况的原因是什么？我究竟做错了什么？

以下是详细信息：

我使用tiny-yolo-voc中的darkflow模型而不做任何修改。我创建了tf模型，如：

$ ./flow --model cfg / tiny-yolo-voc.cfg --load bin / tiny-yolo-voc.weights --savepb --verbalise

我使用以下命令优化了图表：

$ bazel-bin / tensorflow / tools / graph_transforms / transform_graph /
--iningraph = .. / darkflow / darkflow / built_graph / tiny-yolo-voc.pb /
--out_graph =。 ./darkflow/darkflow/built_graph/optimized-tiny-yolo-voc.pb /
--inputs =＆＃39;输入＆＃39; --outputs =＆＃39;输出＆＃39; /
--transforms =＆＃39; strip_unused_nodes（type = float，shape =＆＃34; 1,299,299,3＆＃34;）fold_constants（ignore_errors = true）fold_batch_norms fold_old_batch_norms＆＃39;

我的代码：

InfrerenceRunner.java：

public class InferenceRunner {

    private static final String INPUT_NODE = "input";
    private static final String OUTPUT_NODE = "output";
    protected final TensorFlowInferenceInterface mInferenceInterface;
    private final int mGridSize;
    private final int mNumOfLabels;
    private int mInputSize;

    public InferenceRunner(Context context, String modelFile, int inputSize, int gridSize, int numOfLabels) {
        this.mInputSize = inputSize;
        this.mGridSize = gridSize;
        this.mNumOfLabels = numOfLabels;
        mInferenceInterface = new TensorFlowInferenceInterface(context.getAssets(), modelFile);
    }

    public synchronized void runInference(Bitmap image) {
        Trace.beginSection("imageTransform");
        Bitmap bitmap = Bitmap.createScaledBitmap(image, mInputSize, mInputSize, false);
        int[] intValues = new int[mInputSize * mInputSize];
        float[] floatValues = new float[mInputSize * mInputSize * 3];
        bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());

        for (int i = 0; i < intValues.length; ++i) {
            floatValues[i * 3 + 0] = ((intValues[i] >> 16) & 0xFF) / 255.0f;
            floatValues[i * 3 + 1] = ((intValues[i] >> 8) & 0xFF) / 255.0f;
            floatValues[i * 3 + 2] = (intValues[i] & 0xFF) / 255.0f;
        }
        Trace.endSection();

        Trace.beginSection("inferenceFeed");
        mInferenceInterface.feed(INPUT_NODE, floatValues, 1, mInputSize, mInputSize, 3);
        Trace.endSection();

        Trace.beginSection("inferenceRun");
        mInferenceInterface.run(new String[]{OUTPUT_NODE});
        Trace.endSection();

        final float[] resu =
                new float[mGridSize * mGridSize * (mNumOfLabels + 5) * 5];
        Trace.beginSection("inferenceFetch");
        mInferenceInterface.fetch(OUTPUT_NODE, resu);
        Trace.endSection();
    }
}

MainActivity：onCreate（）：

...
tinyYolo = new InferenceRunner(getApplicationContext(), TINY_YOLO_MODEL_FILE, TINY_YOLO_INPUT_SIZE, 13, 20);
optimizedTinyYolo = new InferenceRunner(getApplicationContext(), OPTIMIZED_TINY_YOLO_MODEL_FILE, TINY_YOLO_INPUT_SIZE, 13, 20);
...

MainActivity：的onResume（）：

...
mHandler.post(new Runnable() {
        @Override
        public void run() {
            Trace.beginSection("TinyYoloModel");
            for (int i = 0; i < 5; i++) {
                tinyYolo.runInference(b);
            }
            Trace.endSection();

            Log.d(TAG, "run: optimized");
            Trace.beginSection("OptimizedModel");
            for (int i = 0; i < 5; i++) {
                optimizedTinyYolo.runInference(b);
            }
            Trace.endSection();
        }
    });
...

我的Systrace输出：

TinyYoloModel墙的持续时间为5,525ms
OptimizedModel持续时间为6,043ms
TinyYoloModel inferenceRun avg：1051ms
OptimizedModel inferenceRun avg：1158ms

您是否知道为什么优化模型会变慢？

如果您需要更多信息，请随时发表评论！谢谢你的帮助。

在

0 个答案: