TensorFlow Lite和Android Things-定位检测到的对象并将其存储在RectF对象中吗?

时间:2018-11-22 04:21:44

标签: android tensorflow android-things tensorflow-lite

我有一个Android平板电脑,在可用的示例中安装了TensorFlow-Lite DetectorActivity。它在Android平板电脑上运行良好。但是,当我尝试在运行Android Things的RaspberryPi 3 Model B上部署它时,它没有运行。在启用实时摄像机预览和运行分析方面,正确配置摄像机似乎存在问题。

我的最初目标是使对象检测应用程序在Android Things上运行。在检测到的对象上绘制边界矩形也很重要。

我正在寻找一个使用TensorFlow-Lite并在Android Things上运行的Android应用示例。我很快找到了this example from hackster.io that uses Image Classification to dispense candy。我在RaspberryPi板上运行了它,然后运行了。它给出结果,对象名称,置信度以及ID。我可以在此示例代码的基础上进行构建。我可以使该应用程序拍照,分析并给出结果,而不是实时的照相机供稿。之后,再拍摄一张照片,然后循环继续进行。


我试图做的是在TFLite Android示例中改编recognizeFunction,它在TFLiteObjectDetectionAPIModel类中。我将其调整为适用于Candy Dispenser Android应用程序的doIdentification功能。我的功能现在看起来像这样:

// outputLocations: array of shape [Batchsize, NUM_DETECTIONS,4]
// contains the location of detected boxes
private float[][][] outputLocations;
// outputClasses: array of shape [Batchsize, NUM_DETECTIONS]
// contains the classes of detected boxes
private float[][] outputClasses;
// outputScores: array of shape [Batchsize, NUM_DETECTIONS]
// contains the scores of detected boxes
private float[][] outputScores;
// numDetections: array of shape [Batchsize]
// contains the number of detected boxes
private float[] numDetections;

private static final int NUM_DETECTIONS = 10;

private static final float IMAGE_MEAN = 128.0f;
private static final float IMAGE_STD = 128.0f;

private void doIdentification(Bitmap image) {
    Log.e(TAG, "doing identification!");

    int numBytesPerChannel;
        Log.e(TAG, "model is quantized");
        numBytesPerChannel = 1; // Quantized
    } else {
        Log.e(TAG, "model is NOT quantized");
        numBytesPerChannel = 4; // Floating point

    ByteBuffer imgData = ByteBuffer.allocateDirect(1 * TF_INPUT_IMAGE_HEIGHT * TF_INPUT_IMAGE_HEIGHT
            * 3 * numBytesPerChannel);

    // Preprocess the image data from 0-255 int to normalized float based
    // on the provided parameters.
    int[] intValues = new int[TF_INPUT_IMAGE_HEIGHT * TF_INPUT_IMAGE_HEIGHT];

    image.getPixels(intValues, 0, image.getWidth(), 0, 0, image.getWidth(), image.getHeight());

    for (int i = 0; i < TF_INPUT_IMAGE_HEIGHT; ++i) {
        for (int j = 0; j < TF_INPUT_IMAGE_HEIGHT; ++j) {
            int pixelValue = intValues[i * TF_INPUT_IMAGE_HEIGHT + j];

            if (TF_OD_API_IS_QUANTIZED) {
                imgData.put((byte) ((pixelValue >> 16) & 0xFF));
                imgData.put((byte) ((pixelValue >> 8) & 0xFF));
                imgData.put((byte) (pixelValue & 0xFF));
            } else {
                imgData.putFloat((((pixelValue >> 16) & 0xFF) - IMAGE_MEAN) / IMAGE_STD);
                imgData.putFloat((((pixelValue >> 8) & 0xFF) - IMAGE_MEAN) / IMAGE_STD);
                imgData.putFloat(((pixelValue & 0xFF) - IMAGE_MEAN) / IMAGE_STD);

    Trace.endSection(); // preprocessBitmap

    // Allocate space for the inference results
    byte[][] confidencePerLabel = new byte[1][mLabels.size()];

    //for box detections
    // Copy the input data into TensorFlow.
    outputLocations = new float[1][NUM_DETECTIONS][4];
    outputClasses = new float[1][NUM_DETECTIONS];
    outputScores = new float[1][NUM_DETECTIONS];
    numDetections = new float[1];

    Object[] inputArray = {imgData};
    Map<Integer, Object> outputMap = new HashMap<>();
    outputMap.put(0, outputLocations);
    outputMap.put(1, outputClasses);
    outputMap.put(2, outputScores);
    outputMap.put(3, numDetections);

    // Read image data into buffer formatted for the TensorFlow model
    TensorFlowHelper.convertBitmapToByteBuffer(image, intValues, imgData);

    // Run inference on the network with the image bytes in imgData as input,
    // storing results on the confidencePerLabel array.

    mTensorFlowLite.runForMultipleInputsOutputs(inputArray, outputMap);

    // TODO - we try and fetch our rectF's here

    final ArrayList<Recognition> recognitions = new ArrayList<>(NUM_DETECTIONS);
    for (int i = 0; i < NUM_DETECTIONS; ++i) {
        final RectF detection =
                new RectF(
                        outputLocations[0][i][1] * TF_OD_API_INPUT_SIZE,
                        outputLocations[0][i][0] * TF_OD_API_INPUT_SIZE,
                        outputLocations[0][i][3] * TF_OD_API_INPUT_SIZE,
                        outputLocations[0][i][2] * TF_OD_API_INPUT_SIZE);
        // SSD Mobilenet V1 Model assumes class 0 is background class
        // in label file and class labels start from 1 to number_of_classes+1,
        // while outputClasses correspond to class index from 0 to number_of_classes
        int labelOffset = 1;

        Log.e(TAG, "adding the following to our results: ");
        Log.e(TAG, "recognition id: " + i);
        Log.e(TAG, "recognition label: " + mLabels.get((int) outputClasses[0][i] + labelOffset));
        Log.e(TAG, "recognition confidence: " + outputScores[0][i]);

                new Recognition(
                        "" + i,
                        mLabels.get((int) outputClasses[0][i] + labelOffset),

    Trace.endSection(); // "recognizeImage"

        // TODO -- This is the old working code
        // Get the results with the highest confidence and map them to their labels
        Collection<Recognition> results = TensorFlowHelper.getBestResults(confidencePerLabel, mLabels);
        Log.e(TAG, "results count is = " + results.size());

        // Report the results with the highest confidence



java.lang.IllegalArgumentException: Cannot convert between a TensorFlowLite tensor with type UINT8 and a Java object of type [[[F (which is compatible with the TensorFlowLite type FLOAT32).


mTensorFlowLite.runForMultipleInputsOutputs(inputArray, outputMap);


有人在Android Things上实现了对象检测(在检测到的对象上绘制RectF)吗?

1 个答案:

答案 0 :(得分:0)

这里的问题可能是您正在混合模型。 图像分类对象检测模型之间存在差异。分类只是报告图像中某种类型的对象的置信度,而检测则可以识别对象的位置。您从开始的糖果分配器样本使用图像分类模型(mobilenet_quant_v1_224.tflite),而您提到的TFLite样本运行对象检测模型(mobilenet_ssd.tflite)。

我建议从进行对象检测并解决相机问题的示例开始,而不是反过来解决问题。糖果分配器示例(以及official image classifier sample)为将相机安装在RPi3上以捕获图像并将其转换以供模型使用提供了很好的参考。