实现像openCV这样的对象检测

时间:2017-09-13 13:33:15

标签: c opencv haar-classifier vivado-hls

我试图在C中使用Haar级联(如openCV'实现)实现Viola-Jones算法进行对象检测,以检测面部。我以Vivado HLS兼容的方式编写C代码,因此我可以将实现移植到FPGA。我的主要目标是尽可能多地学习,而不仅仅是让它发挥作用。我也很感激任何改进我的问题的帮助。

我基本上开始阅读G. Bradski的Learning openCV,观看了一些在线教程并开始编写代码。果然它没有检测到面孔,我不知道为什么。在这一点上,我更关心理解我的错误,而不是能够检测到面孔。

我的实施步骤

我不确定多少细节是合适的,但要保持简短:

  • 将Haar级联数据从haarcascade_frontalface_default.xml提取到C可读结构(大型数组)
  • 编写一个函数来创建任何给定的8位灰度图像的整体图像,大小为24x24(与级联中列出的大小相同)
  • 应用此great post的知识进行必要的计算

我的测试方案

  • 使用具有与上述相同的Haar级联的openCV库来实现python脚本以检测面部以创建黄金数据,从图像中剪切出检测到的面部(确保24x24尺寸)并存储。
  • 存储的图像转换为一维C数组,包含逐行像素值:img = {row0col0, row0col1, row1col0, row1col1, ... }
  • 计算积分图像并应用面部检测

结果

  • 面对哈尔级联的25个阶段仅传递6个,因此我的实现没有检测到,我知道它们应该被检测到,因为带有openCV的python脚本和相同的Haar级联确实检测到它们。

我的代码

 /*
 * This is detectFace.c
 */

#include <stdio.h>
#include "detectFace.h"

// define constants based on Haar cascade in use
// Each feature is made of max 3 rects
//#define FEAT_NO 1     // max no. of features (= 2912 for face_default.xml)
#define RECTS_IN_FEAT 3 // max no. of rect's per feature
//#define INTS_IN_RECT 5    // no. of int's needed to describe a rect
// each node has one feature (bijective relation) and three doubles
#define STAGE_NO 25 // no. of stages
#define NODE_NO 211 // no of nodes per stage, corresponds to FEAT_NO since each Node has always one feature in haarcascade_frontalface_default.xml
//#define ELMNT_IN_NODE 3   // no. of doubles needed to describe a node

// constants for frame size
#define WIN_WIDTH 24 // width = height =24

//int detectFace(int features[FEAT_NO][RECTS_IN_FEAT][INTS_IN_RECT], double stages[STAGE_NO][NODE_NO][ELMNT_IN_NODE], double stageThresh[STAGE_NO], int ii[24][24]){
int detectFace(
    int ii[576],
    int stageNum,
    int stageOrga[25],
    float stageThresholds[25],
    float nodes[8739],
    int featOrga[2913],
    int rectangles[6383][5])
{
    int passedStages = 0; // number of stages passed in this run
    int faceDetected = 0; // turns to 1 if face is detected and to 0 if its not detected
    // Debug:
    int nodesUsed = 0; // number of floats out of nodes[] processed, use to skip to the unprocessed floats
    int rectsUsed = 0; // number of rects processed
    int droppedInStage0 = 0;

    // loop through all stages
    int i;
detectFace_label1:
    for (i = 0; i < STAGE_NO; i++)
    {
        double tmp = 0.0;           //variable to accumulate node-values, to then compare to stage threshold
        int nodeNum = stageOrga[i]; // get number of nodes for this stage from stageOrga using stage index i
        // loop through nodes inside each stage
        // NOTE: it is assumed that each node maps to one corresponding feature. Ex: node[0] has feat[0) and node[1] has feat[1]
        // because this is how it is written in the haarcascade_frontalface_default.xml
        int j;
    detectFace_label0:
        for (j = 0; j < NODE_NO; j++)
        {
            // a node is defined by 3 values:
            double nodeThresh = nodes[nodesUsed]; // the first value is the node threshold
            double lValue = nodes[nodesUsed + 1]; // the second value is the left value
            double rValue = nodes[nodesUsed + 2]; // the third value is the right value
            int sum = 0;                          // contains the weighted value of rectangles in one Haar feature
            // loop through rect's in a feature, some have 2 and some have 3 rect's.
            // Each node always refers to one feature in a way that node0 maps to feature0 and node1 to feature1 (The XML file is build like that)
            //int rectNum = featOrga[j]; // get number of rects for current feature using current node index j
            int k;
        detectFace_label2:
            for (k = 0; k < RECTS_IN_FEAT; k++)
            {
                int x = 0, y = 0, width = 0, height = 0, weight = 0, coordUpL = 0, coordUpR = 0, coordDownL = 0, coordDownR = 0;

                // a rect is defined by 5 values:
                x = rectangles[rectsUsed][0];      // the first value is the x coordinate of the top left corner pixel
                y = rectangles[rectsUsed][1];      // the second value is the y coordinate of the top left corner pixel
                width = rectangles[rectsUsed][2];  // the third value is the width of the current rectangle
                height = rectangles[rectsUsed][3]; // the fourth value is the height of this rectangle
                weight = rectangles[rectsUsed][4]; // the fifth value is the weight of this rectangle

                // calculating 1-Dim index for points of interest. Formula: index = width * row + column, assuming values are stored in row order
                coordUpL = ((WIN_WIDTH * y) - WIN_WIDTH) + (x - 1);
                coordUpR = coordUpL + width;
                coordDownL = coordUpL + (height * WIN_WIDTH);
                coordDownR = coordDownL + width;

                // calculate the area sum according to Viola-Jones
                //sum += (ii[x][y] + ii[x+width][y+height] - ii[x][y+height] - ii[x+width][y]) * weight;
                sum += (ii[coordUpL] + ii[coordDownR] - ii[coordUpR] - ii[coordDownL]) * weight;
                // Debug: counting the number of actual rectangles used
                rectsUsed++; //
            }
            // decide whether the result of the feature calculation reaches the node threshold
            if (sum < nodeThresh)
            {
                tmp += lValue; // add left value to tmp if node threshold was not reached
            }
            else
            {
                tmp += rValue; // // add right value to tmp if node threshold was reached
            }
            nodesUsed = nodesUsed + 3; // one node is processed, increase nodesUsed by number of floats needed to represent a node (3)¬
        }
        //########  at this point we went through each node in the current stage #######
        // check if threshold of current stage was reached
        if (tmp < stageThresholds[i])
        {
            faceDetected = 0; // if any stage threshold is not reached the operation is done and no face is present
            // Debug: show in which stage the frame was dropped
            printf("Face detection failed in stage %d \n", i);
            //i = stageNum;         // breaks out this loop, because i is supposed to stay smaller than STAGE_NO
        }
        else
        {
            passedStages++; // stage threshold is reached, therefore passedStages will count up
        }
    }
    //########  at this point we went through all stages ###############################
    //----------------------------------------------------------------------------------
    // if the number of passed stages reaches the total number of stages, a face is detected
    if (passedStages == stageNum)
    {
        faceDetected = 1; // one symbolizes that the input is a face
    }
    else
    {
        faceDetected = 0; // zero symbolizes that the input is not a face
    };
    return faceDetected;
}

0 个答案:

没有答案