OpenCV:使用StereoCamera系统对颜色标记进行3D姿态估计

时间:2014-07-01 11:08:55

标签: c++ opencv image-processing tracking

我有一个立体相机系统,并使用cv::calibrateCameracv::stereoCalibrate正确校准它。我的reprojection error似乎没问题:

  • Cam0:0.401427
  • Cam1:0.388200
  • 立体声:0.399642

A snapshot of the ongoing calibration process

我通过调用cv::stereoRectify检查我的校准,并使用cv::initUndistortRectifyMapcv::remap转换我的图片。结果如下所示(我注意到有些奇怪的是,当显示校正后的图像时,通常会在一个或有时甚至两个图像上以原始图像的变形副本的形式存在伪影):

Rectification

我还在阈值化的HSV图像上使用cv::findContours正确估计了我的标记在像素坐标中的位置。

enter image description here

不幸的是,当我现在尝试cv::triangulatePoints时,我的结果与估计的坐标相比非常差,特别是在x方向上:

P1 = {   58 (±1),  150 (±1), -90xx (±2xxx)  } (bottom)
P2 = {  115 (±1),  -20 (±1), -90xx (±2xxx)  } (right)
P3 = { 1155 (±6),  575 (±3), 60xxx (±20xxx) } (top-left)

这些是相机坐标中mm的结果。两个相机的位置距离棋盘大约550毫米,方形尺寸为13毫米。显然,我的结果甚至没有达到我的预期(负和巨大的z坐标)。

所以我的问题是:

  1. 我非常仔细地跟踪stereo_calib.cpp样本,我似乎至少在视觉上获得了良好的结果(参见重投影错误)。为什么我的三角测量结果如此糟糕?
  2. 如何将结果转换为真实坐标系,以便我能够定量检查结果?我是否必须手动执行over here,或者是否有一些OpenCV函数?

  3. 这是我的代码:

    std::vector<std::vector<cv::Point2f> > imagePoints[2];
    std::vector<std::vector<cv::Point3f> > objectPoints;
    
    imagePoints[0].resize(s->nrFrames);
    imagePoints[1].resize(s->nrFrames);
    objectPoints.resize( s->nrFrames );
    
    // [Obtain image points..]
    // cv::findChessboardCorners, cv::cornerSubPix
    
    // Calc obj points
    for( int i = 0; i < s->nrFrames; i++ )
        for( int j = 0; j < s->boardSize.height; j++ )
            for( int k = 0; k < s->boardSize.width; k++ )
                objectPoints[i].push_back( Point3f( j * s->squareSize, k * s->squareSize, 0 ) );
    
    
    // Mono calibration
    cv::Mat cameraMatrix[2], distCoeffs[2];
    cameraMatrix[0] = cv::Mat::eye(3, 3, CV_64F);
    cameraMatrix[1] = cv::Mat::eye(3, 3, CV_64F);
    
    std::vector<cv::Mat> tmp0, tmp1;
    
    double err0 = cv::calibrateCamera( objectPoints, imagePoints[0], cv::Size( 656, 492 ),
        cameraMatrix[0], distCoeffs[0], tmp0, tmp1,    
        CV_CALIB_FIX_PRINCIPAL_POINT + CV_CALIB_FIX_ASPECT_RATIO );
    std::cout << "Cam0 reproj err: " << err0 << std::endl;
    
    double err1 = cv::calibrateCamera( objectPoints, imagePoints[1], cv::Size( 656, 492 ),
        cameraMatrix[1], distCoeffs[1], tmp0, tmp1,
        CV_CALIB_FIX_PRINCIPAL_POINT + CV_CALIB_FIX_ASPECT_RATIO );
    std::cout << "Cam1 reproj err: " << err1 << std::endl;
    
    // Stereo calibration
    cv::Mat R, T, E, F;
    
    double err2 = cv::stereoCalibrate(objectPoints, imagePoints[0], imagePoints[1],
        cameraMatrix[0], distCoeffs[0], cameraMatrix[1], distCoeffs[1],
        cv::Size( 656, 492 ), R, T, E, F,
        cv::TermCriteria(CV_TERMCRIT_ITER+CV_TERMCRIT_EPS, 100, 1e-5),
        CV_CALIB_USE_INTRINSIC_GUESS + // because of mono calibration
        CV_CALIB_SAME_FOCAL_LENGTH +
        CV_CALIB_RATIONAL_MODEL +
        CV_CALIB_FIX_K3 + CV_CALIB_FIX_K4 + CV_CALIB_FIX_K5);
    std::cout << "Stereo reproj err: " << err2 << std::endl;
    
    // StereoRectify
    cv::Mat R0, R1, P0, P1, Q;
    Rect validRoi[2];
    cv::stereoRectify( cameraMatrix[0], distCoeffs[0],
                cameraMatrix[1], distCoeffs[1],
                cv::Size( 656, 492 ), R, T, R0, R1, P0, P1, Q,
                CALIB_ZERO_DISPARITY, 1, cv::Size( 656, 492 ), &validRoi[0], &validRoi[1]);
    
    
    // [Track marker..]
    // cv::cvtColor, cv::inRange, cv::morphologyEx, cv::findContours, cv::fitEllipse, *calc ellipsoid centers*
    
    // Triangulation
    unsigned int N = centers[0].size();
    
    
    cv::Mat pnts3D;
    cv::triangulatePoints( P0, P1, centers[0], centers[1], pnts3D );
    
    
    cv::Mat t = pnts3D.t();
    cv::Mat pnts3DT = cv::Mat( N, 1, CV_32FC4, t.data );
    
    cv::Mat resultPoints; 
    cv::convertPointsFromHomogeneous( pnts3DT, resultPoints );
    

    最后,resultPoints应该包含我在相机坐标中的3D位置的列向量。

    编辑:我删除了一些不必要的转换以缩短代码

    Edit2:我使用@marols建议的结果建议三角测量解决方案

    Qualitative and quantitative triangulation results

    P1 = { 111,  47, 526 } (bottom-right)
    P2 = {  -2,   2, 577 } (left)
    P3 = {  64, -46, 616 } (top)
    

2 个答案:

答案 0 :(得分:4)

我的回答将集中于建议triangulatePoints的另一种解决方案。在立体视觉的情况下,您可以通过以下方式使用立体声校正返回的矩阵Q:

std::vector<cv::Vec3f> surfacePoints, realSurfacePoints;

unsigned int N = centers[0].size();
for(int i=0;i<N;i++) {
    double d, disparity;
    // since you have stereo vision system in which cameras lays next to 
    // each other on OX axis, disparity is measured along OX axis
    d = T.at<double>(0,0);
    disparity = centers[0][i].x - centers[1][i].x;

    surfacePoints.push_back(cv::Vec3f(centers[0][i].x, centers[0][i].y, disparity));
}

cv::perspectiveTransform(surfacePoints, realSurfacePoints, Q);

请将以下代码段调整到您的代码中,我可能会犯一些错误,但重点是创建一个cv :: Vec3f&#39; s数组,每个都有以下结构:( point.x,point.y ,第二个图像上的点之间的差异)并将其传递给perspectiveTransform方法(有关详细信息,请参阅docs)。如果您想更详细地了解矩阵Q的创建方式(基本上它代表了从图像到现实世界点的反向&#34;投影),请参阅&#34;学习OpenCV&#34;书,第435页。

在我开发的立体视觉系统中,所描述的方法工作正常,并且在更大的校准误差(如1.2)上给出了正确的结果。

答案 1 :(得分:0)

要投影到真实世界坐标系统,您需要投影相机矩阵。 这可以这样做:

cv::Mat KR = CalibMatrix * R;
cv::Mat eyeC = cv::Mat::eye(3,4,CV_64F);
eyeC.at<double>(0,3) = -T.at<double>(0);
eyeC.at<double>(1,3) = -T.at<double>(1);
eyeC.at<double>(2,3) = -T.at<double>(2);

CameraMatrix = cv::Mat(3,4,CV_64F);
CameraMatrix.at<double>(0,0) = KR.at<double>(0,0) * eyeC.at<double>(0,0) + KR.at<double>(0,1) * eyeC.at<double>(1,0) + KR.at<double>(0,2) * eyeC.at<double>(2,0);
CameraMatrix.at<double>(0,1) = KR.at<double>(0,0) * eyeC.at<double>(0,1) + KR.at<double>(0,1) * eyeC.at<double>(1,1) + KR.at<double>(0,2) * eyeC.at<double>(2,1);
CameraMatrix.at<double>(0,2) = KR.at<double>(0,0) * eyeC.at<double>(0,2) + KR.at<double>(0,1) * eyeC.at<double>(1,2) + KR.at<double>(0,2) * eyeC.at<double>(2,2);
CameraMatrix.at<double>(0,3) = KR.at<double>(0,0) * eyeC.at<double>(0,3) + KR.at<double>(0,1) * eyeC.at<double>(1,3) + KR.at<double>(0,2) * eyeC.at<double>(2,3);
CameraMatrix.at<double>(1,0) = KR.at<double>(1,0) * eyeC.at<double>(0,0) + KR.at<double>(1,1) * eyeC.at<double>(1,0) + KR.at<double>(1,2) * eyeC.at<double>(2,0);
CameraMatrix.at<double>(1,1) = KR.at<double>(1,0) * eyeC.at<double>(0,1) + KR.at<double>(1,1) * eyeC.at<double>(1,1) + KR.at<double>(1,2) * eyeC.at<double>(2,1);
CameraMatrix.at<double>(1,2) = KR.at<double>(1,0) * eyeC.at<double>(0,2) + KR.at<double>(1,1) * eyeC.at<double>(1,2) + KR.at<double>(1,2) * eyeC.at<double>(2,2);
CameraMatrix.at<double>(1,3) = KR.at<double>(1,0) * eyeC.at<double>(0,3) + KR.at<double>(1,1) * eyeC.at<double>(1,3) + KR.at<double>(1,2) * eyeC.at<double>(2,3);
CameraMatrix.at<double>(2,0) = KR.at<double>(2,0) * eyeC.at<double>(0,0) + KR.at<double>(2,1) * eyeC.at<double>(1,0) + KR.at<double>(2,2) * eyeC.at<double>(2,0);
CameraMatrix.at<double>(2,1) = KR.at<double>(2,0) * eyeC.at<double>(0,1) + KR.at<double>(2,1) * eyeC.at<double>(1,1) + KR.at<double>(2,2) * eyeC.at<double>(2,1);
CameraMatrix.at<double>(2,2) = KR.at<double>(2,0) * eyeC.at<double>(0,2) + KR.at<double>(2,1) * eyeC.at<double>(1,2) + KR.at<double>(2,2) * eyeC.at<double>(2,2);
CameraMatrix.at<double>(2,3) = KR.at<double>(2,0) * eyeC.at<double>(0,3) + KR.at<double>(2,1) * eyeC.at<double>(1,3) + KR.at<double>(2,2) * eyeC.at<double>(2,3);