Question

我正在尝试使用我的网络摄像头从具有已知全球位置的四个基准点的图像中获得全局姿势估计值。

我检查了很多stackexchange问题和一些文章，我似乎无法得到正确的解决方案。我得到的位置编号是可重复的，但绝不与摄像机移动成线性比例。仅供参考我使用的是C ++ OpenCV 2.1。

At this link is pictured我的坐标系和下面使用的测试数据。

% Input to solvePnP():
imagePoints =     [ 481, 831; % [x, y] format
                    520, 504;
                   1114, 828;
                   1106, 507]
objectPoints = [0.11, 1.15, 0; % [x, y, z] format
                0.11, 1.37, 0; 
                0.40, 1.15, 0;
                0.40, 1.37, 0]

% camera intrinsics for Logitech C910
cameraMat = [1913.71011, 0.00000,    1311.03556;
             0.00000,    1909.60756, 953.81594;
             0.00000,    0.00000,    1.00000]
distCoeffs = [0, 0, 0, 0, 0]

% output of solvePnP():
tVec = [-0.3515;
         0.8928; 
         0.1997]

rVec = [2.5279;
       -0.09793;
        0.2050]
% using Rodrigues to convert back to rotation matrix:

rMat = [0.9853, -0.1159,  0.1248;
       -0.0242, -0.8206, -0.5708;
        0.1686,  0.5594, -0.8114]

到目前为止，任何人都可以看到这些数字有什么问题？如果有人会检查它们，例如MatLAB（上面的代码是m-file友好的），我将不胜感激。

从这一点来说，我不确定如何从rMat和tVec获得全局姿势。根据我在this question中所读到的，从rMat和tVec获取姿势的原因很简单：

position = transpose(rMat) * tVec   % matrix multiplication

但我怀疑其他消息来源说我读过它并不那么简单。

要在真实世界坐标中获取相机的位置，我需要做什么？ 由于我不确定这是否是一个实现问题（但很可能是理论问题），我希望有人在OpenCV中成功使用solvePnP函数来回答这个问题，尽管任何想法都是受欢迎的！

非常感谢你的时间。

Answer 1

我刚刚解决了这个问题，为今年的延迟道歉。

在我使用的python OpenCV 2.1中，以及更新版本的3.0.0-dev，我已经验证了要在全局框架中获取相机的姿势，你必须：

_, rVec, tVec = cv2.solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs)
Rt = cv2.Rodrigues(rvec)
R = Rt.transpose()
pos = -R * tVec

现在pos是在全局帧中表示的相机的位置（objectPoints表示的相同帧）。 R是态度矩阵DCM，是存储态度的好形式。如果需要欧拉角，则可以使用以下方法将DCM转换为欧拉角，给定XYZ旋转序列：

roll = atan2(-R[2][1], R[2][2])
pitch = asin(R[2][0])
yaw = atan2(-R[1][0], R[0][0])

Answer 2

如果您的意思是使用全局姿势一个可以在OpenGL中使用的4x4相机姿势矩阵，我会这样做

CvMat* ToOpenGLCos( const CvMat* tVec, const CvMat* rVec )
{
    //** flip COS 180 degree around x-axis **//

    // Rodrigues to rotation matrix
    CvMat* extRotAsMatrix = cvCreateMat(3,3,CV_32FC1);
    cvRodrigues2(rVec,extRotAsMatrix);

    // Simply merge rotation matrix and translation vector to 4x4 matrix 
    CvMat* world2CameraTransformation = CreateTransformationMatrixH(tVec,
    extRotAsMatrix );

    // Create correction rotation matrix (180 deg x-axis)
    CvMat* correctionMatrix = cvCreateMat(4,4,CV_32FC1);
    /* 1.00000   0.00000   0.00000   0.00000
       0.00000  -1.00000  -0.00000   0.00000
       0.00000   0.00000  -1.00000   0.00000
       0.00000   0.00000   0.00000   1.00000 */
    cvmSet(correctionMatrix,0,0,1.0); cvmSet(correctionMatrix,0,1,0.0);
    ... 

    // Flip it
    CvMat* world2CameraTransformationOpenGL = cvCreateMat(4,4,CV_32FC1);
    cvMatMul(correctionMatrix,world2CameraTransformation,   world2CameraTransformationOpenGL);

    CvMat* camera2WorldTransformationOpenGL = cvCreateMat(4,4,CV_32FC1);
    cvInv(world2CameraTransformationOpenGL,camera2WorldTransformationOpenGL,
    CV_LU );

    cvReleaseMat( &world2CameraTransformationOpenGL );
    ...

    return camera2WorldTransformationOpenGL;
}

我认为翻转坐标系是必要的，因为OpenCV和OpenGL / VTK /等。使用不同的坐标系，如图OpenGL and OpenCV Coordinate Systems

所示

嗯，它可以这样工作，但有人可能有更好的解释。

Answer 3

摄像机的位置为{ - transpose（r）* t}。就是这样。

你已经正确地完成了所有事情，除非，cv :: solvePnp给出（4 x 1）向量进行翻译，如果我没记错的话，你必须将x，y，z与w坐标分开。

相机姿态估计（OpenCV PnP）

3 个答案: