Question

我在视频（或图像）中有4个共面点代表一个四边形（不一定是正方形或矩形），我希望能够在它们顶部显示一个虚拟立方体，其中立方体的角落准确地站立在视频四边形的角落。

由于点是共面的，我可以计算单位平方的角（即[0,0] [0,1] [1,0] [1,1]）和四边形的视频坐标之间的单应性。

根据这个单应性，我应该能够计算出正确的相机姿势，即[R | t]，其中R是3x3旋转矩阵，t是3x1平移向量，因此虚拟立方体位于视频四边形上。

我已经阅读了很多解决方案（其中一些是关于SO的）并尝试实现它们，但它们似乎仅适用于某些“简单”情况（例如视频四边形是正方形）但在大多数情况下都不起作用。 / p>

以下是我尝试过的方法（大多数方法基于相同的原理，只有翻译的计算略有不同）。设K是摄像机的内在矩阵，H是单应性。我们计算：

A = K-1 * H

令a1，a2，a3为A和r1，r2，r3的列向量，为旋转矩阵R的列向量。

r1 = a1 / ||a1||
r2 = a2 / ||a2||
r3 = r1 x r2
t = a3 / sqrt(||a1||*||a2||)

问题是在大多数情况下这不起作用。为了检查我的结果，我将R和t与OpenCV的solvePnP方法获得的结果进行了比较（使用以下3D点[0,0,0] [0,1,0] [1,0,0] [1,1 ，0]）。

由于我以相同的方式显示立方体，我注意到在每种情况下，solvePnP都提供了正确的结果，而从单应性中获得的姿势大多是错误的。

理论上，由于我的点是共面的，因此可以从单应性计算姿势，但我找不到从H计算姿势的正确方法。

对我做错了什么见解？

尝试@ Jav_Rock的方法后编辑

嗨Jav_Rock，非常感谢你的回答，我尝试了你的方法（以及其他许多方法），这似乎或多或少都可以。然而，在基于4个共面点计算姿势时，我仍然遇到一些问题。为了检查结果，我将与solvePnP的结果进行比较（由于迭代重投影误差最小化方法，这将更好）。

以下是一个例子：

cube

黄色立方体：解决PNP
Black Cube：Jav_Rock的技巧
青色（和紫色）立方体：给出完全相同结果的一些其他技术

正如你所看到的，黑色立方体或多或少都可以，但看起来并不均匀，尽管矢量似乎是正交的。

EDIT2：我计算后对v3进行了规范化（为了强制执行正交性），它似乎也解决了一些问题。

Answer 1

如果你有Homography，你可以用这样的方式计算相机姿势：

void cameraPoseFromHomography(const Mat& H, Mat& pose)
{
    pose = Mat::eye(3, 4, CV_32FC1);      // 3x4 matrix, the camera pose
    float norm1 = (float)norm(H.col(0));  
    float norm2 = (float)norm(H.col(1));  
    float tnorm = (norm1 + norm2) / 2.0f; // Normalization value

    Mat p1 = H.col(0);       // Pointer to first column of H
    Mat p2 = pose.col(0);    // Pointer to first column of pose (empty)

    cv::normalize(p1, p2);   // Normalize the rotation, and copies the column to pose

    p1 = H.col(1);           // Pointer to second column of H
    p2 = pose.col(1);        // Pointer to second column of pose (empty)

    cv::normalize(p1, p2);   // Normalize the rotation and copies the column to pose

    p1 = pose.col(0);
    p2 = pose.col(1);

    Mat p3 = p1.cross(p2);   // Computes the cross-product of p1 and p2
    Mat c2 = pose.col(2);    // Pointer to third column of pose
    p3.copyTo(c2);       // Third column is the crossproduct of columns one and two

    pose.col(3) = H.col(2) / tnorm;  //vector t [R|t] is the last column of pose
}

这种方法适用于我。祝你好运。

Answer 2

Jav_Rock提出的答案并没有为三维空间中的相机姿势提供有效的解决方案。

为了估计由单应性引起的树维变换和旋转，存在多种方法。 One of them提供了用于分解单应性的封闭公式，但它们非常复杂。此外，解决方案永远不是唯一的。

幸运的是，OpenCV 3已经实现了这种分解（decomposeHomographyMat）。给定单应性和正确缩放的内在矩阵，该函数提供一组四种可能的旋转和平移。

Answer 3

以防任何人需要python移植@Jav_Rock编写的函数：

def cameraPoseFromHomography(H):
    H1 = H[:, 0]
    H2 = H[:, 1]
    H3 = np.cross(H1, H2)

    norm1 = np.linalg.norm(H1)
    norm2 = np.linalg.norm(H2)
    tnorm = (norm1 + norm2) / 2.0;

    T = H[:, 2] / tnorm
    return np.mat([H1, H2, H3, T])

在我的任务中工作正常。

Answer 4

从单应矩阵计算[R | T]比Jav_Rock的答案稍微复杂一点。

在OpenCV 3.0中，有一个名为cv :: decomposeHomographyMat的方法，它返回四个可能的解决方案，其中一个是正确的。但是，OpenCV没有提供一种方法来挑选出正确的方法。

我现在正在研究这个问题，也许会在本月晚些时候在这里发布我的代码。

Answer 5

在图像上包含Square的平面有相机消失的通道代理。该线的方程是A x + B y + C = 0.

您的飞机正常是（A，B，C）！

让p00，p01，p10，p11是应用相机的内在参数后的点坐标，并且是均匀的形式，例如，p00 =（x00，y00,1）

消失线可以计算为：

down = p00 cross p01;
left = p00 cross p10;
right = p01 cross p11;
up = p10 cross p11;
v1 =左右交叉;
v2 =向上交叉;
vanish_line = v1 cross v2;

标准向量交叉产品中交叉的位置

Answer 6

您可以使用此功能。对我有用。

def find_pose_from_homography(H, K):
    '''
    function for pose prediction of the camera from the homography matrix, given the intrinsics 
    
    :param H(np.array): size(3x3) homography matrix
    :param K(np.array): size(3x3) intrinsics of camera
    :Return t: size (3 x 1) vector of the translation of the transformation
    :Return R: size (3 x 3) matrix of the rotation of the transformation (orthogonal matrix)
    '''
    
    
    #to disambiguate two rotation marices corresponding to the translation matrices (t and -t), 
    #multiply H by the sign of the z-comp on the t-matrix to enforce the contraint that z-compoment of point
    #in-front must be positive and thus obtain a unique rotational matrix
    H=H*np.sign(H[2,2])

    h1,h2,h3 = H[:,0].reshape(-1,1), H[:,1].reshape(-1,1) , H[:,2].reshape(-1,1)
    
    R_ = np.hstack((h1,h2,np.cross(h1,h2,axis=0))).reshape(3,3)
    
    U, S, V = np.linalg.svd(R_)
    
    R = U@np.array([[1,0,0],
                   [0,1,0],
                    [0,0,np.linalg.det(U@V.T)]])@V.T
    
    t = (h3/np.linalg.norm(h1)).reshape(-1,1)
    
    return R,t

Answer 7

这是一个python版本，基于Dmitriy Voloshyn提交的版本，它将旋转矩阵标准化并将结果转换为3x4。

def cameraPoseFromHomography(H):  
    norm1 = np.linalg.norm(H[:, 0])
    norm2 = np.linalg.norm(H[:, 1])
    tnorm = (norm1 + norm2) / 2.0;

    H1 = H[:, 0] / norm1
    H2 = H[:, 1] / norm2
    H3 = np.cross(H1, H2)
    T = H[:, 2] / tnorm

    return np.array([H1, H2, H3, T]).transpose()

基于4个共面点计算具有单应矩阵的相机姿态

7 个答案: