我要构建一个由向下摄像机覆盖的地面的全景图像(在固定高度,距地面约1米处)。这可能会运行到数千帧,因此Stitcher类'内置panorama
方法并不合适 - 它太慢而且内存很耗。
相反,我假设地板和运动是平面的(这里不是不合理的),并且在我看到每个帧时尝试建立累积单应性。也就是说,对于每个帧,我计算从前一个到新一个的单应性。然后我通过将其与所有先前单应性的乘积相乘得到累积单应性。
让我说我在第0帧和第1帧之间获得H01
,然后在第1帧和第2帧之间获得H12
。要获得转换以将第2帧放置在马赛克上,我需要得到H01*H12
。这会随着帧数的增加而继续,这样我就会得到H01*H12*H23*H34*H45*...
。
在代码中,这类似于:
cv::Mat previous, current;
// Init cumulative homography
cv::Mat cumulative_homography = cv::Mat::eye(3);
video_stream >> previous;
for(;;) {
video_stream >> current;
// Here I do some checking of the frame, etc
// Get the homography using my DenseMosaic class (using Farneback to get OF)
cv::Mat tmp_H = DenseMosaic::get_homography(previous,current);
// Now normalise the homography by its bottom right corner
tmp_H /= tmp_H.at<double>(2, 2);
cumulative_homography *= tmp_H;
previous = current.clone( );
}
它非常有效,除了相机移动&#34; up&#34;从这个角度来看,单应性尺度减小了。随着它向下移动,规模再次增加。这为我的全景图提供了一种我真正不想要的透视效果。
例如,这是在几秒钟的视频向前移动然后向后移动。第一帧看起来不错:
当我们前进几帧时,问题出现了:
然后当我们再次回来时,您可以看到框架再次变大:
我不知道这是从哪里来的。
我使用Farneback密集光流来计算如下的像素 - 像素对应关系(稀疏特征匹配对这些数据不起作用)并且我检查了我的流向量 - 它们&#39 ;一般来说非常好,所以它不是跟踪问题。我也尝试切换输入的顺序以找到单应性(如果我将帧数混合在一起),仍然没有更好。
cv::calcOpticalFlowFarneback(grey_1, grey_2, flow_mat, 0.5, 6,50, 5, 7, 1.5, flags);
// Using the flow_mat optical flow map, populate grid point correspondences between images
std::vector<cv::Point2f> points_1, points_2;
median_motion = DenseMosaic::dense_flow_to_corresp(flow_mat, points_1, points_2);
cv::Mat H = cv::findHomography(cv::Mat(points_2), cv::Mat(points_1), CV_RANSAC, 1);
我认为可能的另一件事是我在转换中包含的翻译,以确保我的全景在场景中居中:
cv::warpPerspective(init.clone(), warped, translation*homography, init.size());
但是在应用翻译之前检查了单应性中的值,我提到的缩放问题仍然存在。
感激地收到任何提示。我可以提供很多代码,但似乎无关紧要,请告诉我是否有遗漏的内容
更新
我已经尝试将*=
运算符切换为完全乘法,并尝试颠倒单应性乘以的顺序,但没有运气。下面是我计算单应性的代码:
/**
\brief Calculates the homography between the current and previous frames
*/
cv::Mat DenseMosaic::get_homography()
{
cv::Mat grey_1, grey_2; // Grayscale versions of frames
cv::cvtColor(prev, grey_1, CV_BGR2GRAY);
cv::cvtColor(cur, grey_2, CV_BGR2GRAY);
// Calculate the dense flow
int flags = cv::OPTFLOW_FARNEBACK_GAUSSIAN;
if (frame_number > 2) {
flags = flags | cv::OPTFLOW_USE_INITIAL_FLOW;
}
cv::calcOpticalFlowFarneback(grey_1, grey_2, flow_mat, 0.5, 6,50, 5, 7, 1.5, flags);
// Convert the flow map to point correspondences
std::vector<cv::Point2f> points_1, points_2;
median_motion = DenseMosaic::dense_flow_to_corresp(flow_mat, points_1, points_2);
// Use the correspondences to get the homography
cv::Mat H = cv::findHomography(cv::Mat(points_2), cv::Mat(points_1), CV_RANSAC, 1);
return H;
}
这是我用来从流程图中找到对应关系的函数:
/**
\brief Calculate pixel->pixel correspondences given a map of the optical flow across the image
\param[in] flow_mat Map of the optical flow across the image
\param[out] points_1 The set of points from #cur
\param[out] points_2 The set of points from #prev
\param[in] step_size The size of spaces between the grid lines
\return The median motion as a point
Uses a dense flow map (such as that created by cv::calcOpticalFlowFarneback) to obtain a set of point correspondences across a grid.
*/
cv::Point2f DenseMosaic::dense_flow_to_corresp(const cv::Mat &flow_mat, std::vector<cv::Point2f> &points_1, std::vector<cv::Point2f> &points_2, int step_size)
{
std::vector<double> tx, ty;
for (int y = 0; y < flow_mat.rows; y += step_size) {
for (int x = 0; x < flow_mat.cols; x += step_size) {
/* Flow is basically the delta between left and right points */
cv::Point2f flow = flow_mat.at<cv::Point2f>(y, x);
tx.push_back(flow.x);
ty.push_back(flow.y);
/* There's no need to calculate for every single point,
if there's not much change, just ignore it
*/
if (fabs(flow.x) < 0.1 && fabs(flow.y) < 0.1)
continue;
points_1.push_back(cv::Point2f(x, y));
points_2.push_back(cv::Point2f(x + flow.x, y + flow.y));
}
}
// I know this should be median, not mean, but it's only used for plotting the
// general motion direction so it's unimportant.
cv::Point2f t_median;
cv::Scalar mtx = cv::mean(tx);
t_median.x = mtx[0];
cv::Scalar mty = cv::mean(ty);
t_median.y = mty[0];
return t_median;
}
答案 0 :(得分:1)
事实证明这是因为我的观点接近于特征,这意味着被跟踪特征的非平面性导致了单应性的偏斜。我设法通过使用estimateRigidTransform
代替findHomography
来阻止这种情况(它更像是一种黑客而非方法......),因为这不会估算透视变化。
在这种特殊情况下,这样做是有道理的,因为视图只会进行严格的转换。