好的,我必须承认我是OpenCV和我的MATLAB / lin的新手。代数知识可能会引入偏见。但我想做的事情很简单,但我还是找不到答案。
在透视变换下尝试校正图像(或图像的一部分)时,基本上执行两个步骤(假设您有4个点来定义扭曲的对象):
findHomography()
或getPerspectiveTransform()
- 为什么这两个在同一点上的操作不同是另一个故事,也令人沮丧);这给了我们一个矩阵T。warpPerspective()
完成)。现在,最后一个函数(warpPerspective()
)要求用户指定目标图像的大小。
我的问题是用户应该如何预先了解这个尺寸。这样做的低级方法就是将变换T应用到找到对象的图像的角点,从而保证不会使用新变换的形状离开边界。 但是,即使您从T中取出矩阵并将其手动应用于这些点,结果也会很奇怪。
有没有办法在OpenCV中执行此操作?谢谢!
P.S。以下是一些代码:
float leftX, lowerY, rightX, higherY;
float minX = std::numeric_limits<float>::max(), maxX = std::numeric_limits<float>::min(), minY = std::numeric_limits<float>::max(), maxY = std::numeric_limits<float>::min();
Mat value, pt;
for(int i=0; i<4; i++)
{
switch(i)
{
case 0:
pt = (Mat_<float>(3, 1) << 1.00,1.00,1.00);
break;
case 1:
pt = (Mat_<float>(3, 1) << srcIm.cols,1.00,1.00);
break;
case 2:
pt = (Mat_<float>(3, 1) << 1.00,srcIm.rows,1.00);
break;
case 3:
pt = (Mat_<float>(3, 1) << srcIm.cols,srcIm.rows,1.00);
break;
default:
cerr << "Wrong switch." << endl;
break;
}
value = invH*pt;
value /= value.at<float>(2);
minX = min(minX,value.at<float>(0));
maxX = max(maxX,value.at<float>(0));
minY = min(minY,value.at<float>(1));
maxY = max(maxY,value.at<float>(1));
}
leftX = std::min<float>(1.00,-minX);
lowerY = std::min<float>(1.00,-minY);
rightX = max(srcIm.cols-minX,maxX-minX);
higherY = max(srcIm.rows-minY,maxY-minY);
warpPerspective(srcIm, dstIm, H, Size(rightX-leftX,higherY-lowerY), cv::INTER_CUBIC);
更新:也许我的结果看起来不太好,因为我使用的矩阵是错误的。由于我无法观察getPerspectiveTransform()
内部发生的情况,我不知道这个矩阵是如何计算的,但它有一些非常小且非常大的值,这让我觉得它们是垃圾。
这是我从T获取数据的方式:
for(int row=0;row<3;row++)
for(int col=0;col<3;col++)
T.at<float>(row,col) = ((float*)(H.data + (size_t)H.step*row))[col];
(虽然getPerspectiveTransform()
的输出矩阵为3x3,但尝试直接通过T.at<float>(row,col)
访问其值会导致分段错误。)
这是正确的方法吗?也许这就是原始问题出现的原因,因为我没有得到正确的矩阵......
答案 0 :(得分:4)
如果在调用warpPerspective之前知道图像的大小,则可以获取其四个角的坐标,并使用PerspectiveTransform对其进行变换,以查看变换后它们的结果。据推测,它们将不再形成漂亮的矩形,因此您可能需要计算最小值和最大值以获取边界框。然后,此边界框的大小就是所需的目标大小。 (此外,如果任何一个角都降至零以下,也不要忘记根据需要平移该框。)这是一个Python示例,该示例使用warpPerspective在其自身上方叠加转换后的图像。
from typing import Tuple
import cv2
import numpy as np
import math
# Input: a source image and perspective transform
# Output: a warped image and 2 translation terms
def perspective_warp(image: np.ndarray, transform: np.ndarray) -> Tuple[np.ndarray, int, int]:
h, w = image.shape[:2]
corners_bef = np.float32([[0, 0], [w, 0], [w, h], [0, h]]).reshape(-1, 1, 2)
corners_aft = cv2.perspectiveTransform(corners_bef, transform)
xmin = math.floor(corners_aft[:, 0, 0].min())
ymin = math.floor(corners_aft[:, 0, 1].min())
xmax = math.ceil(corners_aft[:, 0, 0].max())
ymax = math.ceil(corners_aft[:, 0, 1].max())
x_adj = math.floor(xmin - corners_aft[0, 0, 0])
y_adj = math.floor(ymin - corners_aft[0, 0, 1])
translate = np.eye(3)
translate[0, 2] = -xmin
translate[1, 2] = -ymin
corrected_transform = np.matmul(translate, transform)
return cv2.warpPerspective(image, corrected_transform, (math.ceil(xmax - xmin), math.ceil(ymax - ymin))), x_adj, y_adj
# Just like perspective_warp, but it also returns an alpha mask that can be used for blitting
def perspective_warp_with_mask(image: np.ndarray, transform: np.ndarray) -> Tuple[np.ndarray, np.ndarray, int, int]:
mask_in = np.empty(image.shape, dtype = np.uint8)
mask_in.fill(255)
output, x_adj, y_adj = perspective_warp(image, transform)
mask, _, _ = perspective_warp(mask_in, transform)
return output, mask, x_adj, y_adj
# alpha_blits src onto dest according to the alpha values in mask at location (x, y),
# ignoring any parts that do not overlap
def alpha_blit(dest: np.ndarray, src: np.ndarray, mask: np.ndarray, x: int, y: int) -> None:
dl = max(x, 0)
dt = max(y, 0)
sl = max(-x, 0)
st = max(-y, 0)
sr = max(sl, min(src.shape[1], dest.shape[1] - x))
sb = max(st, min(src.shape[0], dest.shape[0] - y))
dr = dl + sr - sl
db = dt + sb - st
m = mask[st:sb, sl:sr]
dest[dt:db, dl:dr] = (dest[dt:db, dl:dr].astype(np.float) * (255 - m) + src[st:sb, sl:sr].astype(np.float) * m) / 255
# blits a perspective-warped src image onto dest
def perspective_blit(dest: np.ndarray, src: np.ndarray, transform: np.ndarray) -> None:
blitme, mask, x_adj, y_adj = perspective_warp_with_mask(src, transform)
cv2.imwrite("blitme.png", blitme)
alpha_blit(dest, blitme, mask, int(transform[0, 2] + x_adj), int(transform[1, 2] + y_adj))
# Read an input image
image: np.array = cv2.imread('input.jpg')
# Make a perspective transform
h, w = image.shape[:2]
corners_in = np.float32([[[0, 0]], [[w, 0]], [[w, h]], [[0, h]]])
corners_out = np.float32([[[100, 100]], [[300, -100]], [[500, 300]], [[-50, 500]]])
transform = cv2.getPerspectiveTransform(corners_in, corners_out)
# Blit the warped image on top of the original
perspective_blit(image, image, transform)
cv2.imwrite('output.jpg', image)
示例结果:
答案 1 :(得分:0)
如果结果看起来很奇怪,可能是因为你的积分未在getPerspectiveTransform中正确设置。您的点矢量需要按正确的顺序排列(左上角,右上角,右下角,左下角)。
但是要回答你的初步问题,就没有“最佳输出尺寸”这样的东西。你必须根据你想做的事情来决定。尝试找到适合你的尺寸。
编辑:
如果问题来自转换矩阵,您如何创建它? openCV这样做的一个好方法是:
vector<Point2f> corners;
corners.push_back(topleft);
corners.push_back(topright);
corners.push_back(bottomright);
corners.push_back(bottomleft);
// Corners of the destination image
// output is the output image, should be defined before this operation
vector<cv::Point2f> output_corner;
output_corner.push_back(cv::Point2f(0, 0));
output_corner.push_back(cv::Point2f(output.cols, 0));
output_corner.push_back(cv::Point2f(output.cols, output.rows));
output_corner.push_back(cv::Point2f(0, output.rows));
// Get transformation matrix
Mat H = getPerspectiveTransform(corners, output_corner);
答案 2 :(得分:0)
Only half a decade late!... I'm going to answer your questions one at a time:
"My question is how the users should know beforehand what that size would be"
You are actually just missing a step. I also recommend using perspectiveTransform
just for the sake of ease over calculating the minimum and maximum X's and Y's yourself.
So once you have calculated the minimum X and Y, recognize that those can be negative. If they are negative, it means your image will be cropped. To fix this, you create a translation matrix and then correct your original homography:
Mat translate = Mat::eye(3, 3, CV_64F);
translate.at<CV_64F>(2, 0) = -minX;
translate.at<CV_64F>(2, 1) = -minY;
Mat corrected_H = translate * H;
Then the calculation for the destination size is just:
Size(maxX - minX, maxY - minY)
although also note that you'll want to convert minX
, maxX
, minY
, and maxY
to integers.
"As I cannot observe what's happening inside getPerspectiveTransform(), I cannot know how this matrix is computed"
https://github.com/opencv/opencv
That's the source code for OpenCV. You can definitely observe what's happening inside getPerspectiveTransform
.
Also this: https://docs.opencv.org/2.4/modules/imgproc/doc/geometric_transformations.html
The getPerspectiveTransform
doesn't have great documentation on what they're doing, but the findHomography
function does. I'm pretty sure getPerspectiveTransform
is just the simple case when you have exactly the minimum number of points required to solve for 8 parameters (4 pairs of points i.e. the corners).