Question

我正在尝试使用维基百科（http://en.wikipedia.org/wiki/Phase_correlation）中的配方在R中实现二维相位相关算法，以便跟踪两幅图像之间的移动。这些图像（帧）是在风中摇动的相机拍摄的，最终目标是去除这些和后续帧中的抖动。下面是两个示例图像和R代码：

frame 1 frame 2

## we will need the tiff library 
library(tiff)

## read in the tiff files 
f1=as.matrix(readTIFF('f1.tiff',native=TRUE))
f2=as.matrix(readTIFF('f2.tiff',native=TRUE))

## take the fft of the first  frame
F1 <- fft(f1)
## take the Conjugate fft of the second frame
F2.c <- Conj(fft(f2))

## calculate the cross power spectrum according to the wiki article
R <- (F1*F2.c)/abs(F1*F2.c)
## take the inverse fft of R
r <- fft(R,inv=TRUE)/length(R)
## because the zero valued imaginary numbers are not needed
r <- Re(r)

## show the normalized cross-correlation
image(r)

## find the max in the cross correlation matrix, or the phase shift -
## between the two images
shift <- which(r==max(r),arr.ind=TRUE)

根据我的理解，矢量 shift 应包含最佳校正这两个图像的传递位移（dx和dy）的信息。然而，移位变量给出dx = 1和dy = 1，我假设它表示在x或y方向上没有移位。对于在x和y方向上都存在可见移位或多个像素的后续帧，会发生这种情况。

是否所有人都在我的代码/公式中看到错误？或者，在进行相位相关之前，我是否需要尝试更先发现象，如过滤图像？

欢呼加尔斯和伙计们！

Answer 1

根据我对相位相关的了解，代码看起来是正确的。如果我理解你想要的正确，你试图使用相位相关来确定两个图像之间的偏移，因为它们的单应性只不过是水平和垂直偏移。事实上，您只是将转换置于原点，这很可能是因为您的图像缺乏足够的高频信息才能正确确定良好的转换。

请尝试使用这两个图片（这些图片来自您引用的维基百科文章，但我将它们解析出来并将其保存为单独的图片）：

Image #1 Image #2

当我使用R代码运行这两个图像时，我得到了相位相关图。请注意，您的图片实际上已保存为.png，因此我必须将库更改为library(png)，并使用readPNG代替readTIFF。当您尝试使用上面的示例图像运行代码时请记住这一点：

enter image description here

此外，发生最大峰值的位置是：

> shift
     row col
[1,] 132 153

这告诉我们图像移过132行和153列。请注意，相对于图像的中心，这是。如果要确定实际偏移量，则需要将此值减去垂直坐标的一半行和水平坐标的一半列。

因此，代码完全正常......只是你的图像缺乏足够的高频信息才能使相位相关工作。在这种情况下，尝试做的相关性是我们试图在每个图像之间找到“相似”的变化。如果每个图像之间存在很多差异并且非常相似，则相位相关将很好地工作。但是，如果我们没有那么多变化，那么相位相关将不起作用。

为什么会这样？相位相关背后的基础是我们假设图像被高斯白噪声破坏，因此如果我们将白噪声与其自身（从一个图像到另一个图像）相关联，它将在偏移或偏移处给出非常好的高峰值。到处都是，几乎为零。由于您的图像缺少大量高频信息以及图像清晰的事实，因此相位相关实际上不起作用。因此，有些人实际建议的是预先白化您的图像，以便图像包含白噪声，这样您就可以获得我们正在谈论的偏移应该在哪里的漂亮峰值。

但是，为了确保消除任何错误的最大值，最好还要对频域中的互相关矩阵进行平滑处理（R代码中为r），这样才能达到很高的水平。只有一个真正最大值的概率。在频率/ FFT域中使用高斯滤波器应该可以正常工作。

在任何情况下，我都没有看到你的图像有太大的变化，所以要避免这种情况，你必须确保你的图像有很多高频信息才能使用！

Answer 2

下面是例程的定性描述，其后是用于有效且稳健地找到原始问题中发布的两个图像之间的相位相关性的R代码。感谢@rayreng的建议并指出我正确的方向！

同时阅读两张图片
向第二张图片添加噪点
使用fft
使用乘法
以反向fft返回空间域。这是您的标准化互相关
重新排列归一化的互相关矩阵，使零频率位于矩阵的中间（类似于matlab中的fftshift）
使用2d高斯分布来平滑归一化的互相关矩阵
通过识别平滑归一化相关矩阵的最大索引值来确定偏移

请注意，此功能还使用自定义2d高斯生成器（见下文）和类似于matlabs fftshift的自定义函数。

 ### R CODE ###
 rm(list=ls())
 library(tiff)
 ## read in the tiff images 
 f1 <- readTIFF('f1.tiff',native=TRUE)
 f1 <- matrix(f1,ncol=ncol(f1),nrow=nrow(f1)) 


 ## take the fft of f1 
 F1 <- fft(f1)

 ## what is the subsequent frame?
 f2 <-readTIFF('f2.tiff',native=TRUE)
 f2 <- matrix(f2,ncol=ncol(f2),nrow=nrow(f2))

 ## create a vector of random noise the same length as f2
 noise.b <- runif(length(f2),min(range(f2)),max(range(f2)))
 ## add the noise to the f2
 f2 <- noise.b+f2   

## take the fft and conjugate of the f2
F2.c <- Conj(fft(f2))

## calculate the cross-power spectrum
R <- (F1*F2.c)/abs(F1*F2.c)
## calculate the normalized cross-correlation with fft inverse
r <- fft(R,inv=TRUE)/length(R)
## rearrange the r matrix so that zero frequency component is in the -
## middle of the matrix.  This is similar to the function - 
## fftshift in matlab 

fftshift <- function(x) {
if(class(x)=='matrix') {
    rd2 <- floor(nrow(x)/2)
    cd2 <- floor(ncol(x)/2)

    ## Identify the first, second, third, and fourth quadrants 
    q1 <- x[1:rd2,1:cd2]
    q2 <- x[1:rd2,(cd2+1):ncol(x)]
    q3 <- x[(rd2+1):nrow(x),(cd2+1):ncol(x)]
    q4 <- x[(rd2+1):nrow(x),1:cd2]

    ## rearrange the quadrants 
    centered.t <- rbind(q4,q1)
    centered.b <- rbind(q3,q2)
    centered <- cbind(centered.b,centered.t)

    return(Re(centered))             
}
if(class(x)!='matrix') {
    print('sorry, this class of input x is not supported yet')
    }
}

## now use the defined function fftshift on the matrix r
r <- fftshift(r)
r <- Re(r)

## try and smooth the matrix r to find the peak!
## first build a 2d gaussian matrix  
sig = 5 ## define a sigma 

## determine the rounded half dimensions of the r matrix 
x.half.dim <- floor(ncol(r)/2)
y.half.dim <- floor(nrow(r)/2)

## create x and y vectors that correspond to the indexed row
## and column values of the r matrix.  
xs <- rep(-x.half.dim:x.half.dim,ncol(r))
ys <- rep(-y.half.dim:y.half.dim,each=nrow(r))

## apply the gaussian blur formula 
## (http://en.wikipedia.org/wiki/Gaussian_blur)
## to every x and y indexed value
gaus <- 1/(2*pi*sig^2) * exp(-(xs^2 + ys^2)/(2*sig^2))
gaus <- matrix(gaus,ncol=ncol(r),nrow=nrow(r),byrow=FALSE)

## now convolve the gaus with r in the frequency domain
r.filt <- fft((fft(r)*fft(gaus)),inv=TRUE)/length(r)
r.filt <- fftshift(Re(r.filt))

## find dx and dy with the peak in r    
min.err <- which(r.filt==max(r.filt),arr.ind=TRUE)

## how did the image move from the previous? 
shift <- (dim(f1)+3)/2-min.err

矢量移位表示图像在x正方向和负y方向上移位。换句话说，第二图像（f2）大致移动到右上方。由于引入第二图像的噪声以及来自r矩阵上的高斯滤波器的平滑算子，矢量偏移的值将随着每次尝试而略微变化。我注意到类似于上面概述的相位相关在较大的图像/矩阵上工作得更好。要查看上述算法的结果，请访问https://www.youtube.com/watch?v=irDFk2kbKaE处的稳定视频。

在R中执行与fft的相位相关

2 个答案: