考虑到一些生成的二维玩具数据,我一直试图拟合正弦曲线的幅度,频率和相位。 (最后的代码)
为了获得三个参数的估计值,我首先执行FFT。我使用FFT中的值作为实际频率和相位的初始猜测,然后适合它们(逐行)。我编写了我的代码,以便输入我想要频率的FFT的哪个bin,因此我可以检查拟合是否正常工作。但是有一些非常奇怪的行为。如果我的输入bin是3.1(非整数bin,因此FFT不会给我正确的频率)那么拟合效果非常好。但是如果输入bin是3(因此FFT输出确切的频率)那么我的拟合就会失败,我试图理解为什么。
这是输入箱(在X和Y方向)分别为3.0和2.1时的输出:
(右边的图是数据拟合的)
这是输入箱为3.0和2.0时的输出:
问题:当我输入曲线的确切频率时,为什么非线性拟合会失败?
代码:
#! /usr/bin/python
# For the purposes of this code, it's easier to think of the X-Y axes as transposed,
# so the X axis is vertical and the Y axis is horizontal
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as optimize
import itertools
import sys
PI = np.pi
# Function which accepts paramters to define a sin curve
# Used for the non linear fit
def sineFit(t, a, f, p):
return a * np.sin(2.0 * PI * f*t + p)
xSize = 18
ySize = 60
npt = xSize * ySize
# Get frequency bin from user input
xFreq = float(sys.argv[1])
yFreq = float(sys.argv[2])
xPeriod = xSize/xFreq
yPeriod = ySize/yFreq
# arrays should be defined here
# Generate the 2D sine curve
for jj in range (0, xSize):
for ii in range(0, ySize):
sineGen[jj, ii] = np.cos(2.0*PI*(ii/xPeriod + jj/yPeriod))
# Compute 2dim FFT as well as freq bins along each axis
fftData = np.fft.fft2(sineGen)
fftMean = np.mean(fftData)
fftRMS = np.std(fftData)
xFreqArr = np.fft.fftfreq(fftData.shape[1]) # Frequency bins along x
yFreqArr = np.fft.fftfreq(fftData.shape[0]) # Frequency bins along y
# Find peak of FFT, and position of peak
maxVal = np.amax(np.abs(fftData))
maxPos = np.where(np.abs(fftData) == maxVal)
# Iterate through peaks in the FFT
# For this example, number of loops will always be only one
prevPhase = -1000
for col, row in itertools.izip(maxPos[0], maxPos[1]):
# Initial guesses for fit parameters from FFT
init_phase = np.angle(fftData[col,row])
init_amp = 2.0 * maxVal/npt
init_freqY = yFreqArr[col]
init_freqX = xFreqArr[row]
cntr = 0
if prevPhase == -1000:
prevPhase = init_phase
guess = [init_amp, init_freqX, prevPhase]
# Fit each row of the 2D sine curve independently
for rr in sineGen:
(amp, freq, phs), pcov = optimize.curve_fit(sineFit, xDat, rr, guess)
# xDat is an linspace array, containing a list of numbers from 0 to xSize-1
# Subtract fit from original data and plot
fitData = sineFit(xDat, amp, freq, phs)
sub1 = rr - fitData
# Plot
fig1 = plt.figure()
ax1 = fig1.add_subplot(121)
p1, = ax1.plot(rr, 'g')
p2, = ax1.plot(fitData, 'b')
plt.legend([p1,p2], ["data", "fit"])
ax2 = fig1.add_subplot(122)
p3, = ax2.plot(sub1)
plt.legend([p3], ['residual1'])
fig1.tight_layout()
plt.show()
cntr += 1
prevPhase = phs # Update guess for phase of sine curve
答案 0 :(得分:3)
我试图将你问题的重要部分提炼成这个答案。
当你使用FFT想法走在正确的轨道上时,我认为你的实现并不完全正确。下面的代码应该是一个很棒的玩具系统。它生成f(x) = a0*sin(a1*x+a2)
类型的随机数据。有时随机的初始猜测会起作用,有时它会失败。但是,对频率使用FFT猜测,收敛应始终对此系统起作用。示例输出:
import numpy as np
import pylab as plt
import scipy.optimize as optimize
# This is your target function
def sineFit(t, (a, f, p)):
return a * np.sin(2.0*np.pi*f*t + p)
# This is our "error" function
def err_func(p0, X, Y, target_function):
err = ((Y - target_function(X, p0))**2).sum()
return err
# Try out different parameters, sometimes the random guess works
# sometimes it fails. The FFT solution should always work for this problem
inital_args = np.random.random(3)
X = np.linspace(0, 10, 1000)
Y = sineFit(X, inital_args)
# Use a random inital guess
inital_guess = np.random.random(3)
# Fit
sol = optimize.fmin(err_func, inital_guess, args=(X,Y,sineFit))
# Plot the fit
Y2 = sineFit(X, sol)
plt.figure(figsize=(15,10))
plt.subplot(211)
plt.title("Random Inital Guess: Final Parameters: %s"%sol)
plt.plot(X,Y)
plt.plot(X,Y2,'r',alpha=.5,lw=10)
# Use an improved "fft" guess for the frequency
# this will be the max in k-space
timestep = X[1]-X[0]
guess_k = np.argmax( np.fft.rfft(Y) )
guess_f = np.fft.fftfreq(X.size, timestep)[guess_k]
inital_guess[1] = guess_f
# Guess the amplitiude by taking the max of the absolute values
inital_guess[0] = np.abs(Y).max()
sol = optimize.fmin(err_func, inital_guess, args=(X,Y,sineFit))
Y2 = sineFit(X, sol)
plt.subplot(212)
plt.title("FFT Guess : Final Parameters: %s"%sol)
plt.plot(X,Y)
plt.plot(X,Y2,'r',alpha=.5,lw=10)
plt.show()
答案 1 :(得分:1)
问题是由于阶段的初始猜测错误,而不是频率。在循环遍历genSine(内部循环)的行时,使用前一行的拟合结果作为下一行的初始猜测,该行始终不起作用。如果您从当前行的fft确定相位并将其用作初始猜测,则拟合将成功。 您可以按如下方式更改内部循环:
for n,rr in enumerate(sineGen):
fftx = np.fft.fft(rr)
fftx = fftx[:len(fftx)/2]
idx = np.argmax(np.abs(fftx))
init_phase = np.angle(fftx[idx])
print fftx[idx], init_phase
...
您还需要更改
def sineFit(t, a, f, p):
return a * np.sin(2.0 * np.pi * f*t + p)
到
def sineFit(t, a, f, p):
return a * np.cos(2.0 * np.pi * f*t + p)
因为相位= 0意味着fft的虚部为零,因此函数是余弦的。
顺便说一下。您上面的示例仍然缺少sineGen和xDat的定义。
答案 2 :(得分:0)
根据http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html:
,在不了解您的大部分代码的情况下(amp2, freq2, phs2), pcov = optimize.curve_fit(sineFit, tDat,
sub1, guess2)
应该成为:
(amp2, freq2, phs2), pcov = optimize.curve_fit(sineFit, tDat,
sub1, p0=guess2)
假设tDat和sub1是x和y,那应该可以解决问题。但是,再一次,很难理解如此复杂的代码,这些代码具有如此多的相互关联的变量而根本没有任何评论。代码应该始终从下到上构建,这意味着当单个代码不工作时你不进行拟合循环,在代码适合非嘈杂的示例之前不要添加噪声...好运气!
答案 3 :(得分:0)
通过“没什么特别的”,我的意思是删除与拟合无关的一切,并做一个简化的模拟示例,例如:
import numpy as np
import scipy.optimize as optimize
def sineFit(t, a, f, p):
return a * np.sin(2.0 * np.pi * f*t + p)
# Create array of x and y with given parameters
x = np.asarray(range(100))
y = sineFit(x, 1, 0.05, 0)
# Give a guess and fit, printing result of the fitted values
guess = [1., 0.05, 0.]
print optimize.curve_fit(sineFit, x, y, guess)[0]
结果恰恰就是答案:
[1. 0.05 0.]
但是,如果你改变猜测不是太多,那就足够了:
# Give a guess and fit, printing result of the fitted values
guess = [1., 0.06, 0.]
print optimize.curve_fit(sineFit, x, y, guess)[0]
结果给出了荒谬的错误数字:
[ 0.00823701 0.06391323 -1.20382787]
你能解释一下这种行为吗?
答案 4 :(得分:0)
您可以将curve_fit
与一系列三角函数结合使用,通常只需增加术语数量就可以非常强大且可调节所需的精度......这是一个例子:
from scipy import sin, cos, linspace
def f(x, a0,s1,s2,s3,s4,s5,s6,s7,s8,s9,s10,s11,s12,
c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12):
return a0 + s1*sin(1*x) + c1*cos(1*x) \
+ s2*sin(2*x) + c2*cos(2*x) \
+ s3*sin(3*x) + c3*cos(3*x) \
+ s4*sin(4*x) + c4*cos(4*x) \
+ s5*sin(5*x) + c5*cos(5*x) \
+ s6*sin(6*x) + c6*cos(6*x) \
+ s7*sin(7*x) + c7*cos(7*x) \
+ s8*sin(8*x) + c8*cos(8*x) \
+ s9*sin(9*x) + c9*cos(9*x) \
+ s10*sin(9*x) + c10*cos(9*x) \
+ s11*sin(9*x) + c11*cos(9*x) \
+ s12*sin(9*x) + c12*cos(9*x)
from scipy.optimize import curve_fit
pi/2. / (x.max() - x.min())
x_norm *= norm_factor
popt, pcov = curve_fit(f, x_norm, y)
x_fit = linspace(x_norm.min(), x_norm.max(), 1000)
y_fit = f(x_fit, *popt)
plt.plot( x_fit/x_norm, y_fit )