Question

time = 1:100  
head(y)  
0.07841589 0.07686316 0.07534116 0.07384931 0.07238699 0.07095363   
plot(time,y)

这是指数曲线 1）如何在不知道公式的情况下在此曲线上拟合线？我无法使用＆＃39; nls＆＃39;由于公式未知（仅给出数据点） 2）如何获得该曲线的等式并确定常数。在等式中？
我尝试了黄土，但它没有给出拦截

Answer 1

编辑： mra68的回答让我意识到对于有一些噪音的数据（这更加真实），我的原始答案效果不佳。因此，我使用nls()函数更新了它的潜在解决方案。

原始答案

正如Adama的答案所述，我相信这可以通过取变量的对数然后拟合线性模型来解决。

您的假设是 y =时间ⁿ ，并且您想要估算 n 。首先取双方的对数： log（y）= log（时间ⁿ）

然后，考虑 log（x ⁿ）= n.log（x）的对数规则，因此： log（y）= n.log（时间） ，这是一个线性方程式。

所以，更具体一点：

#Simulate some data
#I will use 25 as the exponent, but in your case this is unknown
time = 1:100
y = time^25

plot(time, y)

#Plot the log of both variables.
plot(log(time), log(y))

#Fit the linear model
fit = lm(log(y) ~ log(time))

# Check that the estimated coefficient is 25, just as we expected!
fit$coefficients
#  (Intercept)     log(time) 
# -1.477929e-13  2.500000e+01 

#Plot the fitted line
plot(time, y)
lines(time, time ^ fit$coefficients[2], col = "red")

更新了答案

当引入一些噪音时，使用上述解决方案会产生比人们想要的更差的估计值。似乎使用nls()函数在某种程度上克服了这个问题，如此处所示。

#Simulate some data with noise added
set.seed(10021)
time = 1:100
y2 = (time + rnorm(100, sd = 2))^25

# Plot both non-transformed and log-transformed data
par(mfrow = c(1, 2))
plot(time, y2)

plot(log(time), log(y2))
lines(log(time), log(time^25), col = "red") # line when there is no noise

从右边的图中可以看出，对于较低的值，对数转换会导致与底层（在我们已知的情况下）模型之间存在很大差异。这进一步说明了lm()拟合的残差：

# Fit the model using the log-transformed variables
fit_lm = lm(log(y2) ~ log(time))

# Plot fitted values vs. residuals
plot(fit_lm, which = 1)

# And the estimated coefficient is slightly above known
coef(fit_lm)
# (Intercept)   log(time) 
#  -9.327772   27.383641

使用nls()似乎可以改善指数的估计值。

# Fit using nls
fit_nls = nls(y2 ~ (time ^ b), start = c(b = 24), trace = T)

# The coefficient is much closer to the known
coef(fit_nls)
#        b 
# 25.04061

# Plot of data and two estimates
plot(time, y2)
lines(time, time^coef(fit_nls), col = "red")
lines(time, time^coef(fit_lm)[2], col = "green3")
legend("topleft", c("fit_lm", "fit_nls"), lwd = 2, col = c("green3", "red"))

Answer 2

如果它确实是指数级的，你可以尝试取变量的对数并拟合线性模型。

Answer 3

不幸的是，取对数并拟合线性模型并不是最佳选择。原因是大y值的误差比那些重量大得多对于小y值，当应用指数函数返回到原始模型。这是一个例子：

f <- function(x){exp(0.3*x+5)}

squaredError <- function(a,b,x,y) {sum((exp(a*x+b)-f(x))^2)}

x <- 0:12
y <- f(x) * ( 1 + sample(-300:300,length(x),replace=TRUE)/10000 )
x
y   
#--------------------------------------------------------------------

M <- lm(log(y)~x)
a <- unlist(M[1])[2]
b <- unlist(M[1])[1]
print(c(a,b))

squaredError(a,b,x,y)

approxPartAbl_a <- (squaredError(a+1e-8,b,x,y) - squaredError(a,b,x,y))/1e-8

for ( i in 0:10 )
{
  eps <- -i*sign(approxPartAbl_a)*1e-5
  print(c(eps,squaredError(a+eps,b,x,y)))
}

结果：

> f <- function(x){exp(0.3*x+5)}

> squaredError <- function(a,b,x,y) {sum((exp(a*x+b)-f(x))^2)}

> x <- 0:12

> y <- f(x) * ( 1 + sample(-300:300,length(x),replace=TRUE)/10000 )

> x
 [1]  0  1  2  3  4  5  6  7  8  9 10 11 12

> y
 [1]  151.2182  203.4020  278.3769  366.8992  503.5895  682.4353  880.1597 1186.5158 1630.9129 2238.1607 3035.8076 4094.6925 5559.3036

> #--------------------------------------------------------------------
> 
> M <- lm(log(y)~x)

> a <- unlist(M[1])[2]

> b <- unlist(M[1])[1]

> print(c(a,b))
          coefficients.x coefficients.(Intercept) 
               0.2995808                5.0135529 

> squaredError(a,b,x,y)
[1] 5409.752

> approxPartAbl_a <- (squaredError(a+1e-8,b,x,y) - squaredError(a,b,x,y))/1e-8

> for ( i in 0:10 )
+ {
+   eps <- -i*sign(approxPartAbl_a)*1e-5
+   print(c(eps,squaredError(a+eps,b,x,y)))
+ }
[1]    0.000 5409.752
[1]   -0.00001 5282.91927
[1]   -0.00002 5157.68422
[1]   -0.00003 5034.04589
[1]   -0.00004 4912.00375
[1]   -0.00005 4791.55728
[1]   -0.00006 4672.70592
[1]   -0.00007 4555.44917
[1]   -0.00008 4439.78647
[1]   -0.00009 4325.71730
[1]   -0.0001 4213.2411
>

也许可以尝试一些数字方法，即渐变搜索，来找到最小平方误差函数。

当然这不是一个可爱的答案。请不要惩罚我。

R中的指数曲线拟合

3 个答案: