非线性回归与对数模型

时间:2018-10-22 14:22:33

标签: r exponential curve-fitting nls

import cv2
import multiprocessing as mp
import numpy as np
import ctypes
import time


class Worker(mp.Process):
    def __init__(self,sharedArray,lock, width, height, channels):
        super(Worker, self).__init__()
        self.s=sharedArray
        self.lock = lock
        self.w = width
        self.h = height
        self.c = channels
        return

    def run(self):
        print("worker running")

        self.lock.acquire()
        buf = np.frombuffer(self.s.get_obj(), dtype='uint8')
        buf2 = buf.reshape(self.h, self.w, self.c)
        self.lock.release()

        print("worker ",buf2.shape, buf2.size)
        cv2.imshow("worker",buf2)
        cv2.waitKey(-1)


if __name__ == '__main__':

    img = cv2.imread('pic640x480.jpg')
    shape = img.shape
    size = img.size
    width = shape[1]
    height = shape[0]
    channels = shape[2]

    realsize = width * height * channels
    print('main ', shape, size, realsize)

    s = mp.Array(ctypes.c_uint8, realsize)
    lock = mp.Lock()
    lock.acquire()
    buf = np.frombuffer(s.get_obj(), dtype='uint8')
    buf2 = buf.reshape(height, width, channels)
    buf2[:] = img
    lock.release()

    worker = Worker(s,lock,width, height, channels)
    worker.start()

    cv2.imshow("img",img)
    cv2.waitKey(-1)

    worker.join()
    cv2.destroyAllWindows()

enter image description here

这种关系似乎是非线性关系。因此,我将拟合一个我记录了y和x的模型

library(ggplot2)

dat <- structure(list(y = c(52L, 63L, 59L, 58L, 57L, 54L, 27L, 20L, 15L, 27L, 27L, 26L, 70L, 70L, 70L, 70L, 70L, 70L, 45L, 42L, 41L, 55L, 45L, 39L, 51L, 
                        64L, 57L, 39L, 59L, 37L, 44L, 44L, 38L, 57L, 50L, 56L, 66L, 66L, 64L, 64L, 60L, 55L, 52L, 57L, 47L, 57L, 64L, 63L, 49L, 49L, 
                        56L, 55L, 57L, 42L, 60L, 53L, 53L, 57L, 56L, 54L, 42L, 45L, 34L, 52L, 57L, 50L, 60L, 59L, 52L, 42L, 45L, 47L, 45L, 51L, 39L, 
                        38L, 42L, 33L, 62L, 57L, 65L, 44L, 44L, 39L, 46L,  49L, 52L, 44L, 43L, 38L), 
                  x = c(122743L, 132300L, 146144L, 179886L, 195180L, 233605L, 1400L, 1400L, 3600L, 5000L, 14900L, 16000L, 71410L, 85450L, 106018L, 
                        119686L, 189746L, 243171L, 536545L, 719356L, 830031L, 564546L, 677540L, 761225L, 551561L, 626799L, 68618L, 1211267L, 1276369L,
                        1440113L, 1153720L, 1244575L, 1328641L, 610452L, 692624L, 791953L, 4762522L, 5011232L, 5240402L, 521339L, 
                        560098L, 608641L, 4727833L, 4990042L, 5263899L, 1987296L, 2158704L, 2350927L, 7931905L, 8628608L, 8983683L, 2947957L, 3176995L, 3263118L, 
                        55402L, 54854L, 55050L, 52500L, 72000L, 68862L, 1158244L, 1099976L, 1019490L, 538146L, 471219L, 437954L, 863592L, 661055L, 
                        548097L, 484450L, 442643L, 404487L, 1033728L, 925514L, 854793L, 371420L, 285257L, 260157L, 2039241L, 2150710L, 1898614L, 
                        1175287L, 1495433L, 1569586L, 2646966L, 3330486L, 3282677L, 745784L, 858574L, 1119671L)), 
                class = "data.frame", row.names = c(NA, -90L))

 ggplot(dat, aes(x = x, y = y)) + geom_point()

enter image description here

但是,我看到对于较低的值,对数转换会导致较大的差异,如残差所示。然后,我转向非线性最小二乘法。我以前没有使用过,但是使用了这篇文章

Why is nls() giving me "singular gradient matrix at initial parameter estimates" errors?

mod.lm <- lm(log(y) ~ log(x), data = dat)
ggplot(dat, aes(x = log(x), y = log(y))) + geom_point() + geom_smooth(method = "lm") 

有人能告诉我这个错误是什么意思,以及如何将nls模型拟合到上述数据中吗?

1 个答案:

答案 0 :(得分:1)

在您的情况下,nls遇到问题,因为起始值不好,并且您引入了线性化形式中不存在的系数c。 为了适合您的nls,您可以按照以下方式进行操作,使用更好的凝视值并删除系数c:

mod.glm <- glm(y ~ x, dat=dat, family=poisson(link = "log"))
start <- list(a = coef(mod.glm)[1], b = coef(mod.glm)[2])
mod.nls <- nls(y ~ exp(a + b * x), data = dat, start = start)

我建议使用如上所述的glm而不是nls来找到系数。

如果线性化模型(mod.lm)的估计值不应有偏差,则需要对其进行调整。

mod.lm <- lm(log(y) ~ log(x), data = dat)
mean(dat$y) #50.44444
mean(predict(mod.glm, type="response")) #50.44444
mean(predict(mod.nls)) #50.44499
mean(exp(predict(mod.lm))) #49.11622 !
f <- log(mean(dat$y) / mean(exp(predict(mod.lm)))) #bias corection for a
mean(exp(coef(mod.lm)[1] + f + coef(mod.lm)[2]*log(dat$x))) #50.44444

如果您想自己获得詹姆斯·菲利普斯(James Phillips)在评论中给出的系数,可以尝试:

mod.nlsJP <- nls(y ~ a * (x^(b*x)) + offset, data=dat, start=list(a=-30, b=-5e-6, offset=50))