如何在R中手动计算标准化的Pearson残差?

时间:2019-03-27 00:14:58

标签: r

我正在尝试手动计算R中的标准化Pearson残差。但是,在计算帽子矩阵时,我很费力。

我建立了自己的逻辑回归,并尝试在logReg函数中计算标准化的Pearson残差。

logRegEst <- function(x, y, threshold = 1e-10, maxIter = 100)
{
  calcPi <- function(x, beta)
  {
    beta <- as.vector(beta)
    return(exp(x %*% beta) / (1 + exp(x %*% beta)))
  }

  beta <- rep(0, ncol(x))   # initial guess for beta
  diff <- 1000
  # initial value bigger than threshold so that we can enter our while loop
  iterCount = 0
  # counter for the iterations to ensure we're not stuck in an infinite loop

  while(diff > threshold) # tests for convergence
  {
    pi <- as.vector(calcPi(x, beta))
    # calculate pi by using the current estimate of beta

    W <-  diag(pi * (1 - pi))
    # calculate matrix of weights W as defined int he fisher scooring algorithem

    beta_change <- solve(t(x) %*% W %*% x) %*% t(x) %*% (y - pi)
    # calculate the change in beta

    beta <- beta + beta_change   #  new beta
    diff <- sum(beta_change^2)
    # calculate how much we changed beta by in this iteration
    # if this is less than threshold, we'll break the while loop

    iterCount <- iterCount + 1
    # see if we've hit the maximum number of iterations
    if(iterCount > maxIter){
      stop("This isn't converging.")
    }
    # stop if we have hit the maximum number of iterations
  }
  n <- length(y)
  df <- length(y) - ncol(x)
  # calculating the degrees of freedom by taking the length of y minus
  # the number of x columns
  vcov <- solve(t(x) %*% W %*% x)
  logLik <- sum(y * log(pi / (1 - pi)) + log(1 - pi))
  deviance <- -2 * logLik
  AIC <- -2 * logLik + 2 * ncol(x)
  rank <- ncol(x)
  list(coefficients = beta, vcov = vcov, df = df, deviance = deviance,
       AIC = AIC, iter = iterCount - 1, x = x, y = y, n = n, rank = rank)
  # returning results
}

logReg <- function(formula, data)
{
  if (sum(is.na(data)) > 0) {
    print("missing values in data")
  } else {
    mf <- model.frame(formula = formula, data = data)
    # model.frame() returns us a data.frame with the variables needed to use the
    # formula.
    x <- model.matrix(attr(mf, "terms"), data = mf)
    # model.matrix() creates a disign matrix. That means that for example the
    #"Sex"-variable is given as a dummy variable with ones and zeros.
    y <- as.numeric(model.response(mf)) - 1
    # model.response gives us the response variable.
    est <- logRegEst(x, y)
    # Now we have the starting position to apply our function from above.
    est$formula <- formula
    est$call <- match.call()
    # We add the formular and the call to the list.
    nullModel <- logRegEst(x = as.matrix(rep(1, length(y))), y)
    est$nullDeviance <- nullModel$deviance
    est$nullDf <- nullModel$df
    mu <- exp(as.vector(est$x %*% est$coefficients)) /
      (1 + exp(as.vector(est$x %*% est$coefficients)))
    # computing the fitted values
    est$residuals <- (est$y - mu) / sqrt(mu * (1 - mu))
    est$mu <- mu
    est$x <- x
    est$y <- y
    est$data <- data
    hat <- (t(mu))^(1/2)%*%x%*%(t(x)%*%mu%*%x)^(-1)%*%t(x)%*%mu^(1/2)
    est$stdresiduals <- est$residuals/(sqrt(1-hat))
    class(est) <- "logReg"
    # defining the class
    est
  }
}

在计算= ̂1 / 2(̂)−1̂1 / 2时,我很挣扎。在我的代码中,这被称为帽子。 如果我尝试计算帽子矩阵(帽子),则会出现这样的错误:在这种情况下,我无法将向量mu和矩阵x相乘:t(x)%*%mu%*%x。 我可以看到矩阵的秩不相同,因此无法将它们相乘。

任何人都可以看到我的错误在哪里吗?非常感谢您的帮助。谢谢!

0 个答案:

没有答案