在没有函数lm的情况下如何计算R中的OLS系数。 公式:ß=(X'X)^-1 * X'y
X <- cbind(runif(1000000), rnorm(1000000), rchisq(1000000,50))
y <- 100 * X[,1] + 200 * X[,2] + rnorm(nrow(X), 0, 10)
真的很感谢您的帮助,因为我不知道该怎么做
答案 0 :(得分:1)
这是OLS的基本线性代数。您可能想看看https://en.wikipedia.org/wiki/Linear_regression
set.seed(123)
X <- cbind(runif(1000000), rnorm(1000000), rchisq(1000000,50))
y <- 100 * X[,1] + 200 * X[,2] + rnorm(nrow(X), 0, 10)
# (X'X)^-1*X'y
# basic matrix algebra
solve(t(X) %*% X) %*% (t(X) %*% y)
# crossprod for numeric stability
crossprod(solve(crossprod(X)), crossprod(X,y))
# same in lm()
lm(y~0+X)
如果线性模型具有截距
x <- cbind(1, X)
# (X'X)^-1*X'y
solve(t(x) %*% x) %*% (t(x) %*% y)
crossprod(solve(crossprod(x)), crossprod(x,y))
lm(y~X)
答案 1 :(得分:0)
这是我的版本,包括渐变色。也对this post表示敬意。
x0 <- c(1,1,1,1,1) # Intercept
x1 <- c(1,2,3,4,5)
x2 <- c(8,4,3,1,8)
x <- as.matrix(cbind(x0,x1,x2))
y <- as.matrix(c(3,7,5,11,14))
# (X'X)^-1 X'y
beta1 = solve(t(x)%*%x) %*% t(x)%*%y
# R's regression command
beta2 = summary(lm(y ~ x[, 2:3]))
# Gradient decent
m <- nrow(y)
grad <- function(x, y, theta) {
gradient <- (1/m)* (t(x) %*% ((x %*% t(theta)) - y))
return(t(gradient))
}
# Define gradient descent update algorithm
grad.descent <- function(x, maxit){
theta <- matrix(c(0, 0, 0), nrow=1) # Initialize the parameters
alpha = .05 # set learning rate
for (i in 1:maxit) {
theta <- theta - alpha * grad(x, y, theta)
}
return(theta)
}
# Results without feature scaling
print(grad.descent(x,2000))
beta1
beta2