模拟线性回归的条件,并表明多维线性回归(三个或更多参数)的估计是无偏的。尝试对线性回归的参数做出有偏差的估计,并通过模拟显示您已成功实现了偏差。
这是我尝试过的方法,但我始终坚持从有偏估计中获得无偏估计。
b0=2
b1=1.3
b2=5
N=1000
matrica=matrix(rep(0,N*3),ncol=3)
for (i in 1:N) {
x1=rnorm(100) ##expectation and variance is arbitrary
x2=rnorm(100)
err=rnorm(100)
y=b0+b1*x1+b2*x2+err
lm=lm(y~x1+x2)
matrica[i,1]=lm$coefficients[1]
matrica[i,2]=lm$coefficients[2]
matrica[i,3]=lm$coefficients[3]
}
matrica
rez1 <- matrica[1:N ,1]
rez2 <- matrica[1:N ,2]
rez3 <- matrica[1:N ,3]
## now we need to show that the estimates are unbiased (b0~mean(rez1...))
summary(rez1)
summary(rez2)
summary(rez3)
cor(rez1,rez2) #highly connected
cor(rez2,rez3) #highly connected
答案 0 :(得分:1)
与开始的方式非常相似,您可以执行以下操作:
# True Values
b0=2
b1=1.3
b2=5
# Simulation Set Points
N=1000
n <- 100
set.seed(42)
collector <- matrix(ncol = 3,nrow = N)
colnames(collector) <- c("b0_hat", "b1_hat", "b2_hat")
for(i in 1:N){
# Generate Data
x1 <- rnorm(n, mean = 1, sd = 1)
x2 <- rnorm(n, mean = 1, sd = 1)
y_hat <- b1 * x1 + b2 * x2 + b0
# Add Noise
y <- rnorm(n, y_hat, 1)
# Fit Data
fit <- lm(y ~ x1 + x2)
# Store Results
collector[i, ] <- fit$coefficients
}
然后要显示结果,您可以绘制直方图并显示估计的平均值接近真实参数值beta。
# Graph to Show UnbiasedNess
par(mfrow = c(3,1))
hist(collector[,1], main =expression(hat(beta[0])),breaks = 30)
abline(v =b0, col = "red", lwd = 2)
hist(collector[,2], main =expression(hat(beta[1])),breaks = 30)
abline(v =b1, col = "red", lwd = 2)
hist(collector[,3], main =expression(hat(beta[2])),breaks = 30)
abline(v =b2, col = "red", lwd = 2)
有偏估计是指从长远来看(即期望值),参数估计与真实参数值不同。一种实现方法是说,错误不是从正态分布中提取,而是从t分布中提取。
# True Values
b0=2
b1=1.3
b2=5
# Simulation Set Points
N=1000
n <- 100
set.seed(42)
collector <- matrix(ncol = 3,nrow = N)
colnames(collector) <- c("b0_hat", "b1_hat", "b2_hat")
for(i in 1:N){
# Generate Data
x1 <- rnorm(n, mean = 1, sd = 1)
x2 <- rnorm(n, mean = 1, sd = 1)
y_hat <- b1 * x1 + b2 * x2 + b0
# Add Noise from a t-distribution using `rt`
y <- rt(n, df = 3, ncp = y_hat)
# Fit Data
fit <- lm(y ~ x1 + x2)
# Store Results
collector[i, ] <- fit$coefficients
}
现在您可以在下面的代码中看到我们的估算值存在偏差。
# Graph to Show UnbiasedNess
par(mfrow = c(3,1))
hist(collector[,1], main =expression(hat(beta[0])),breaks = 30)
abline(v =b0, col = "red", lwd = 2)
hist(collector[,2], main =expression(hat(beta[1])),breaks = 30)
abline(v =b1, col = "red", lwd = 2)
hist(collector[,3], main =expression(hat(beta[2])),breaks = 30)
abline(v =b2, col = "red", lwd = 2)