我在R中实现了多元线性回归,然后是批量更新梯度下降算法。我现在正试图绘制这种梯度下降的结果。
这些教程的问题在于,在这两种情况下,它们都明确定义了线性回归方程(也不是多元方程)。
如何在下面列出的代码中使用不同的学习速率和收敛阈值多次创建相似的图来覆盖运行gradDescent
函数的结果:
data <- read.csv("Data/Bike-Sharing-Dataset/hour.csv")
# Select the useable features
data1 <- data[, c("season", "mnth", "hr", "holiday", "weekday", "workingday", "weathersit", "temp", "atemp", "hum", "windspeed", "cnt")]
# Set seed
set.seed(100)
# Split the data
trainingObs<-sample(nrow(data1),0.70*nrow(data1),replace=FALSE)
# Create the training dataset
trainingDS<-data1[trainingObs,]
# Create the test dataset
testDS<-data1[-trainingObs,]
# Create the variables
y <- trainingDS$cnt
y_test <- testDS$cnt
X <- as.matrix(trainingDS[-ncol(trainingDS)])
X_test <- as.matrix(testDS[-ncol(testDS)])
int <- rep(1, length(y))
# Add intercept column to X
X <- cbind(int, X)
X_test <- cbind(int, X_test)
# Solve for beta
betas <- solve(t(X) %*% X) %*% t(X) %*% y
# Round the beta values
betas <- round(betas, 2)
# Gradient descent 1
gradientDesc <- function(x, y, learn_rate, conv_threshold, max_iter) {
n <- nrow(x)
m <- runif(ncol(x), 0, 1)
yhat <- x %*% m
cost <- sum((y - yhat) ^ 2) / (2*n)
converged = F
iterations = 0
while(converged == F) {
## Implement the gradient descent algorithm
m <- m - learn_rate * ( 1/n * t(x) %*% (yhat - y))
yhat <- x %*% m
new_cost <- sum((y - yhat) ^ 2) / (2*n)
if( abs(cost - new_cost) <= conv_threshold) {
converged = T
}
iterations = iterations + 1
cost <- new_cost
if(iterations >= max_iter) break
}
return(list(converged = converged,
num_iterations = iterations,
cost = cost,
new_cost = new_cost,
coefs = m) )
}
out <- gradientDesc(X, y, 0.005, 0.0000001, 200000)
注意: 使用的数据是 -
自行车共享-数据集
UCI机器学习库
答案 0 :(得分:2)
因为这是多变量情况,所以很难将cost
作为参数进行绘制。但是,可以根据迭代次数绘制cost
。
为此,我们需要在每次迭代中保留cost
的值。我们可以在data.frame
循环中创建while
并将其添加到要返回的列表中。
data <- read.csv("Data/Bike-Sharing-Dataset/hour.csv")
# Select the useable features
data1 <- data[, c("season", "mnth", "hr", "holiday", "weekday", "workingday", "weathersit", "temp", "atemp", "hum", "windspeed", "cnt")]
# Set seed
set.seed(100)
# Split the data
trainingObs<-sample(nrow(data1),0.70*nrow(data1),replace=FALSE)
# Create the training dataset
trainingDS<-data1[trainingObs,]
# Create the test dataset
testDS<-data1[-trainingObs,]
# Create the variables
y <- trainingDS$cnt
y_test <- testDS$cnt
X <- as.matrix(trainingDS[-ncol(trainingDS)])
X_test <- as.matrix(testDS[-ncol(testDS)])
int <- rep(1, length(y))
# Add intercept column to X
X <- cbind(int, X)
X_test <- cbind(int, X_test)
# Solve for beta
betas <- solve(t(X) %*% X) %*% t(X) %*% y
# Round the beta values
betas <- round(betas, 2)
# Gradient descent 1
gradientDesc <- function(x, y, learn_rate, conv_threshold, max_iter) {
n <- nrow(x)
m <- runif(ncol(x), 0, 1)
yhat <- x %*% m
cost <- sum((y - yhat) ^ 2) / (2*n)
converged = F
iterations = 0
while(converged == F) {
## Implement the gradient descent algorithm
m <- m - learn_rate * ( 1/n * t(x) %*% (yhat - y))
yhat <- x %*% m
new_cost <- sum((y - yhat) ^ 2) / (2*n)
if( abs(cost - new_cost) <= conv_threshold) {
converged = T
}
step <- data.frame(iteration = iterations,
cost = cost,
new_cost = new_cost)
if(exists("iters")) {
iters <- rbind(iters, step)
} else {
iters <- step
}
iterations = iterations + 1
cost <- new_cost
if(iterations >= max_iter) break
}
return(list(converged = converged,
num_iterations = iterations,
cost = cost,
new_cost = new_cost,
coefs = m,
iters = iters))
}
现在可视化new_cost
10000次迭代:
out <- gradientDesc(X, y, 0.005, 0.0000001, 10000)
library(ggplot2)
ggplot(data = out$iters, mapping = aes(x = iteration, y = new_cost))+
geom_line()
希望它有效。