如何获得X和Y行匹配?

时间:2019-06-20 21:48:34

标签: r glmnet

我正在开发一种新型的代码,在岭正则化回归方面需要一些帮助。试图建立一个预测模型,但首先我需要x和y矩阵行来匹配。

我发现与Google搜索类似,但它们的数据是随机生成的,没有像我的一样提供。数据是一个大型数据集,包含超过500,000个观察值和670个变量。

 library(rsample)
 library(glmnet)
 library(dplyr)
 library(ggplot2)

 # Create training (70%) and test (30%) sets
 # Use set.seed for reproducibility

 set.seed(123)

 alumni_split<-initial_split(alumni, prop=.7, strata = "Id.Number")
 alumni_train<-training(alumni_split)
 alumni_test<-testing(alumni_split)

 #----
 # Create training and testing feature model matrices and response 
 vectors.
 # we use model.matrix(...)[, -1] to discard the intercept
 alumni_train_x <- model.matrix(Id.Number ~ ., alumni_train)[, -1]
 alumni_test_x <- model.matrix(Id.Number ~ ., alumni_test)[, -1]

 alumni_train_y <- log(alumni_train$Id.Number)
 alumni_test_y <- log(alumni_test$Id.Number)

 # What is the dimension of of your feature matrix?
 dim(alumni_train_x)

 #---- [HERE]
 # Apply Ridge regression to alumni data
   alumni_ridge <- glmnet(alumni_train_x, alumni_train_y, alpha = 0)

错误消息(带有代码):

alumni_ridge <-glmnet(alumni_train_x,alumni_train_y,alpha = 0)      glmnet中的错误(alumni_train_x,alumni_train_y,alpha = 0):       y中的观察次数(329870)不等于的行数       x(294648)

0 个答案:

没有答案