rep(1,N)中的错误:尝试使用cv.glmnet

时间:2019-05-13 09:33:15

标签: r regression glmnet non-linear-regression

我正在尝试对具有11个预测变量的大型数据集执行岭回归。我从CSV数据集中提取了所有相关变量,并将它们设置为矩阵,以便glmnet库可以访问它们。我还将数据集分为两半,其中一半用于训练,另一半用于测试。在尝试运行第一个测试拟合模型时,我不断收到此错误代码Error in rep(1, N) : invalid 'times'参数,该参数防止代码提供lambda的最佳值,并且停止运行。

根据其他帖子,我已经检查并确保没有正在读取的NA列或行。我发现处理此错误的其他问题似乎没有使用glmnet库。我还使用了"mgaussian"系列,因为我正在测试多个变量,但是我尝试使用标准的"gaussian"系列,但它仍然带有相同的错误代码。

library(glmnet) #use the glmnet library to perfrom ridge regression
SWVars = read.csv(file.choose('SWData'), header = TRUE) #read the data into Rstudio
n = 11940 #length of dataset
x = as.matrix(SWVars[0:12]) #read the desired variables in as a matrix
y = as.matrix(SWVars[16:16]) #read the desired response variable in as a matrix 
train_rows = sample(1:n, 0.5*n) #randomly designate half of the data as training rows
x.train = x[train_rows] #designate half of indepdent variable data for training
x.test = x[-train_rows] #designate the other half of independent variable data for testing
y.train = y[train_rows] #designate half of the dependent variable data for training
y.test = y[-train_rows] #designate the other half of dependent variable data for testing
#Fit a training curve for the data using cv.glmnet, minimising MSE, and using the mgaussian famility for multiple regression
alpha0.fit = cv.glmnet(x.train,y.train, type.measure="mse", alpha=0, family="mgaussian")
structure(list(
Latitude = c(33.37648429, 33.58147205, 43.76802869, 
33.55658479, 44.36456222, 40.16155115, 45.77329011, 36.81228138, 
39.37683345, 34.4202345), 
ABSLt = c(33.37648429, 33.58147205, 
43.76802869, 33.55658479, 44.36456222, 40.16155115, 45.77329011, 
36.81228138, 39.37683345, 34.4202345), 
Longitude = c(-111.9196013, 
-111.2821257, -103.5206581, -111.5104323, -101.0545158, -79.05296653, 
-99.64100853, -96.04132091, -89.02743535, -111.0896969), 
ABSLn = c(111.9196013, 
111.2821257, 103.5206581, 111.5104323, 101.0545158, 79.05296653, 
99.64100853, 96.04132091, 89.02743535, 111.0896969), 
Eleveation = c(360.29, 
583, 1581.8, 459, 597.43, 551.86, 562.5, 230.56, 195.91, 2220
), 
Deuiterium = c(-32.16640732, -60.6107658, -64.8100282, -61.11196959, 
-22.34856023, -58.2616656, -69.80240134, -12.77002745, -37.88557439, 
-55.65939053), 
ABSd2H = c(32.16640732, 60.6107658, 64.8100282, 
61.11196959, 22.34856023, 58.2616656, 69.80240134, 12.77002745, 
37.88557439, 55.65939053), 
Oxygen.18 = c(-0.679664825, -7.65316576, 
-6.660581453, -7.378091132, 0.207154673, -8.921727565, -7.789111383, 
-2.43286863, -5.066014096, -7.447887386), 
ABSd18O = c(0.679664825, 
7.65316576, 6.660581453, 7.378091132, 0.207154673, 8.921727565, 
7.789111383, 2.43286863, 5.066014096, 7.447887386), dex = c(-27L, 
1L, -12L, -2L, -24L, 13L, -7L, 7L, 3L, 4L), 
ABSdex = c(27L, 1L, 
12L, 2L, 24L, 13L, 7L, 7L, 3L, 4L), 
DOY = c(15L, 15L, 15L, 15L, 
15L, 15L, 15L, 15L, 15L, 15L), 
sine_DOY = c(0.128748177, 0.128748177, 
0.128748177, 0.128748177, 0.128748177, 0.128748177, 0.128748177, 
0.128748177, 0.128748177, 0.128748177), 
PDSI = c(-1.133137345, 
-1.133137345, -0.944772124, -1.163842678, 3.165101767, -2.081107855, 
-3.871144056, -1.90775156, -2.455032349, -2.285209417)), 
.Names = c("Latitude", 
"ABSLt", "Longitude", "ABSLn", "Eleveation", "Deuiterium", "ABSd2H", 
"Oxygen.18", "ABSd18O", "dex", "ABSdex", "DOY", "sine_DOY", "PDSI"
), row.names = c(NA, 10L), class = "data.frame")

我不确定是什么导致了此问题,我希望这部分代码可以顺利运行。我很想弄清楚问题出在哪里,以便我可以逐步进行回归。感谢您提供的任何帮助。

0 个答案:

没有答案