我正在尝试对数据集(x_final)运行线性回归。我已将数据拆分为test和train,我的因变量是YDepDelay,我将自变量存储为XDepDelay。我收到了这个错误:
[[<-.data.frame
(*tmp*
,i,值= c(304L,304L,304L, 304L,:替换有19000000行,数据有1000000
代码:
# To select training data - randomly sample of 1 Million rows from x_final
indexes = sample(1:nrow(x_final), size=524735)
# Split data
library("dplyr")
set.seed(123)
train <- x_final[-indexes,]
dim(train)
test <- x_final[indexes,]
dim(test)
#For DepDelay
YDepDelay <- as.matrix(train$DepDelay)
XDepDelay <- as.matrix(train[, names(train) != "DepDelay"])
model_DepDelay <- lm(YDepDelay ~ XDepDelay, data= train, drop.unused.levels= TRUE)
summary(model_DepDelay)