在R中执行线性回归时出错:替换有19000000行,数据有1000000

时间:2018-04-22 18:44:34

标签: r linear-regression large-data

我正在尝试对数据集(x_final)运行线性回归。我已将数据拆分为test和train,我的因变量是YDepDelay,我将自变量存储为XDepDelay。我收到了这个错误:

  

[[<-.data.frame*tmp*,i,值= c(304L,304L,304L,   304L,:替换有19000000行,数据有1000000

代码:

# To select training data - randomly sample of 1 Million rows from x_final
indexes = sample(1:nrow(x_final), size=524735)

# Split data
library("dplyr")
set.seed(123)
train <- x_final[-indexes,]
dim(train) 
test <- x_final[indexes,]
dim(test)  

#For DepDelay

YDepDelay <- as.matrix(train$DepDelay)
XDepDelay <- as.matrix(train[, names(train) != "DepDelay"])
model_DepDelay <- lm(YDepDelay ~ XDepDelay, data= train, drop.unused.levels= TRUE)
summary(model_DepDelay)

0 个答案:

没有答案