尝试预测lm模型时因素错误

时间:2020-05-23 09:10:33

标签: r machine-learning linear-regression

我有这个数据框:all six combinations 我需要创建一个线性回归模型。 当我尝试“分解”某些功能时,出现此错误:Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : factor grade has new levels 1 而且我不知道该怎么办,我认为我需要“分解”我使用的几乎所有功能,但是我总是会收到此错误

我的代码:

house.data.raw <- read.csv('housedata.csv')
library(ggplot2)
house.data.prepared <- house.data.raw

#convert to date type and structure
dates <- house.data.prepared$date
dates <- as.Date(dates, "%Y%m%dT000000")
dates <- format(dates, format="%d-%m-%Y")
house.data.prepared$date <- dates
house.data.prepared$date <- as.Date(house.data.prepared$date, "%d-%m-%Y")

#Remove all columns with one or more rows that contains "NA" 
numberOfNA = length(which(is.na(house.data.prepared) == T))
if(numberOfNA > 0)
{
  cat('Number of missing values: ', numberOfNA)
  cat('\nRemoving missing values...')
  house.data.prepared = house.data.prepared[complete.cases(house.data.prepared), ]
}
house.data.final$bedrooms <- factor(house.data.final$bedrooms)
house.data.final$floors <- factor(house.data.final$floors)
house.data.final$waterfront <- factor(house.data.final$waterfront)
house.data.final$view <- factor(house.data.final$view)
house.data.final$condition <- factor(house.data.final$condition)
house.data.final$grade <- factor(house.data.final$grade)

library(caTools)
filter <- sample.split(house.data.final$bedrooms, SplitRatio = 0.7)

#Training set
house.train <- subset(house.data.final, filter == T)

#test set
house.test <- subset(house.data.final, filter == F)

dim(house.data.final)
dim(house.train)
dim(house.test)

model <- lm(price ~ . ,house.train)
summary(model)

predict.train <- predict(model, house.train)
predict.test <- predict(model, house.test)

0 个答案:

没有答案