执行预测功能时在R中获取错误

时间:2017-09-25 18:35:29

标签: r lm predict

每当我尝试根据线性模型预测数据时,我都会收到以下错误。

  

警告消息:'newdata'有101行,但找到的变量有296行

以下是代码段

trainingFrame = data.frame(weeksTrainingConv,bugsTraining)
validateFrame = data.frame(weekTestConv,bugsTest)


model <- lm(totWeekConv ~ totBugs,trainingFrame)
myPrediction <- predict(model,validateFrame)

数据框及其组件的计算是在单独的表格中编写的。这是片段。我根据代码的性质注释掉了块。第一个块代表我的训练数据集,第二个块是我将用来测试我的模型的数据集。最后,最后一个块是总数据集。

library(lubridate)
#training DataSet

weeksTraining = as.Date(c("2003-12-28","2004-01-04","2004-01-11","2004-01-18","2004-01-25","2004-02-01","2004-02-08","2004-02-15","2004-02-22","2004-02-29","2004-03-07","2004-03-14","2004-03-21","2004-03-28","2004-04-04","2004-04-11","2004-04-18","2004-04-25","2004-05-02","2004-05-09","2004-05-16","2004-05-23","2004-05-30","2004-06-06","2004-06-13","2004-06-20","2004-06-27","2004-07-04","2004-07-11","2004-07-18","2004-07-25","2004-08-01","2004-08-08","2004-08-15","2004-08-22","2004-08-29","2004-09-05","2004-09-12","2004-09-19","2004-09-26","2004-10-03","2004-10-10","2004-10-17","2004-10-24","2004-10-31","2004-11-07","2004-11-14","2004-11-21","2004-11-28","2004-12-05","2004-12-12","2004-12-19","2004-12-26","2005-01-02","2005-01-09","2005-01-16","2005-01-23","2005-01-30","2005-02-06","2005-02-13","2005-02-20","2005-02-27","2005-03-06","2005-03-13","2005-03-20","2005-03-27","2005-04-03","2005-04-10","2005-04-17","2005-04-24","2005-05-01","2005-05-08","2005-05-15","2005-05-22","2005-05-29","2005-06-05","2005-06-12","2005-06-19","2005-06-26","2005-07-03","2005-07-10","2005-07-17","2005-07-24","2005-07-31","2005-08-07","2005-08-14","2005-08-21","2005-08-28","2005-09-04","2005-09-11","2005-09-18","2005-09-25","2005-10-02","2005-10-09","2005-10-16","2005-10-23","2005-10-30","2005-11-06","2005-11-13","2005-11-20","2005-11-27","2005-12-04","2005-12-11","2005-12-18","2005-12-25","2006-01-01","2006-01-08","2006-01-15","2006-01-22","2006-01-29","2006-02-05","2006-02-12","2006-02-19","2006-02-26","2006-03-05","2006-03-12","2006-03-19","2006-03-26","2006-04-02","2006-04-09","2006-04-16","2006-04-23","2006-04-30","2006-05-07","2006-05-14","2006-05-21","2006-05-28","2006-06-04","2006-06-11","2006-06-18","2006-06-25","2006-07-02","2006-07-09","2006-07-16","2006-07-23","2006-07-30","2006-08-06","2006-08-13","2006-08-20","2006-08-27","2006-09-03","2006-09-10","2006-09-17","2006-09-24","2006-10-01","2006-10-08","2006-10-15","2006-10-22","2006-10-29","2006-11-05","2006-11-12","2006-11-19","2006-11-26","2006-12-03","2006-12-10","2006-12-17","2006-12-24","2006-12-31","2007-01-07","2007-01-14","2007-01-21","2007-01-28","2007-02-04","2007-02-11","2007-02-18","2007-02-25","2007-03-04","2007-03-11","2007-03-18","2007-03-25","2007-04-01","2007-04-08","2007-04-15","2007-04-22","2007-04-29","2007-05-06","2007-05-13","2007-05-20","2007-05-27","2007-06-03","2007-06-10","2007-06-17","2007-06-24","2007-07-01","2007-07-08","2007-07-15","2007-07-22","2007-07-29","2007-08-05","2007-08-12","2007-08-19","2007-08-26","2007-09-02","2007-09-09","2007-09-16"))
bugsTraining = c(3,18,14,25,21,13,17,25,21,18,20,11,17,19,23,9,7,18,13,17,16,15,16,18,20,12,14,16,19,23,18,10,24,23,11,14,16,19,22,20,15,21,14,9,19,12,18,12,20,10,20,16,14,12,16,11,10,18,20,17,17,20,16,15,20,19,9,11,11,17,10,14,10,16,7,14,11,9,10,9,14,7,13,13,13,16,17,7,17,8,11,11,10,16,9,20,9,13,13,6,11,21,8,10,7,14,16,13,12,9,13,12,17,13,10,12,15,14,8,8,9,13,9,9,18,9,6,10,14,11,5,6,7,4,9,9,9,6,4,5,7,10,12,7,4,13,11,9,6,6,2,8,10,2,7,7,4,1,5,5,10,11,5,11,9,14,5,9,2,6,6,4,4,2,5,7,13,6,4,3,1,5,4,4,2,6,3,5,2,5,5,3,1,5,2)
weeksTrainingConv = numeric();

#converting Dates to numerical Value
for(i in 1:length(weeksTraining)){
  val = ymd(weeksTraining[i])
  val = as.numeric(val)
  weeksTrainingConv[i] = c(val)
  print(weeksTrainingConv[i])
}

#end Training DataSet




#test DataSet

weekTest = as.Date(c("2007-09-23","2007-09-30","2007-10-07","2007-10-14","2007-10-21","2007-10-28","2007-11-04","2007-11-11","2007-11-18","2007-11-25","2007-12-02","2007-12-09","2007-12-16","2007-12-30","2008-01-06","2008-01-13","2008-01-20","2008-01-27","2008-02-03","2008-02-10","2008-02-17","2008-02-24","2008-03-02","2008-03-09","2008-03-16","2008-03-23","2008-03-30","2008-04-06","2008-04-13","2008-04-20","2008-04-27","2008-05-04","2008-05-11","2008-05-18","2008-05-25","2008-06-01","2008-06-08","2008-06-15","2008-06-22","2008-06-29","2008-07-06","2008-07-20","2008-07-27","2008-08-03","2008-08-10","2008-08-17","2008-08-24","2008-08-31","2008-09-07","2008-09-14","2008-09-21","2008-09-28","2008-10-05","2008-10-12","2008-10-19","2008-10-26","2008-11-02","2008-11-09","2008-11-16","2008-11-30","2008-12-07","2008-12-14","2009-01-04","2009-01-11","2009-01-18","2009-01-25","2009-02-01","2009-02-15","2009-02-22","2009-03-15","2009-03-22","2009-03-29","2009-04-05","2009-04-12","2009-04-19","2009-04-26","2009-05-10","2009-05-17","2009-05-24","2009-05-31","2009-06-21","2009-06-28","2009-07-05","2009-07-12","2009-07-19","2009-07-26","2009-08-02","2009-08-09","2009-08-16","2009-08-23","2009-09-06","2009-09-20","2009-09-27","2009-10-04","2009-10-11","2009-10-25","2009-11-01","2009-11-08","2009-11-15","2009-11-29","2009-12-06"));
bugsTest = c(2,4,5,1,4,4,2,4,1,7,2,2,4,1,2,3,1,2,3,1,4,2,10,1,1,6,3,5,1,4,2,3,2,4,2,1,5,6,3,1,1,2,2,5,1,1,2,1,2,3,3,4,4,3,2,3,1,2,6,1,1,1,2,2,2,3,1,1,2,1,3,4,2,3,1,3,1,2,2,1,1,2,2,1,1,1,2,2,2,1,4,3,2,2,6,2,4,3,2,2,1)
weekTestConv = numeric()
#converting Dates to numerical Value
for(i in 1:length(weekTest)){
  val = ymd(weekTest[i])
  val = as.numeric(val)
  weekTestConv[i] = c(val)
}

#end Test DataSet




#total DataSet
totWeek = as.Date(c("2003-12-28","2004-01-04","2004-01-11","2004-01-18","2004-01-25","2004-02-01","2004-02-08","2004-02-15","2004-02-22","2004-02-29","2004-03-07","2004-03-14","2004-03-21","2004-03-28","2004-04-04","2004-04-11","2004-04-18","2004-04-25","2004-05-02","2004-05-09","2004-05-16","2004-05-23","2004-05-30","2004-06-06","2004-06-13","2004-06-20","2004-06-27","2004-07-04","2004-07-11","2004-07-18","2004-07-25","2004-08-01","2004-08-08","2004-08-15","2004-08-22","2004-08-29","2004-09-05","2004-09-12","2004-09-19","2004-09-26","2004-10-03","2004-10-10","2004-10-17","2004-10-24","2004-10-31","2004-11-07","2004-11-14","2004-11-21","2004-11-28","2004-12-05","2004-12-12","2004-12-19","2004-12-26","2005-01-02","2005-01-09","2005-01-16","2005-01-23","2005-01-30","2005-02-06","2005-02-13","2005-02-20","2005-02-27","2005-03-06","2005-03-13","2005-03-20","2005-03-27","2005-04-03","2005-04-10","2005-04-17","2005-04-24","2005-05-01","2005-05-08","2005-05-15","2005-05-22","2005-05-29","2005-06-05","2005-06-12","2005-06-19","2005-06-26","2005-07-03","2005-07-10","2005-07-17","2005-07-24","2005-07-31","2005-08-07","2005-08-14","2005-08-21","2005-08-28","2005-09-04","2005-09-11","2005-09-18","2005-09-25","2005-10-02","2005-10-09","2005-10-16","2005-10-23","2005-10-30","2005-11-06","2005-11-13","2005-11-20","2005-11-27","2005-12-04","2005-12-11","2005-12-18","2005-12-25","2006-01-01","2006-01-08","2006-01-15","2006-01-22","2006-01-29","2006-02-05","2006-02-12","2006-02-19","2006-02-26","2006-03-05","2006-03-12","2006-03-19","2006-03-26","2006-04-02","2006-04-09","2006-04-16","2006-04-23","2006-04-30","2006-05-07","2006-05-14","2006-05-21","2006-05-28","2006-06-04","2006-06-11","2006-06-18","2006-06-25","2006-07-02","2006-07-09","2006-07-16","2006-07-23","2006-07-30","2006-08-06","2006-08-13","2006-08-20","2006-08-27","2006-09-03","2006-09-10","2006-09-17","2006-09-24","2006-10-01","2006-10-08","2006-10-15","2006-10-22","2006-10-29","2006-11-05","2006-11-12","2006-11-19","2006-11-26","2006-12-03","2006-12-10","2006-12-17","2006-12-24","2006-12-31","2007-01-07","2007-01-14","2007-01-21","2007-01-28","2007-02-04","2007-02-11","2007-02-18","2007-02-25","2007-03-04","2007-03-11","2007-03-18","2007-03-25","2007-04-01","2007-04-08","2007-04-15","2007-04-22","2007-04-29","2007-05-06","2007-05-13","2007-05-20","2007-05-27","2007-06-03","2007-06-10","2007-06-17","2007-06-24","2007-07-01","2007-07-08","2007-07-15","2007-07-22","2007-07-29","2007-08-05","2007-08-12","2007-08-19","2007-08-26","2007-09-02","2007-09-09","2007-09-16","2007-09-23","2007-09-30","2007-10-07","2007-10-14","2007-10-21","2007-10-28","2007-11-04","2007-11-11","2007-11-18","2007-11-25","2007-12-02","2007-12-09","2007-12-16","2007-12-30","2008-01-06","2008-01-13","2008-01-20","2008-01-27","2008-02-03","2008-02-10","2008-02-17","2008-02-24","2008-03-02","2008-03-09","2008-03-16","2008-03-23","2008-03-30","2008-04-06","2008-04-13","2008-04-20","2008-04-27","2008-05-04","2008-05-11","2008-05-18","2008-05-25","2008-06-01","2008-06-08","2008-06-15","2008-06-22","2008-06-29","2008-07-06","2008-07-20","2008-07-27","2008-08-03","2008-08-10","2008-08-17","2008-08-24","2008-08-31","2008-09-07","2008-09-14","2008-09-21","2008-09-28","2008-10-05","2008-10-12","2008-10-19","2008-10-26","2008-11-02","2008-11-09","2008-11-16","2008-11-30","2008-12-07","2008-12-14","2009-01-04","2009-01-11","2009-01-18","2009-01-25","2009-02-01","2009-02-15","2009-02-22","2009-03-15","2009-03-22","2009-03-29","2009-04-05","2009-04-12","2009-04-19","2009-04-26","2009-05-10","2009-05-17","2009-05-24","2009-05-31","2009-06-21","2009-06-28","2009-07-05","2009-07-12","2009-07-19","2009-07-26","2009-08-02","2009-08-09","2009-08-16","2009-08-23","2009-09-06","2009-09-20","2009-09-27","2009-10-04","2009-10-11","2009-10-25","2009-11-01","2009-11-08","2009-11-15","2009-11-29","2009-12-06"))
totBugs = c(3,18,14,25,21,13,17,25,21,18,20,11,17,19,23,9,7,18,13,17,16,15,16,18,20,12,14,16,19,23,18,10,24,23,11,14,16,19,22,20,15,21,14,9,19,12,18,12,20,10,20,16,14,12,16,11,10,18,20,17,17,20,16,15,20,19,9,11,11,17,10,14,10,16,7,14,11,9,10,9,14,7,13,13,13,16,17,7,17,8,11,11,10,16,9,20,9,13,13,6,11,21,8,10,7,14,16,13,12,9,13,12,17,13,10,12,15,14,8,8,9,13,9,9,18,9,6,10,14,11,5,6,7,4,9,9,9,6,4,5,7,10,12,7,4,13,11,9,6,6,2,8,10,2,7,7,4,1,5,5,10,11,5,11,9,14,5,9,2,6,6,4,4,2,5,7,13,6,4,3,1,5,4,4,2,6,3,5,2,5,5,3,1,5,2,2,4,5,1,4,4,2,4,1,7,2,2,4,1,2,3,1,2,3,1,4,2,10,1,1,6,3,5,1,4,2,3,2,4,2,1,5,6,3,1,1,2,2,5,1,1,2,1,2,3,3,4,4,3,2,3,1,2,6,1,1,1,2,2,2,3,1,1,2,1,3,4,2,3,1,3,1,2,2,1,1,2,2,1,1,1,2,2,2,1,4,3,2,2,6,2,4,3,2,2,1)

totWeekConv = numeric();

#converting Dates to numerical Value
for(i in 1:length(totWeek)){
  val = ymd(totWeek[i])
  val = as.numeric(val)
  totWeekConv[i] = c(val)
}


#end Total DataSet

我想创建一个线性模型,并建立周与错误之间的关系。我将周日期转换为数字格式以便于计算。 我可以使用lm()命令创建模型,并为其提供了训练数据集,如第一个代码片段所示。每当我想使用测试数据集来预测模型时,在这种情况下测试数据集是一个名为“validateFrame”的数据帧,程序会给我一个错误说明

  

警告消息:'newdata'有101行,但找到的变量有296行   行

我是R的新手,我已经用谷歌搜索了但是我在某个地方失败了。我已经用Google搜索过了,但我找到的解决方案似乎对我不起作用。

1 个答案:

答案 0 :(得分:0)

问题在于您的初始代码段。

trainingFrame = data.frame(weeksTrainingConv,bugsTraining)
validateFrame = data.frame(weekTestConv,bugsTest)

model <- lm(totWeekConv ~ totBugs, trainingFrame)
myPrediction <- predict(model,validateFrame)

首先,您使用totWeekConv中的totBugstrainingFrame创建模型。但是trainingFrame没有带有这些名称的变量。它包含名为weeksTrainingConvbugsTraining的列。然后,您尝试评估变量名称不同的validateFrame上的模型 - weekTestConvbugsTest。您必须始终使用相同的变量名称。

我不太确定你打算如何使用totWeekConvtotBugs,但我相信你想要的是:

trainingFrame = data.frame(weeksConv = weeksTrainingConv,bugs = bugsTraining)
validateFrame = data.frame(weeksConv = weekTestConv,bugs = bugsTest)

model <- lm(weeksConv ~ bugs,trainingFrame)
myPrediction <- predict(model,validateFrame)

在这里,您正在培训培训数据和测试测试数据,但两个地方的列名相同。