为什么在拟合回归模型的过程中某些因素水平消失了

时间:2019-10-23 06:11:52

标签: regression prediction lm levels

我想用回归分析做一些预测。

用'lm'函数拟合模型后,我尝试使用'predict'函数进行一些预测。

但是此消息未显示预测

  

'model.frame.default中的错误(术语,newdata,na.action = na.action,xlev = object $ xlevels):因子g2的新级别为28,等等。

,这是我使用的代码。

testid2<-sample(1:length(h2),0.8*length(h2))

h2data<-data.frame(h2,e2,g2,i2,k2)

h2.train<-h2data[testid2,]

h2.test<-h2data[-testid2,]


 str(h2data)

'data.frame':   1936 obs. of  5 variables:

 $ h2: int  30 41 41 40 35 35 31 44 35 37 ...

 $ e2: int  374 362 814 719 704 724 714 689 687 660 ...

 $ g2: Factor w/ 76 levels "1","2","3","4",..: 6 7 26 27 28 36 40 41 42 43 ...

 $ i2: num  0.913 0.666 0.946 0.971 0.935 0.977 0.838 0.972 0.985 0.996 ...

 $ k2: int  50 36 172 187 170 173 166 145 157 157 ...


rreg2<-lm(log(h2+1)~e2+g2+i2+k2,data=h2.train)

pred2<-predict(rreg2,newdata=h2.test)

通过谷歌搜索,我发现发生此错误是因为我的测试数据中有一些g2级别不在拟合模型中。

但是在我用来拟合模型的训练数据中,需要所有级别。

我发现拟合模型过程中一些因素水平消失了。

级别(h2.train $ g2)

[1] "1"  "2"  "3"  "4"  "25" "26" "27" "28" "29" "30" "31" "32" "33" "34" "35" "36" "37"

[18] "38" "39" "40" "41" "42" "43" "44" "45" "46" "47" "48" "49" "50" "51" "52" "53" "54"

[35] "55" "56" "57" "58" "59" "60" "61" "62" "63" "64" "65" "66" "67" "68" "69" "70" "71"

[52] "72" "73" "74" "75" "76" "77" "78" "79" "80" "81" "82" "83" "84" "85" "86" "87" "88"

[69] "89" "90" "91" "92" "93" "94" "95" "96"

级别(h2.test $ g2)

[1] "1"  "2"  "3"  "4"  "25" "26" "27" "28" "29" "30" "31" "32" "33" "34" "35" "36" "37"

[18] "38" "39" "40" "41" "42" "43" "44" "45" "46" "47" "48" "49" "50" "51" "52" "53" "54"

[18] "38" "39" "40" "41" "42" "43" "44" "45" "46" "47" "48" "49" "50" "51" "52" "53" "54"

[35] "55" "56" "57" "58" "59" "60" "61" "62" "63" "64" "65" "66" "67" "68" "69" "70" "71"

[52] "72" "73" "74" "75" "76" "77" "78" "79" "80" "81" "82" "83" "84" "85" "86" "87" "88"


[69] "89" "90" "91" "92" "93" "94" "95" "96"

rreg2 $ xlevels $ g2

[1] "26" "27" "38" "40" "41" "42" "43" "44" "45" "46" "47" "49" "50" "56" "57" "58" "59"

[18] "61" "62" "63" "65" "66" "67" "68" "69" "70" "71" "72" "77" "84" "88" "91" "95"

所以,这是一个问题,要使所有级别保留在模型中并获得没有错误的预测,我该怎么办?

谢谢您的帮助。 真诚的。

0 个答案:

没有答案