拟合模型给出NULL

时间:2018-12-31 10:26:10

标签: r machine-learning neural-network random-forest

我想使用不同的算法训练模型。例如,这项工作:

dd=read.arff("china.arff")
model=lm(Effort~ ., data=dd)
fitted(model)

但是以下代码为同一数据集提供了NULL

install.packages("neuralnet")
library(neuralnet)
model=neuralnet(Effort~N_effort+Duration, data=dd, 
                   hidden=1,err.fct="ce", linear.output=FALSE)
fitted(model)

//给出NULL

randomForest模型显示相似的结果

这些模型不可能没有错误,那应该是什么问题?

structure(list(Output = c(150, 98, 27, 60, 69, 19, 14, 17, 64, 
60, 27, 17, 41, 40, 12, 38, 57, 20, 66, 112, 28, 68, 15, 15), 
    Inquiry = c(75, 70, 0, 20, 1, 0, 0, 15, 14, 20, 29, 8, 16, 
    20, 13, 24, 12, 24, 13, 21, 4, 0, 6, 0), RawFPcounts = c(1750, 
    1902, 535, 660, 478.89, 377.33, 256.25, 262.73, 715.79, 690.43, 
    465.45, 298.67, 490.59, 802.35, 220, 487.62, 550.91, 363.64, 
    1073.91, 1310, 476.19, 694, 189.52, 273.68), AdjFP = c(1750, 
    1902, 428, 759, 431, 283, 205, 289, 680, 794, 512, 224, 417, 
    682, 209, 512, 606, 400, 1235, 1572, 500, 694, 199, 260), 
    Effort = c(102.4, 105.2, 11.1, 21.1, 28.8, 10, 8, 4.9, 12.9, 
    19, 10.8, 2.9, 7.5, 12, 4.1, 15.8, 18.3, 8.9, 38.1, 61.2, 
    3.6, 11.8, 0.5, 6.1)), class = "data.frame", row.names = c(NA, 
-24L))

1 个答案:

答案 0 :(得分:0)

据我所知,fitted在R中并未得到广泛使用(也许在GLM模型的上下文中除外);老实说,我以前从未听说过该函数(并且我已经在R中编程了大约7年了。)

因此,在(广义)线性模型的上下文之外(即在诸如神经网络或随机森林之类的模型中)实际上并未实现该方法,并且它只是返回NULL,就不足为奇了。

好消息可能来自问自己您为什么要精确使用 fitted?因为实际上,从广义上讲,fitted大致等效于predict,至少对于简单的线性模型而言:

df <- data.frame(income=c(5,3,47,8,6,5),
               won=c(0,0,1,1,1,0),
               age=c(18,18,23,50,19,39),
               home=c(0,0,1,0,0,1))

md1 <- lm(income ~ age + home, data=df) # linear model



fitted(md1)
        1         2         3         4         5         6 
 7.893273  7.893273 28.320749 -1.389725  7.603179 23.679251 

predict(md1)
        1         2         3         4         5         6 
 7.893273  7.893273 28.320749 -1.389725  7.603179 23.679251 

在使用GLM的情况下,您只需在预测时指定type='response',以使这两个函数再次返回几乎相同的结果:

md2 <- glm(factor(won) ~ age + home, data=df, family=binomial(link="logit")) #glm

fitted(md2)
        1         2         3         4         5         6 
0.4208590 0.4208590 0.4193888 0.7274819 0.4308001 0.5806112 

predict(md2)
         1          2          3          4          5          6 
-0.3192480 -0.3192480 -0.3252830  0.9818840 -0.2785876  0.3252830 

predict(md2, type='response')
        1         2         3         4         5         6 
0.4208590 0.4208590 0.4193888 0.7274819 0.4308001 0.5806112 

因此,尽管fitted用于随机森林模型确实提供了NULL:

library(randomForest)
rf <- randomForest(income ~ age + home, data=df)
fitted(rf)
NULL

您可以简单地通过predict获得所需的结果:

predict(rf)
        1         2         3         4         5         6 
 9.748170 11.463800  5.186755 13.905696  8.791710 29.000931 

以下线程也可能有用:

Is there a difference between the R functions fitted() and predict()?

Finding the fitted and predicted values for a statistical model