使用包' neuralnet'在R中预测Weekly_Sales

时间:2017-07-08 08:47:32

标签: r neural-network

我是R的新手,我试图通过在R中使用神经网络包训练神经网络来预测测试数据集的Weekly_Sales。

我看过的数据(train1):

  Store Dep  Date     Temperature  Fuel_Price MarkDown1 MarkDown2 MarkDown3  MarkDown4  MarkDown5   CPI         Unemployment   IsHoliday    Rank Weekly_Sales
   1    1   5/2/2010       42.31      2.572     -2000      -500      -100        -500      -700 211.0963582        8.106         0         13     24924.50
   1    1  12/2/2010       38.51      2.548     -2000      -500      -100        -500      -700 211.2421698        8.106         1         13     46039.49
   1    1 19/02/2010       39.93      2.514     -2000      -500      -100        -500      -700 211.2891429        8.106         0         13     41595.55
   1    1 26/02/2010       46.63      2.561     -2000      -500      -100        -500      -700 211.3196429        8.106         0         13     19403.54
   1    1   5/3/2010       46.50      2.625     -2000      -500      -100        -500      -700 211.3501429        8.106         0         13     21827.90
   1    1  12/3/2010       57.79      2.667     -2000      -500      -100        -500      -700 211.3501429        8.106         0         13     21827.90

数据分离

>ind<- sample(2,nrow(train1),replace= TRUE,prob=c(0.7,0.3))
>train <- train1[ind==1,]
>test <- train1 [ind==2,]
火车

>head(train)
Store Dept  Date    Temperature   Fuel_Price MarkDown1 MarkDown2 MarkDown3  MarkDown4 MarkDown5      CPI    Unemployment      IsHoliday    Rank  Weekly_Sales
 1    1   5/2/2010       42.31      2.572     -2000      -500      -100     -500      -700       211.0963582     8.106         0          13     24924.50
 1    1 26-02-2010       46.63      2.561     -2000      -500      -100     -500      -700       211.3196429     8.106         0          13     19403.54
 1    1   5/3/2010       46.50      2.625     -2000      -500      -100     -500      -700       211.3501429     8.106         0          13     21827.90
 1    1 19-03-2010       54.58      2.720     -2000      -500      -100     -500      -700       211.2156350     8.106         0          13     22136.64
 1    1 26-03-2010       51.45      2.732     -2000      -500      -100     -500      -700       211.0180424     8.106         0          13     26229.21
 1    1   2/4/2010       62.27      2.719     -2000      -500      -100     -500      -700       210.8204499     7.808         0          13     57258.43

进行测试

>head(test)
Store Dept  Date   Temperature Fuel_Price MarkDown1 MarkDown2 MarkDown3  MarkDown4 MarkDown5  CPI     Unemployment   IsHoliday Rank Weekly_Sales
1    1  12/2/2010       38.51      2.548     -2000      -500      -100       -500      -700   211.2421698    8.106       1     13     46039.49
1    1 19-02-2010       39.93      2.514     -2000      -500      -100       -500      -700   211.2891429    8.106       0     13     41595.55
1    1  12/3/2010       57.79      2.667     -2000      -500      -100       -500      -700   211.3806429    8.106       0     13     21043.39
1    1   7/5/2010       72.55      2.835     -2000      -500      -100       -500      -700   210.3399684    7.808       0     13     17413.94
1    1 21-05-2010       76.44      2.826     -2000      -500      -100       -500      -700   210.6170934    7.808       0     13     14773.04
1    1 28-05-2010       80.44      2.759     -2000      -500      -100       -500      -700   210.8967606    7.808       0     13     15580.43

我使用的代码如下所示:

>library(neuralnet)



>n <-neuralnet(Weekly_Sales~Temperature+Fuel_Price+MarkDown1+MarkDown2+MarkDown3+MarkDown4+MarkDown5+CPI+Unemployment+IsHoliday+Rank,data= train,hidden=c(4,3),err.fct="sse",linear.output=FALSE)
>plot(n)
>output <- compute(n,test[,4:14])
>output1 <- output$net.result*(max(test$Weekly_Sales)-min(test$Weekly_Sales))+min(test$Weekly_Sales)

神经网络经过训练,显示的误差范围为10 ^ 13。我每次都得到相同的输出,我正在运行代码,这些预测甚至不接近测试数据中的实际Weekly_Sales。我已经使用了另一个部门的数据集,但仍然得到了相同的预测。

输出

>head(output$net.result)
      [,1]
2  0.9999999998
3  0.9999999998
6  0.9999999998
14 0.9999999998
16 0.9999999998
17 0.9999999998



> head(output1)
    [,1]
2  149743.97
3  149743.97
6  149743.97
14 149743.97
16 149743.97
17 149743.97

1 个答案:

答案 0 :(得分:1)

在应用neuralnet()之前,需要规范化数据。因此,在将train1拆分为train / test之前,请使用下面的代码

maximum <- apply(train1, 2, max)
minimum <- apply(train1, 2, min)
train1_scaled <- as.data.frame(scale(train1, center=minimum, scale = maximum- minimum))

然后使用您的代码分割数据并使用以下功能

#linear.output should be TRUE as you are predicting continuos dependent variable
n <- neuralnet(Weekly_Sales~Temperature+Fuel_Price+MarkDown1+MarkDown2+MarkDown3+MarkDown4+MarkDown5+CPI+Unemployment+IsHoliday+Rank,data= train,hidden=c(4,3),err.fct="sse",linear.output=TRUE)

此代码后面的代码也需要稍作修改

#basically to convert it back to non-scaled version, you need to do it using non-scaled original data not 'test' dataset 
output1 <- output$net.result*(max(train1$Weekly_Sales)-min(train1$Weekly_Sales))+min(train1$Weekly_Sales)

#also the dependent variable in test dataset will need conversion
test$Weekly_Sales_nonScaled <- test$Weekly_Sales*(max(train1$Weekly_Sales)-min(train1$Weekly_Sales))+min(train1$Weekly_Sales)

#After this you can compare original data (test$Weekly_Sales_nonScaled) with predicted data (output1)

请不要忘记告诉我们是否有帮助:)