我的实验有什么问题(试图预测汽车销售)?

时间:2017-08-08 02:23:35

标签: machine-learning regression azure-machine-learning-studio

我有这样的数据集(只是它的一个样本):

DATE_REF,MONTH,YEAR,DAY_OF_YEAR,DAY_OF_MONTH,WEEK_DAY,WEEK_DAY_1,WEEK_DAY_2,WEEK_DAY_3,WEEK_DAY_4,WEEK_DAY_5,WEEK_DAY_6,WEEK_DAY_7,WEEK_NUMBER_IN_MONTH,WEEKEND,WORK_DAY,AMOUNT_SOLD
20100101,1,2010,1,1,6,0,0,0,0,0,1,0,1,0,0,0
20100102,1,2010,2,2,7,0,0,0,0,0,0,1,1,1,0,2
20100103,1,2010,3,3,1,1,0,0,0,0,0,0,2,1,0,0
20100104,1,2010,4,4,2,0,1,0,0,0,0,0,2,0,1,12830
20100105,1,2010,5,5,3,0,0,1,0,0,0,0,2,0,1,19200
20100106,1,2010,6,6,4,0,0,0,1,0,0,0,2,0,1,22930
20100107,1,2010,7,7,5,0,0,0,0,1,0,0,2,0,1,23495
20100108,1,2010,8,8,6,0,0,0,0,0,1,0,2,0,1,23215
20100109,1,2010,9,9,7,0,0,0,0,0,0,1,2,1,0,172
20100110,1,2010,10,10,1,1,0,0,0,0,0,0,3,1,0,0
20100111,1,2010,11,11,2,0,1,0,0,0,0,0,3,0,1,18815
20100112,1,2010,12,12,3,0,0,1,0,0,0,0,3,0,1,25415
20100113,1,2010,13,13,4,0,0,0,1,0,0,0,3,0,1,25262
20100114,1,2010,14,14,5,0,0,0,0,1,0,0,3,0,1,27967
20100115,1,2010,15,15,6,0,0,0,0,0,1,0,3,0,1,26352
20100116,1,2010,16,16,7,0,0,0,0,0,0,1,3,1,0,202
20100117,1,2010,17,17,1,1,0,0,0,0,0,0,4,1,0,10
20100118,1,2010,18,18,2,0,1,0,0,0,0,0,4,0,1,20295
20100119,1,2010,19,19,3,0,0,1,0,0,0,0,4,0,1,25982
20100120,1,2010,20,20,4,0,0,0,1,0,0,0,4,0,1,24745
20100121,1,2010,21,21,5,0,0,0,0,1,0,0,4,0,1,28087
20100122,1,2010,22,22,6,0,0,0,0,0,1,0,4,0,1,28417
20100123,1,2010,23,23,7,0,0,0,0,0,0,1,4,1,0,115
20100124,1,2010,24,24,1,1,0,0,0,0,0,0,5,1,0,5
20100125,1,2010,25,25,2,0,1,0,0,0,0,0,5,0,1,20185
20100126,1,2010,26,26,3,0,0,1,0,0,0,0,5,0,1,25932
20100127,1,2010,27,27,4,0,0,0,1,0,0,0,5,0,1,31710
20100128,1,2010,28,28,5,0,0,0,0,1,0,0,5,0,1,21020
20100129,1,2010,29,29,6,0,0,0,0,0,1,0,5,0,1,51460
20100130,1,2010,30,30,7,0,0,0,0,0,0,1,5,1,0,670
20100131,1,2010,31,31,1,1,0,0,0,0,0,0,6,1,0,17

我尝试使用Azure ML上的以下实验预测AMOUNT_SOLD新日期(DATE_REF):

Azure ML Experiment

然后我部署了Web服务并测试了预测,但我得到的AMOUNT_SOLD列为零。

我可能缺少什么?

1 个答案:

答案 0 :(得分:1)

尽管我想复制你的Azure ML实验,但我没有足够的数据。但我所做的如下:

enter image description here

我复制了您的示例数据,然后将其乘以4次(添加行数x 2 )。 然后拆分数据(70%/ 30%),随机种子7(可重复的结果)。 Boosted决策树回归具有默认参数。 在调整模型超参数上,我选择 AMOUNT_SOLD 作为标签列。 然后得分模型评估模型

enter image description here

准确度/测定系数非常好。

之后,要将其部署为Web服务,您必须首先从训练实验中设置预测实验。 Setup Web Service > Predictive Experiment你的实验会像魔法一样移动。

enter image description here

Web服务输入模块默认位于实验顶部。我移动了它并在Score Model 的右侧连接,因此当您输入Web服务的参数时,将使用您的Trained Model预测。< / p>

在分数模型模块之后,我在数据集模块中放置了选择列,并仅选择了名为评分标签的列。此列包含模型的预测。然后,我使用编辑元数据模块重命名“评分标签”列,然后将其传递给 Web服务输出模块。

您的实验现已准备好部署为Web服务。

为了预测新值,我使用当前日期详细信息作为输入测试了Web服务。 (虽然DATE_REF输入必须为20170818 :D)

enter image description here

然后输出如下:

enter image description here

您的网络服务现在可以预测新值。