azure机器学习Web服务培训输出

时间:2017-08-23 21:44:42

标签: azure azure-data-factory azure-machine-learning-studio

我最近设置了一个天蓝色的机器学习实验,每天使用azure数据工厂重新训练,更新和执行示例文档

我的管道设置类似于下面的

{
  "name": "RetrainAndExecutePipeline",
  "properties": {
    "activities": [{
      "type": "AzureMLBatchExecution",
      "typeProperties": {
        "webServiceOutputs": {
          "Output-TrainedModel": "TrainedModel"
        },
        "webServiceInputs": {},
        "globalParameters": {}
      },
      "outputs": [{
          "name": "TrainedModel"
        }
      ],
      "policy": {
        "timeout": "01:00:00",
        "concurrency": 1,
        "executionPriorityOrder": "NewestFirst",
        "retry": 3
      },
      "scheduler": {
        "frequency": "Day",
        "interval": 1,
        "offset": "22:00:00",
        "style": "StartOfInterval"
      },
      "name": "Retrain ML Model",
      "linkedServiceName": "TrainingService"
    }],
    "start": "2017-08-20T22:00:00Z",
    "end": "9999-09-09T00:00:00Z",
    "isPaused": false,
    "hubName": "autdatafactoryml_hub",
    "pipelineMode": "Scheduled"
  }
}

和下面的TrainedModel数据集

{
  "name": "TrainedModel",
  "properties": {
      "published": false,
      "type": "AzureBlob",
      "linkedServiceName": "AzureStorageLinkedService",
      "typeProperties": {
          "fileName": "trainedModel.ilearner",
          "folderPath": "trainingoutput",
          "format": {
              "type": "TextFormat"
          }
      },
      "availability": {
          "frequency": "Day",
          "interval": 1,
          "offset": "22:00:00",
          "style": "StartOfInterval"
      }
  }
}

我注意到,在完成培训后,我从Web服务输出进入azure blob存储的输出连接到" Train Model" node是ilearner文件和两个随机命名的文件,没有扩展名,即使我还没有指定它们。 一个带有内容的xml格式化文件

<?xml version="1.0" encoding="utf-8"?>
<RuntimeInfo>
  <Language>DotNet</Language>
  <Version>4.5.0</Version>
</RuntimeInfo>

另一个包含您在azure ML实验中可视化输出时可以看到的信息,格式为json,如下所示

{
  "visualizationType": "learner",
  "learner": {
    "name": "LogisticRegressionClassifier",
    "isTrained": true,
    "settings": {
      "records": [
        ...
      ],
      "features": [
        {
          "name": "Setting",
          "index": 0,
          "elementType": "System.String",
          "featureType": "String Feature"
        },
        {
          "name": "Value",
          "index": 1,
          "elementType": "System.String",
          "featureType": "String Feature"
        }
      ],
      "name": null,
      "numberOfRows": 8,
      "numberOfColumns": 2
    },
    "weights": {
      "records": [
        ...
      ],
      "features": [
        {
          "name": "Feature",
          "index": 0,
          "elementType": "System.String",
          "featureType": "String Feature"
        },
        {
          "name": "Weight",
          "index": 1,
          "elementType": "System.Double",
          "featureType": "Numeric Feature"
        }
      ],
      "name": null,
      "numberOfRows": 92,
      "numberOfColumns": 2
    }
  }
}

这个json文件是我感兴趣的文件,因为我认为这是显示系数值的数据,我想跟踪个人系数值在更新训练模型时的变化,我无法找到捕获此输出的方法。

我的问题是,有没有办法在使用azure数据工厂的天蓝ML实验中从单个Web服务输出捕获多个输出? 或者我有一个完全不同的方式来解决这个问题吗?

我感谢每个人&#39;反馈并提前感谢

1 个答案:

答案 0 :(得分:0)

在Azure ML Studio中,您可以通过附加多个Web服务输出模块来创建具有多个输出的Web服务。调用Web服务时,将以JSON格式返回这些模块的输出。例如,您还可以使用多个导出数据模块将多个结果写入Azure存储。