训练模型时遇到问题。我有一系列的HTTP请求,我希望能够确定该请求是否来自机器人。要进行培训,我有以下范围:
public class Request
{
public string Url { get; set; }
public string UserAgent { get; set; }
public bool IsBot { get; set; }
}
还有这样的预测类:
public class IsBotPrediction
{
[ColumnName("PredictedLabel")]
public bool Prediction { get; set; }
public float Score { get; set; }
}
仅在此示例中,我创建了一个硬编码数据列表:
var trainingData = new List<Request>
{
new Request { Url = "/wp-admin", UserAgent = "a bot", IsBot = true },
new Request { Url = "/backoffice", UserAgent = "a bot", IsBot = true },
new Request { Url = "/hack", UserAgent = "a bot", IsBot = true },
new Request { Url = "/login", UserAgent = "a bot", IsBot = false },
new Request { Url = "/dashboard", UserAgent = "a bot", IsBot = false },
new Request { Url = "/humans.txt", UserAgent = "a bot", IsBot = false },
new Request { Url = "/admin", UserAgent = "a bot", IsBot = true },
};
要训练模型,我正在使用以下代码:
IDataView mlData = mlContext.Data.LoadFromEnumerable(trainingData);
var dataPrepPipeline = mlContext
.Transforms
.Text
.FeaturizeText("UrlF", "Url")
.Append(mlContext.Transforms.Text.FeaturizeText("UserAgentF", "UserAgent"))
.Append(mlContext.Transforms.Concatenate("Features", "UrlF", "UserAgentF"))
.Append(mlContext.Transforms.NormalizeMinMax("Features", "Features"))
.AppendCacheCheckpoint(mlContext);
var prepPipeline = dataPrepPipeline.Fit(mlData);
var trainer = mlContext
.BinaryClassification
.Trainers
.AveragedPerceptron(labelColumnName: "IsBot", numberOfIterations: 10, featureColumnName: "Features");
var preprocessedData = prepPipeline.Transform(mlData);
ITransformer trainedModel = trainer.Fit(preprocessedData);
训练有素的模型似乎很成功。但是当我尝试创建预测引擎时:
var predEngine = mlContext.Model.CreatePredictionEngine<Request, IsBotPrediction>(trainedModel);
我收到以下异常:
System.ArgumentOutOfRangeException:'功能列'功能'未找到(参数'模式')'
您能帮我弄清楚这是什么意思吗?
答案 0 :(得分:2)
这可能是由于在将数据拟合到模型之前进行了数据转换。
以下设置应该可以使用。
var dataPrepPipeline = mlContext.Transforms.Text.FeaturizeText("UrlF", "Url")
.Append(mlContext.Transforms.Text.FeaturizeText("UserAgentF", "UserAgent"))
.Append(mlContext.Transforms.Concatenate("Features", "UrlF", "UserAgentF"))
.Append(mlContext.Transforms.NormalizeMinMax("Features", "Features"))
.AppendCacheCheckpoint(mlContext);
var dataPrepModel = dataPrepPipeline.Fit(mlData);
var dataPrepDataView = dataPrepModel.Transform(mlData);
var pipeline = dataPrepPipeline.Append(
mlContext.BinaryClassification.Trainers.AveragedPerceptron(labelColumnName: "IsBot", numberOfIterations: 10, featureColumnName: "Features"));
mlContext.Model.Save(dataPrepModel, dataPrepDataView.Schema, "./dataprep.zip");
var model = pipeline.Fit(mlData);
var modelDataView = model.Transform(mlData);
mlContext.Model.Save(model, modelDataView.Schema, "./model.zip");
var predEngine = mlContext.Model.CreatePredictionEngine<Request, IsBotPrediction>(model);