训练集有0个实例,训练异常中止

时间:2019-02-11 17:56:45

标签: c# .net-core ml.net

我正在将我的项目重建到ML.NET 0.10。我从this link获取数据,数据看起来像这样(我以这种方式将其保存为.csv文件:

diagnosis;radius_mean;texture_mean;perimeter_mean;area_mean;smoothness_mean;compactness_mean;concavity_mean;concave points_mean;symmetry_mean;fractal_dimension_mean;radius_se;texture_se;perimeter_se;area_se;smoothness_se;compactness_se;concavity_se;concave points_se;symmetry_se;fractal_dimension_se;radius_worst;texture_worst;perimeter_worst;area_worst;smoothness_worst;compactness_worst;concavity_worst;concave points_worst;symmetry_worst;fractal_dimension_worst
B;11.62;18.18;76.38;408.8;0.1175;0.1483;0.102;0.05564;0.1957;0.07255;0.4101;1.74;3.027;27.85;0.01459;0.03206;0.04961;0.01841;0.01807;0.005217;13.36;25.4;88.14;528.1;0.178;0.2878;0.3186;0.1416;0.266;0.0927
B;9.667;18.49;61.49;289.1;0.08946;0.06258;0.02948;0.01514;0.2238;0.06413;0.3776;1.35;2.569;22.73;0.007501;0.01989;0.02714;0.009883;0.0196;0.003913;11.14;25.62;70.88;385.2;0.1234;0.1542;0.1277;0.0656;0.3174;0.08524

我的Data班级是这样的:

class CancerData
{
    [LoadColumn(0, 30), ColumnName("Features")]
    public float FeatureVector { get; set; }

    [LoadColumn(31)]
    public float Target { get; set; }
}

现在,我的Program.cs文件:

var mlContext = new MLContext();
var trainData = mlContext.Data.ReadFromTextFile<CancerData>("Cancer-train.csv", 
                             hasHeader: true, 
                             separatorChar: ';');

var pipeline = mlContext.Transforms
                        .Normalize("Features")
                        .AppendCacheCheckpoint(mlContext)
            .Append(mlContext.BinaryClassification.Trainers.StochasticDualCoordinateAscent(labelColumn: "Target", featureColumn: "Features"));

var model = pipeline.Fit(trainData);

var testData = mlContext.Data.ReadFromTextFile<CancerData>("Cancer-test.csv", 
                             hasHeader: true, 
                             separatorChar: ';');

var metrics = mlContext.BinaryClassification.Evaluate(model.Transform(testData), label: "Target");

从这段代码中,我得到一个异常:

  

System.InvalidOperationException:“训练集有0个实例,正在中止训练。”

enter image description here

我的问题是,我的代码正确吗?我的.csv文件位于项目文件夹中,并且可以与ML.NET 0.5一起使用。感谢您的任何建议!

1 个答案:

答案 0 :(得分:2)

LoadColumn(0, 30)指定从0到30列加载数据,而FeatureVector是单个浮点数。至少应为float[]

第一列包含文本数据。应该将其从FeatureVector数组中排除。

CancerData应该看起来像这样:

class CancerData
{
    [LoadColumn(1, 30), ColumnName("Features")]
    public float[] FeatureVector { get; set; }

    [LoadColumn(31)]
    public float Target { get; set; }
}

如果需要diagnosis列,则应为:

class CancerData
{
    [LoadColumn(0)]
    public string Diagnosis {get;set;}

    [LoadColumn(1, 30), ColumnName("Features")]
    public float[] FeatureVector { get; set; }

    [LoadColumn(31)]
    public float Target { get; set; }
}