我可以在accord.net中为ID3算法创建交叉验证吗?

时间:2014-05-07 22:25:46

标签: c# id3 cross-validation accord.net

我的代码快照:(完整版:http://pastebin.com/7ALhSKgX

        var crossvalidation = new CrossValidation(size: data.Rows.Count, folds: 7);

        crossvalidation.Fitting = 
             delegate(int k, int[] indicesTrain, int[] indicesValidation)
        {
            //omitted declarations for clarity
            DecisionTree tree = new DecisionTree(attributes, classCount);

            //omitted
            double trainingError = 
               id3learning.ComputeError(trainingInputs, trainingOutputs);
            double validationError = 
               id3learning.ComputeError(validationInputs, validationOutputs);
            return new CrossValidationValues<DecisionTree>
               (tree, trainingError, validationError);
        };

错误在这一行:

          return new CrossValidationValues<DecisionTree>
                        (tree, trainingError, validationError);

并给出错误: 无法将匿名方法转换为委托类型'Accord.MachineLearning.CrossValidationFittingFunction',因为块中的某些返回类型不能隐式转换为委托返回类型

2 个答案:

答案 0 :(得分:1)

问题是您使用非通用构造函数CrossValidation来初始化crossvalidation变量。 CrossValidation类继承自CrossValidation<object>

Fitting属性是CrossValidationFittingFunction<TModel>代理,非通用TModel类的CrossValidationobject而不是DecisionTree

根据您的意图,您可以使用 more 特定构造函数来解决此问题:

var crossvalidation = new CrossValidation<DecisionTree>(size: data.Rows.Count, folds: 7);

返回较少的特定交叉验证值:

return new CrossValidationValues<object>(tree, trainingError, validationError);

答案 1 :(得分:0)

从版本3.7.0开始,现在可以创建使用交叉验证而无需编写自己的Fitting函数。示例如下所示:

// Ensure we have reproducible results
Accord.Math.Random.Generator.Seed = 0;

// Get some data to be learned: Here we will download and use Wiconsin's
// (Diagnostic) Breast Cancer dataset, where the goal is to determine
// whether the characteristics extracted from a breast cancer exam
// correspond to a malignant or benign type of cancer. In order to do
// this using the Accord.NET Framework, all we have to do is:
var data = new WisconsinDiagnosticBreastCancer();

// Now, we can import the input features and output labels using
double[][] input = data.Features; // 569 samples, 30-dimensional features
int[] output = data.ClassLabels;  // 569 samples, 2 different class labels

// Now, let's say we want to measure the cross-validation performance of
// a decision tree with a maximum tree height of 5 and where variables
// are able to join the decision path at most 2 times during evaluation:
var cv = CrossValidation.Create(

    k: 10, // We will be using 10-fold cross validation

    learner: (p) => new C45Learning() // here we create the learning algorithm
    {
        Join = 2,
        MaxHeight = 5
    },

    // Now we have to specify how the tree performance should be measured:
    loss: (actual, expected, p) => new ZeroOneLoss(expected).Loss(actual),

    // This function can be used to perform any special
    // operations before the actual learning is done, but
    // here we will just leave it as simple as it can be:
    fit: (teacher, x, y, w) => teacher.Learn(x, y, w),

    // Finally, we have to pass the input and output data
    // that will be used in cross-validation. 
    x: input, y: output
);

// After the cross-validation object has been created,
// we can call its .Learn method with the input and 
// output data that will be partitioned into the folds:
var result = cv.Learn(input, output);

// We can grab some information about the problem:
int numberOfSamples = result.NumberOfSamples; // should be 569
int numberOfInputs = result.NumberOfInputs;   // should be 30
int numberOfOutputs = result.NumberOfOutputs; // should be 2

double trainingError = result.Training.Mean; // should be 0
double validationError = result.Validation.Mean; // should be 0.089661654135338359