Weka In C#将标签和数据写入数组

时间:2013-12-03 21:28:07

标签: c# arrays data-mining weka

我已经就这个代码块问了一些问题,并且在部件和一般结构的并发执行方面获得了帮助。在weka语境中我还有一个问题。我想将标签的结果输出到数据中,以便稍后将其放入图中。这是完整的代码:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using weka.classifiers.meta;
using weka.classifiers.functions;
using weka.core;
using java.io;
using weka.clusterers;
using System.Diagnostics;
using System.Threading;

// From http://weka.wikispaces.com/IKVM+with+Weka+tutorial

class MainClass
{
    public static void Main(string[] args)
    {
        System.Console.WriteLine("J48 in C#");
        classifyTest();
    }

    const int percentSplit = 66;
    public static void classifyTest()
    {
        try
        {

            Stopwatch stopwatch1 = new Stopwatch();
            stopwatch1.Start();

            weka.core.Instances insts = new weka.core.Instances(new java.io.FileReader(@"C:\Users\Deines\Documents\School\Software\WekaSharp2012\data\sonar.arff"));
            insts.setClassIndex(insts.numAttributes() - 1);

            weka.classifiers.Classifier cl = new weka.classifiers.trees.J48();
            System.Console.WriteLine("Performing " + percentSplit + "% split evaluation.");

            // Stop timing
            stopwatch1.Stop();

            // Write result
            System.Console.WriteLine("Load The Data Set: {0}",
              stopwatch1.ElapsedMilliseconds);


            Stopwatch stopwatch2 = new Stopwatch();
            stopwatch2.Start();

            //randomize the order of the instances in the dataset.
            weka.filters.Filter myRandom = new weka.filters.unsupervised.instance.Randomize();
            myRandom.setInputFormat(insts);
            insts = weka.filters.Filter.useFilter(insts, myRandom);

            int trainSize = insts.numInstances() * percentSplit / 100;
            int testSize = insts.numInstances() - trainSize;
            weka.core.Instances train = new weka.core.Instances(insts, 0, trainSize);

            // Stop timing
            stopwatch2.Stop();

            // Write result
            System.Console.WriteLine("Tasks With Parameter Set: {0}",
              stopwatch2.ElapsedMilliseconds);


            Stopwatch stopwatch3 = new Stopwatch();
            stopwatch3.Start();

            cl.buildClassifier(train);
            int numCorrect = 0;
            for (int i = trainSize; i < insts.numInstances(); i++)
            {
                weka.core.Instance currentInst = insts.instance(i);
                double predictedClass = cl.classifyInstance(currentInst);
                if (predictedClass == insts.instance(i).classValue())
                    numCorrect++;
            }
            // Stop timing
            stopwatch3.Stop();

            // Write result
            System.Console.WriteLine("Sequential Time: {0}",
              stopwatch3.ElapsedMilliseconds);

            Stopwatch stopwatch4 = new Stopwatch();
            stopwatch4.Start();

            //Parallel Calculation
            cl.buildClassifier(train);
            int numCorrectpara = 0;
            System.Threading.Tasks.Parallel.For(trainSize, insts.numInstances(), i =>
            {
                weka.core.Instance currentInst = insts.instance(i);
                double predictedClass = cl.classifyInstance(currentInst);
                if (predictedClass == insts.instance(i).classValue())
                    Interlocked.Increment(ref numCorrectpara);
            });

            // Stop timing
            stopwatch4.Stop();

            // Write result
            System.Console.WriteLine("Parallel Time: {0}",
              stopwatch3.ElapsedMilliseconds);


            System.Console.WriteLine(numCorrect + " out of " + testSize + " correct (" +
           (double)((double)numCorrect / (double)testSize * 100.0) + "%)");


            System.Console.WriteLine(numCorrectpara + " out of " + testSize + " correct (" +
           (double)((double)numCorrectpara/ (double)testSize * 100.0) + "%)");

        }
        catch (java.lang.Exception ex)
        {
            ex.printStackTrace();
        }
    }

应该制作数组的部分如下,我认为行世界是这样的:

    cl.buildClassifier(train);
    int numCorrect = 0;
    string[,] array = new string[38, insts.numInstances()];

    for (int i = trainSize; i < insts.numInstances(); i++)
    {
        weka.core.Instance currentInst = insts.instance(i);
        double predictedClass = cl.classifyInstance(currentInst);
        string value = array[currentInst, predictedClass];
        Console.WriteLine(value);

        if (predictedClass == insts.instance(i).classValue())
            numCorrect++;
    }

我收到错误代码: 错误1无法将类型'weka.core.Instance'隐式转换为'int'C:\ Users \ Deines \ Documents \ School \ Software \ WekaCSharp2012 \ Class1.cs 77 38 WekaSample 错误2无法将类型'double'隐式转换为'int'。存在显式转换(您是否缺少演员?)C:\ Users \ Deines \ Documents \ School \ Software \ WekaCSharp2012 \ Class1.cs 77 51 WekaSample

我假设一个38个元素的数据集和一个由实例定义的长度。我仍然是C#的新手,因为我大部分时间都在函数式编程中度过,所以我为无知的问题道歉。非常感谢你。

1 个答案:

答案 0 :(得分:0)

首先,您的数组是一个二维的字符串数组。可以使用整数索引访问数组中的单元格。 看来你正在使用2种不是整数作为索引的类型。一个是weka.core.Instance类型,另一个是double类型。如果这些确实是您的索引,那么您应该以某种方式将它们转换为整数或使用不同的&#34;容器&#34;。

其次,你写了#34;应该制作数组&#34;。除非缺少某些代码,否则数组似乎是空的。你在这一行中的任务(如果索引错误得到纠正)将始终返回一个空字符串:

string value = array[currentInst, predictedClass];

再读几次,我想我明白你想要什么。 假设您要存储在数组中的信息已分配给predictedClass,并假设您希望按实例的顺序将结果存储在数组中,那么您应该这样做:

您应该将数组类型更改为具有一维的double数组。

double[] array = new double[insts.numInstances()];

现在我也假设你想要所有实例的结果,所以让我们从0开始循环。当然可以存储一些实例的结果,在这种情况下,一些数组将会是空的,也许最好只使用一个较小的阵列。

for (int i = 0; i < insts.numInstances(); i++)

接下来我们拿一个实例,然后我们得到“预测类”&#39;并将结果存储在数组中:

weka.core.Instance currentInst = insts.instance(i);
double predictedClass = cl.classifyInstance(currentInst);
array[i] = predictedClass;
//.. the rest of the loop

如果你想存储类值,你可以使用这个赋值:

array[i] = insts.instance(i).classValue();