使用预制模型文件预测在WEKA中即时创建的数据

时间:2013-11-20 12:55:33

标签: java machine-learning artificial-intelligence weka indexoutofboundsexception

我想创建一个WEKA Java程序,它读取一组新创建的数据,这些数据将从GUI版本提供给预制模型。

以下是该计划:

import java.util.ArrayList;
import weka.classifiers.Classifier;
import weka.core.Attribute;
import weka.core.DenseInstance;
import weka.core.Instances;
import weka.core.Utils;


public class UseModelWithData {

  public static void main(String[] args) throws Exception {
    // load model
    String rootPath = "G:/";
    Classifier classifier = (Classifier) weka.core.SerializationHelper.read(rootPath+"j48.model");

    // create instances
    Attribute attr1 = new Attribute("age");
    Attribute attr2 = new Attribute("menopause");
    Attribute attr3 = new Attribute("tumor-size");
    Attribute attr4 = new Attribute("inv-nodes");
    Attribute attr5 = new Attribute("node-caps");
    Attribute attr6 = new Attribute("deg-malig");
    Attribute attr7 = new Attribute("breast");
    Attribute attr8 = new Attribute("breast-quad");
    Attribute attr9 = new Attribute("irradiat");
    Attribute attr10 = new Attribute("Class");

    ArrayList<Attribute> attributes = new ArrayList<Attribute>();
    attributes.add(attr1);
    attributes.add(attr2);
    attributes.add(attr3);
    attributes.add(attr4);
    attributes.add(attr5);
    attributes.add(attr6);
    attributes.add(attr7);
    attributes.add(attr8);
    attributes.add(attr9);
    attributes.add(attr10);

    // predict instance class values
    Instances testing = new Instances("Test dataset", attributes, 0);

    // add data
    double[] values = new double[testing.numAttributes()];
    values[0] = testing.attribute(0).addStringValue("60-69");
    values[1] = testing.attribute(1).addStringValue("ge40");
    values[2] = testing.attribute(2).addStringValue("10-14");
    values[3] = testing.attribute(3).addStringValue("15-17");
    values[4] = testing.attribute(4).addStringValue("yes");
    values[5] = testing.attribute(5).addStringValue("2");
    values[6] = testing.attribute(6).addStringValue("right");
    values[7] = testing.attribute(7).addStringValue("right_up");
    values[8] = testing.attribute(0).addStringValue("yes");
    values[9] = Utils.missingValue();

    // add data to instance
    testing.add(new DenseInstance(1.0, values));
    // instance row to predict
    int index = 10;
    // perform prediction
    double myValue = classifier.classifyInstance(testing.instance(10));
    // get the name of class value
    String prediction = testing.classAttribute().value((int) myValue);

    System.out.println("The predicted value of the instance [" 
        + Integer.toString(index) + "]: " + prediction);

  }

}

我的参考资料包括:

到目前为止,我在脚本中创建新Instance的部分会导致以下错误:

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 10, Size: 1

double myValue = classifier.classifyInstance(testing.instance(10));

我只想将最新一行的实例值用于预制的WEKA模型。我该如何解决这个问题?


资源

1 个答案:

答案 0 :(得分:2)

您有错误,因为您尝试访问第11个实例并且只创建了一个。

如果您始终想要访问最后一个实例,可以尝试以下操作:

double myValue = classifier.classifyInstance(testing.lastInstance());  

此外,我不相信您正在创建您希望的实例。在查看您提供的“.arff”文件后,我认为您试图模仿该文件,我认为您应该按照以下方式制作实例:

FastVector      atts;
FastVector      attAge;

Instances       testing;
double[]        vals;

// 1. set up attributes
atts = new FastVector();

//age
attAge = new FastVector();
attAge.addElement("10-19");
attAge.addElement("20-29");
attAge.addElement("30-39");
attAge.addElement("40-49");
attAge.addElement("50-59");
attAge.addElement("60-69");
attAge.addElement("70-79");
attAge.addElement("80-89");
attAge.addElement("90-99");
atts.addElement(new Attribute("age", attAge));

// 2. create Instances object
testing = new Instances("breast-cancer", atts, 0);

// 3. fill with data
vals = new double[testing.numAttributes()];
vals[0] = attAge.indexOf("10-19");
testing.add(new DenseInstance(1.0, vals));

// 4. output data
System.out.println(testing);

当然我没有创建整个数据集,但技术是相同的。