weka:向数据集添加新实例

时间:2013-05-28 17:05:41

标签: java dataset instance weka

我有一个weka数据集:

@attribute uid numeric
@attribute itemid numeric
@attribute rating numeric
@attribute timestamp numeric

@data
196 242 3   881250949
186 302 3   891717742
22  377 1   878887116
196 51  5   881250949
244 51  2   880606923

如果我想添加这样的新实例:

244 59  2   880606923

我该怎么办?

这样的事情?

Instances newData = arffLoader.getDataSet();
    for (int i = 0; i < newData.numInstances(); i++) {
         Instance one = newData.instance(i);
         one.setDataset(data);
         data.add(one);
    }

3 个答案:

答案 0 :(得分:1)

尝试以下代码。您需要做什么为新值创建一个双数组。使用DenseInstance类将它们添加到Instances对象。

public static void main(String[] args) {


    String dataSetFileName = "stackoverflowQuestion.arff";
    Instances data = MyUtilsForWekaInstanceHelper.getInstanceFromFile(dataSetFileName);
    System.out.println("Before adding");
    System.out.println(data);


    double[] instanceValue1 = new double[data.numAttributes()];
    instanceValue1[0] = 244;
    instanceValue1[1] = 59;
    instanceValue1[2] = 2;
    instanceValue1[3] = 880606923;

    DenseInstance denseInstance1 = new DenseInstance(1.0, instanceValue1);

    data.add(denseInstance1);

    System.out.println("-----------------------------------------------------------");
    System.out.println("After adding");
    System.out.println(data);


public class MyUtilsForWekaInstanceHelper {

public static Instances getInstanceFromFile(String pFileName)
{
    Instances data = null;
    try {
        BufferedReader reader = new BufferedReader(new FileReader(pFileName));
        data = new Instances(reader);
        reader.close();
        // setting class attribute
        data.setClassIndex(data.numAttributes() - 1);
    }
    catch (Exception e) {
        throw new RuntimeException(e);
    } 
    return data;

}
  }

输出正在跟随。

Before adding
@relation stackoverflowQuestion

@attribute uid numeric
@attribute itemid numeric
@attribute rating numeric
@attribute timestamp numeric

@data
196,242,3,881250949
186,302,3,891717742
22,377,1,878887116
196,51,5,881250949
244,51,2,880606923
---------------------------------------------------------------------------------
After adding
@relation stackoverflowQuestion

@attribute uid numeric
@attribute itemid numeric
@attribute rating numeric
@attribute timestamp numeric

@data
196,242,3,881250949
186,302,3,891717742
22,377,1,878887116
196,51,5,881250949
244,51,2,880606923
244,59,2,880606923

答案 1 :(得分:0)

您只需将新行附加到您的arff文件中,如:

String filename= "MyDataset.arff";
FileWriter fwriter = new FileWriter(filename,true); //true will append the new instance
fwiter.write("244 59  2   880606923\n");//appends the string to the file
fwriter.close();

答案 2 :(得分:-2)

可以轻松地将新实例添加到任何现有数据集中,如下所示:

 //assuming we already have arff loaded in a variable called dataset
     Instance newInstance  = new Instance();
     for(int i = 0 ; i < dataset.numAttributes() ; i++)
     {

         newInstance.setValue(i , value);
         //i is the index of attribute
         //value is the value that you want to set
     }
     //add the new instance to the main dataset at the last position
     dataset.add(newInstance);
     //repeat as necessary