Weka:如何在weka中准备测试集

时间:2013-08-08 04:26:37

标签: weka

我一直在使用带有以下数据的SVM分类器

@relation whatever

@attribute mfe numeric
@attribute GB numeric
@attribute GTB numeric
@attribute Seeds numeric
@attribute ABP numeric
@attribute AU_Seed numeric
@attribute GC_Seed numeric
@attribute GU_Seed numeric
@attribute UP numeric
@attribute AU numeric
@attribute GC numeric
@attribute GU numeric
@attribute A-U_L numeric
@attribute G-C_L numeric
@attribute G-U_L numeric
@attribute (G+C) numeric
@attribute MFEi1 numeric
@attribute MFEi2 numeric
@attribute MFEi3 numeric
@attribute MFEi4 numeric
@attribute dG numeric
@attribute dP numeric
@attribute dQ numeric
@attribute dD numeric
@attribute Outcome {Yes,No}


@data
-24.3,1,18,2,9,4,3,0.5,8,10,7,1,0.454545455,0.318181818,0.045454545,7,-0.157792208,-0.050206612,-1.104545455,-1.35,-1.104545455,0,0,0,Yes
-24.8,2,15,2,7.5,2,3,1,7,5,8,2,0.208333333,0.333333333,0.083333333,8,-0.129166667,-0.043055556,-0.516666667,-1.653333333,-1.033333333,0,0,0,No
-24.4,1,16,3,5.333333333,1.666666667,2.666666667,1,4,5,8,3,0.217391304,0.347826087,0.130434783,8,-0.132608696,-0.046124764,-1.060869565,-1.525,-1.060869565,0,0,0,Yes
-24.2,1,18,2,9,2,2.5,1,10,5,11,2,0.227272727,0.5,0.090909091,11,-0.1,-0.05,-1.1,-1.344444444,-1.1,0,0,0,Yes
-24.5,3,17,2,8.5,2,3,1,5,6,9,2,0.272727273,0.409090909,0.090909091,9,-0.123737374,-0.050619835,-0.371212121,-1.441176471,-1.113636364,-0.12244898,0,0,Yes

这是我的训练集。并在此定义我的数据是肯定的类还是没有类。我的问题是我的测试数据来自未知来源,我不知道它属于哪个类。那么如何准备我的测试集。如果没有结果属性,weka就会给出“ereor:Data mismatch”。如何准备测试集?使用SVM将我的变量分隔为是和nO类。

2 个答案:

答案 0 :(得分:9)

准备测试集的步骤:

  1. 以CSV格式创建训练集。
  2. 同样以相同的编号创建CSV格式的测试集。属性和相同类型。
  3. 复制测试集并粘贴到训练集的末尾并保存为新的CSV文件。
  4. 使用Weka>> Explorer>>预处理在步骤3中导入保存的CSV文件。
  5. 在过滤选项中选择过滤器>>无监督>>实例>>删除范围。
  6. 点击首先显示RemoveRange-R的Feed。
  7. 指定要删除的范围,说训练数据有100个值,然后选择第一个100并应用过滤器。
  8. 另存为Arff文件,可以将其用作测试集。
  9. 然后应用此套装。如果您仍有任何错误,请写下此帖子的回复。

答案 1 :(得分:2)

如果您不想经历麻烦,那么您可以准备您的测试集,其中包含您的训练集中的确切名称,数据类型和数据范围,当然还有属性值。 class属性将存在,​​但值应为问号(?)。例如,要将您给定的训练集转换为测试集,可以进行以下更改`@ relation whatever

    @relation whatever-TEST

    @attribute mfe numeric
    @attribute GB numeric
    @attribute GTB numeric
    @attribute Seeds numeric
    @attribute ABP numeric
    @attribute AU_Seed numeric
    @attribute GC_Seed numeric
    @attribute GU_Seed numeric
    @attribute UP numeric
    @attribute AU numeric
    @attribute GC numeric
    @attribute GU numeric
    @attribute A-U_L numeric
    @attribute G-C_L numeric
    @attribute G-U_L numeric
    @attribute (G+C) numeric
    @attribute MFEi1 numeric
    @attribute MFEi2 numeric
    @attribute MFEi3 numeric
    @attribute MFEi4 numeric
    @attribute dG numeric
    @attribute dP numeric
    @attribute dQ numeric
    @attribute dD numeric
    @attribute Outcome {Yes,No}


    @data
    -24.3,1,18,2,9,4,3,0.5,8,10,7,1,0.454545455,0.318181818,0.045454545,7,-0.157792208,-0.050206612,-1.104545455,-1.35,-1.104545455,0,0,0,?
    -24.8,2,15,2,7.5,2,3,1,7,5,8,2,0.208333333,0.333333333,0.083333333,8,-0.129166667,-0.043055556,-0.516666667,-1.653333333,-1.033333333,0,0,0,?
    -24.4,1,16,3,5.333333333,1.666666667,2.666666667,1,4,5,8,3,0.217391304,0.347826087,0.130434783,8,-0.132608696,-0.046124764,-1.060869565,-1.525,-1.060869565,0,0,0,?
    -24.2,1,18,2,9,2,2.5,1,10,5,11,2,0.227272727,0.5,0.090909091,11,-0.1,-0.05,-1.1,-1.344444444,-1.1,0,0,0,?
    -24.5,3,17,2,8.5,2,3,1,5,6,9,2,0.272727273,0.409090909,0.090909091,9,-0.123737374,-0.050619835,-0.371212121,-1.441176471,-1.113636364,-0.12244898,0,0,?

`