我正在尝试将 .arff 文件传递给 LinearRegression 对象,同时这样做会给我这个异常无法处理多值名义类! 即可。
实际发生的是我正在使用 CFSSubsetEval 评估程序执行属性选择,并在执行此操作后搜索为 GreedyStepwise ,将这些属性传递给LinearRegression,如下所示
LinearRegression rl=new LinearRegression(); rl.buildClassifier(data);
data是Instance对象,它具有来自.arff文件的数据,该文件先前仅使用weka转换为标称值。我在这做错什么吗?我试图在谷歌上搜索这个错误,但找不到。
代码
package com.attribute;
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.Random;
import weka.attributeSelection.AttributeSelection;
import weka.attributeSelection.CfsSubsetEval;
import weka.attributeSelection.GreedyStepwise;
import weka.classifiers.Evaluation;
import weka.classifiers.functions.LinearRegression;
import weka.classifiers.meta.AttributeSelectedClassifier;
import weka.classifiers.trees.J48;
import weka.core.Instances;
import weka.core.Utils;
import weka.filters.supervised.attribute.NominalToBinary;
/**
* performs attribute selection using CfsSubsetEval and GreedyStepwise
* (backwards) and trains J48 with that. Needs 3.5.5 or higher to compile.
*
* @author FracPete (fracpete at waikato dot ac dot nz)
*/
public class AttributeSelectionTest2 {
/**
* uses the meta-classifier
*/
protected static void useClassifier(Instances data) throws Exception {
System.out.println("\n1. Meta-classfier");
AttributeSelectedClassifier classifier = new AttributeSelectedClassifier();
CfsSubsetEval eval = new CfsSubsetEval();
GreedyStepwise search = new GreedyStepwise();
search.setSearchBackwards(true);
J48 base = new J48();
classifier.setClassifier(base);
classifier.setEvaluator(eval);
classifier.setSearch(search);
Evaluation evaluation = new Evaluation(data);
evaluation.crossValidateModel(classifier, data, 10, new Random(1));
System.out.println(evaluation.toSummaryString());
}
/**
* uses the low level approach
*/
protected static void useLowLevel(Instances data) throws Exception {
System.out.println("\n3. Low-level");
AttributeSelection attsel = new AttributeSelection();
CfsSubsetEval eval = new CfsSubsetEval();
GreedyStepwise search = new GreedyStepwise();
search.setSearchBackwards(true);
attsel.setEvaluator(eval);
attsel.setSearch(search);
attsel.SelectAttributes(data);
int[] indices = attsel.selectedAttributes();
System.out.println("selected attribute indices (starting with 0):\n"
+ Utils.arrayToString(indices));
useLinearRegression(indices, data);
}
protected static void useLinearRegression(int[] indices, Instances data) throws Exception{
System.out.println("\n 4. Linear-Regression on above selected attributes");
BufferedReader reader = new BufferedReader(new FileReader(
"C:/Entertainement/MS/Fall 2014/spdb/project 4/healthcare.arff"));
Instances data1 = new Instances(reader);
data.setClassIndex(data.numAttributes() - 1);
/*NominalToBinary nb = new NominalToBinary();
for(int i=0;i<=20; i++){
//Still coding left here, create an Instance variable to store the data from 'data' variable for given indices
Instances data_lr=data1.
}*/
LinearRegression rl=new LinearRegression(); //Creating a LinearRegression Object to pass data1
rl.buildClassifier(data1);
}
/**
* takes a dataset as first argument
*
* @param args
* the commandline arguments
* @throws Exception
* if something goes wrong
*/
public static void main(String[] args) throws Exception {
// load data
System.out.println("\n0. Loading data");
BufferedReader reader = new BufferedReader(new FileReader(
"C:/Entertainement/MS/Fall 2014/spdb/project 4/healthcare.arff"));
Instances data = new Instances(reader);
if (data.classIndex() == -1)
data.setClassIndex(data.numAttributes() - 14);
// 1. meta-classifier
useClassifier(data);
// 2. filter
//useFilter(data);
// 3. low-level
useLowLevel(data);
}
}
注意:由于我没有编写代码来构建具有'indices'属性的实例变量,因此我(为了运行程序)从同一原始文件加载数据。
我不知道如何为样本数据上传文件,但它看起来像这样。 [link](https://scontent-a-dfw.xx.fbcdn.net/hphotos-xfa1/t31.0-8/p552x414/10496920_756438941076936_8448023649960186530_o.jpg)
答案 0 :(得分:4)
根据您的数据,您的上一个属性似乎是名义数据类型(主要包含数字,但也有一些字符串)。 LinearRegression不允许预测名义类别。
您可以采取哪些措施来确保您的给定数据集正常运行,通过带有线性回归的Weka Explorer运行它,并查看是否生成了所需的结果。在此之后,数据很有可能在您的代码中正常运行。
希望这有帮助!
答案 1 :(得分:0)
以下是LinearRegression(source)
的数据集示例@RELATION house
@ATTRIBUTE houseSize NUMERIC
@ATTRIBUTE lotSize NUMERIC
@ATTRIBUTE bedrooms NUMERIC
@ATTRIBUTE granite NUMERIC
@ATTRIBUTE bathroom NUMERIC
@ATTRIBUTE sellingPrice NUMERIC
@DATA
3529,9191,6,0,0,205000
3247,10061,5,1,1,224900
4032,10150,5,0,1,197900
2397,14156,4,1,0,189900
2200,9600,4,0,1,195000
3536,19994,6,1,1,325000
2983,9365,5,0,1,230000