我尝试在csv文件上实现线性回归。以下是csv文件的内容:
X1; X2; X3; X 4; X 5; X 6; X 7; X 8; Y 1; Y 2;
0.98; 514.50; 294.00; 110.25; 7.00; 2; 0.00; 0; 15.55; 21.33; 0.98; 514.50; 294.00; 110.25; 7.00; 3; 0.00; 0; 15.55; 21.33; 0.98; 514.50; 294.00; 110.25; 7.00; 4; 0.00; 0; 15.55; 21.33; 0.98; 514.50; 294.00; 110.25; 7.00; 5; 0.00; 0; 15.55; 21.33; 0.90; 563.50; 318.50; 122.50; 7.00; 2; 0.00; 0; 20.84; 28.28; 0.90; 563.50; 318.50; 122.50; 7.00; 3; 0.00; 0; 21.46; 25.38; 0.90; 563.50; 318.50; 122.50; 7.00; 4; 0.00; 0; 20.71; 25.16; 0.90; 563.50; 318.50; 122.50; 7.00; 5; 0.00; 0; 19.68; 29.60; 0.86; 588.00; 294.00; 147.00; 7.00; 2; 0.00; 0; 19.50; 27.30; 0.86; 588.00; 294.00; 147.00; 7.00; 3; 0.00; 0; 19.95; 21.97; 0.86; 588.00; 294.00; 147.00; 7.00; 4; 0.00; 0; 19.34; 23.49; 0.86; 588.00; 294.00; 147.00; 7.00; 5; 0.00; 0; 18.31; 27.87; 0.82; 612.50; 318.50; 147.00; 7.00; 2; 0.00; 0; 17.05; 23.77;
...
0.71; 710.50; 269.50; 220.50; 3.50; 2; 0.40; 5; 12.43; 15.59; 0.71; 710.50; 269.50; 220.50; 3.50; 3; 0.40; 5; 12.63; 14.58; 0.71; 710.50; 269.50; 220.50; 3.50; 4; 0.40; 5; 12.76; 15.33; 0.71; 710.50; 269.50; 220.50; 3.50; 5; 0.40; 5; 12.42; 15.31; 0.69; 735.00; 294.00; 220.50; 3.50; 2; 0.40; 5; 14.12; 16.63; 0.69; 735.00; 294.00; 220.50; 3.50; 3; 0.40; 5; 14.28; 15.87; 0.69; 735.00; 294.00; 220.50; 3.50; 4; 0.40; 5; 14.37; 16.54; 0.69; 735.00; 294.00; 220.50; 3.50; 5; 0.40; 5; 14.21; 16.74; 0.66; 759.50; 318.50; 220.50; 3.50; 2; 0.40; 5; 14.96; 17.64; 0.66; 759.50; 318.50; 220.50; 3.50; 3; 0.40; 5; 14.92; 17.79; 0.66; 759.50; 318.50; 220.50; 3.50; 4; 0.40; 5; 14.92; 17.55; 0.66; 759.50; 318.50; 220.50; 3.50; 5; 0.40; 5; 15.16; 18.06; 0.64; 784.00; 343.00; 220.50; 3.50; 2; 0.40; 5; 17.69; 20.82; 0.64; 784.00; 343.00; 220.50; 3.50; 3; 0.40; 5; 18.19; 20.21; 0.64; 784.00; 343.00; 220.50; 3.50; 4; 0.40; 5; 18.16; 20.71; 0.64; 784.00; 343.00; 220.50; 3.50; 5; 0.40; 5; 17.88; 21.40; 0.62; 808.50; 367.50; 220.50; 3.50; 2; 0.40; 5; 16.54; 16.88; 0.62; 808.50; 367.50; 220.50; 3.50; 3; 0.40; 5; 16.44; 17.11; 0.62; 808.50; 367.50; 220.50; 3.50; 4; 0.40; 5; 16.48; 16.61; 0.62; 808.50; 367.50; 220.50; 3.50; 5; 0.40; 5; 16.64; 16.03;
我读了这个csv文件并实现了线性回归实现。这是java中的源代码:
public static void main(String[] args) throws IOException
{
String csvFile = null;
CSVLoader loader = null;
Remove remove =null;
Instances data =null;
LinearRegression model = null;
int numberofFeatures = 0;
try
{
csvFile = "C:\\Users\\Taha\\Desktop/ENB2012_data.csv";
loader = new CSVLoader();
// load CSV
loader.setSource(new File(csvFile));
data = loader.getDataSet();
//System.out.println(data);
numberofFeatures = data.numAttributes();
System.out.println("number of features: " + numberofFeatures);
data.setClassIndex(data.numAttributes() - 2);
//remove last attribute Y2
remove = new Remove();
remove.setOptions(new String[]{"-R", data.numAttributes()+""});
remove.setInputFormat(data);
data = Filter.useFilter(data, remove);
// data.setClassIndex(data.numAttributes() - 2);
model = new LinearRegression();
model.buildClassifier(data);
System.out.println(model);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
我收到错误, weka.core.UnassignedClassException:类索引为负(未设置)!行 model.buildClassifier(data); Number of特征是1,然而,它预计是9.它们是 X1; X2; X3; X4; X5; X6; X7; X8; Y1; Y2 我缺少什么? 提前致谢。
答案 0 :(得分:5)
您可以在行data=loader.getDataSet()
之后添加将解决异常的下一行:
if (data.classIndex() == -1) {
System.out.println("reset index...");
instances.setClassIndex(data.numAttributes() - 1);
}
这对我有用。
答案 1 :(得分:0)
由于我找不到任何解决方案,我决定将数据放入Oracle数据库,然后从Oracle读取数据。 Oracle Sql Developer中有一个导入实用程序,我使用它。这解决了我的问题。我为有相同问题的人写这篇文章。 以下是有关为weka连接Oracle数据库的详细信息。
http://tahasozgen.blogspot.com.tr/2016/10/connection-to-oracle-database-in-weka.html