我正在尝试创建一个Java程序,将文本文件转换为Weka的ARFF文件。不知何故,我的名字属性设置为数字,但应设置为字符串。我尝试了一切,我试着修理它固定
attr.add(new Attribute("name"));
到
attr.add(new Attribute("name",true));
但是当我运行它时,它会将名称打印为数字(位于第2列)
1,0,?,?,?
1000,1,?,?,?
1002,2,?,?,?
2,3,?,?,?
3000,4,?,?,?
我做错了什么?
import java.util.ArrayList;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.*;
import weka.core.Attribute;
import weka.core.DenseInstance;
import weka.core.Instance;
import java.util.*;
import weka.core.Instances;
import weka.core.converters.ArffSaver;
public class WekaCreateARFF {
private static final String FILENAME = "Some File";
public static void main(String[] args) throws IOException {
ArrayList<String> input = new ArrayList<String>();
ArrayList<Attribute> attr = new ArrayList<Attribute>();
Instances dataset;
double [] values;
BufferedReader br = null;
FileReader fr = null;
String date = null;
double id;
String n = null;
Instance inst = new DenseInstance(5);
List nominal_state = new ArrayList(5);
nominal_state.add("CA");
nominal_state.add("NC");
nominal_state.add("TX");
nominal_state.add("SC");
nominal_state.add("NY");
List nominal_party = new ArrayList(2);
nominal_party.add("republican");
nominal_party.add("democrat");
attr.add(new Attribute("id"));
attr.add(new Attribute("name",true));
attr.add(new Attribute("political party", nominal_party));
attr.add(new Attribute("state", nominal_state));
attr.add(new Attribute("birth date", date));
try {
fr = new FileReader(FILENAME);
br = new BufferedReader(fr);
String entry;
dataset = new Instances("SimpleARFF",attr,0);
values = new double[dataset.numAttributes()];
while ((entry = br.readLine()) != null) {
//System.out.println(entry);
input.add(entry);
for (int i = 0; i<5; i++ ) {
String[] parts = entry.split(",");
String part1 = parts[0];
String name = parts[1];
id = Double.parseDouble(part1);
inst.setValue(attr.get(0), id);
inst.setValue(attr.get(1), name);
}
System.out.println(inst);
dataset.add(new DenseInstance(1.0, values));
}
//System.out.println(dataset);
//ArffSaver arff = new ArffSaver();
//arff.setInstances(dataset);
//arff.setFile(new File("Simple.arff"));
//arff.writeBatch();
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null)
br.close();
if (fr != null)
fr.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
}
答案 0 :(得分:1)
你可能想要这个构造函数:
http://weka.sourceforge.net/doc.dev/weka/core/Attribute.html#Attribute-java.lang.String-boolean-
也就是说,你基本上必须添加一个布尔标志来告诉Weka你想要一个String
属性,而不是一个数字属性(默认):
new Attribute("blah", true)
应该为您提供String
- 属性。