我有一个标记数据元素的向量,如下所示:
[label1: 1.1, label2: 2.43, label3: 0.5]
[label1: 0.1, label2: 2.0, label3: 1.0]
可以有任意数量的元素,其中每个元素基本上对应于一行数据。我正在尝试将其解析为带有列标题的CSV,如下所示:
label1 label2 label3 1.1 2.43 0.5 0.1 2.0 1.0
我一直在使用StringBuilder()
构造函数,并且更愿意坚持使用它,但如果有必要,我可以使用其他东西。
除了将标题与数字结果的第一行分开之外,我几乎完成了这项工作。
我有一个遍历数组元素(“行”)的外部循环和遍历每个数组元素(“列”)的每个部分的内部循环,其中在上面的例子中我们有2个“行”(元素)和3“列”(成员索引)。
我的代码看起来像这样(下面的块都创建了CSV并打印到屏幕上):
StringBuilder builder = new StringBuilder();
// Write predictions to file
for (int i = 0; i < labeled.size(); i++)
{
// Discreet prediction
double predictionIndex =
clf.classifyInstance(newTest.instance(i));
// Get the predicted class label from the predictionIndex.
String predictedClassLabel =
newTest.classAttribute().value((int) predictionIndex);
// Get the prediction probability distribution.
double[] predictionDistribution =
clf.distributionForInstance(newTest.instance(i));
// Print out the true predicted label, and the distribution
System.out.printf("%5d: predicted=%-10s, distribution=",
i, predictedClassLabel);
// Loop over all the prediction labels in the distribution.
for (int predictionDistributionIndex = 0;
predictionDistributionIndex < predictionDistribution.length;
predictionDistributionIndex++)
{
// Get this distribution index's class label.
String predictionDistributionIndexAsClassLabel =
newTest.classAttribute().value(
predictionDistributionIndex);
// Get the probability.
double predictionProbability =
predictionDistribution[predictionDistributionIndex];
System.out.printf("[%10s : %6.3f]",
predictionDistributionIndexAsClassLabel,
predictionProbability );
if(i == 0){
builder.append(predictionDistributionIndexAsClassLabel+",");
if(predictionDistributionIndex == predictionDistribution.length){
builder.append("\n");
}
}
// Add probabilities as rows
builder.append(predictionProbability+",");
}
System.out.printf("\n");
builder.append("\n");
}
结果目前如下:
setosa,1.0,versicolor,0.0,virginica,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
其中setosa,versicolor和virginica是标签。你可以看到它从第二行开始工作,但我无法弄清楚如何修复第一行。
答案 0 :(得分:1)
如果我正确理解你的问题,你会在内部for循环中同时获得第一行的标签和值,因此它们会随之附加。如果要将标签分开,可以对内循环部分进行一些更改,如下所示:
StringBuilder labelRow = new StringBuilder();
// Loop over all the prediction labels in the distribution.
for (int predictionDistributionIndex = 0;
predictionDistributionIndex < predictionDistribution.length;
predictionDistributionIndex++)
{
// Get this distribution index's class label.
String predictionDistributionIndexAsClassLabel =
newTest.classAttribute().value(
predictionDistributionIndex);
// Get the probability.
double predictionProbability =
predictionDistribution[predictionDistributionIndex];
System.out.printf("[%10s : %6.3f]",
predictionDistributionIndexAsClassLabel,
predictionProbability );
if(i == 0){
labelRow.append(predictionDistributionIndexAsClassLabel+",");
if(predictionDistributionIndex == predictionDistribution.length){
builder.append("\n");
}
}
// Add probabilities as rows
builder.append(predictionProbability+",");
}
if(i == 0){
builder.insert(0,labelRow.toString()+"\n");
}
它的作用是在单独的StringBuilder
中收集标签,之后您可以在最终builder
值的开头插入标签。