使用StringBuilder()将数组解析为带有标题的CSV - 标题行的问题

时间:2017-04-05 18:54:59

标签: java arrays csv stringbuilder

我有一个标记数据元素的向量,如下所示:

  

[label1: 1.1, label2: 2.43, label3: 0.5]

     

[label1: 0.1, label2: 2.0, label3: 1.0]

可以有任意数量的元素,其中每个元素基本上对应于一行数据。我正在尝试将其解析为带有列标题的CSV,如下所示:

label1 label2 label3
1.1    2.43   0.5
0.1    2.0    1.0

我一直在使用StringBuilder()构造函数,并且更愿意坚持使用它,但如果有必要,我可以使用其他东西。

除了将标题与数字结果的第一行分开之外,我几乎完成了这项工作。

我有一个遍历数组元素(“行”)的外部循环和遍历每个数组元素(“列”)的每个部分的内部循环,其中在上面的例子中我们有2个“行”(元素)和3“列”(成员索引)。

我的代码看起来像这样(下面的块都创建了CSV并打印到屏幕上):

StringBuilder builder  = new StringBuilder();

// Write predictions to file
for (int i = 0; i < labeled.size(); i++)      
{
    // Discreet prediction
    double predictionIndex = 
        clf.classifyInstance(newTest.instance(i)); 

    // Get the predicted class label from the predictionIndex.
    String predictedClassLabel =
        newTest.classAttribute().value((int) predictionIndex);

    // Get the prediction probability distribution.
    double[] predictionDistribution = 
        clf.distributionForInstance(newTest.instance(i)); 

    // Print out the true predicted label, and the distribution
    System.out.printf("%5d: predicted=%-10s, distribution=", 
                      i, predictedClassLabel); 

    // Loop over all the prediction labels in the distribution.
    for (int predictionDistributionIndex = 0; 
         predictionDistributionIndex < predictionDistribution.length; 
         predictionDistributionIndex++)
    {
        // Get this distribution index's class label.
        String predictionDistributionIndexAsClassLabel = 
            newTest.classAttribute().value(
                predictionDistributionIndex);

        // Get the probability.
        double predictionProbability = 
            predictionDistribution[predictionDistributionIndex];

        System.out.printf("[%10s : %6.3f]", 
                          predictionDistributionIndexAsClassLabel, 
                          predictionProbability );
        if(i == 0){
            builder.append(predictionDistributionIndexAsClassLabel+",");

            if(predictionDistributionIndex == predictionDistribution.length){
                builder.append("\n");
            }
        }
        // Add probabilities as rows     
        builder.append(predictionProbability+",");

        }

    System.out.printf("\n");
    builder.append("\n");

}

结果目前如下:

setosa,1.0,versicolor,0.0,virginica,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,
1.0,0.0,0.0,

其中setosa,versicolor和virginica是标签。你可以看到它从第二行开始工作,但我无法弄清楚如何修复第一行。

1 个答案:

答案 0 :(得分:1)

如果我正确理解你的问题,你会在内部for循环中同时获得第一行的标签和值,因此它们会随之附加。如果要将标签分开,可以对内循环部分进行一些更改,如下所示:

StringBuilder labelRow = new StringBuilder();

    // Loop over all the prediction labels in the distribution.
    for (int predictionDistributionIndex = 0; 
         predictionDistributionIndex < predictionDistribution.length; 
         predictionDistributionIndex++)
    {
        // Get this distribution index's class label.
        String predictionDistributionIndexAsClassLabel = 
            newTest.classAttribute().value(
                predictionDistributionIndex);

        // Get the probability.
        double predictionProbability = 
            predictionDistribution[predictionDistributionIndex];

        System.out.printf("[%10s : %6.3f]", 
                          predictionDistributionIndexAsClassLabel, 
                          predictionProbability );
        if(i == 0){
            labelRow.append(predictionDistributionIndexAsClassLabel+",");

            if(predictionDistributionIndex == predictionDistribution.length){
                builder.append("\n");
            }

        }

        // Add probabilities as rows     
        builder.append(predictionProbability+",");

     }
     if(i == 0){
          builder.insert(0,labelRow.toString()+"\n");
     }

它的作用是在单独的StringBuilder中收集标签,之后您可以在最终builder值的开头插入标签。