我使用Apache Spark MLLib库实现了一些机器学习算法。我正在使用MulticlassClassificationEvaluator对象获取结果。我想要获得的结果是精度,召回率和准确性。
问题在于准确性和召回率对于我使用的所有算法都是相同的。例如,对于随机森林,准确性和召回率值是98%,对于朴素贝叶斯算法,则是95%。我使用的其他算法的情况也相同。这正常吗?它与我获得结果的方式有关吗?
这是我使用的一些实现。随机森林:
<div>
<div class="left">
<mat-list class="filter-list" *ngFor="let f of filterList" (click)="onSelect(f)">
<span>
{{f.id}}
</span>
<span>
{{f.name}}
</span>
</mat-list>
</div>
<div class="right">
<ng-container *ngIf="selectedFilter">
{{selectedFilter | json}}
<mat-list *ngFor="let tag of selectedFilter.tags">
<span>
{{tag.id}}
</span> here
</mat-list>
</ng-container>
</div>
</div>
朴素贝叶斯算法:
Dataset<Row> dataFrame = sparkBase
.getSpark()
.read()
.format("libsvm")
.load(svFilePath);
StringIndexerModel labelIndexer = new StringIndexer()
.setInputCol("label")
.setOutputCol("indexedLabel")
.fit(dataFrame);
VectorIndexerModel featureIndexer = new VectorIndexer()
.setInputCol("features")
.setOutputCol("indexedFeatures")
.setMaxCategories(categoryCount)
.fit(dataFrame);
Dataset<Row>[] splits = dataFrame.randomSplit(new double[]
{mainController.getTrainingDataRate(), mainController.getTestDataRate()}, 1234L);
Dataset<Row> train = splits[0];
Dataset<Row> test = splits[1];
RandomForestClassifier rf = new RandomForestClassifier()
.setLabelCol("indexedLabel")
.setFeaturesCol("indexedFeatures");
IndexToString labelConverter = new IndexToString()
.setInputCol("prediction")
.setOutputCol("predictedLabel")
.setLabels(labelIndexer.labels());
Pipeline pipeline = new Pipeline()
.setStages(new PipelineStage[] {labelIndexer, featureIndexer, rf, labelConverter});
PipelineModel model = pipeline.fit(train);
Dataset<Row> predictions = model.transform(test);
MulticlassClassificationEvaluator evaluator = new MulticlassClassificationEvaluator()
.setLabelCol("indexedLabel")
.setPredictionCol("prediction")
.setMetricName("accuracy");
accuracy = evaluator.evaluate(predictions);
evaluator.setMetricName("weightedRecall");
recall = (evaluator.evaluate(predictions));
evaluator.setMetricName("weightedPrecision");
precision = (evaluator.evaluate(predictions));
我做错什么了吗?问候