Multiclass Classification Evaluator =========== Evaluator for multiclass classification, which expects two input columns: score and label. Type --------- ml-evaluator Class --------- fire.nodes.ml.NodeMulticlassClassificationEvaluator Fields --------- .. list-table:: :widths: 10 5 10 :header-rows: 1 * - Name - Title - Description * - labelCol - Label Column - The label column for model fitting. * - predictionCol - Prediction Column - The prediction column. * - modelUUID - Model UUID - Enter the model uuid * - confusionMatrix - Confusion Matrix - * - output_confusion_matrix_chart - Output Confusion Matrix Chart - whether to display confusion matrix chart. * - cm_chart_title - Confusion Matrix Chart Title - Title name to display in Confusion Matrix Chart * - cm_chart_description - Confusion Matrix Chart Description - Description to display in Confusion Matrix CHart * - confusionMatrixTargetLegend - Confusion Matrix Target Legend - Legend name to display for Target in Confusion Matrix * - confusionMatrixPredictedLabelLegend - Confusion Matrix PredictedLabel Legend - Legend name to display for Predicted Label in Confusion Matrix * - confusionMatrixCountLegend - Confusion Matrix Count Legend - Legend name to display for Count in Confusion Matrix * - Description - Confusion Matrix Description - * - confusionMatrixRowDescription - Confusion Matrix Outcome description - One can provide the business details of the outcome of the confusion matrix rows Details ------- Evaluator for multiclass classification, which expects two input columns: score and label. More at Spark MLlib/ML docs page :https://spark.apache.org/docs/1.6.0/mllib-evaluation-metrics.html#multiclass-classification Examples ------- Below example is available at : https://spark.apache.org/docs/latest/mllib-evaluation-metrics.html#multiclass-classification +++++++++++++++ import org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS import org.apache.spark.mllib.evaluation.MulticlassMetrics import org.apache.spark.mllib.regression.LabeledPoint import org.apache.spark.mllib.util.MLUtils // Load training data in LIBSVM format val data = MLUtils.loadLibSVMFile(sc, "data/mllib/sample_multiclass_classification_data.txt") // Split data into training (60%) and test (40%) val Array(training, test) = data.randomSplit(Array(0.6, 0.4), seed = 11L) training.cache() // Run training algorithm to build the model val model = new LogisticRegressionWithLBFGS() .setNumClasses(3) .run(training) // Compute raw scores on the test set val predictionAndLabels = test.map { case LabeledPoint(label, features) => val prediction = model.predict(features) (prediction, label) } // Instantiate metrics object val metrics = new MulticlassMetrics(predictionAndLabels) // Confusion matrix println("Confusion matrix:") println(metrics.confusionMatrix) // Overall Statistics val accuracy = metrics.accuracy println("Summary Statistics") println(s"Accuracy = $accuracy") // Precision by label val labels = metrics.labels labels.foreach { l => println(s"Precision($l) = " + metrics.precision(l)) } // Recall by label labels.foreach { l => println(s"Recall($l) = " + metrics.recall(l)) } // False positive rate by label labels.foreach { l => println(s"FPR($l) = " + metrics.falsePositiveRate(l)) } // F-measure by label labels.foreach { l => println(s"F1-Score($l) = " + metrics.fMeasure(l)) } // Weighted stats println(s"Weighted precision: ${metrics.weightedPrecision}") println(s"Weighted recall: ${metrics.weightedRecall}") println(s"Weighted F1 score: ${metrics.weightedFMeasure}") println(s"Weighted false positive rate: ${metrics.weightedFalsePositiveRate}")