Binary Classification Evaluator

Evaluator for binary classification, which expects two input columns: rawPrediction and label.

Output

It outputs the Probability for each class

Type

ml-evaluator

Class

fire.nodes.ml.NodeBinaryClassificationEvaluator

Fields

Name

Title

Description

labelCol

Label Column

The label column for model fitting.

predictionCol

Prediction Column

The prediction column.

modelUUID

Model UUID

Enter the model uuid

confusionMatrix

Confusion Matrix

output_confusion_matrix_chart

Output Confusion Matrix Chart

Whether to display Confusion Matrix Chart.

cmChartTitle

Confusion Matrix Chart Title

Title name to display in Confusion Matrix Chart

cmChartDescription

Confusion Matrix Chart Description

Description to display in Confusion Matrix Chart

confusionMatrixTargetLegend

Confusion Matrix Target Legend

Legend name to display for Target in Confusion Matrix

confusionMatrixPredictedLabelLegend

Confusion Matrix PredictedLabel Legend

Legend name to display for Predicted Label in Confusion Matrix

Description

Confusion Matrix Description

path

Save Confusion Matrix Path

Save Confusion Matrix

confusionMatrixRowDescription

Confusion Matrix Outcome description

Add the business details of the outcome of the confusion matrix rows

ROC Curve

ROC Curve

output_roc_chart

Output ROC Curve

Whether to display confusion matrix chart.

roc_title

ROC Curve Chart Title

Title name to display in ROC Curve Chart

roc_description

ROC Curve Chart Description

Add Description for ROC Curve Chart

xlabel

X Label

X label

ylabel

Y Label

Y Label

Details

Evaluator for binary classification, which expects two input columns: rawPrediction and label.

More at Spark MLlib/ML docs page : http://spark.apache.org/docs/latest/mllib-evaluation-metrics.html#binary-classification

Examples

Below example is available at : https://spark.apache.org/docs/latest/mllib-evaluation-metrics.html#binary-classification

import org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS

import org.apache.spark.mllib.evaluation.BinaryClassificationMetrics

import org.apache.spark.mllib.regression.LabeledPoint

import org.apache.spark.mllib.util.MLUtils

// Load training data in LIBSVM format

val data = MLUtils.loadLibSVMFile(sc, “data/mllib/sample_binary_classification_data.txt”)

// Split data into training (60%) and test (40%)

val Array(training, test) = data.randomSplit(Array(0.6, 0.4), seed = 11L)

training.cache()

// Run training algorithm to build the model

val model = new LogisticRegressionWithLBFGS()

.setNumClasses(2)

.run(training)

// Clear the prediction threshold so the model will return probabilities

model.clearThreshold

// Compute raw scores on the test set

val predictionAndLabels = test.map { case LabeledPoint(label, features) =>

val prediction = model.predict(features)

(prediction, label)

}

// Instantiate metrics object

val metrics = new BinaryClassificationMetrics(predictionAndLabels)

// Precision by threshold

val precision = metrics.precisionByThreshold

precision.collect.foreach { case (t, p) =>

println(s”Threshold: $t, Precision: $p”)

}

// Recall by threshold

val recall = metrics.recallByThreshold

recall.collect.foreach { case (t, r) =>

println(s”Threshold: $t, Recall: $r”)

}

// Precision-Recall Curve

val PRC = metrics.pr

// F-measure

val f1Score = metrics.fMeasureByThreshold

f1Score.collect.foreach { case (t, f) =>

println(s”Threshold: $t, F-score: $f, Beta = 1”)

}

val beta = 0.5

val fScore = metrics.fMeasureByThreshold(beta)

fScore.collect.foreach { case (t, f) =>

println(s”Threshold: $t, F-score: $f, Beta = 0.5”)

}

// AUPRC

val auPRC = metrics.areaUnderPR

println(s”Area under precision-recall curve = $auPRC”)

// Compute thresholds used in ROC and PR curves

val thresholds = precision.map(_._1)

// ROC Curve

val roc = metrics.roc

// AUROC

val auROC = metrics.areaUnderROC

println(s”Area under ROC = $auROC”)