evaluators

Type Members

case class BinaryClassificationBinMetrics(BrierScore: Double, binSize: Double, binCenters: Seq[Double], numberOfDataPoints: Seq[Long], numberOfPositiveLabels: Seq[Long], averageScore: Seq[Double], averageConversionRate: Seq[Double]) extends EvaluationMetrics with Product with Serializable

Metrics of BinaryClassificationBinMetrics

Metrics of BinaryClassificationBinMetrics

BrierScore

brier score for overall dataset

binSize

size of each bin

binCenters

center of each bin

numberOfDataPoints

total number of data points in each bin

numberOfPositiveLabels

count of labels > 0 in each bin

averageScore

average score in each bin

averageConversionRate

average conversion rate in each bin
case class BinaryClassificationMetrics(Precision: Double, Recall: Double, F1: Double, AuROC: Double, AuPR: Double, Error: Double, TP: Double, TN: Double, FP: Double, FN: Double, ThresholdMetrics: BinaryThresholdMetrics) extends EvaluationMetrics with Product with Serializable

Metrics for binary classification models

Metrics for binary classification models

Precision

Overall precision of model, TP / (TP + FP)

Recall

Overall recall of model, TP / (TP + FN)

F1

Overall F1 score of model, 2 / (1 / Precision + 1 / Recall)

AuROC

AuROC of model

AuPR

AuPR of model

Error

Error of model

TP

True positive count at Spark's default decision threshold (0.5)

TN

True negative count at Spark's default decision threshold (0.5)

FP

False positive count at Spark's default decision threshold (0.5)

FN

False negative count at Spark's default decision threshold (0.5)

ThresholdMetrics

Metrics across different threshold values
case class BinaryThresholdMetrics(thresholds: Seq[Double], precisionByThreshold: Seq[Double], recallByThreshold: Seq[Double], falsePositiveRateByThreshold: Seq[Double], truePositivesByThreshold: Seq[Long], falsePositivesByThreshold: Seq[Long], trueNegativesByThreshold: Seq[Long], falseNegativesByThreshold: Seq[Long]) extends Product with Serializable

Threshold metrics for binary classification predictions

Threshold metrics for binary classification predictions

thresholds

Sequence of thresholds for subsequent threshold metrics

precisionByThreshold

Sequence of precision values at thresholds

recallByThreshold

Sequence of recall values at thresholds

falsePositiveRateByThreshold

Sequence of false positive rates, FP / (FP + TN), at thresholds

truePositivesByThreshold

Sequence of true positive counts at thresholds

falsePositivesByThreshold

Sequence of false positive counts at thresholds

trueNegativesByThreshold

Sequence of true negative counts at thresholds

falseNegativesByThreshold

Sequence of false negative counts at thresholds
case class ClassCount(ClassIndex: Double, Count: Long) extends Product with Serializable

container to store the count of a class
sealed abstract class ClassificationEvalMetric extends EnumEntry with EvalMetric

Classification Metrics
case class ConfusionMatrixPerThreshold(Threshold: Double, ConfusionMatrixCounts: Seq[Long]) extends Product with Serializable
sealed trait EvalMetric extends EnumEntry with Serializable

Eval metric
trait EvaluationMetrics extends JsonLike

Trait for all different kinds of evaluation metrics
sealed abstract class ForecastEvalMetric extends EnumEntry with EvalMetric

Forecast Metrics
case class ForecastMetrics(SMAPE: Double, SeasonalError: Double, MASE: Double) extends EvaluationMetrics with Product with Serializable

Metrics of Forecasting Problem

Metrics of Forecasting Problem

SMAPE

Symmetric Mean Absolute Percentage Error

SeasonalError

Seasonal Error

MASE

Mean Absolute Scaled Error
case class MisClassificationMetrics(ConfMatrixMinSupport: Int, MisClassificationsByLabel: Seq[MisClassificationsPerCategory], MisClassificationsByPrediction: Seq[MisClassificationsPerCategory]) extends Product with Serializable

Multiclass mis-classification metrics, including the top n (n = confMatrixMinSupport) most frequently mis-classified classes for each label or prediction category.
case class MisClassificationsPerCategory(Category: Double, TotalCount: Long, CorrectCount: Long, MisClassifications: Seq[ClassCount]) extends Product with Serializable

container to store the most frequently mis-classified classes for each label/prediction category
case class MultiClassificationMetrics(Precision: Double, Recall: Double, F1: Double, Error: Double, ThresholdMetrics: MulticlassThresholdMetrics, TopKMetrics: MultiClassificationMetricsTopK, ConfusionMatrixMetrics: MulticlassConfMatrixMetricsByThreshold, MisClassificationMetrics: MisClassificationMetrics) extends EvaluationMetrics with Product with Serializable

Metrics of MultiClassification Problem
case class MultiClassificationMetricsTopK(topKs: Seq[Int], Precision: Seq[Double], Recall: Seq[Double], F1: Seq[Double], Error: Seq[Double]) extends EvaluationMetrics with Product with Serializable

Metrics for topK MultiClassification

Metrics for topK MultiClassification

Each metric contains a list of metrics corresponding to each of the topK most occurring labels. If the predicted label is outside of the topK most occurring labels, it is treated as incorrect.
case class MultiMetrics(metrics: Map[String, EvaluationMetrics]) extends EvaluationMetrics with Product with Serializable

A container for multiple evaluation metrics for evaluators

A container for multiple evaluation metrics for evaluators

metrics

map of evaluation metrics
case class MulticlassConfMatrixMetricsByThreshold(ConfMatrixNumClasses: Int, ConfMatrixClassIndices: Seq[Double], ConfMatrixThresholds: Seq[Double], ConfMatrices: Seq[ConfusionMatrixPerThreshold]) extends EvaluationMetrics with Product with Serializable

Metrics for multi-class confusion matrix.

Metrics for multi-class confusion matrix. It captures confusion matrix of records of which 1) the labels belong to the top n most occurring classes (n = confMatrixNumClasses) 2) the top predicted probability exceeds a certain threshold in confMatrixThresholds
case class MulticlassThresholdMetrics(topNs: Seq[Int], thresholds: Seq[Double], correctCounts: Map[Int, Seq[Long]], incorrectCounts: Map[Int, Seq[Long]], noPredictionCounts: Map[Int, Seq[Long]]) extends EvaluationMetrics with Product with Serializable

Threshold-based metrics for multiclass classification

Threshold-based metrics for multiclass classification

Classifications being correct, incorrect, or no classification are defined in terms of the topN and score threshold to be: Correct - score of the true label is in the top N scores AND the score of the true label is >= threshold Incorrect - score of top predicted label >= threshold AND (true label NOT in top N predicted labels OR score of true label < threshold) No prediction - otherwise (score of top predicted label < threshold)

topNs

list of topN values (used as keys for the count maps)

thresholds

list of threshold values (correspond to thresholds at the indices of the arrays in the count maps)

correctCounts

map from topN value to an array of counts of correct classifications at each threshold

incorrectCounts

map from topN value to an array of counts of incorrect classifications at each threshold

noPredictionCounts

map from topN value to an array of counts of no prediction at each threshold
abstract class OpBinaryClassificationEvaluatorBase[T <: EvaluationMetrics] extends OpClassificationEvaluatorBase[T]

Base Interface for OpBinaryClassificationEvaluator
abstract class OpEvaluatorBase[T <: EvaluationMetrics] extends Evaluator with OpHasLabelCol[RealNN] with OpHasPredictionValueCol[RealNN] with OpHasPredictionCol

Base Interface for OpEvaluator to be used in Evaluator creation.

Base Interface for OpEvaluator to be used in Evaluator creation. Can be used for both OP and spark eval (so with workflows and cross validation).
sealed abstract class OpEvaluatorNames extends EnumEntry with EvalMetric

GeneralMetrics
trait OpHasLabelCol[T <: FeatureType] extends Params

Trait for labelCol param
trait OpHasPredictionCol extends Params

Trait for predictionCol which contains all output results param
trait OpHasPredictionValueCol[T <: FeatureType] extends Params

Trait for internal flattened predictionCol param
trait OpHasProbabilityCol[T <: FeatureType] extends Params

Trait for internal flattened probabilityCol Param
trait OpHasRawPredictionCol[T <: FeatureType] extends Params

Trait for internal flattened rawPredictionColParam
abstract class OpMultiClassificationEvaluatorBase[T <: EvaluationMetrics] extends OpClassificationEvaluatorBase[T]

Base Interface for OpMultiClassificationEvaluator
abstract class OpRegressionEvaluatorBase[T <: EvaluationMetrics] extends OpEvaluatorBase[T]

Base Interface for OpRegressionEvaluator
sealed abstract class RegressionEvalMetric extends EnumEntry with EvalMetric

Regression Metrics
case class RegressionMetrics(RootMeanSquaredError: Double, MeanSquaredError: Double, R2: Double, MeanAbsoluteError: Double, SignedPercentageErrorHistogram: SignedPercentageErrorHistogram) extends EvaluationMetrics with Product with Serializable

Metrics of Regression Problem
case class SignedPercentageErrorHistogram(bins: Seq[Double], counts: Seq[Long]) extends EvaluationMetrics with Product with Serializable

Histogram of signed percentage errors

Histogram of signed percentage errors

bins

Histogram bins, where for example [-1, 0, 1] refer to bins [-1, 0), [0, 1]

counts

Histogram counts (length of bins parameter - 1)
case class SingleMetric(name: String, value: Double) extends EvaluationMetrics with Product with Serializable

A container for a single evaluation metric for evaluators

A container for a single evaluation metric for evaluators

name

metric name

value

metric value

Value Members

object BinaryClassEvalMetrics extends Enum[ClassificationEvalMetric]

Binary Classification Metrics
object BinaryClassificationBinMetrics extends Serializable
object EvalMetric extends Serializable

Eval metric companion object
object Evaluators

Just a handy factory for evaluators
object ForecastEvalMetrics extends Enum[ForecastEvalMetric]
object MultiClassEvalMetrics extends Enum[ClassificationEvalMetric]

Multi Classification Metrics
object OpEvaluatorNames extends Enum[OpEvaluatorNames] with Serializable

Contains evaluator names used in logging
object RegressionEvalMetrics extends Enum[RegressionEvalMetric]

Regression Metrics

package evaluators

Type Members

case class BinaryClassificationBinMetrics(BrierScore: Double, binSize: Double, binCenters: Seq[Double], numberOfDataPoints: Seq[Long], numberOfPositiveLabels: Seq[Long], averageScore: Seq[Double], averageConversionRate: Seq[Double]) extends EvaluationMetrics with Product with Serializable

case class BinaryClassificationMetrics(Precision: Double, Recall: Double, F1: Double, AuROC: Double, AuPR: Double, Error: Double, TP: Double, TN: Double, FP: Double, FN: Double, ThresholdMetrics: BinaryThresholdMetrics) extends EvaluationMetrics with Product with Serializable

case class ClassCount(ClassIndex: Double, Count: Long) extends Product with Serializable

sealed abstract class ClassificationEvalMetric extends EnumEntry with EvalMetric

case class ConfusionMatrixPerThreshold(Threshold: Double, ConfusionMatrixCounts: Seq[Long]) extends Product with Serializable

sealed trait EvalMetric extends EnumEntry with Serializable

trait EvaluationMetrics extends JsonLike

sealed abstract class ForecastEvalMetric extends EnumEntry with EvalMetric

case class ForecastMetrics(SMAPE: Double, SeasonalError: Double, MASE: Double) extends EvaluationMetrics with Product with Serializable

case class MisClassificationMetrics(ConfMatrixMinSupport: Int, MisClassificationsByLabel: Seq[MisClassificationsPerCategory], MisClassificationsByPrediction: Seq[MisClassificationsPerCategory]) extends Product with Serializable

case class MisClassificationsPerCategory(Category: Double, TotalCount: Long, CorrectCount: Long, MisClassifications: Seq[ClassCount]) extends Product with Serializable

case class MultiClassificationMetricsTopK(topKs: Seq[Int], Precision: Seq[Double], Recall: Seq[Double], F1: Seq[Double], Error: Seq[Double]) extends EvaluationMetrics with Product with Serializable

case class MultiMetrics(metrics: Map[String, EvaluationMetrics]) extends EvaluationMetrics with Product with Serializable

case class MulticlassConfMatrixMetricsByThreshold(ConfMatrixNumClasses: Int, ConfMatrixClassIndices: Seq[Double], ConfMatrixThresholds: Seq[Double], ConfMatrices: Seq[ConfusionMatrixPerThreshold]) extends EvaluationMetrics with Product with Serializable

case class MulticlassThresholdMetrics(topNs: Seq[Int], thresholds: Seq[Double], correctCounts: Map[Int, Seq[Long]], incorrectCounts: Map[Int, Seq[Long]], noPredictionCounts: Map[Int, Seq[Long]]) extends EvaluationMetrics with Product with Serializable

abstract class OpBinaryClassificationEvaluatorBase[T <: EvaluationMetrics] extends OpClassificationEvaluatorBase[T]

abstract class OpEvaluatorBase[T <: EvaluationMetrics] extends Evaluator with OpHasLabelCol[RealNN] with OpHasPredictionValueCol[RealNN] with OpHasPredictionCol

sealed abstract class OpEvaluatorNames extends EnumEntry with EvalMetric

trait OpHasLabelCol[T <: FeatureType] extends Params

trait OpHasPredictionCol extends Params

trait OpHasPredictionValueCol[T <: FeatureType] extends Params

trait OpHasProbabilityCol[T <: FeatureType] extends Params

trait OpHasRawPredictionCol[T <: FeatureType] extends Params

abstract class OpMultiClassificationEvaluatorBase[T <: EvaluationMetrics] extends OpClassificationEvaluatorBase[T]

abstract class OpRegressionEvaluatorBase[T <: EvaluationMetrics] extends OpEvaluatorBase[T]

sealed abstract class RegressionEvalMetric extends EnumEntry with EvalMetric

case class RegressionMetrics(RootMeanSquaredError: Double, MeanSquaredError: Double, R2: Double, MeanAbsoluteError: Double, SignedPercentageErrorHistogram: SignedPercentageErrorHistogram) extends EvaluationMetrics with Product with Serializable

case class SignedPercentageErrorHistogram(bins: Seq[Double], counts: Seq[Long]) extends EvaluationMetrics with Product with Serializable

case class SingleMetric(name: String, value: Double) extends EvaluationMetrics with Product with Serializable

Value Members

object BinaryClassEvalMetrics extends Enum[ClassificationEvalMetric]

object BinaryClassificationBinMetrics extends Serializable

object EvalMetric extends Serializable

object Evaluators

object ForecastEvalMetrics extends Enum[ForecastEvalMetric]

object MultiClassEvalMetrics extends Enum[ClassificationEvalMetric]

object OpEvaluatorNames extends Enum[OpEvaluatorNames] with Serializable

object RegressionEvalMetrics extends Enum[RegressionEvalMetric]

Ungrouped