Package

com.salesforce.op

filters

Permalink

package filters

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. filters
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. case class FeatureDistribution(name: String, key: Option[String], count: Long, nulls: Long, distribution: Array[Double], summaryInfo: Array[Double], type: FeatureDistributionType = FeatureDistributionType.Training) extends FeatureDistributionLike with Product with Serializable

    Permalink

    Class containing summary information for a feature

    Class containing summary information for a feature

    name

    name of the feature

    key

    map key associated with distribution (when the feature is a map)

    count

    total count of feature seen

    nulls

    number of empties seen in feature

    distribution

    binned counts of feature values (hashed for strings, evenly spaced bins for numerics)

    summaryInfo

    either min and max number of tokens for text data, or splits used for bins for numeric data

  2. case class FilteredRawData(cleanedData: DataFrame, featuresToDrop: Array[OPFeature], mapKeysToDrop: Map[String, Set[String]], featureDistributions: Seq[FeatureDistribution]) extends Product with Serializable

    Permalink

    case class for the RFF filtered data and features to drop

    case class for the RFF filtered data and features to drop

    cleanedData

    RFF cleaned data

    featuresToDrop

    raw features dropped by RFF

    mapKeysToDrop

    keys in map features dropped by RFF

    featureDistributions

    feature distributions calculated from the training and scoring data

  3. class RawFeatureFilter[T] extends Serializable

    Permalink

    Specialized stage that will load up data and compute distributions and empty counts on raw features.

    Specialized stage that will load up data and compute distributions and empty counts on raw features. This information is then used to compute which raw features should be excluded from the workflow DAG Note: Currently, raw features that aren't explicitly blacklisted, but are not used because they are inputs to explicitly blacklisted features are not present as raw features in the model, nor in ModelInsights. However, they are accessible from an OpWorkflowModel via getRawFeatureDistributions().

    T

    datatype of the reader

  4. case class Summary(min: Double, max: Double, sum: Double, count: Double) extends Product with Serializable

    Permalink

    Class used to get summaries of prepared features to determine distribution binning strategy

    Class used to get summaries of prepared features to determine distribution binning strategy

    min

    minimum value seen for double, minimum number of tokens in one text for text

    max

    maximum value seen for double, maximum number of tokens in one text for text

    sum

    sum of values for double, total number of tokens for text

    count

    number of doubles for double, number of texts for text

Value Members

  1. object FeatureDistribution extends Serializable

    Permalink
  2. object RawFeatureFilter extends Serializable

    Permalink
  3. object Summary extends Product with Serializable

    Permalink

Inherited from AnyRef

Inherited from Any

Ungrouped