Object

com.salesforce.op.stages.impl.preparators

DerivedFeatureFilterUtils

Related Doc: package preparators

Permalink

object DerivedFeatureFilterUtils

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DerivedFeatureFilterUtils
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  10. def getFeaturesToDrop(stats: Array[ColumnStatistics], minVariance: Double, minCorrelation: Double = 0.0, maxCorrelation: Double = 1.0, maxFeatureCorr: Double = 1.0, maxCramersV: Double = 1.0, maxRuleConfidence: Double = 1.0, minRequiredRuleSupport: Double = 1.0, removeFeatureGroup: Boolean = false, protectTextSharedHash: Boolean = true): Array[(ColumnStatistics, String)]

    Permalink

    Identifies which features to drop based on input exclusion criteria, and returns array of dropped columns, with messages for logging why columns were dropped

    Identifies which features to drop based on input exclusion criteria, and returns array of dropped columns, with messages for logging why columns were dropped

    stats

    ColumnStatistics containing multivariate statistics computed by Spark

    minVariance

    Min variance for dropping features

    minCorrelation

    Min correlation with label for dropping features

    maxCorrelation

    Max correlation with label for dropping features

    maxFeatureCorr

    Max correlation between features for dropping the later features

    maxCramersV

    Max Cramer's V for dropping categorical features

    maxRuleConfidence

    Max allowed confidence of association rules for dropping features

    minRequiredRuleSupport

    Threshold for association rule

    removeFeatureGroup

    Whether to remove features descended from parent feature with derived features that meet exclusion criteria

    protectTextSharedHash

    Whether individual hash is dropped or kept independently of related null indicators or other hashes

    returns

    columns to drop, with exclusion reasons

  11. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  12. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  13. def makeColumnStatistics(metaCols: Seq[OpVectorColumnMetadata], statsSummary: MultivariateStatisticalSummary, labelNameAndIndex: Option[(String, Int)] = None, corrsWithLabel: Array[Double] = Array.empty, corrIndices: Array[Int] = Array.empty, categoricalStats: Array[CategoricalGroupStats] = Array.empty, corrMatrix: Option[Matrix] = None): Array[ColumnStatistics]

    Permalink

    Builds an Array of ColumnStatistics objects containing all the data we calculate for each column (eg.

    Builds an Array of ColumnStatistics objects containing all the data we calculate for each column (eg. mean, max, variance, correlation, cramer's V, etc.)

    metaCols

    Sequence of OpVectorColumnMetadata to use for grouping features

    statsSummary

    Multivariate statistics previously computed by Spark

    labelNameAndIndex

    Name of label and index of the column corresponding to the label

    corrsWithLabel

    Array containing correlations between each feature vector element and the label

    corrIndices

    Indices that we actually compute correlations for (eg. can ignore hashed text features)

    categoricalStats

    Array of CategoricalGroupStats for each group of feature vector indices corresponding to a categorical feature

    returns

    Array of ColumnStatistics objects, one for each column in metaCols

  14. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  15. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  16. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  17. def removeFeatures(indicesToKeep: Array[Int], removeBadFeatures: Boolean): (OPVector) ⇒ OPVector

    Permalink

    Transformation used in derived feature filters.

    Transformation used in derived feature filters. If removeBadFeatures true, then this is just identity (does nothing); otherwise, returns OPVector with only columns in indicesToKeep

    indicesToKeep

    column indices of derived features to keep

    removeBadFeatures

    whether to remove any features

    returns

    OPVector with bad features dropped if removeBadFeatures is true

  18. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  19. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  20. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  21. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  22. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped