FeatureLike
Apply OpIndexToStringNoFilter transformer.
Apply OpIndexToStringNoFilter transformer.
A transformer that maps a feature of indices back to a new feature of corresponding text values. The index-string mapping is either from the ML attributes of the input feature, or from user-supplied labels (which take precedence over ML attributes).
Optional array of labels specifying index-string mapping. If not provided or if empty, then metadata from input feature is used instead.
name to give strings that appear in transform but not in fit
how to transform values not seen in fitting
deindexed text feature
OpStringIndexerNoFilter for converting text into indices
FeatureLike
Apply SanityChecker estimator.
Apply SanityChecker estimator. It checks for potential problems with computed features in a supervized learning setting.
feature vector
Rate to downsample the data for statistical calculations (note: actual sampling will not be exact due to Spark's dataset sampling behavior)
Seed to use when sampling
Lower limit on number of samples in downsampled data set (note: sample limit will not be exact, due to Spark's dataset sampling behavior)
Upper limit on number of samples in downsampled data set (note: sample limit will not be exact, due to Spark's dataset sampling behavior)
Maximum correlation (absolute value) allowed between a feature in the feature vector and the label
Minimum correlation (absolute value) allowed between a feature in the feature vector and the label
Which coefficient to use for computing correlation
Minimum amount of variance allowed for each feature and label
If set to true, this will automatically remove all the bad features from the feature vector
remove all features descended from a parent feature
protect text shared hash from related null indicators and other hashes
Maximum allowed confidence of association rules in categorical variables. A categorical variable will be removed if there is a choice where the maximum confidence is above this threshold, and the support for that choice is above the min rule support parameter, defined below.
Categoricals can be removed if an association rule is found between one of the choices and a categorical label where the confidence of that rule is above maxRuleConfidence and the support fraction of that choice is above minRuleSupport.
Setting for what categories of feature vector columns to exclude from the correlation calculation (eg. hashed text features)
If true, treat label as categorical. If not set, check number of distinct labels to decide whether a label should be treated categorical.
sanity checked feature vector
Apply standard isotonic regression transformer shortcut function.
Apply standard isotonic regression transformer shortcut function.
feature to calibrate against
increasing default true or decreasing
recalibrated feature
Apply PercentileBucketizer transformer shortcut function.
Apply PercentileBucketizer transformer shortcut function. Will rescale values into the specified number of bins (default it 100)
number of bins to scale into
Apply real vectorizer: Converts a sequence of RealNN features into a vector feature.
Apply real vectorizer: Converts a sequence of RealNN features into a vector feature.
other features of same type
Z-normalization shortcut function using OpStandardScaler.
Enrichment functions for Real non nullable Feature