OPSet Feature
OPSet Feature
Converts a sequence of OPSet features into a vector keeping the top K most common occurrences of each OPSet feature (ie the final vector has length k * number of OPSet inputs).
Converts a sequence of OPSet features into a vector keeping the top K most common occurrences of each OPSet feature (ie the final vector has length k * number of OPSet inputs). Plus an additional column for "other" values - which will capture values that do not make the cut or values not seen in training
other features to include in the pivot
keep topK values
min occurrences to keep a value
if true ignores capitalization and punctuations when grouping categories
keep a count of nulls
max percentage of distinct values a categorical feature can have (between 0.0 and 1.00)
Converts a sequence of OPSet features into a vector keeping the top K most common occurrences of each OPSet feature (ie the final vector has length k * number of OPSet inputs).
Converts a sequence of OPSet features into a vector keeping the top K most common occurrences of each OPSet feature (ie the final vector has length k * number of OPSet inputs). Plus an additional column for "other" values - which will capture values that do not make the cut or values not seen in training
keep topK values
min occurrences to keep a value
if true ignores capitalization and punctuations when grouping categories
keep a count of nulls
other features to include in the pivot
max percentage of distinct values a categorical feature can have (between 0.0 and 1.00)
Enrichment functions for OPSet Feature