Absolute value transformer
Plus function truth table (Real as example):
Plus function truth table (Real as example):
Real.empty + Real.empty = Real.empty Real.empty + Real(x) = Real(x) Real(x) + Real.empty = Real(x) Real(x) + Real(y) = Real(x + y)
No-op (identity) alias feature transformer allowing renaming features without applying a transformation on values.
No-op (identity) alias feature transformer allowing renaming features without applying a transformation on values.
feature type
Class for vectorizing BinaryMap features.
Class for vectorizing BinaryMap features. Fills missing keys with args.defaultValue, which does not depend on the key, so getFillByKey returns an empty sequence.
input feature type to vectorize into an OPVector
Vectorizes Binary inputs where each input is transformed into 2 vector elements where the first element is [1 -> true] or [0 -> false] and the second element is [1 -> filled value] or [0 -> original value].
Vectorizes Binary inputs where each input is transformed into 2 vector elements where the first element is [1 -> true] or [0 -> false] and the second element is [1 -> filled value] or [0 -> original value]. The vector representation for each input is concatenated into a final vector representation.
Example:
Data: Seq[(Binary, Binary)] = ((Some(false), None)) => f1, f2 new BinaryVectorizer().setInput(f1, f2).setFillValue(10)
will produce Array(0.0, 0.0, 10.0, 1.0)
Ceil transformer
Ceil transformer
input feature type
Model Combination Strategies
Converts a sequence of DateLists features into a vector feature.
Converts a sequence of DateLists features into a vector feature. Can choose how to pivot the features
Following: http://webspace.ship.edu/pgmarr/geo441/lectures/lec%2016%20-%20directional%20statistics.pdf Transforms a Date or DateTime field into a cartesian coordinate representation of an extracted time period on the unit circle
Following: http://webspace.ship.edu/pgmarr/geo441/lectures/lec%2016%20-%20directional%20statistics.pdf Transforms a Date or DateTime field into a cartesian coordinate representation of an extracted time period on the unit circle
parameter timePeriod The time period to extract from the timestamp enum from: DayOfMonth, DayOfWeek, DayOfYear, HourOfDay, MonthOfYear, WeekOfMonth, WeekOfYear
We extract the timePeriod from the timestamp and map this onto the unit circle containing the number of time periods equally spaced. For example, when timePeriod = HourOfDay, the timestamp 01/01/2018 6:37 maps to the point on the circle with angle radians = 2*math.Pi*6/24 We return the cartesian coordinates of this point: (math.cos(radians), math.sin(radians))
The first time period always has angle 0.
Note: We use the ISO week date format https://en.wikipedia.org/wiki/ISO_week_date#First_week Monday is the first day of the week & the first week of the year is the week wit the first Monday after Jan 1.
Model for DateMapToUnitCircleVectorizer
Model for DateMapToUnitCircleVectorizer
DateMap type
Class for vectorizing DateMap features.
Class for vectorizing DateMap features. Fills missing keys with args.defaultValue, which does not depend on the key, so getFillByKey returns an empty sequence.
input feature type to vectorize into an OPVector
Following: http://webspace.ship.edu/pgmarr/geo441/lectures/lec%2016%20-%20directional%20statistics.pdf Transforms a Date or DateTime field into a cartesian coordinate representation of an extracted time period on the unit circle
Following: http://webspace.ship.edu/pgmarr/geo441/lectures/lec%2016%20-%20directional%20statistics.pdf Transforms a Date or DateTime field into a cartesian coordinate representation of an extracted time period on the unit circle
parameter timePeriod The time period to extract from the timestamp enum from: DayOfMonth, DayOfWeek, DayOfYear, HourOfDay, MonthOfYear, WeekOfMonth, WeekOfYear
We extract the timePeriod from the timestamp and map this onto the unit circle containing the number of time periods equally spaced. For example, when timePeriod = HourOfDay, the timestamp 01/01/2018 6:37 maps to the point on the circle with angle radians = 2*math.Pi*6/24 We return the cartesian coordinates of this point: (math.cos(radians), math.sin(radians))
The first time period always has angle 0.
Note: We use the ISO week date format https://en.wikipedia.org/wiki/ISO_week_date#First_week Monday is the first day of the week & the first week of the year is the week wit the first Monday after Jan 1.
Smart bucketizer for numeric values based on a Decision Tree classifier.
Smart bucketizer for numeric values based on a Decision Tree classifier.
numeric feature type value
numeric feature type
Smart bucketizer for numeric map values based on a Decision Tree classifier.
Smart bucketizer for numeric map values based on a Decision Tree classifier.
numeric feature type value
numeric map feature type
A transformer that takes as inputs a feature to descale and (potentially different) scaled feature which contains the metadata for reconstructing the inverse scaling function.
A transformer that takes as inputs a feature to descale and (potentially different) scaled feature which contains the metadata for reconstructing the inverse scaling function. Transforms the 2nd input feature by applying the inverse of the scaling function found in the metadata - 1st input feature the feature to descale - 2nd input feature the scaled feature containing metadata for constructing the scaling used to make this column
feature type for first input
feature type for the second input
output feature type
Divide function truth table (Real as example):
Divide function truth table (Real as example):
Real.empty / Real.empty = Real.empty Real.empty / Real(x) = Real.empty Real(x) / Real.empty = Real.empty Real(x) / Real(y) = Real(x * y) filter ("is not NaN or Infinity")
Allows columns to be dropped from a feature vector based on properties of the metadata about what is contained in each column (will work only on vectors) created with OpVectorMetadata
Case class for Scaling families that take no parameters
Exp transformer: returns Euler's number e
raised to the power of feature value
Exp transformer: returns Euler's number e
raised to the power of feature value
input feature type
Fill missing values with mean for any numeric feature
Filters maps by keys provided in a allowlist or blocklist
Filters maps by keys provided in a allowlist or blocklist
input feature type
Floor transformer
Floor transformer
input feature type
Defines the different kinds of gender detection strategies that are possible
Defines the different kinds of gender detection strategies that are possible
We need to overwrite toString
in order to provide serialization during the Spark map and reduce steps and then the fromString
function provides deserialization back to the GenderDetectStrategy
class for the companion transformer
Converts a sequence of Geolocation features into a vector feature.
Converts a sequence of Geolocation features into a vector feature. Can choose to fill null values with the mean or a constant
Hashing Algorithms
Hash space strategy
Hashing Parameters
Hashing Parameters
if true, include indices when hashing a feature that has them (OPLists or OPVectors)
if true, prepends a input feature name to each token of that feature
number of features (hashes) to generate
number of inputs
max number of features (hashes)
if true, term frequency vector will be binary such that non-zero term counts will be set to 1.0
hash algorithm to use
strategy to determine whether to use shared hash space for all included features
Unary estimator for identifying whether a single Text column is a name or not.
Unary estimator for identifying whether a single Text column is a name or not. If the column does appear to be a name, a custom map will be returned that contains the guessed gender for each entry (gender detection only supported for English at the moment). If the column does not appear to be a name, then the output will be an empty map.
the FeatureType (subtype of Text) to operate over
Class for vectorizing IntegralMap features.
Class for vectorizing IntegralMap features. Fills missing keys with the mode for that key.
input feature type to vectorize into an OPVector
Converts a sequence of Integral features into a vector feature.
Converts a sequence of Integral features into a vector feature. Can choose to fill null values with the mean or a constant
Transformer to determine if a phone numbers is valid when no country code is available.
Transformer to determine if a phone numbers is valid when no country code is available. The default locale will be used for validation. All phone numbers with less than 2 characters will be categorized as invalid All phone numbers that starts with "+" will be evaluated with international formatting
Returns binary feature true if phone is valid false if invalid and none if phone number is none
Transformer to determine if a map of phone numbers is valid when no country code is available.
Transformer to determine if a map of phone numbers is valid when no country code is available. The default locale will be used for validation. All phone numbers with less than 2 characters will be categorized as invalid All phone numbers that starts with "+" will be evaluated with international formatting
Returns binary map feature true if phone is valid false if invalid and none if phone number is none
Determine whether a phone number is valid given the country's regional code.
Determine whether a phone number is valid given the country's regional code. By default the regional code will be checked against those provided in Google's PhoneNumber library. If the input regional code is not found, the default locale will be used for validation.
If the User provided a Country name to code mapping, the phone number can only be validated against the input mapping. This transformer will first match on regional code, failing that, it will select the country with the closest Q-Distance.
All phone numbers with less than 2 characters will be categorized as invalid
All phone numbers that starts with "+" will be evaluated with international formatting
Returns binary feature true if phone is valid false if invalid and none if phone number is none
Calculates the Jaccard Similarity between two sets.
Calculates the Jaccard Similarity between two sets. If both inputs are empty, Jaccard Similarity is defined as 1.0
Transformer that detects the language of the text
A case class representing a linear scaling function
A case class representing a linear scaling function
case class containing the slope and intercept of the scaling function
Parameters need to uniquely define a linear scaling function
Parameters need to uniquely define a linear scaling function
the slope of the linear scaler
the x axis intercept of the linear scaler
A case class representing a logarithmic scaling function
Log base N transformer
Log base N transformer
input feature type
Detects MIME type for Base64 encoded binary data.
Detects MIME type for Base64Map encoded binary data.
Joins probability score with label from string indexer stage
Joins probability score with label from string indexer stage
Map(label -> probability)
Converts a sequence of KeyMultiPickList features into a vector keeping the top K most common occurrences of each key in the maps for that feature (ie the final vector has length k * number of keys * number of features).
Converts a sequence of KeyMultiPickList features into a vector keeping the top K most common occurrences of each key in the maps for that feature (ie the final vector has length k * number of keys * number of features). Each key found will also generate an other column which will capture values that do not make the cut or where not seen in training. Note that any keys not seen in training will be ignored.
Multiply function truth table (Real as example):
Multiply function truth table (Real as example):
Real.empty * Real.empty = Real.empty Real.empty * Real(x) = Real.empty Real(x) * Real.empty = Real.empty Real(x) * Real(y) = Real(x * y) filter ("is not NaN or Infinity")
Name Entity NameEntityType text recognizer.
Name Entity NameEntityType text recognizer.
Note: when providing your own the analyzer/splitter/tagger make sure they can work together, for instance OpenNLP models require their own analyzers to be provided when tokenizing. The returned feature type is a MultiPickListMap which contains sets of entities for all the tokens
text feature type
Numeric Bucketizer
Numeric Bucketizer
numeric feature type
Generic hashing vectorizer to convert features of type OPCollection into Vectors
Generic hashing vectorizer to convert features of type OPCollection into Vectors
In more details: It tries to hash entries in the collection using the specified hashing algorithm to build a single vector. If the desired number of features (= hash space size) for all features combined is larger than Integer.Max (the maximal index for a vector), then all the features use the same hash space. There are also options for the user to hash indices with collections where that makes sense (OPLists and OPVectors), and to force a shared hash space, even if the number of feature is not high enough to require it.
Abstract base class for a set of transformer wrappers that allow unary transformers between non-collection types to be used on collection types.
Abstract base class for a set of transformer wrappers that allow unary transformers between non-collection types to be used on collection types. For example, we can use a UnaryLambdaTransformer[Email, Integer] on a map's values, creating a UnaryLambdaTransformer[EmailMap, IntegralMap]. This base class will be inherited by concrete classes for OPMaps, OPList, and OPSets (in order to enforce not allowing these collection types to be transformed into each other, eg. no MultiPickList to RealMap transformations).
The OP type hierarchy does not allow direct type checking of such transformer wrappers (eg. Real#Value is Option[Double] and RealMap#Value is Map[String, Double], so there's no way to enforce that a RealMap can only hold what is contained in a Real) since the types themselves are not created with typetags for performance reasons. However, we can still enforce that operations like building a UnaryLambdaTransformer[RealMap, StringMap] from a UnaryLambdaTransformer[Real, Integer] is not possible by using the Spark types in validateTypes.
input feature type for supplied non-collection transformer
output feature type for supplied non-collection transformer
input feature type for desired collection transformer
output feature type for desired collection transformer
Base class for vectorizing OPMap[A] features.
Base class for vectorizing OPMap[A] features. Individual vectorizers for different feature types need to implement the getFillByKey function (which calculates any fill values that differ by key - means, modes, etc.) and the makeModel function (which specifies which type of model will be returned).
value type for underlying map
input feature type to vectorize into an OPVector
OPMap vectorizer model arguments
OPMap vectorizer model arguments
all keys per feature
fill values for features
should clean map keys
should clean map values
default value to replace with
add column to track null values for each map key
Wrapper around spark ml CountVectorizer for use with OP pipelines
Wrapper for org.apache.spark.ml.feature.HashingTF
Wrapper for org.apache.spark.ml.feature.HashingTF
Maps a sequence of terms to their term frequencies using the hashing trick. Currently we use Austin Appleby's MurmurHash 3 algorithm (MurmurHash3_x86_32) to calculate the hash code value for the term object. Since a simple modulo is used to transform the hash function to a column index, it is advisable to use a power of two as the numFeatures parameter; otherwise the features will not be mapped evenly to the columns.
HashingTF for more info
Wrapper for org.apache.spark.ml.feature.IndexToString
Wrapper for org.apache.spark.ml.feature.IndexToString
NOTE THAT THIS CLASS EITHER FILTERS OUT OR THROWS AN ERROR IF PREVIOUSLY UNSEEN VALUES APPEAR
A transformer that maps a feature of indices back to a new feature of corresponding text values. The index-string mapping is either from the ML attributes of the input feature, or from user-supplied labels (which take precedence over ML attributes).
OpStringIndexer for converting text into indices
A transformer that maps a feature of indices back to a new feature of corresponding text values.
A transformer that maps a feature of indices back to a new feature of corresponding text values. The index-string mapping is either from the ML attributes of the input feature, or from user-supplied labels (which take precedence over ML attributes).
OpStringIndexerNoFilter for converting text into indices
Wrapper around spark ml LDA (Latent Dirichlet Allocation) for use with OP pipelines
Wrapper for org.apache.spark.ml.feature.NGram
Wrapper for org.apache.spark.ml.feature.NGram
A feature transformer that converts the input array of strings into an array of n-grams. Null values in the input array are ignored. It returns an array of n-grams where each n-gram is represented by a space-separated string of words.
When the input is empty, an empty array is returned. When the input array length is less than n (number of elements per n-gram), no n-grams are returned.
NGram for more info
Converts a sequence of features into a vector keeping the top K most common occurrences of each feature (ie the final vector has length K * number of inputs).
Converts a sequence of features into a vector keeping the top K most common occurrences of each feature (ie the final vector has length K * number of inputs). Plus an additional column for "other" values - which will capture values that do not make the cut or values not seen in training, and an additional column for empty values unless null tracking is disabled.
Wraps Spark's native StandardScaler, which operates on vectors, to enable it to operate directly on scalars.
Converts a sequence of OpSet features into a vector keeping the top K most common occurrences of each feature (ie the final vector has length K * number of inputs).
Converts a sequence of OpSet features into a vector keeping the top K most common occurrences of each feature (ie the final vector has length K * number of inputs). Plus an additional column for "other" values - which will capture values that do not make the cut or values not seen in training, and an additional column for empty values unless null tracking is disabled.
Wrapper for org.apache.spark.ml.feature.StopWordsRemover
Wrapper for org.apache.spark.ml.feature.StopWordsRemover
A feature transformer that filters out stop words from input.
null values from input array are preserved unless adding null to stopWords explicitly.
StopWordsRemover for more info
Wrapper for org.apache.spark.ml.feature.StringIndexer
Wrapper for org.apache.spark.ml.feature.StringIndexer
NOTE THAT THIS CLASS EITHER FILTERS OUT OR THROWS AN ERROR IF PREVIOUSLY UNSEEN VALUES APPEAR
A label indexer that maps a text column of labels to an ML feature of label indices. The indices are in [0, numLabels), ordered by label frequencies. So the most frequent label gets index 0.
OpIndexToString for the inverse transformation
A label indexer that maps a text column of labels to an ML feature of label indices.
A label indexer that maps a text column of labels to an ML feature of label indices. The indices are in [0, numLabels), ordered by label frequencies. So the most frequent label gets index 0.
OpIndexToStringNoFilter for the inverse transformation
Converts a sequence of Text features into a vector keeping the top K most common occurrences of each feature (ie the final vector has length K * number of inputs).
Converts a sequence of Text features into a vector keeping the top K most common occurrences of each feature (ie the final vector has length K * number of inputs). Plus an additional column for "other" values - which will capture values that do not make the cut or values not seen in training, and an additional column for empty values unless null tracking is disabled.
Wrapper around spark ml word2vec for use with OP pipelines
Transformer to determine if a phone numbers is valid when no country code is available.
Transformer to determine if a phone numbers is valid when no country code is available. The default locale will be used for validation. All phone numbers with less than 2 characters will be categorized as invalid All phone numbers that starts with "+" will be evaluate with international formatting
Returns stripped number if number is valid. And None other wise.
Determine whether a phone number is valid given the country's regional code.
Determine whether a phone number is valid given the country's regional code. By default the regional code will be checked against those provided in Google's PhoneNumber library. If the input regional code is not found, the default locale will be used for validation.
If the User provided a Country name to code mapping, the phone number can only be validated against the input mapping. This transformer will first match on regional code, failing that, it will select the country with the closest Q-Distance.
All phone numbers with less than 2 characters will be categorized as invalid
All phone numbers that starts with "+" will be evaluated with international formatting
Returns stripped number if number is valid. And None other wise.
Wraps around org.apache.spark.ml.feature.QuantileDiscretizer
Power transformer
Power transformer
input feature type
Applies to the input column the inverse of the scaling function defined in the Prediction feature metadata.
Applies to the input column the inverse of the scaling function defined in the Prediction feature metadata. - 1st input feature is the Prediction feature to descale - 2nd input feature is scaled Prediction feature containing the metadata for constructing the scaling used to make this column
input feature type
output feature type
Class for vectorizing RealMap features.
Class for vectorizing RealMap features. Fills missing keys with the mean for that key.
input feature type to vectorize into an OPVector
Converts a sequence of real non nullable features into a vector feature
Converts a sequence of Nullable Numeric features into a vector feature.
Converts a sequence of Nullable Numeric features into a vector feature. Can choose to fill null values with the mean or a constant
Round digits transformer
Round digits transformer
input feature type
Round transformer
Round transformer
input feature type
Scalar addition transformer
Scalar addition transformer
input feature type
value type
Scalar divide transformer
Scalar divide transformer
input feature type
value type
Scalar multiply transformer
Scalar multiply transformer
input feature type
value type
Scalar subtract transformer
Scalar subtract transformer
input feature type
value type
A trait for defining a new family of scaling functions scalingType: a ScalingType Enum for the scaling name args: A case class containing the args needed to define scaling and inverse scaling functions scale: The scaling function descale: The inverse scaling function
A trait for defining a new family of scaling functions scalingType: a ScalingType Enum for the scaling name args: A case class containing the args needed to define scaling and inverse scaling functions scale: The scaling function descale: The inverse scaling function
To add a new family of scaling functions: Add an entry to the scalingType enum, define a Case class extending Scaler, and add a case statement to both the Scaler and ScalerMetaData case classes
Metadata containing the info needed to reconstruct a Scaler instance
Metadata containing the info needed to reconstruct a Scaler instance
the family of functions containing the scaler
the args uniquely defining a function in the scaling family
Scaling transformer that applies a scaling function to a numerical feature
Scaling transformer that applies a scaling function to a numerical feature
input feature type
output feature type
A trait to be extended by a case class containing the args needed to define a family of scaling & descaling functions
Compute char ngram distance for MultiPickList features.
Info about each feature within a text map
Info about each feature within a text map
name of a feature
method to use for text vectorization (either pivot, hashing, or ignoring)
most common values of a feature (only for categoricals)
Convert a sequence of text map features into a vector by detecting categoricals that are disguised as text.
Convert a sequence of text map features into a vector by detecting categoricals that are disguised as text. A categorical will be represented as a vector consisting of occurrences of top K most common values of that feature plus occurrences of non top k values and a null indicator (if enabled). Non-categoricals will be converted into a vector using the hashing trick. In addition, a null indicator is created for each non-categorical (if enabled).
Detection and removal of names in the input columns can be enabled with the sensitiveFeatureMode
param.
Arguments for SmartTextMapVectorizerModel
Arguments for SmartTextMapVectorizerModel
info about each feature with each text map
should clean feature keys
should clean feature values
should track nulls
hashing function params
Convert a sequence of text features into a vector by detecting categoricals that are disguised as text.
Convert a sequence of text features into a vector by detecting categoricals that are disguised as text. A categorical will be represented as a vector consisting of occurrences of top K most common values of that feature plus occurrences of non top k values and a null indicator (if enabled). Non-categoricals will be converted into a vector using the hashing trick. In addition, a null indicator is created for each non-categorical (if enabled).
Detection and removal of names in the input columns can be enabled with the sensitiveFeatureMode
param.
Arguments for SmartTextVectorizerModel
Arguments for SmartTextVectorizerModel
method to use for text vectorization (either pivot, hashing, or ignoring)
top values to each feature
should clean text value
should track nulls
hashing function params
Square root transformer
Square root transformer
input feature type
Checks if the first input is a substring of the second input
Checks if the first input is a substring of the second input
first input feature type
second input feature type
Minus function truth table (Real as example):
Minus function truth table (Real as example):
Real.empty - Real.empty = Real.empty Real.empty - Real(x) = Real(-x) Real(x) - Real.empty = Real(x) Real(x) - Real(y) = Real(x - y)
Sequence transformer for generating a sequence of text lengths from a sequence of TextList values (eg.
Sequence transformer for generating a sequence of text lengths from a sequence of TextList values (eg. tokenized raw text)
Method for computing text lengths
Creates null indicator columns for a sequence of input TextList features, originally for use as a separate stage in null tracking for hashed text features (easier to do outside the hashing vectorizer since we can make a null indicator column for each input feature without having to add lots of complex logic in the hashing vectorizer to deal with metadata for shared vs.
Creates null indicator columns for a sequence of input TextList features, originally for use as a separate stage in null tracking for hashed text features (easier to do outside the hashing vectorizer since we can make a null indicator column for each input feature without having to add lots of complex logic in the hashing vectorizer to deal with metadata for shared vs. separate hash spaces.
Estimator for computing text lengths on fields stored in text maps.
Estimator for computing text lengths on fields stored in text maps. Note that because there are no maps from String to TextList, we need to do the tokenization here (unlike the TextLenTransformer).
Creates null indicator columns for a sequence of input TextMap features, originally for use as a separate stage in null tracking for hashed text features (easier to do outside the hashing vectorizer since we can make a null indicator column for each input feature without having to add lots of complex logic in the hashing vectorizer to deal with metadata for shared vs.
Creates null indicator columns for a sequence of input TextMap features, originally for use as a separate stage in null tracking for hashed text features (easier to do outside the hashing vectorizer since we can make a null indicator column for each input feature without having to add lots of complex logic in the hashing vectorizer to deal with metadata for shared vs. separate hash spaces.
Converts a sequence of KeyString features into a vector keeping the top K most common occurrences of each key in the maps for that feature (ie the final vector has length k * number of keys * number of features).
Converts a sequence of KeyString features into a vector keeping the top K most common occurrences of each key in the maps for that feature (ie the final vector has length k * number of keys * number of features). Each key found will also generate an other column which will capture values that do not make the cut or where not seen in training. Note that any keys not seen in training will be ignored.
Compute char ngram distance for Text features.
Transformer that takes anything of type Text or lower and returns a TextList of tokens extracted from that text
Transformer that takes anything of type Text or lower and returns a TextList of tokens extracted from that text
Special reader/writer class for TextTokenizer stage
Methods of vectorizing text (eg.
Methods of vectorizing text (eg. to be chosen by statistics computed in SmartTextVectorizer)
TimePeriodMapTransformer extracts one of a set of time periods from a date/datetime list
TimePeriodMapTransformer extracts one of a set of time periods from a date/datetime list
input feature type
TimePeriodMapTransformer extracts one of a set of time periods from a date/datetime map
TimePeriodMapTransformer extracts one of a set of time periods from a date/datetime map
input feature type
TimePeriodTransformer extracts one of a set of time periods from a date/datetime
TimePeriodTransformer extracts one of a set of time periods from a date/datetime
input feature type
Transformer that converts input feature of type I into doolean feature using a user specified function that maps object type I to a Boolean
Transformer that converts input feature of type I into doolean feature using a user specified function that maps object type I to a Boolean
Object type to be mapped to a double (doolean).
Joins probability score with label from string indexer stage and Sorts by highest score and returns up topN.
Joins probability score with label from string indexer stage and Sorts by highest score and returns up topN. and Filters out the class - UnseenLabel
Sorts the label probability map and returns the topN.
Param that decides whether or not the values that are considered invalid are tracked
Param that decides whether or not the values that were missing are tracked
Param that decides whether or not lengths of text are tracked during vectorization
Checks if an email is valid
Takes in a sequence of vectors and combines them into a single vector
Enumeration object that contains the option to pivot the DateList feature
Enumeration object that contains the option to pivot the DateList feature
1) SinceFirst - replace the feature by the number of days between the first event and reference date
2) SinceLast - replace the feature by the number of days between the last event and reference date
3) ModeDay - replace the feature by a pivot that indicates the mode of the day of the week Example : If the mode is Monday then it will return (1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
4) ModeMonth - replace the feature by a pivot that indicates the mode of the month
5) ModeHour - replace the feature by a pivot that indicates the mode of the hour of the day.
Scaler instance factory
Tika helper
Absolute value transformer
input feature type