Summary of label distribution for continuous label
Summary of label distribution for continuous label
min value
max value
mean value
variance of values
Summary of label distribution for discrete label
Summary of label distribution for discrete label
sequence of all unique values observed in data
probabilities of each unique value observed in data (order is matched to domain order)
History of all stages and origin features used to create a given feature
History of all stages and origin features used to create a given feature
alphabetically ordered names of the raw features this feature was created from
sequence of the stageNames applied
Summary of feature insights for all features derived from a given input (raw) feature
Summary of feature insights for all features derived from a given input (raw) feature
name of raw feature insights are about
type of raw feature insights are about
sequence containing insights for each feature derived from the raw feature
sequence containing metrics computed in RawFeatureFilter
distribution information for the raw feature (if calculated in RawFeatureFilter)
exclusion reasons for the raw feature (if calculated in RawFeatureFilter)
derived information about sensitive field checks (if performed)
Summary of insights for a derived feature
Summary of insights for a derived feature
name of derived feature
the stageNames of all stages applied to make feature from the raw input feature
grouping of this feature if the feature is a pivot
value of the feature if the feature is a numeric encoding of a non-numeric feature or bucket
was this derived feature excluded from the model by the sanity checker
the correlation of this feature with the label
the cramersV of this feature with the label (when both label and feature are categorical)
the mutual information for this feature (and all features in its grouping) with the label (categorical features only)
the mutual information of this feature with each value of the label (categorical features only)
the counts of the occurrence of this feature with each of the label values (categorical features only)
the contribution of this feature to the model (eg feature importance for random forest, weight for logistic regression)
the min value of this feature
the max value of this feature
the mean value of this feature
the variance of this feature
Common trait for Continuous and Discrete
Summary information about label used in model creation (all fields will be empty if no label is found)
Summary information about label used in model creation (all fields will be empty if no label is found)
name of label feature
name of raw features that label is derived from
types of raw features that label is derived from
the stageNames of all stages applied to label before modeling
count of label used to compute distribution information (will be fraction of data corresponding to sample rate in sanity checker)
summary of label distribution (either continuous or discrete)
Summary of all model insights
Summary of all model insights
summary of information about the label
sequence containing insights for each raw feature that fed into the model
summary information about model training and winning model from model selector
op parameters used in model training
all stages and their parameters settings used to create feature output of model keyed by stageName
A simple command line app for running an OpWorkflow with Spark.
A simple command line app for running an OpWorkflow with Spark. A user needs to implement a run function.
A simple command line app for running an OpWorkflow with Spark.
A simple command line app for running an OpWorkflow with Spark. A user needs to implement a runner creation function.
OpParams for passing in command line information
Workflow for TransmogrifAI.
Workflow for TransmogrifAI. Takes the final features that the user wants to generate as inputs and constructs the full DAG needed to generate them from those features lineage. Then fits any estimators in the pipeline dag to create a sequence of transformations that are saved in a workflow model.
Workflow model is a container and executor for the sequence of transformations that have been fit to the data to produce the desired output features
Reads OpWorkflowModelWriter serialized OpWorkflowModel objects by path and JValue.
Reads OpWorkflowModelWriter serialized OpWorkflowModel objects by path and JValue. This will only work if the features were serialized in topological order. NOTE: The FeatureGeneratorStages will not be recovered into the Model object, because they are part of each feature.
Writes the OpWorkflowModel to json format.
Writes the OpWorkflowModel to json format. For now we will not serialize the parent of the model
The features/stages must be sorted in topological order
A class for running an TransmogrifAI Workflow.
A class for running an TransmogrifAI Workflow. Provides methods to train, score, evaluate and computeUpTo for TransmogrifAI Workflow.
OpWorkflowRunner configuration container
OpWorkflowRunner configuration container
workflow run type
workflow file params location
default params to use in case the file params is missing
read locations
write location
model location
metrics location
Reader params
Methods of vectorizing text (eg.
Methods of vectorizing text (eg. to be chosen by statistics computed in SmartTextVectorizer)
Enrichment functions for an array of arbitrary features
Enrichment functions for an array of arbitrary features
Enrichment functions for an array of arbitrary features
Enrichment functions for an array of arbitrary features
Enrichment functions for a collection of arbitrary features
Enrichment functions for a collection of arbitrary features
Enrichment functions for Base64Map features.
Enrichment functions for Base64Map features.
Enrichment functions for Binary Feature
Enrichment functions for Binary Feature
Enrichment functions for OPMap Features with Boolean values
Enrichment functions for OPMap Features with Boolean values
Enrichment functions for Date Feature
Enrichment functions for Date Feature
Enrichment functions for DateList Feature
Enrichment functions for DateList Feature
Enrichment functions for OPMap Features with Date values
Enrichment functions for OPMap Features with Date values
Enrichment functions for DateTime Feature
Enrichment functions for DateTime Feature
Enrichment functions for DateList Feature
Enrichment functions for DateList Feature
Enrichment functions for OPMap Features with DateTime values
Enrichment functions for OPMap Features with DateTime values
Enrichment functions for EmailMap Features
Enrichment functions for EmailMap Features
Enrichment functions for Feature[A]
Enrichment functions for Geolocation Feature
Enrichment functions for Geolocation Feature
Enrichment functions for OPMap Features with Geolocation values
Enrichment functions for OPMap Features with Geolocation values
Enrichment functions for Integral Feature
Enrichment functions for Integral Feature
Enrichment functions for OPMap Features with Long values
Enrichment functions for OPMap Features with Long values
Enrichment functions for OPMap Features
Enrichment functions for OPMap Features
Enrichment functions for OPMap Features with String values
Enrichment functions for OPMap Features with String values
Enrichment functions for Numeric Feature
Enrichment functions for OPSet Feature
Enrichment functions for OPSet Feature
Enrichment functions for PhoneMap features
Enrichment functions for PhoneMap features
Enrichment functions for Prediction Features
Enrichment functions for Prediction Features
Enrichment functions for Real Feature
Enrichment functions for Real Feature
Enrichment functions for OPMap Features with Double values
Enrichment functions for OPMap Features with Double values
Enrichment functions for Real non nullable Feature
Enrichment functions for Real non nullable Feature
Enrichment functions for MultiPickList Feature
Enrichment functions for MultiPickList Feature
Enrichment functions for OPMap Features with String values.
Enrichment functions for OPMap Features with String values. All are pivoted by default except TextMap and TextAreaMap which are defined specially below.
Enrichment functions for TextAreaMap Features (they are hashed by default instead of being pivoted)
Enrichment functions for TextAreaMap Features (they are hashed by default instead of being pivoted)
Enrichment functions for TextList Feature
Enrichment functions for TextList Feature
Enrichment functions for TextMap Features (they are hashed by default instead of being pivoted)
Enrichment functions for TextMap Features (they are hashed by default instead of being pivoted)
Enrichment functions for URLMap Features
Enrichment functions for URLMap Features
Enrichment functions for Vector Feature
Enrichment functions for Vector Feature
Enrichment functions for a collection of vector features
Enrichment functions for a collection of vector features
A base class for different SensitiveFeatureInformation The following three params are required for every kind of SensitiveFeatureInformation
OpParams factory
Writes the OpWorkflowModel into a specified path
Unique Identifier (UID) generator