Maximum size of dataset want to train on.
Function to set parameters before passing into the validation step eg - do data balancing or dropping based on the labels
Fraction of data to reserve for test Default is 0.1
Seed for data splitting
Function to use to create the training set and test set.
Rebalance the training data within the validation step
Add a splitter parameter to name the label column