Package

com.salesforce.op

readers

Permalink

package readers

Visibility
  1. Public
  2. All

Type Members

  1. class AggregateAvroReader[T <: GenericRecord] extends AvroReader[T] with AggregateDataReader[T]

    Permalink

    Data reader for avro events where there might be multiple records for a given key.

  2. class AggregateCSVAutoReader[T <: GenericRecord] extends CSVAutoReader[T] with AggregateDataReader[T]

    Permalink

    Data Reader for event type CSV data, where there may be multiple records for a given key.

    Data Reader for event type CSV data, where there may be multiple records for a given key. Each csv record will be automatically converted to an avro record by inferring a schema.

  3. class AggregateCSVProductReader[T <: Product] extends CSVProductReader[T] with AggregateDataReader[T]

    Permalink

    Data Reader for CSV events, where there may be multiple records for a given key.

    Data Reader for CSV events, where there may be multiple records for a given key. Each csv record will be automatically converted to type T that defines an Encoder.

  4. class AggregateCSVReader[T <: GenericRecord] extends CSVReader[T] with AggregateDataReader[T]

    Permalink

    Data Reader for event type CSV data, where there may be multiple records for a given key.

    Data Reader for event type CSV data, where there may be multiple records for a given key. Each csv record will be automatically converted to an avro record using the provided schema.

  5. abstract class AggregateCustomReader[T] extends CustomReader[T] with AggregateDataReader[T]

    Permalink

    Custom aggregate data reader

  6. trait AggregateDataReader[T] extends AggregatedReader[T]

    Permalink

    DataReader to use for event type data, with multiple records per key

  7. case class AggregateParams[T](timeStampFn: Option[(T) ⇒ Long], cutOffTime: CutOffTime) extends Product with Serializable

    Permalink

    Aggregate data reader params

    Aggregate data reader params

    timeStampFn

    An additional timeStamp function for extracting the timestamp of the event

    cutOffTime

    A cut off time to be used for aggregating features extracted from the events

    • Predictor variables will be aggregated from events up until the cut off time
    • Response variables will be aggregated from events following the cut off time
  8. class AggregateParquetProductReader[T <: Product] extends ParquetProductReader[T] with AggregateDataReader[T]

    Permalink

    Data Reader for Parquet events, where there may be multiple records for a given key.

    Data Reader for Parquet events, where there may be multiple records for a given key. Each parquet record will be automatically converted to type T that defines an Encoder.

  9. trait AggregatedReader[T] extends DataReader[T]

    Permalink

    Readers that extend this can be used as right hand side arguments for joins and so should do aggregation on the key to return only a single value

  10. class AvroReader[T <: GenericRecord] extends DataReader[T]

    Permalink

    Data reader for avro data.

  11. class CSVAutoReader[T <: GenericRecord] extends DataReader[T]

    Permalink

    Data Reader for CSV data that automatically infers the schema from the CSV data and converts to T <: GenericRecord.

    Data Reader for CSV data that automatically infers the schema from the CSV data and converts to T <: GenericRecord. The schema is inferred either using the provided headers params, otherwise the first row is assumed as a headers line

  12. class CSVProductReader[T <: Product] extends DataReader[T]

    Permalink

    CSV reader for any type that defines an Encoder.

    CSV reader for any type that defines an Encoder. Scala case classes and tuples/products included automatically.

  13. class CSVReader[T <: GenericRecord] extends DataReader[T]

    Permalink

    Data Reader for CSV data.

    Data Reader for CSV data. Each CSV record will be automatically converted to an Avro record using the provided schema.

  14. class ConditionalAvroReader[T <: GenericRecord] extends AvroReader[T] with ConditionalDataReader[T]

    Permalink

    Data reader for avro events when computing conditional probabilities.

  15. class ConditionalCSVAutoReader[T <: GenericRecord] extends CSVAutoReader[T] with ConditionalDataReader[T]

    Permalink

    Data Reader for event type CSV data (with schema inference), when computing conditional probabilities.

    Data Reader for event type CSV data (with schema inference), when computing conditional probabilities. There may be multiple records for a given key. Each csv record will be automatically converted to an avro record with an inferred schema.

  16. class ConditionalCSVProductReader[T <: Product] extends CSVProductReader[T] with ConditionalDataReader[T]

    Permalink

    Data Reader for CSV events, when computing conditional probabilities.

    Data Reader for CSV events, when computing conditional probabilities. There may be multiple records for a given key. Each csv record will be automatically converted to type T that defines an Encoder.

  17. class ConditionalCSVReader[T <: GenericRecord] extends CSVReader[T] with ConditionalDataReader[T]

    Permalink

    Data Reader for event type CSV data, when computing conditional probabilities.

    Data Reader for event type CSV data, when computing conditional probabilities. There may be multiple records for a given key. Each csv record will be automatically converted to an avro record using the provided schema.

  18. abstract class ConditionalCustomReader[T] extends CustomReader[T] with ConditionalDataReader[T]

    Permalink

    Custom conditional aggregate data reader

  19. trait ConditionalDataReader[T] extends AggregatedReader[T]

    Permalink

    DataReader to use for event type data, when modeling conditional probabilities.

    DataReader to use for event type data, when modeling conditional probabilities. Predictor variables will be aggregated from events up until the occurrence of the condition. Response variables will be aggregated from events following the occurrence of the condition.

  20. case class ConditionalParams[T](timeStampFn: (T) ⇒ Long, targetCondition: (T) ⇒ Boolean, responseWindow: Option[Duration] = ..., predictorWindow: Option[Duration] = ..., timeStampToKeep: TimeStampToKeep = TimeStampToKeep.Random, cutOffTimeFn: Option[(String, Seq[T]) ⇒ CutOffTime] = None, dropIfTargetConditionNotMet: Boolean = false) extends Product with Serializable

    Permalink

    Conditional data reader params

    Conditional data reader params

    timeStampFn

    function for extracting the timestamp from an event

    targetCondition

    function for identifying if the condition is met

    responseWindow

    optional size of time window over which the response variable is to be aggregated

    predictorWindow

    optional size of time window over which the predictor variables are to be aggregated

    timeStampToKeep

    if a particular key met the condition multiple times, which of the times would you like to use in the training set

    cutOffTimeFn

    optional function to compute the cutoff value based on key and aggregated sequence of events for that key

    dropIfTargetConditionNotMet

    do not generate feature vectors for keys in training set where the target condition is not met. If set to false, and condition is not met, features for those

  21. class ConditionalParquetProductReader[T <: Product] extends ParquetProductReader[T] with ConditionalDataReader[T]

    Permalink

    Data Reader for Parquet events, when computing conditional probabilities.

    Data Reader for Parquet events, when computing conditional probabilities. There may be multiple records for a given key. Each parquet record will be automatically converted to type T that defines an Encoder.

  22. abstract class CustomReader[T] extends DataReader[T]

    Permalink

    Custom data reader

  23. trait DataReader[T] extends Reader[T] with ReaderKey[T]

    Permalink

    DataReaders must specify: 1.

    DataReaders must specify: 1. An optional path to read from 2. A function for extracting the key from the records being read 3. The read method to be used for reading the data

  24. class FileStreamingAvroReader[T <: GenericRecord] extends StreamingReader[T]

    Permalink

    Simple avro streaming reader that monitors a Hadoop-compatible filesystem for new files.

  25. case class JoinKeys(leftKey: String = KeyFieldName, rightKey: String = KeyFieldName, resultKey: String = CombinedKeyName) extends Product with Serializable

    Permalink

    Join Keys to use

    Join Keys to use

    leftKey

    key to use from left table

    rightKey

    key to use from right table (will always be the aggregation key

    resultKey

    key of joined result

  26. sealed abstract class JoinType extends EnumEntry with Serializable

    Permalink
  27. class ParquetProductReader[T <: Product] extends DataReader[T]

    Permalink

    ParquetReader for any type that defines an Encoder.

    ParquetReader for any type that defines an Encoder. Scala case classes and tuples/products included automatically.

  28. trait Reader[T] extends ReaderType[T]

    Permalink
  29. trait StreamingReader[T] extends ReaderType[T] with ReaderKey[T]

    Permalink
  30. case class TimeBasedFilter(condition: TimeColumn, primary: TimeColumn, timeWindow: Duration) extends Product with Serializable

    Permalink

    Time based filter for conditional aggregation

    Time based filter for conditional aggregation

    condition

    condition time column

    primary

    primary time column

    timeWindow

    time window for conditional aggregation

  31. case class TimeColumn(name: String, keep: Boolean) extends Product with Serializable

    Permalink

    Time column for aggregation

    Time column for aggregation

    name

    column name

    keep

    should keep the column in result

  32. sealed abstract class TimeStampToKeep extends EnumEntry with Serializable

    Permalink

Value Members

  1. object CSVDefaults

    Permalink
  2. object DataFrameFieldNames extends Product with Serializable

    Permalink

    The name of the column containing the entity being scored will always be key

  3. object DataReaders

    Permalink

    Just a handy factory for data readers

  4. object JoinTypes extends Enum[JoinType]

    Permalink
  5. object ReaderKey extends Serializable

    Permalink
  6. object StreamingReaders

    Permalink

    Just a handy factory for streaming readers

  7. object TimeStampToKeep extends Enum[TimeStampToKeep] with Serializable

    Permalink

Ungrouped