Class

com.salesforce.op.readers

AggregateCSVReader

Related Doc: package readers

Permalink

class AggregateCSVReader[T <: GenericRecord] extends CSVReader[T] with AggregateDataReader[T]

Data Reader for event type CSV data, where there may be multiple records for a given key. Each csv record will be automatically converted to an avro record using the provided schema.

Linear Supertypes
AggregateDataReader[T], AggregatedReader[T], CSVReader[T], DataReader[T], ReaderKey[T], Reader[T], ReaderType[T], Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. AggregateCSVReader
  2. AggregateDataReader
  3. AggregatedReader
  4. CSVReader
  5. DataReader
  6. ReaderKey
  7. Reader
  8. ReaderType
  9. Serializable
  10. Serializable
  11. AnyRef
  12. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new AggregateCSVReader(readPath: Option[String], key: (T) ⇒ String, schema: String, options: CSVOptions = CSVDefaults.CSVOptions, timeZone: String = CSVDefaults.TimeZone, aggregateParams: AggregateParams[T])(implicit arg0: ClassTag[T], arg1: scala.reflect.api.JavaUniverse.WeakTypeTag[T])

    Permalink

    readPath

    default path to data

    key

    function for extracting key from avro record

    schema

    avro schema. Note dateTime fields should be of type Long and will be automatically converted to unix timestamps in millis

    options

    CSV options

    timeZone

    timeZone to be used for any dateTime fields

    aggregateParams

    aggregate params function for extracting timestamp of event

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. val aggregateParams: AggregateParams[T]

    Permalink

    aggregate params function for extracting timestamp of event

    aggregate params function for extracting timestamp of event

    Definition Classes
    AggregateCSVReaderAggregateDataReader
  5. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  6. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  7. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  8. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  9. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. final def fullTypeName: String

    Permalink

    Full reader input type name

    Full reader input type name

    returns

    full input type name

    Definition Classes
    ReaderType
  11. final def generateDataFrame(rawFeatures: Array[OPFeature], opParams: OpParams = new OpParams())(implicit spark: SparkSession): DataFrame

    Permalink

    Generate the Dataframe that will be used in the OpPipeline calling this method

    Generate the Dataframe that will be used in the OpPipeline calling this method

    rawFeatures

    features to generate from the dataset read in by this reader

    opParams

    op parameters

    spark

    spark instance to do the reading and conversion from RDD to Dataframe

    returns

    A Dataframe containing columns with all of the raw input features expected by the pipeline

    Definition Classes
    AggregatedReaderDataReaderReader
  12. final def generateRow(key: String, records: Seq[T], rawFeatures: Array[OPFeature]): Option[Row]

    Permalink
    Definition Classes
    AggregateDataReaderAggregatedReader
  13. def generateRow(key: String, record: T, rawFeatures: Array[OPFeature]): Option[Row]

    Permalink
    Attributes
    protected
    Definition Classes
    DataReader
  14. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  15. final def getFinalReadPath(params: OpParams): String

    Permalink

    Default method for extracting the path used in read method.

    Default method for extracting the path used in read method. The path is taken in the following order of priority: readerPath, params

    returns

    final path to use

    Attributes
    protected
    Definition Classes
    DataReader
  16. def getGenStage[I](f: OPFeature): FeatureGeneratorStage[I, _ <: FeatureType]

    Permalink
    Attributes
    protected[com.salesforce.op]
    Definition Classes
    Reader
  17. final def getReaderParams(opParams: OpParams): Option[ReaderParams]

    Permalink

    Default method for extracting this reader's parameters from readerParams in OpParams

    Default method for extracting this reader's parameters from readerParams in OpParams

    opParams

    contains map of reader type to ReaderParams instances

    returns

    ReaderParams instance if it exists

    Definition Classes
    ReaderType
  18. final def getSchema(rawFeatures: Array[OPFeature]): StructType

    Permalink

    Derives DataFrame schema for raw features.

    Derives DataFrame schema for raw features.

    rawFeatures

    feature array representing raw feature-data

    returns

    a StructType instance

    Attributes
    protected
    Definition Classes
    DataReader
  19. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  20. final def innerJoin[U](other: DataReader[U], joinKeys: JoinKeys = JoinKeys()): JoinedDataReader[T, U]

    Permalink

    Inner join

    Inner join

    U

    Type of data read by right data reader

    other

    reader from right side of join

    joinKeys

    join keys to use

    returns

    joined reader

    Definition Classes
    Reader
  21. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  22. final def join[U](other: DataReader[U], joinType: JoinType, joinKeys: JoinKeys = JoinKeys()): JoinedDataReader[T, U]

    Permalink

    Join readers

    Join readers

    U

    Type of data read by right data reader

    other

    reader from right side of join

    joinType

    type of join to perform

    joinKeys

    join keys to use

    returns

    joined reader

    Attributes
    protected
    Definition Classes
    Reader
  23. val key: (T) ⇒ String

    Permalink

    function for extracting key from avro record

    function for extracting key from avro record

    Definition Classes
    CSVReader → ReaderKey
  24. final def leftOuterJoin[U](other: DataReader[U], joinKeys: JoinKeys = JoinKeys()): JoinedDataReader[T, U]

    Permalink

    Left Outer join

    Left Outer join

    U

    Type of data read by right data reader

    other

    reader from right side of join

    joinKeys

    join keys to use

    returns

    joined reader

    Definition Classes
    Reader
  25. final def maybeRepartition(data: Dataset[T], params: OpParams): Dataset[T]

    Permalink

    Function to repartition the data based on the op params of this reader

    Function to repartition the data based on the op params of this reader

    data

    dataset

    params

    op params

    returns

    maybe repartitioned dataset

    Attributes
    protected
    Definition Classes
    DataReader
  26. final def maybeRepartition(data: RDD[T], params: OpParams): RDD[T]

    Permalink

    Function to repartition the data based on the op params of this reader

    Function to repartition the data based on the op params of this reader

    data

    rdd

    params

    op params

    returns

    maybe repartitioned rdd

    Attributes
    protected
    Definition Classes
    DataReader
  27. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  28. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  29. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  30. val options: CSVOptions

    Permalink

    CSV options

    CSV options

    Definition Classes
    CSVReader
  31. final def outerJoin[U](other: DataReader[U], joinKeys: JoinKeys = JoinKeys()): JoinedDataReader[T, U]

    Permalink

    Outer join

    Outer join

    U

    Type of data read by right data reader

    other

    reader from right side of join

    joinKeys

    join keys to use

    returns

    joined reader

    Definition Classes
    Reader
  32. def read(params: OpParams = new OpParams())(implicit spark: SparkSession): Either[RDD[T], Dataset[T]]

    Permalink

    Function which reads raw data from specified location to use in Dataframe creation, i.e.

    Function which reads raw data from specified location to use in Dataframe creation, i.e. generateDataFrame fun. This function returns either RDD or Dataset of the type specified by this reader. It can be overwritten to carry out any special logic required for the reader (ie filters or joins needed to produce the specified reader type).

    params

    parameters used to carry out specialized logic in reader (passed in from workflow)

    spark

    spark instance to do the reading and conversion from RDD to Dataframe

    returns

    either RDD or Dataset of type T

    Definition Classes
    CSVReaderDataReader
  33. final def readDataset(params: OpParams = new OpParams())(implicit sc: SparkSession, encoder: Encoder[T]): Dataset[T]

    Permalink

    Function which reads raw data from specified location to use in Dataframe creation, i.e.

    Function which reads raw data from specified location to use in Dataframe creation, i.e. generateDataFrame fun. This function returns a Dataset of the type specified by this reader.

    params

    parameters used to carry out specialized logic in reader (passed in from workflow)

    sc

    spark session

    returns

    Dataset of type T

    Definition Classes
    DataReader
  34. val readPath: Option[String]

    Permalink

    default path to data

    default path to data

    Definition Classes
    CSVReaderDataReader
  35. final def readRDD(params: OpParams = new OpParams())(implicit sc: SparkSession): RDD[T]

    Permalink

    Function which reads raw data from specified location to use in Dataframe creation, i.e.

    Function which reads raw data from specified location to use in Dataframe creation, i.e. generateDataFrame fun. This function returns a RDD of the type specified by this reader.

    params

    parameters used to carry out specialized logic in reader (passed in from workflow)

    sc

    spark session

    returns

    RDD of type T

    Definition Classes
    DataReader
  36. val schema: String

    Permalink

    avro schema.

    avro schema. Note dateTime fields should be of type Long and will be automatically converted to unix timestamps in millis

    Definition Classes
    CSVReader
  37. implicit val seqEnc: Encoder[Seq[T]]

    Permalink
    Definition Classes
    AggregatedReader
  38. implicit val strEnc: Encoder[String]

    Permalink
    Definition Classes
    AggregatedReader
  39. final def subReaders: Seq[DataReader[_]]

    Permalink

    All the reader's sub readers (used in joins)

    All the reader's sub readers (used in joins)

    returns

    sub readers

    Definition Classes
    DataReaderReader
  40. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  41. val timeZone: String

    Permalink

    timeZone to be used for any dateTime fields

    timeZone to be used for any dateTime fields

    Definition Classes
    CSVReader
  42. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  43. implicit val tupEnc: Encoder[(String, Seq[T])]

    Permalink
    Definition Classes
    AggregatedReader
  44. final def typeName: String

    Permalink

    Short reader input type name

    Short reader input type name

    returns

    short reader input type name

    Definition Classes
    ReaderType
  45. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  46. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  47. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  48. implicit val wtt: scala.reflect.api.JavaUniverse.WeakTypeTag[T]

    Permalink

    Reader type tag

    Reader type tag

    Definition Classes
    CSVReader → ReaderType

Inherited from AggregateDataReader[T]

Inherited from AggregatedReader[T]

Inherited from CSVReader[T]

Inherited from DataReader[T]

Inherited from ReaderKey[T]

Inherited from Reader[T]

Inherited from ReaderType[T]

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped