Class

com.salesforce.op.stages.impl.feature

TextMapHashingVectorizer

Related Doc: package feature

Permalink

class TextMapHashingVectorizer[T <: OPMap[String]] extends OPMapVectorizer[String, T] with TextParams

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. TextMapHashingVectorizer
  2. TextParams
  3. OPMapVectorizer
  4. TrackNullsParam
  5. NumericMapDefaultParam
  6. MapVectorizerFuns
  7. CleanTextMapFun
  8. CleanTextFun
  9. MapPivotParams
  10. VectorizerDefaults
  11. SequenceEstimator
  12. OpPipelineStageN
  13. HasOut
  14. HasInN
  15. OpPipelineStage
  16. OpPipelineStageBase
  17. MLWritable
  18. OpPipelineStageParams
  19. InputParams
  20. Estimator
  21. PipelineStage
  22. Logging
  23. Params
  24. Serializable
  25. Serializable
  26. Identifiable
  27. AnyRef
  28. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new TextMapHashingVectorizer(uid: String = UID[TextMapHashingVectorizer[T]])(implicit tti: scala.reflect.api.JavaUniverse.TypeTag[T])

    Permalink

Type Members

  1. final type InputFeatures = Array[FeatureLike[T]]

    Permalink

    Input Features type

    Input Features type

    Definition Classes
    OpPipelineStageNOpPipelineStageInputParams
  2. final type OutputFeatures = FeatureLike[OPVector]

    Permalink
    Definition Classes
    OpPipelineStageOpPipelineStageBase

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  5. final val allowListKeys: StringArrayParam

    Permalink
    Definition Classes
    MapPivotParams
  6. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  7. final val blockListKeys: StringArrayParam

    Permalink
    Definition Classes
    MapPivotParams
  8. implicit def booleanToDouble(v: Boolean): Double

    Permalink
    Definition Classes
    VectorizerDefaults
  9. final def checkInputLength(features: Array[_]): Boolean

    Permalink

    Checks the input length

    Checks the input length

    features

    input features

    returns

    true is input size as expected, false otherwise

    Definition Classes
    OpPipelineStageNInputParams
  10. final def checkSerializable: Try[Unit]

    Permalink

    Check if the stage is serializable

    Check if the stage is serializable

    returns

    Failure if not serializable

    Definition Classes
    SequenceEstimatorOpPipelineStageBase
  11. final val cleanKeys: BooleanParam

    Permalink
    Definition Classes
    MapPivotParams
  12. def cleanMap[V](m: Map[String, V], shouldCleanKey: Boolean, shouldCleanValue: Boolean): Map[String, V]

    Permalink
    Definition Classes
    CleanTextMapFun
  13. final val cleanText: BooleanParam

    Permalink
    Definition Classes
    TextParams
  14. def cleanTextFn(s: String, shouldClean: Boolean): String

    Permalink
    Definition Classes
    CleanTextFun
  15. final def clear(param: Param[_]): TextMapHashingVectorizer.this.type

    Permalink
    Definition Classes
    Params
  16. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  17. val convertFn: (Map[String, String]) ⇒ Map[String, Double]

    Permalink

    maps input type into a Map[String, Double] on the way to conversion to OPVector

    maps input type into a Map[String, Double] on the way to conversion to OPVector

    Definition Classes
    OPMapVectorizer
  18. final def copy(extra: ParamMap): TextMapHashingVectorizer.this.type

    Permalink

    This method is used to make a copy of the instance with new parameters in several methods in spark internals Default will find the constructor and make a copy for any class (AS LONG AS ALL CONSTRUCTOR PARAMS ARE VALS, this is why type tags are written as implicit vals in base classes).

    This method is used to make a copy of the instance with new parameters in several methods in spark internals Default will find the constructor and make a copy for any class (AS LONG AS ALL CONSTRUCTOR PARAMS ARE VALS, this is why type tags are written as implicit vals in base classes).

    Note: that the convention in spark is to have the uid be a constructor argument, so that copies will share a uid with the original (developers should follow this convention).

    extra

    new parameters want to add to instance

    returns

    a new instance with the same uid

    Definition Classes
    OpPipelineStageBase → Params
  19. def copyValues[T <: Params](to: T, extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  20. final def defaultCopy[T <: Params](extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  21. final val defaultValue: DoubleParam

    Permalink
    Definition Classes
    NumericMapDefaultParam
  22. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  23. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  24. def explainParam(param: Param[_]): String

    Permalink
    Definition Classes
    Params
  25. def explainParams(): String

    Permalink
    Definition Classes
    Params
  26. final def extractParamMap(): ParamMap

    Permalink
    Definition Classes
    Params
  27. final def extractParamMap(extra: ParamMap): ParamMap

    Permalink
    Definition Classes
    Params
  28. def fillByKey(dataset: Dataset[Seq[Map[String, String]]]): Seq[Map[String, Double]]

    Permalink
    Definition Classes
    OPMapVectorizer
  29. def filterKeys[V](m: Map[String, V], shouldCleanKey: Boolean, shouldCleanValue: Boolean): Map[String, V]

    Permalink
    Attributes
    protected
    Definition Classes
    MapPivotParams
  30. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  31. def fit(dataset: Dataset[_]): SequenceModel[T, OPVector]

    Permalink

    Spark operation on dataset to produce Dataset for constructor fit function and then turn output function into a Model

    Spark operation on dataset to produce Dataset for constructor fit function and then turn output function into a Model

    dataset

    input data for this stage

    returns

    a fitted model that will perform the transformation specified by the function defined in constructor fit

    Definition Classes
    SequenceEstimator → Estimator
  32. def fit(dataset: Dataset[_], paramMaps: Array[ParamMap]): Seq[SequenceModel[T, OPVector]]

    Permalink
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  33. def fit(dataset: Dataset[_], paramMap: ParamMap): SequenceModel[T, OPVector]

    Permalink
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  34. def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): SequenceModel[T, OPVector]

    Permalink
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" ) @varargs()
  35. def fitFn(dataset: Dataset[Seq[Map[String, String]]]): SequenceModel[T, OPVector]

    Permalink

    Function that fits the sequence model

    Function that fits the sequence model

    Definition Classes
    OPMapVectorizerSequenceEstimator
  36. final def get[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  37. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  38. final def getDefault[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  39. def getHashSpaceStrategy: HashSpaceStrategy

    Permalink
  40. final def getInputFeature[T <: FeatureType](i: Int): Option[FeatureLike[T]]

    Permalink

    Gets an input feature Note: this method IS NOT safe to use outside the driver, please use getTransientFeature method instead

    Gets an input feature Note: this method IS NOT safe to use outside the driver, please use getTransientFeature method instead

    returns

    array of features

    Definition Classes
    InputParams
    Exceptions thrown

    NoSuchElementException if the features are not set

    RuntimeException in case one of the features is null

  41. final def getInputFeatures(): Array[OPFeature]

    Permalink

    Gets the input features Note: this method IS NOT safe to use outside the driver, please use getTransientFeatures method instead

    Gets the input features Note: this method IS NOT safe to use outside the driver, please use getTransientFeatures method instead

    returns

    array of features

    Definition Classes
    InputParams
    Exceptions thrown

    NoSuchElementException if the features are not set

    RuntimeException in case one of the features is null

  42. final def getInputSchema(): StructType

    Permalink
    Definition Classes
    OpPipelineStageParams
  43. def getKeyValues(in: Dataset[Seq[Map[String, Double]]], shouldCleanKeys: Boolean, shouldCleanValues: Boolean): Seq[Seq[String]]

    Permalink
    Attributes
    protected
    Definition Classes
    MapVectorizerFuns
  44. final def getMetadata(): Metadata

    Permalink
    Definition Classes
    OpPipelineStageParams
  45. final def getOrDefault[T](param: Param[T]): T

    Permalink
    Definition Classes
    Params
  46. def getOutput(): FeatureLike[OPVector]

    Permalink

    Output features that will be created by this stage

    Output features that will be created by this stage

    returns

    feature of type OutputFeatures

    Definition Classes
    HasOut → OpPipelineStageBase
  47. final def getOutputFeatureName: String

    Permalink

    Name of output feature (i.e.

    Name of output feature (i.e. column created by this stage)

    Definition Classes
    OpPipelineStage
  48. def getParam(paramName: String): Param[Any]

    Permalink
    Definition Classes
    Params
  49. final def getTransientFeature(i: Int): Option[TransientFeature]

    Permalink

    Gets an input feature at index i

    Gets an input feature at index i

    i

    input index

    returns

    maybe an input feature

    Definition Classes
    InputParams
  50. final def getTransientFeatures(): Array[TransientFeature]

    Permalink

    Gets the input Features

    Gets the input Features

    returns

    input features

    Definition Classes
    InputParams
  51. final def hasDefault[T](param: Param[T]): Boolean

    Permalink
    Definition Classes
    Params
  52. def hasParam(paramName: String): Boolean

    Permalink
    Definition Classes
    Params
  53. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  54. final val hashSpaceStrategy: Param[String]

    Permalink
  55. final def inN: Array[TransientFeature]

    Permalink
    Attributes
    protected
    Definition Classes
    HasInN
  56. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  57. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  58. final def inputAsArray(in: InputFeatures): Array[OPFeature]

    Permalink

    Function to convert InputFeatures to an Array of FeatureLike

    Function to convert InputFeatures to an Array of FeatureLike

    returns

    an Array of FeatureLike

    Definition Classes
    OpPipelineStageNInputParams
  59. final def isDefined(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  60. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  61. final def isSet(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  62. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  63. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  64. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  65. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  66. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  67. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  68. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  69. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  70. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  71. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  72. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  73. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  74. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  75. def makeModel(args: OPMapVectorizerModelArgs, operationName: String, uid: String): OPMapVectorizerModel[String, T]

    Permalink
    Definition Classes
    TextMapHashingVectorizerOPMapVectorizer
  76. def makeVectorMetaWithNullIndicators(allKeys: Seq[Seq[String]]): OpVectorMetadata

    Permalink
    Attributes
    protected
    Definition Classes
    MapVectorizerFuns
  77. def makeVectorMetadata(allKeys: Seq[Seq[String]]): OpVectorMetadata

    Permalink
    Attributes
    protected
    Definition Classes
    MapVectorizerFuns
  78. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  79. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  80. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  81. final val numFeatures: IntParam

    Permalink
  82. def onGetMetadata(): Unit

    Permalink

    Function to be called on getMetadata

    Function to be called on getMetadata

    Attributes
    protected
    Definition Classes
    OpPipelineStageParams
  83. def onSetInput(): Unit

    Permalink

    Function to be called on setInput

    Function to be called on setInput

    Definition Classes
    VectorizerDefaultsInputParams
  84. val operationName: String

    Permalink

    unique name of the operation this stage performs

    unique name of the operation this stage performs

    Definition Classes
    SequenceEstimatorOpPipelineStageBase
  85. final def outputAsArray(out: OutputFeatures): Array[OPFeature]

    Permalink

    Function to convert OutputFeatures to an Array of FeatureLike

    Function to convert OutputFeatures to an Array of FeatureLike

    returns

    an Array of FeatureLike

    Definition Classes
    OpPipelineStageOpPipelineStageBase
  86. def outputFeatureUid: String

    Permalink
    Attributes
    protected[com.salesforce.op]
    Definition Classes
    OpPipelineStageNOpPipelineStage
  87. def outputIsResponse: Boolean

    Permalink

    Should output feature be a response? Yes, if any of the input features are.

    Should output feature be a response? Yes, if any of the input features are.

    returns

    true if the the output feature should be a response

    Definition Classes
    OpPipelineStage
  88. def outputVectorMeta: OpVectorMetadata

    Permalink

    Get the metadata describing the output vector

    Get the metadata describing the output vector

    This does not trigger onGetMetadata()

    returns

    Metadata of output vector

    Attributes
    protected
    Definition Classes
    VectorizerDefaults
  89. lazy val params: Array[Param[_]]

    Permalink
    Definition Classes
    Params
  90. final val prependFeatureName: BooleanParam

    Permalink
  91. def save(path: String): Unit

    Permalink
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  92. val seqIConvert: FeatureTypeSparkConverter[T]

    Permalink
    Definition Classes
    SequenceEstimator
  93. implicit val seqIEncoder: Encoder[Seq[T.Value]]

    Permalink
    Definition Classes
    SequenceEstimator
  94. final def set(paramPair: ParamPair[_]): TextMapHashingVectorizer.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  95. final def set(param: String, value: Any): TextMapHashingVectorizer.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  96. final def set[T](param: Param[T], value: T): TextMapHashingVectorizer.this.type

    Permalink
    Definition Classes
    Params
  97. final def setAllowListKeys(keys: Array[String]): TextMapHashingVectorizer.this.type

    Permalink
    Definition Classes
    MapPivotParams
  98. final def setBlockListKeys(keys: Array[String]): TextMapHashingVectorizer.this.type

    Permalink
    Definition Classes
    MapPivotParams
  99. def setCleanKeys(clean: Boolean): TextMapHashingVectorizer.this.type

    Permalink
    Definition Classes
    MapPivotParams
  100. def setCleanText(clean: Boolean): TextMapHashingVectorizer.this.type

    Permalink
    Definition Classes
    TextParams
  101. final def setDefault(paramPairs: ParamPair[_]*): TextMapHashingVectorizer.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  102. final def setDefault[T](param: Param[T], value: T): TextMapHashingVectorizer.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  103. def setDefaultValue(value: Double): TextMapHashingVectorizer.this.type

    Permalink
    Definition Classes
    NumericMapDefaultParam
  104. def setFillWithConstant(value: Double): TextMapHashingVectorizer.this.type

    Permalink
    Definition Classes
    NumericMapDefaultParam
  105. def setHashSpaceStrategy(v: HashSpaceStrategy): TextMapHashingVectorizer.this.type

    Permalink
  106. final def setInput(features: FeatureLike[T]*): TextMapHashingVectorizer.this.type

    Permalink
    Definition Classes
    OpPipelineStageN
  107. final def setInput(features: InputFeatures): TextMapHashingVectorizer.this.type

    Permalink

    Input features that will be used by the stage

    Input features that will be used by the stage

    returns

    feature of type InputFeatures

    Definition Classes
    OpPipelineStageBase
  108. final def setInputFeatures[S <: OPFeature](features: Array[S]): TextMapHashingVectorizer.this.type

    Permalink

    Sets input features

    Sets input features

    S

    feature like type

    features

    array of input features

    returns

    this stage

    Attributes
    protected
    Definition Classes
    InputParams
  109. final def setMetadata(m: Metadata): TextMapHashingVectorizer.this.type

    Permalink
    Definition Classes
    OpPipelineStageParams
  110. def setNumFeatures(v: Int): TextMapHashingVectorizer.this.type

    Permalink
  111. def setOutputFeatureName(name: String): TextMapHashingVectorizer.this.type

    Permalink
    Definition Classes
    OpPipelineStage
  112. def setPrependFeatureName(v: Boolean): TextMapHashingVectorizer.this.type

    Permalink
  113. def setTrackNulls(v: Boolean): TextMapHashingVectorizer.this.type

    Permalink

    Option to keep track of values that were missing

    Option to keep track of values that were missing

    Definition Classes
    TrackNullsParam
  114. val shouldCleanValues: Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    OPMapVectorizer
  115. final def stageName: String

    Permalink

    Stage unique name consisting of the stage operation name and uid

    Stage unique name consisting of the stage operation name and uid

    returns

    stage name

    Definition Classes
    OpPipelineStageBase
  116. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  117. def toString(): String

    Permalink
    Definition Classes
    Identifiable → AnyRef → Any
  118. final val trackNulls: BooleanParam

    Permalink
    Definition Classes
    TrackNullsParam
  119. final def transformSchema(schema: StructType): StructType

    Permalink

    This function translates the input and output features into spark schema checks and changes that will occur on the underlying data frame

    This function translates the input and output features into spark schema checks and changes that will occur on the underlying data frame

    schema

    schema of the input data frame

    returns

    a new schema with the output features added

    Definition Classes
    OpPipelineStageBase
  120. def transformSchema(schema: StructType, logging: Boolean): StructType

    Permalink
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  121. implicit val tti: scala.reflect.api.JavaUniverse.TypeTag[T]

    Permalink

    type tag for input

    type tag for input

    Definition Classes
    SequenceEstimator
  122. implicit val ttiv: scala.reflect.api.JavaUniverse.TypeTag[T.Value]

    Permalink

    type tag for input value

    type tag for input value

    Definition Classes
    SequenceEstimator
  123. implicit val tto: scala.reflect.api.JavaUniverse.TypeTag[OPVector]

    Permalink

    type tag for input

    type tag for input

    Definition Classes
    SequenceEstimator → HasOut
  124. implicit val ttov: scala.reflect.api.JavaUniverse.TypeTag[Value]

    Permalink

    type tag for output value

    type tag for output value

    Definition Classes
    SequenceEstimator → HasOut
  125. val uid: String

    Permalink

    uid for instance

    uid for instance

    Definition Classes
    SequenceEstimator → Identifiable
  126. def vectorMetadataFromInputFeatures: OpVectorMetadata

    Permalink

    Compute the output vector metadata only from the input features.

    Compute the output vector metadata only from the input features. Vectorizers use this to derive the full vector, including pivot columns or indicator features.

    returns

    Vector metadata from input features

    Attributes
    protected
    Definition Classes
    VectorizerDefaults
  127. def vectorMetadataWithNullIndicators: OpVectorMetadata

    Permalink
    Attributes
    protected
    Definition Classes
    VectorizerDefaults
  128. def vectorOutputName: String

    Permalink

    Get the name of the output vector

    Get the name of the output vector

    returns

    Output vector name as a string

    Attributes
    protected
    Definition Classes
    VectorizerDefaults
  129. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  130. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  131. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  132. final val withConstant: BooleanParam

    Permalink
    Definition Classes
    NumericMapDefaultParam
  133. final def write: MLWriter

    Permalink
    Definition Classes
    OpPipelineStageBase → MLWritable

Inherited from TextParams

Inherited from OPMapVectorizer[String, T]

Inherited from TrackNullsParam

Inherited from NumericMapDefaultParam

Inherited from MapVectorizerFuns[Double, RealMap]

Inherited from CleanTextMapFun

Inherited from CleanTextFun

Inherited from MapPivotParams

Inherited from VectorizerDefaults

Inherited from SequenceEstimator[T, OPVector]

Inherited from OpPipelineStageN[T, OPVector]

Inherited from HasOut[OPVector]

Inherited from HasInN

Inherited from OpPipelineStage[OPVector]

Inherited from OpPipelineStageBase

Inherited from MLWritable

Inherited from OpPipelineStageParams

Inherited from InputParams

Inherited from Estimator[SequenceModel[T, OPVector]]

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Ungrouped