Class

com.salesforce.op.stages.impl.feature

SmartTextVectorizerModel

Related Doc: package feature

Permalink

final class SmartTextVectorizerModel[T <: Text] extends SequenceModel[T, OPVector] with TextTokenizerParams with HashingFun with OneHotModelFun[Text]

Linear Supertypes
OneHotModelFun[Text], CleanTextFun, HashingFun, TextTokenizerParams, LanguageDetectionParams, SequenceModel[T, OPVector], OpTransformerN[T, OPVector], OpTransformer, OpPipelineStageN[T, OPVector], HasInN, OpPipelineStage[OPVector], OpPipelineStageBase, MLWritable, OpPipelineStageParams, InputParams, Model[SequenceModel[T, OPVector]], Transformer, PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. SmartTextVectorizerModel
  2. OneHotModelFun
  3. CleanTextFun
  4. HashingFun
  5. TextTokenizerParams
  6. LanguageDetectionParams
  7. SequenceModel
  8. OpTransformerN
  9. OpTransformer
  10. OpPipelineStageN
  11. HasInN
  12. OpPipelineStage
  13. OpPipelineStageBase
  14. MLWritable
  15. OpPipelineStageParams
  16. InputParams
  17. Model
  18. Transformer
  19. PipelineStage
  20. Logging
  21. Params
  22. Serializable
  23. Serializable
  24. Identifiable
  25. AnyRef
  26. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. final type InputFeatures = Array[FeatureLike[T]]

    Permalink

    Input Features type

    Input Features type

    Definition Classes
    OpPipelineStageNOpPipelineStageInputParams
  2. type KeyValue = (String) ⇒ Any

    Permalink

    Feature name (key) -> value lookup, e.g Row, Map etc.

    Feature name (key) -> value lookup, e.g Row, Map etc.

    Definition Classes
    OpTransformer
  3. final type OutputFeatures = FeatureLike[OPVector]

    Permalink
    Definition Classes
    OpPipelineStageOpPipelineStageBase

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  5. val args: SmartTextVectorizerModelArgs

    Permalink
  6. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  7. final val autoDetectLanguage: BooleanParam

    Permalink

    Indicates whether to attempt language detection.

    Indicates whether to attempt language detection.

    Definition Classes
    LanguageDetectionParams
  8. final val autoDetectThreshold: DoubleParam

    Permalink

    Language detection threshold.

    Language detection threshold. If none of the detected languages have confidence greater than the threshold then defaultLanguage is used.

    Definition Classes
    LanguageDetectionParams
  9. final def checkInputLength(features: Array[_]): Boolean

    Permalink

    Checks the input length

    Checks the input length

    features

    input features

    returns

    true is input size as expected, false otherwise

    Definition Classes
    OpPipelineStageNInputParams
  10. final def checkSerializable: Try[Unit]

    Permalink

    Check if the stage is serializable

    Check if the stage is serializable

    returns

    Failure if not serializable

    Definition Classes
    OpTransformerNOpPipelineStageBase
  11. def cleanTextFn(s: String, shouldClean: Boolean): String

    Permalink
    Definition Classes
    CleanTextFun
  12. final def clear(param: Param[_]): SmartTextVectorizerModel.this.type

    Permalink
    Definition Classes
    Params
  13. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  14. def convertToSet(in: Text): Set[String]

    Permalink
    Attributes
    protected
    Definition Classes
    SmartTextVectorizerModel → OneHotModelFun
  15. final def copy(extra: ParamMap): SmartTextVectorizerModel.this.type

    Permalink

    This method is used to make a copy of the instance with new parameters in several methods in spark internals Default will find the constructor and make a copy for any class (AS LONG AS ALL CONSTRUCTOR PARAMS ARE VALS, this is why type tags are written as implicit vals in base classes).

    This method is used to make a copy of the instance with new parameters in several methods in spark internals Default will find the constructor and make a copy for any class (AS LONG AS ALL CONSTRUCTOR PARAMS ARE VALS, this is why type tags are written as implicit vals in base classes).

    Note: that the convention in spark is to have the uid be a constructor argument, so that copies will share a uid with the original (developers should follow this convention).

    extra

    new parameters want to add to instance

    returns

    a new instance with the same uid

    Definition Classes
    OpPipelineStageBase → Params
  16. def copyValues[T <: Params](to: T, extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  17. final def defaultCopy[T <: Params](extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  18. final val defaultLanguage: Param[String]

    Permalink

    Default language to assume in case autoDetectLanguage is disabled or failed to make a good enough prediction.

    Default language to assume in case autoDetectLanguage is disabled or failed to make a good enough prediction.

    Definition Classes
    LanguageDetectionParams
  19. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  20. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  21. def explainParam(param: Param[_]): String

    Permalink
    Definition Classes
    Params
  22. def explainParams(): String

    Permalink
    Definition Classes
    Params
  23. final def extractParamMap(): ParamMap

    Permalink
    Definition Classes
    Params
  24. final def extractParamMap(extra: ParamMap): ParamMap

    Permalink
    Definition Classes
    Params
  25. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  26. final def get[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  27. def getAutoDetectLanguage: Boolean

    Permalink
    Definition Classes
    LanguageDetectionParams
  28. def getAutoDetectThreshold: Double

    Permalink
    Definition Classes
    LanguageDetectionParams
  29. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  30. final def getDefault[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  31. def getDefaultLanguage: Language

    Permalink
    Definition Classes
    LanguageDetectionParams
  32. final def getInputFeature[T <: FeatureType](i: Int): Option[FeatureLike[T]]

    Permalink

    Gets an input feature Note: this method IS NOT safe to use outside the driver, please use getTransientFeature method instead

    Gets an input feature Note: this method IS NOT safe to use outside the driver, please use getTransientFeature method instead

    returns

    array of features

    Definition Classes
    InputParams
    Exceptions thrown

    NoSuchElementException if the features are not set

    RuntimeException in case one of the features is null

  33. final def getInputFeatures(): Array[OPFeature]

    Permalink

    Gets the input features Note: this method IS NOT safe to use outside the driver, please use getTransientFeatures method instead

    Gets the input features Note: this method IS NOT safe to use outside the driver, please use getTransientFeatures method instead

    returns

    array of features

    Definition Classes
    InputParams
    Exceptions thrown

    NoSuchElementException if the features are not set

    RuntimeException in case one of the features is null

  34. final def getInputSchema(): StructType

    Permalink
    Definition Classes
    OpPipelineStageParams
  35. final def getMetadata(): Metadata

    Permalink
    Definition Classes
    OpPipelineStageParams
  36. def getMinTokenLength: Int

    Permalink
    Definition Classes
    TextTokenizerParams
  37. final def getOrDefault[T](param: Param[T]): T

    Permalink
    Definition Classes
    Params
  38. def getOutput(): FeatureLike[OPVector]

    Permalink

    Output features that will be created by this stage

    Output features that will be created by this stage

    returns

    feature of type OutputFeatures

    Definition Classes
    OpPipelineStageNOpPipelineStageBase
  39. final def getOutputFeatureName: String

    Permalink

    Name of output feature (i.e.

    Name of output feature (i.e. column created by this stage)

    Definition Classes
    OpPipelineStage
  40. def getParam(paramName: String): Param[Any]

    Permalink
    Definition Classes
    Params
  41. def getToLowercase: Boolean

    Permalink
    Definition Classes
    TextTokenizerParams
  42. final def getTransientFeature(i: Int): Option[TransientFeature]

    Permalink

    Gets an input feature at index i

    Gets an input feature at index i

    i

    input index

    returns

    maybe an input feature

    Definition Classes
    InputParams
  43. final def getTransientFeatures(): Array[TransientFeature]

    Permalink

    Gets the input Features

    Gets the input Features

    returns

    input features

    Definition Classes
    InputParams
  44. final def hasDefault[T](param: Param[T]): Boolean

    Permalink
    Definition Classes
    Params
  45. def hasParam(paramName: String): Boolean

    Permalink
    Definition Classes
    Params
  46. def hasParent: Boolean

    Permalink
    Definition Classes
    Model
  47. def hash[T <: OPCollection](in: Seq[T], features: Array[TransientFeature], params: HashingFunctionParams): OPVector

    Permalink

    Hashes input sequence of values into OPVector using the supplied hashing params

    Hashes input sequence of values into OPVector using the supplied hashing params

    Attributes
    protected
    Definition Classes
    HashingFun
  48. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  49. def hashingTF(params: HashingFunctionParams): HashingTF

    Permalink

    HashingTF instance

    HashingTF instance

    Attributes
    protected
    Definition Classes
    HashingFun
  50. final def inN: Array[TransientFeature]

    Permalink
    Attributes
    protected
    Definition Classes
    HasInN
  51. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  52. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  53. final def inputAsArray(in: InputFeatures): Array[OPFeature]

    Permalink

    Function to convert InputFeatures to an Array of FeatureLike

    Function to convert InputFeatures to an Array of FeatureLike

    returns

    an Array of FeatureLike

    Definition Classes
    OpPipelineStageNInputParams
  54. final def isDefined(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  55. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  56. final def isSet(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  57. def isSharedHashSpace(p: HashingFunctionParams, numFeatures: Option[Int] = None): Boolean

    Permalink

    Determine if the transformer should use a shared hash space for all features or not

    Determine if the transformer should use a shared hash space for all features or not

    returns

    true if the shared hashing space to be used, false otherwise

    Attributes
    protected
    Definition Classes
    HashingFun
  58. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  59. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  60. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  61. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  62. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  63. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  64. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  65. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  66. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  67. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  68. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  69. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  70. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  71. def makeVectorColumnMetadata(features: Array[TransientFeature], params: HashingFunctionParams): Array[OpVectorColumnMetadata]

    Permalink
    Attributes
    protected
    Definition Classes
    HashingFun
  72. def makeVectorMetadata(features: Array[TransientFeature], params: HashingFunctionParams, outputName: String): OpVectorMetadata

    Permalink
    Attributes
    protected
    Definition Classes
    HashingFun
  73. final val minTokenLength: IntParam

    Permalink

    Minimum token length, >= 1.

    Minimum token length, >= 1.

    Definition Classes
    TextTokenizerParams
  74. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  75. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  76. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  77. def onGetMetadata(): Unit

    Permalink

    Function to be called on getMetadata

    Function to be called on getMetadata

    Attributes
    protected
    Definition Classes
    OpPipelineStageParams
  78. def onSetInput(): Unit

    Permalink

    Function to be called on setInput

    Function to be called on setInput

    Attributes
    protected
    Definition Classes
    OpPipelineStageBase
  79. val operationName: String

    Permalink

    unique name of the operation this stage performs

    unique name of the operation this stage performs

    Definition Classes
    SequenceModelOpPipelineStageBase
  80. final def outputAsArray(out: OutputFeatures): Array[OPFeature]

    Permalink

    Function to convert OutputFeatures to an Array of FeatureLike

    Function to convert OutputFeatures to an Array of FeatureLike

    returns

    an Array of FeatureLike

    Definition Classes
    OpPipelineStageOpPipelineStageBase
  81. def outputFeatureUid: String

    Permalink
    Attributes
    protected[com.salesforce.op]
    Definition Classes
    OpPipelineStageNOpPipelineStage
  82. def outputIsResponse: Boolean

    Permalink

    Should output feature be a response? Yes, if any of the input features are.

    Should output feature be a response? Yes, if any of the input features are.

    returns

    true if the the output feature should be a response

    Definition Classes
    OpPipelineStage
  83. lazy val params: Array[Param[_]]

    Permalink
    Definition Classes
    Params
  84. var parent: Estimator[SequenceModel[T, OPVector]]

    Permalink
    Definition Classes
    Model
  85. def pivotFn(topValues: Seq[Seq[String]], shouldCleanText: Boolean, shouldTrackNulls: Boolean): (Seq[Text]) ⇒ OPVector

    Permalink
    Attributes
    protected
    Definition Classes
    OneHotModelFun
  86. def prepare[T <: OPCollection](el: T, shouldHashWithIndex: Boolean, shouldPrependFeatureName: Boolean, featureNameHash: Int): Iterable[Any]

    Permalink

    Function that prepares the input columns to be hashed Note that MurMur3 hashing algorithm only defined for primitive types so need to convert tuples to strings.

    Function that prepares the input columns to be hashed Note that MurMur3 hashing algorithm only defined for primitive types so need to convert tuples to strings. MultiPickList sets are hashed as is since there is no meaningful order in the selected choices. Lists and vectors can be hashed with or without their indices, since order may be important. Maps are hashed as (key,value) strings.

    el

    element we are hashing (eg. an OPList, OPMap, etc.)

    returns

    an Iterable object corresponding to the hashed element

    Attributes
    protected
    Definition Classes
    HashingFun
  87. def save(path: String): Unit

    Permalink
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  88. final def set(paramPair: ParamPair[_]): SmartTextVectorizerModel.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  89. final def set(param: String, value: Any): SmartTextVectorizerModel.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  90. final def set[T](param: Param[T], value: T): SmartTextVectorizerModel.this.type

    Permalink
    Definition Classes
    Params
  91. def setAutoDetectLanguage(value: Boolean): SmartTextVectorizerModel.this.type

    Permalink
    Definition Classes
    LanguageDetectionParams
  92. def setAutoDetectThreshold(value: Double): SmartTextVectorizerModel.this.type

    Permalink
    Definition Classes
    LanguageDetectionParams
  93. final def setDefault(paramPairs: ParamPair[_]*): SmartTextVectorizerModel.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  94. final def setDefault[T](param: Param[T], value: T): SmartTextVectorizerModel.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  95. def setDefaultLanguage(value: Language): SmartTextVectorizerModel.this.type

    Permalink
    Definition Classes
    LanguageDetectionParams
  96. final def setInput(features: FeatureLike[T]*): SmartTextVectorizerModel.this.type

    Permalink
    Definition Classes
    OpPipelineStageN
  97. final def setInput(features: InputFeatures): SmartTextVectorizerModel.this.type

    Permalink

    Input features that will be used by the stage

    Input features that will be used by the stage

    returns

    feature of type InputFeatures

    Definition Classes
    OpPipelineStageBase
  98. final def setInputFeatures[S <: OPFeature](features: Array[S]): SmartTextVectorizerModel.this.type

    Permalink

    Sets input features

    Sets input features

    S

    feature like type

    features

    array of input features

    returns

    this stage

    Attributes
    protected
    Definition Classes
    InputParams
  99. final def setMetadata(m: Metadata): SmartTextVectorizerModel.this.type

    Permalink
    Definition Classes
    OpPipelineStageParams
  100. def setMinTokenLength(value: Int): SmartTextVectorizerModel.this.type

    Permalink
    Definition Classes
    TextTokenizerParams
  101. def setOutputFeatureName(name: String): SmartTextVectorizerModel.this.type

    Permalink
    Definition Classes
    OpPipelineStage
  102. def setParent(parent: Estimator[SequenceModel[T, OPVector]]): SequenceModel[T, OPVector]

    Permalink
    Definition Classes
    Model
  103. def setToLowercase(value: Boolean): SmartTextVectorizerModel.this.type

    Permalink
    Definition Classes
    TextTokenizerParams
  104. final def stageName: String

    Permalink

    Stage unique name consisting of the stage operation name and uid

    Stage unique name consisting of the stage operation name and uid

    returns

    stage name

    Definition Classes
    OpPipelineStageBase
  105. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  106. final val toLowercase: BooleanParam

    Permalink

    Indicates whether to convert all characters to lowercase before tokenizing.

    Indicates whether to convert all characters to lowercase before tokenizing.

    Definition Classes
    TextTokenizerParams
  107. def toString(): String

    Permalink
    Definition Classes
    Identifiable → AnyRef → Any
  108. def tokenize(text: Text, languageDetector: LanguageDetector = TextTokenizer.LanguageDetector, analyzer: TextAnalyzer = TextTokenizer.Analyzer): TextTokenizerResult

    Permalink
    Definition Classes
    TextTokenizerParams
  109. def transform(dataset: Dataset[_]): DataFrame

    Permalink

    Spark operation on dataset to produce new output feature column using defined function

    Spark operation on dataset to produce new output feature column using defined function

    dataset

    input data for this stage

    returns

    a new dataset containing a column for the transformed feature

    Definition Classes
    OpTransformerN → Transformer
  110. def transform(dataset: Dataset[_], paramMap: ParamMap): DataFrame

    Permalink
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" )
  111. def transform(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): DataFrame

    Permalink
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" ) @varargs()
  112. def transformFn: (Seq[Text]) ⇒ OPVector

    Permalink

    Function used to convert input to output

    Function used to convert input to output

    Definition Classes
    SmartTextVectorizerModelOpTransformerN
  113. lazy val transformKeyValue: (KeyValue) ⇒ Any

    Permalink

    Creates a transform function to transform any key/value to a value

    Creates a transform function to transform any key/value to a value

    returns

    a transform function to transform any key/value to a value

    Definition Classes
    OpTransformerN → OpTransformer
  114. def transformMap: (Map[String, Any]) ⇒ Any

    Permalink

    Creates a transform function to transform Map to a value

    Creates a transform function to transform Map to a value

    returns

    a transform function to transform Map to a value

    Definition Classes
    OpTransformer
  115. def transformRow: (Row) ⇒ Any

    Permalink

    Creates a transform function to transform Row to a value

    Creates a transform function to transform Row to a value

    returns

    a transform function to transform Row to a value

    Definition Classes
    OpTransformer
  116. final def transformSchema(schema: StructType): StructType

    Permalink

    This function translates the input and output features into spark schema checks and changes that will occur on the underlying data frame

    This function translates the input and output features into spark schema checks and changes that will occur on the underlying data frame

    schema

    schema of the input data frame

    returns

    a new schema with the output features added

    Definition Classes
    OpPipelineStageBase
  117. def transformSchema(schema: StructType, logging: Boolean): StructType

    Permalink
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  118. implicit val tti: scala.reflect.api.JavaUniverse.TypeTag[T]

    Permalink

    type tag for input

    type tag for input

    Definition Classes
    SequenceModelOpTransformerN
  119. implicit val tto: scala.reflect.api.JavaUniverse.TypeTag[OPVector]

    Permalink

    type tag for output

    type tag for output

    Definition Classes
    SequenceModelOpPipelineStageN
  120. implicit val ttov: scala.reflect.api.JavaUniverse.TypeTag[Value]

    Permalink

    type tag for output value

    type tag for output value

    Definition Classes
    SequenceModelOpPipelineStageN
  121. val uid: String

    Permalink

    uid for instance

    uid for instance

    Definition Classes
    SequenceModel → Identifiable
  122. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  123. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  124. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  125. final def write: MLWriter

    Permalink
    Definition Classes
    OpPipelineStageBase → MLWritable

Inherited from OneHotModelFun[Text]

Inherited from CleanTextFun

Inherited from HashingFun

Inherited from TextTokenizerParams

Inherited from LanguageDetectionParams

Inherited from SequenceModel[T, OPVector]

Inherited from OpTransformerN[T, OPVector]

Inherited from OpTransformer

Inherited from OpPipelineStageN[T, OPVector]

Inherited from HasInN

Inherited from OpPipelineStage[OPVector]

Inherited from OpPipelineStageBase

Inherited from MLWritable

Inherited from OpPipelineStageParams

Inherited from InputParams

Inherited from Model[SequenceModel[T, OPVector]]

Inherited from Transformer

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Ungrouped