String Indexer =========== StringIndexer encodes a string column of labels to a column of label indices Input -------------- It takes in a DataFrame and transforms it to another DataFrame Output -------------- It adds a new column containing the encoding of the string column of labels to a column of label indices, to the incoming DataFrame. Type --------- ml-transformer Class --------- fire.nodes.etl.NodeStringIndexer Fields --------- .. list-table:: :widths: 10 5 10 :header-rows: 1 * - Name - Title - Description * - inputCol - Input Columns - Input column * - outputCol - Output Column - Output column Details ------- String Indexer Node Details +++++++++++++++ The String Indexer Node is used to encode a string column of labels to a column of label indices. It takes in a DataFrame and transforms it to another DataFrame by adding a new column containing the encoding of the string column of labels to a column of label indices. It takes in the parameters handleInvalid, inputCol and outputCol, which are used to handle invalid entries, input column name and output column name respectively. Input Parameters +++++++++++++++ HANDLE INVALID: Select whether to skip or throw error on invalid entries. INPUT COLUMN: Select the required column for encoding. OUTPUT COLUMN: The name of the output column after encoding. Examples ------- String Indexer Node Example +++++++++++++++ Consider the below **String Indexer** output for the **color** column id color encoded_color 0 red 2 1 green 1 2 blue 0 3 purple 3 In this example, the input column is color and the output column is encoded_color. The string indexer encodes the color column to a column of label indices. The handleInvalid is set to skip, so any invalid entries will be skipped.