One Hot Encoder¶

Maps a column of label indices to a column of binary vectors, with at most a single one-value

Input¶

It takes in a DataFrame and transforms it to another DataFrame

Output¶

The output DataFrame contains a new column which contains the mapping of a column of label indices to a column of binary vectors, with at most a single one-value.

Type¶

ml-transformer

Class¶

fire.nodes.ml.NodeOneHotEncoder

Fields¶

Name	Title	Description
inputCols	Input Columns	Input columns for encoding
outputCols	Output Columns	Output columns

Details¶

One Hot Encoder Node Details¶

One-hot encoding is a process by which categorical data (such as nominal data) are converted into numerical features of a dataset. This is often a required preprocessing step since machine learning models require numerical data.

One Hot Encoder Node maps a column of label indices to a column of binary vectors, with at most a single one-value.This encoding allows algorithms which expect continuous features, such as Logistic Regression, to use categorical features.

Input Parameters¶

OUTPUT STORAGE LEVEL : Keep this as DEFAULT.
VARIABLES : Add the columns to be one-hot encoded
Input Columns : Select the categorical field in the input schema for one-hot encoding.
Output Columns : Set the name of the field in the output schema that contains the binary vector result.

Examples¶

One Hot Encoder Node Example¶

Assume that we have the following DataFrame with column v1 as integer data type:

v1 |
---|
|
|
|
|
|

The output dataframe that will be obtained by selecting v1 as the Input columns and vecOut as the Output column name will be :

 v1 |     vecOut    |
----|---------------|
| (5,[1],[1.0]) |
| (5,[2],[1.0]) |
| (5,[3],[1.0]) |
| (5,[4],[1.0]) |
| (5,[],[])     |