Quantile Discretizer¶
QuantileDiscretizer takes a column with continuous features and outputs a column with binned categorical features.
Input¶
It takes in a DataFrame and transforms it to another DataFrame
Output¶
The output DataFrame contains a new column of binned categorical features.
Type¶
ml-estimator
Class¶
fire.nodes.ml.NodeQuantileDiscretizer
Fields¶
Name |
Title |
Description |
|---|---|---|
inputCol |
Input Column |
The Input column name |
outputCol |
Output Column |
Output column name |
numBuckets |
NumBuckets |
Maximum number of buckets (quantiles or categories) into which the data points are grouped. Must be >= 2. |
Details¶
Quantile Discretizer Node Details¶
The Quantile Discretizer Node is used to convert a column with continuous features to a column with binned categorical features. It takes in a DataFrame and transforms it to another DataFrame with a new column of binned categorical features.
It takes in the parameters inputCol, outputCol and numBuckets, which are the input column name, output column name and maximum number of buckets respectively. The input column should be in the format of Double.
Input Parameters¶
INPUT COLUMN: Select the required column for which discretization has to be done.
OUTPUT COLUMN: The name of the output column after discretization.
NUMBUCKETS : Maximum number of buckets (quantiles or categories) into which the data points are grouped. Must be >= 2.
Examples¶
Quantile Discretizer Node Example¶
Consider the below Quantile Discretizer output for the age column
id age age_bucket
0 20 1
1 25 1
2 30 2
3 35 2
In this example, the input column is age and the output column is age_bucket. The quantile discretizer takes the column age and converts it to age_bucket by grouping the data points into 2 buckets. The value of numBuckets is 2, so the data points are grouped into 2 buckets.