Feature Selection With Importance¶

Output¶

Pass the input DataFrame to output and in table format feature importance is displayed.

Type¶

transform

Class¶

fire.nodes.featureselection.NodeFeatureSelectionWithImportance

Fields¶

Name	Title	Description
targetCol	TargetCol	Target column to be used when selecting the features
featureCols	Feature Columns	The list of columns for which to calculate the feature importance
mlType	MLType	target column type, Regression or Classification

Details¶

Feature Selection With Importance Node Details¶

This node used Random Forest which is a very powerful model both for regression and classification. It can give its own interpretation of feature importance as well, which can be plotted and used for selecting the most informative set of features.

Input Parameters¶

OUTPUT STORAGE LEVEL : Keep this as DEFAULT.
TARGETCOL : Select the target variable for which we want to explore the existence of a relationship.
FEATURE COLUMNS :
Available : A list of numeric feature columns derived from the input dataframe schema.
Selected : A list of columns for whom the node will compute correlational values against the TARGETCOL.
MLTYPE : Select the type of machine learning model for the input data, choose between Regression or Classification model.

Examples¶

Feature Selection With Importance Node Example¶

For a given dataframe having the below housing schema:

price  | bathrms | stories| bedrooms|
Double | Double  | Double | Double  |
-------------------------------------

We can select the Target Columns as price and explore the correlation which exists between the target column and the feature columns of bathrms, stories and bedrooms for a Regression model.

This will yield two output sections:

A Feature Importance table showing the correlation percentage between the target column and the feature columns &
Output of the input dataframe in tabular format