Feature Selection With Importance¶
Output¶
Pass the input DataFrame to output and in table format feature importance is displayed.
Type¶
transform
Class¶
fire.nodes.featureselection.NodeFeatureSelectionWithImportance
Fields¶
Name |
Title |
Description |
|---|---|---|
targetCol |
TargetCol |
Target column to be used when selecting the features |
featureCols |
Feature Columns |
The list of columns for which to calculate the feature importance |
mlType |
MLType |
target column type, Regression or Classification |
Details¶
Feature Selection With Importance Node Details¶
This node used Random Forest which is a very powerful model both for regression and classification. It can give its own interpretation of feature importance as well, which can be plotted and used for selecting the most informative set of features.
Input Parameters¶
OUTPUT STORAGE LEVEL : Keep this as DEFAULT.
TARGETCOL : Select the target variable for which we want to explore the existence of a relationship.
FEATURE COLUMNS :
Available : A list of numeric feature columns derived from the input dataframe schema.
Selected : A list of columns for whom the node will compute correlational values against the TARGETCOL.
MLTYPE : Select the type of machine learning model for the input data, choose between Regression or Classification model.
Examples¶
Feature Selection With Importance Node Example¶
For a given dataframe having the below housing schema:
price | bathrms | stories| bedrooms|
Double | Double | Double | Double |
-------------------------------------
We can select the Target Columns as price and explore the correlation which exists between the target column and the feature columns of bathrms, stories and bedrooms for a Regression model.
This will yield two output sections:
A Feature Importance table showing the correlation percentage between the target column and the feature columns &
Output of the input dataframe in tabular format