Correlation¶
calculates the correlation between two series of data.
Input¶
It takes in a DataFrame and transforms it to another DataFrame
Output¶
The input DataFrame is passed along to the next Processors
Type¶
transform
Class¶
fire.nodes.ml.NodeCorrelation
Fields¶
Name |
Title |
Description |
|---|---|---|
title |
Title |
|
inputCols |
Input Column for Correlation |
Column Names to check correlation |
Details¶
Correlation Node Details¶
Correlation is to measure if two variables or two feature columns tend to move in together in same or opposite direction. The idea is to detect if one variable or feature column can be predicted by another variable or feature column.
The Correlation node uses the method of Pearson’s correlation for checking correlation between two continuous variables (or feature columns)
Input Parameters¶
OUTPUT STORAGE LEVEL : Keep this as DEFAULT.
TITLE : A short description to summarizes what the data depicts.
INPUT COLUMN FOR CORRELATION :
Available : A list of numeric columns derived from the input dataframe schema.
Selected : A list of numeric columns among which the correlation is to be predicted.
Examples¶
Correlation Node Example¶
For a given dataframe having the below schema:
Course | Amount | Discount|
(String)| Double | Double |
----------------------------
We can select the Amount and Discount fields for which we need to find the correlation.
This will yield three separate output sections:
A Correlation Table
A Correlation Matrix &
A sample data values of the input dataframe