Data Wrangling

This node creates a new DataFrame by applying each of the Rules specified

Input

It takes in a DataFrame as Input

Output

It creates the output DataFrame by applying the data wrangling rules provided

Type

transform

Class

fire.nodes.etl.NodeDataWrangling

Fields

Name

Title

Description

rules

Rules

Rules to be applied on column and rows

Details

Rename one column to another

rename col:c1 to c2;

Drop Column

drop col:col4

Delete columns with some condition

delete col:col3 > 44

Substring col:col2 0,3

get substring between 0 and 3rd column from the column col2

Trim Values : Removes leading and trailing whitespace from a string value.

set col:Name value: trim(Name)

Sets the new value of Name column to be trim(Name)

Examples

Example:

Let’s assume we have a DataFrame with the following columns: id, column1, column2, column3, column4, column5, column6, column7, column8.

We want to:

Rename column1 to col2.

Drop columns column1 and column2.

Delete rows where column3 is greater than 100.

Delete rows older than 90 days.

Convert the values in column4 to uppercase.

Convert the values in column5 to lowercase.

Derive a new column avgScore by calculating the average of column6 and column7.

Derive a new column zipCodeSubstring by extracting the last 3 characters from the column8.

Configuration:

rename col:cl to c2;

drop col:cl,c2;

delete row:(column3 > 100);

delete row:(dateAge > 90);

set col:c4 value:UPPER(c4);

set col:c5 value:LOWER(c5);

derive value: AVERAGE(Score) avgScore;

derive value: SUBSTRING(ZipCode,3,5)

Node Execution:

The node will apply the specified rules to the DataFrame, resulting in a transformed DataFrame with the desired modifications.