Multi Regex Extractor

This node is used to extract pattern from input columns

Input

This type of node takes in a DataFrame and transforms it to another DataFrame

Output

This node extract pattern from input columns as specified

Type

transform

Class

fire.nodes.etl.NodeMultiRegexExtractor

Fields

Name

Title

Description

inputColNames

InputColumnsName

Columns

outputColNames

OuputColumnsName

name of the output column

patterns

Patterns

patterns or regex to extract the input column name

groups

Groups

An regular expression group number starting with 1, defining which portion of the matching string will be returned

Details

This node extracts data from columns present in the incoming Dataframe based on provided pattern and add them as new columns in outgoing Dataframe.

Examples

Incoming Dataframe has following rows:

CUST_CD    |    CUST_NAME    |    AGE    |    DATE_OF_JOINING    |    SALARY
-------------------------------------------------------------------------------------
C01        |    MATT         |    50     |    12-02-2002         |    USD 200000.00
C02        |    LISA         |    45     |    15-11-2020         |    GBP 100000.00
C03        |    ROBIN        |    30     |    10-10-2015         |    EUR 15000.00
C04        |    MARCUS       |    35     |    01-01-2021         |    AUD 350000.00

If MultiRegexExtractor node is configured to extract data based on patterns as mentioned below:

INPUTCOLUMNSNAME    |    OUPUTCOLUMNSNAME    |    PATTERNS    |    GROUPS
---------------------------------------------------------------------------
CUST_CD             |    Cust_ID             |    \d{1,2}     |    0
DATE_OF_JOINING     |    DOJ_Year            |    \d{4}       |    0
SALARY              |    Currency            |    \w{3}       |    0

then outgoing Dataframe would be created as below:

CUST_CD    |    CUST_NAME    |    AGE    |    DATE_OF_JOINING    |    SALARY         |    Cust_ID    |    DOJ_Year    |    Currency
------------------------------------------------------------------------------------------------------------------------------------
C01        |    MATT         |    50     |    12-02-2002         |    USD 200000.00  |    01         |    2002        |    USD
C02        |    LISA         |    45     |    15-11-2020         |    GBP 100000.00  |    02         |    2020        |    GBP
C03        |    ROBIN        |    30     |    10-10-2015         |    EUR 15000.00   |    03         |    2015        |    EUR
C04        |    MARCUS       |    35     |    01-01-2021         |    AUD 350000.00  |    04         |    2021        |    AUD