Multi Regex Extractor¶

This node is used to extract pattern from input columns

Input¶

This type of node takes in a DataFrame and transforms it to another DataFrame

Output¶

This node extract pattern from input columns as specified

Type¶

transform

Class¶

fire.nodes.etl.NodeMultiRegexExtractor

Fields¶

Name	Title	Description
inputColNames	InputColumnsName	Columns
outputColNames	OuputColumnsName	name of the output column
patterns	Patterns	patterns or regex to extract the input column name
groups	Groups	An regular expression group number starting with 1, defining which portion of the matching string will be returned

Details¶

This node extracts data from columns present in the incoming Dataframe based on provided pattern and add them as new columns in outgoing Dataframe.

Examples¶

Incoming Dataframe has following rows:

CUST_CD    |    CUST_NAME    |    AGE    |    DATE_OF_JOINING    |    SALARY
-------------------------------------------------------------------------------------
C01        |    MATT         |    50     |    12-02-2002         |    USD 200000.00
C02        |    LISA         |    45     |    15-11-2020         |    GBP 100000.00
C03        |    ROBIN        |    30     |    10-10-2015         |    EUR 15000.00
C04        |    MARCUS       |    35     |    01-01-2021         |    AUD 350000.00

If MultiRegexExtractor node is configured to extract data based on patterns as mentioned below:

INPUTCOLUMNSNAME    |    OUPUTCOLUMNSNAME    |    PATTERNS    |    GROUPS
---------------------------------------------------------------------------
CUST_CD             |    Cust_ID             |    \d{1,2}     |    0
DATE_OF_JOINING     |    DOJ_Year            |    \d{4}       |    0
SALARY              |    Currency            |    \w{3}       |    0

then outgoing Dataframe would be created as below:

CUST_CD    |    CUST_NAME    |    AGE    |    DATE_OF_JOINING    |    SALARY         |    Cust_ID    |    DOJ_Year    |    Currency
------------------------------------------------------------------------------------------------------------------------------------
C01        |    MATT         |    50     |    12-02-2002         |    USD 200000.00  |    01         |    2002        |    USD
C02        |    LISA         |    45     |    15-11-2020         |    GBP 100000.00  |    02         |    2020        |    GBP
C03        |    ROBIN        |    30     |    10-10-2015         |    EUR 15000.00   |    03         |    2015        |    EUR
C04        |    MARCUS       |    35     |    01-01-2021         |    AUD 350000.00  |    04         |    2021        |    AUD