Record ID¶
Adds a sequential Record ID column to the dataset, similar to Alteryx Record ID Tool. Supports grouping, sorting, and formatted incremental IDs.
Input¶
Accepts a DataFrame as input.
Output¶
Returns a DataFrame with a new Record ID column added.
Type¶
transform
Class¶
fire.nodes.etl.NodeGenerateRecordID
Fields¶
Name |
Title |
Description |
|---|---|---|
outputCol |
Output Column Name |
Name of the column that will contain generated Record IDs. |
startValue |
Starting Value |
The first value in the sequence. Default is 1. |
dataType |
Data Type |
Select the data type of the Record ID column. |
size |
Size (String Only) |
Applicable only when Data Type is String. Pads the value with leading zeros to match this length. |
position |
Column Position |
Position of the new column in the output DataFrame. |
generationScope |
Record ID Generation Scope |
Generate IDs across the entire dataset or restart numbering within each group. |
groupByCols |
Group By Columns |
Columns to group by when generating Record IDs within groups. |
orderByCols |
Order Rows Within Each Group |
Optional columns to define the order of rows within each group before numbering. Supports ascending or descending order. |
Details¶
Record ID Node Details¶
The Record ID Node generates a unique sequential identifier for each row in a DataFrame, similar to the Alteryx Record ID Tool. It supports flexible configuration, including grouping, sorting, and formatting, making it useful for adding row indexes or uniquely identifying records across or within groups.
General:¶
Output Column Name:¶
Specifies the name of the new column that will contain generated Record IDs. Example: “RecordID” or “RowNumber”.
Starting Value:¶
Specifies the initial number in the sequence. Default is 1. Example: Setting 1000 will start IDs from 1000.
Data Type:¶
Determines the data type of the Record ID column. Options include:
Int32: Generates 32-bit integer IDs.
Int64: Generates 64-bit integer IDs.
String: Generates IDs as strings (supports zero-padding when “Size” is specified).
Size (String Only):¶
Specifies the number of characters for each Record ID when Data Type is set to “String”. IDs are padded with leading zeros to match this length. Example: Size=6 produces IDs like “000001”, “000002”, etc.
Column Position:¶
Defines where to insert the Record ID column in the output DataFrame. Options:
first: Adds Record ID as the first column.
last: Adds Record ID as the last column (default).
Record ID Generation Scope:¶
Specifies how Record IDs are generated across the dataset:
entire_table: IDs are assigned sequentially across all rows.
within_group: IDs restart from the start value within each group defined by Group By Columns.
Group By Columns:¶
Specifies one or more columns used to group rows. Record IDs reset for each group. Example: “Department” groups data by department before generating IDs.
Order Rows Within Each Group:¶
Specifies columns used to order rows before assigning IDs. This supports ascending or descending order notation, e.g., “date asc, amount desc”.
Output:¶
The node returns a DataFrame identical to the input but with an additional column containing generated Record IDs. The output respects ordering and grouping configurations.
Examples¶
Example: Record ID Node¶
Example 1: Sequential IDs for Entire Dataset¶
Input:¶
| Name | Department |
|--------|-------------|
| John | HR |
| Mary | IT |
| Steve | HR |
Configuration:¶
Output Column Name: RecordID
Starting Value: 1
Data Type: Int64
Column Position: last
Generation Scope: entire_table
Output:¶
| Name | Department | RecordID |
|--------|-------------|----------|
| John | HR | 1 |
| Mary | IT | 2 |
| Steve | HR | 3 |
---
Example 2: Group-wise Record ID Generation¶
Input:¶
| Name | Department |
|--------|-------------|
| John | HR |
| Mary | HR |
| Steve | IT |
| Alice | IT |
Configuration:¶
Output Column Name: EmpID
Starting Value: 1
Data Type: Int32
Generation Scope: within_group
Group By Columns: Department
Output:¶
| Name | Department | EmpID |
|--------|-------------|-------|
| John | HR | 1 |
| Mary | HR | 2 |
| Steve | IT | 1 |
| Alice | IT | 2 |
---
Example 3: String-formatted Record IDs with Zero Padding¶
Input:¶
| Product | Price |
|----------|--------|
| Pen | 10 |
| Pencil | 5 |
| Eraser | 3 |
Configuration:¶
Output Column Name: RowID
Starting Value: 1
Data Type: String
Size: 4
Column Position: first
Output:¶
| RowID | Product | Price |
|--------|----------|--------|
| 0001 | Pen | 10 |
| 0002 | Pencil | 5 |
| 0003 | Eraser | 3 |
---
Example 4: Ordered Record ID Generation Within Groups¶
Input:¶
| Dept | Salary | Name |
|-------|---------|------|
| HR | 50000 | John |
| HR | 60000 | Mary |
| IT | 80000 | Alice |
| IT | 75000 | Bob |
Configuration:¶
Output Column Name: RankID
Starting Value: 1
Data Type: Int64
Generation Scope: within_group
Group By Columns: Dept
Order Rows Within Each Group: Salary desc
Output:¶
| Dept | Salary | Name | RankID |
|-------|---------|------|--------|
| HR | 60000 | Mary | 1 |
| HR | 50000 | John | 2 |
| IT | 80000 | Alice | 1 |
| IT | 75000 | Bob | 2 |