Sort By¶
Sorts the entire DataFrame by one or multiple columns with full control over ascending/descending order per column. Essential for ranked reports, leaderboards, time-series ordering, and preparing data for window functions or exports.
Type¶
transform
Class¶
fire.nodes.etl.NodeSortBy
Fields¶
Name |
Title |
Description |
|---|---|---|
sortByColNames |
Columns |
Select one or more columns to sort by. Drag to reorder — the top column is the primary sort key, followed by secondary, tertiary, etc. |
ascDesc |
Sorting Order |
Individually set ASC (ascending) or DESC (descending) for each selected column. Example: Salary → DESC, then Department → ASC, then Name → ASC |
Details¶
Sort By Node – Full Control Over Data Ordering¶
The Sort By node gives you pixel-perfect control over row order in your DataFrame. Whether you need top-10 customers by revenue, chronological transaction history, or complex multi-level ranking (e.g., Country DESC → Revenue DESC → Customer Name ASC), this node handles it cleanly and efficiently.
Perfect For¶
Leaderboards and rankings
Monthly/quarterly reports that must show highest → lowest
Time-series data (sort by date descending for latest first)
Preparing data before window functions (rank(), row_number(), etc.)
Excel/CSV exports where business expects specific order
Top-N + Others grouping (sort first, then add row number, then filter)
Key Features¶
Multi-level sorting (up to as many columns as needed)
Independent ASC/DESC per column
Drag-and-drop reordering of sort priority
Works on strings, numbers, dates, timestamps, nulls (nulls last by default)
Fully distributed — works on billions of rows
Pro Tips¶
Always place Sort By before Limit if creating “Top 100” reports
For stable sorting across runs, include a unique ID as the last sort column (ASC)
Nulls are treated as “greater than” any value in ASC order and “smaller than” in DESC order
Examples¶
Sort By Node – Real Business Examples¶
Example 1 – Top Earners Report (Classic Use Case)¶
Goal: Show employees from highest to lowest salary, then by name alphabetically if tie¶
Configuration¶
Columns: SALARY, EMP_NAME
Order: DESC, ASC
Result¶
| EMP_CD | EMP_NAME | SALARY |
|--------|----------|--------------|
| C04 | MARCUS | AUD 350000.00|
| C01 | MATT | USD 200000.00|
| C02 | LISA | GBP 100000.00|
| C03 | ROBIN | EUR 15000.00 |
Example 2 – Latest Transactions First (Most Recent on Top)¶
Configuration¶
Columns: TRANSACTION_DATE, TRANSACTION_ID
Order: DESC, DESC
Result¶
Latest transactions appear at the top — perfect for monitoring dashboards.
Example 3 – Multi-Level Sales Ranking¶
Goal: Rank by Region (A-Z), then by Revenue (high→low), then by Customer Name (A-Z)¶
Configuration¶
Columns: REGION, TOTAL_REVENUE, CUSTOMER_NAME
Order: ASC, DESC, ASC
Result¶
Clean, readable regional sales report exactly as business expects.
Example 4 – Prepare for Top-10 + Others¶
Step 1: Sort By → REVENUE (DESC)¶
Step 2: Add Row Number column¶
Step 3: Use Expression or Filter: case when row_num <= 10 then CUSTOMER else ‘Others’ end¶
Result¶
Top 10 customers shown individually, rest grouped as “Others”.
Example 5 – Alphabetical Customer List with Null Handling¶
Data has some missing LAST_NAME¶
Configuration¶
Columns: LAST_NAME, FIRST_NAME
Order: ASC, ASC
Result¶
Rows with null LAST_NAME appear at the bottom (Spark default behavior).
Example 6 – Stable Sorting Across Multiple Runs¶
Configuration¶
Columns: SCORE, CUSTOMER_ID
Order: DESC, ASC
Result¶
Even when scores are tied, same customer always appears in same position because of unique ID tie-breaker.