Sort By

Sorts the entire DataFrame by one or multiple columns with full control over ascending/descending order per column. Essential for ranked reports, leaderboards, time-series ordering, and preparing data for window functions or exports.

Type

transform

Class

fire.nodes.etl.NodeSortBy

Fields

Name

Title

Description

sortByColNames

Columns

Select one or more columns to sort by. Drag to reorder — the top column is the primary sort key, followed by secondary, tertiary, etc.

ascDesc

Sorting Order

Individually set ASC (ascending) or DESC (descending) for each selected column. Example: Salary → DESC, then Department → ASC, then Name → ASC

Details

Sort By Node – Full Control Over Data Ordering

The Sort By node gives you pixel-perfect control over row order in your DataFrame. Whether you need top-10 customers by revenue, chronological transaction history, or complex multi-level ranking (e.g., Country DESC → Revenue DESC → Customer Name ASC), this node handles it cleanly and efficiently.

Perfect For

  • Leaderboards and rankings

  • Monthly/quarterly reports that must show highest → lowest

  • Time-series data (sort by date descending for latest first)

  • Preparing data before window functions (rank(), row_number(), etc.)

  • Excel/CSV exports where business expects specific order

  • Top-N + Others grouping (sort first, then add row number, then filter)

Key Features

  • Multi-level sorting (up to as many columns as needed)

  • Independent ASC/DESC per column

  • Drag-and-drop reordering of sort priority

  • Works on strings, numbers, dates, timestamps, nulls (nulls last by default)

  • Fully distributed — works on billions of rows

Pro Tips

  • Always place Sort By before Limit if creating “Top 100” reports

  • For stable sorting across runs, include a unique ID as the last sort column (ASC)

  • Nulls are treated as “greater than” any value in ASC order and “smaller than” in DESC order

Examples

Sort By Node – Real Business Examples

Example 1 – Top Earners Report (Classic Use Case)

Goal: Show employees from highest to lowest salary, then by name alphabetically if tie

Configuration

  • Columns: SALARY, EMP_NAME

  • Order: DESC, ASC

Result

| EMP_CD | EMP_NAME | SALARY       |
|--------|----------|--------------|
| C04    | MARCUS   | AUD 350000.00|
| C01    | MATT     | USD 200000.00|
| C02    | LISA     | GBP 100000.00|
| C03    | ROBIN    | EUR 15000.00 |

Example 2 – Latest Transactions First (Most Recent on Top)

Configuration

  • Columns: TRANSACTION_DATE, TRANSACTION_ID

  • Order: DESC, DESC

Result

Latest transactions appear at the top — perfect for monitoring dashboards.

Example 3 – Multi-Level Sales Ranking

Goal: Rank by Region (A-Z), then by Revenue (high→low), then by Customer Name (A-Z)

Configuration

  • Columns: REGION, TOTAL_REVENUE, CUSTOMER_NAME

  • Order: ASC, DESC, ASC

Result

Clean, readable regional sales report exactly as business expects.

Example 4 – Prepare for Top-10 + Others

Step 1: Sort By → REVENUE (DESC)

Step 2: Add Row Number column

Step 3: Use Expression or Filter: case when row_num <= 10 then CUSTOMER else ‘Others’ end

Result

Top 10 customers shown individually, rest grouped as “Others”.

Example 5 – Alphabetical Customer List with Null Handling

Data has some missing LAST_NAME

Configuration

  • Columns: LAST_NAME, FIRST_NAME

  • Order: ASC, ASC

Result

Rows with null LAST_NAME appear at the bottom (Spark default behavior).

Example 6 – Stable Sorting Across Multiple Runs

Configuration

  • Columns: SCORE, CUSTOMER_ID

  • Order: DESC, ASC

Result

Even when scores are tied, same customer always appears in same position because of unique ID tie-breaker.