Repartition¶
This node repartitions incoming dataframe into a specified number of partitions
Input¶
It accepts a DataFrame as input from the previous Node
Type¶
transform
Class¶
fire.nodes.etl.NodeRepartition
Fields¶
Name |
Title |
Description |
|---|---|---|
numPartitions |
Number of Partitions |
Number of Partitions |
Details¶
Repartition Node
Overview:
This node is used to repartition a DataFrame, which involves distributing the data across multiple partitions for improved performance and parallel processing.
Configuration:
Number of Partitions: Specifies the desired number of partitions.
Output Storage Level: Sets the storage level for the repartitioned DataFrame.
Examples¶
Example:
If you have a DataFrame with 1000 rows and you set the number of partitions to 10, the node will distribute the data evenly across 10 partitions. This can significantly improve performance for operations like joins, aggregations, and sorting.