Hive Incremental¶

This node is used to incrementally read data from Hive table.

Output¶

It creates a DataFrame from selected hive table with latest data

Type¶

dataset

Class¶

fire.nodes.etl.NodeHiveIncremental

Fields¶

Name	Title	Description
database	HIVE Database	HIVE Database
table	HIVE Table	HIVE Table from which to read the data
path	Watermark File Path	Path of the watermark file.
filterFields	Incremental Load Fields	Comma separated values of field names used in data filter for the incremental load.
outputColNames	Column Names of the database table	Column Names of the database table
outputColTypes	Column Types of the database table	Column Types of the database table

Details¶

Hive Incremental Node Details¶

This node reads a table from Hive and creates a DataFrame containing the schema and data of the specified table, with an incremental load configuration.

Parameters to be set:¶

OUTPUT STORAGE LEVEL: Keep this as DEFAULT.
HIVE DATABASE: Specify the Hive database from which data is to be read.
HIVE TABLE: Specify the table in the Hive database from which data is to be read incrementally.
WATERMARK FILE PATH: Define the file path for the watermark file to track the last load timestamp.
INCREMENTAL LOAD FIELDS: Specify the fields that will be used for incremental loading (e.g., timestamp or ID fields).
SCHEMA COLUMNS: Refresh the schema to load column names and types from the database table.

Examples¶

Hive Incremental Node Examples¶

Example of Connection Values¶

OUTPUT STORAGE LEVEL: DEFAULT
HIVE DATABASE: retail_db
HIVE TABLE: sales_data
WATERMARK FILE PATH: /user/hive/watermark/sales_data_watermark.txt
INCREMENTAL LOAD FIELDS: sale_date
SCHEMA COLUMNS: Click “Refresh Schema” to load columns from the specified Hive table.