Create Text Embedding

This node enables the creation of a embedding text data and output as dataframe

Input

It takes in Dataframe as an input

Output

Output as dataframe which have extra embedding columns

Type

transform

Class

fire.nodes.gai.NodeTextEmbedder

Fields

Name

Title

Description

createEmbedding

Create Embedding

embeddingMethod

Embedding Method

Select the embedding method.

llmConnection

Select Connection

Select Connection

chunkSize

Chunk Size

Size of each text chunk.

chunkOverlap

Chunk Overlap

Overlap size between consecutive chunks.

contentCol

Vector Indexer

Column name for content.

queryEmbedding

Query Embedding

userQuery

User Query

User provided query.

huggingface

Hugging Face

hfModelName

Hugging Face Model

Hugging Face embedding model.

Details

Text Embedder Node Details

This node enables the creation of a embedding text data further used for indexing it into a vector database using the specified configuration. It supports multiple embedding providers including OpenAI, Bedrock, and HuggingFace.

Parameters to be set:

### General:

  • Embedding Method: Choose the embedding method from HuggingFace, OpenAI, or Bedrock.

  • Chunk Size: Define the size of text chunks for processing large inputs (default: 1024).

  • Chunk Overlap: Set the overlap size between consecutive chunks (default: 100).

  • Content Column: Select the column that contains the text content (default: content).

  • File Name Column: Select the column that contains file names (default: fileName).

  • Page Number Column: Select the column that contains page numbers (default: pageNumber).

  • Base64 Image Column: Select the column that contains Base64 encoded images (default: base64Image).

  • User Query: Provide a user query string for search or processing.

  • Directory Path Column: Set the column name for directory paths (default: directoryPath).

### Service-Specific Configurations:

#### OpenAI:

  • API Key: Provide your OpenAI API key.

  • Embedding Model: Specify the OpenAI embedding model.

  • Max Retries: Set the maximum number of retries for API calls (default: 6).

  • Embedding Context Length: Define the context length for embeddings (default: 8191).

#### Bedrock:

  • Service Name: Specify the Bedrock service name.

  • Region Name: Provide the AWS region name.

  • AWS Access Key ID: Provide your AWS Access Key ID.

  • AWS Secret Access Key: Provide your AWS Secret Access Key.

  • Embedding Model: Specify the Bedrock embedding model.

#### HuggingFace:

  • Model Name: Specify the HuggingFace embedding model (default: all-MiniLM-L6-v2).