Manage your Knowledge Bases.
Knowledge Bases (KBs) are collections of documents and data that provide context and information to AI agents. They serve as the foundation for enabling AI to access, understand, and utilize both structured and unstructured information.
A Knowledge Base acts as a centralized repository that organizes information in a way that’s optimized for AI consumption and retrieval.
You can create a new Knowledge Base within a group using the Add Knowledge Base endpoint. This returns the created KB object with its unique identifier.
The platform provides several ways to add documents to a Knowledge Base:
Upload files directly using the Upload Train endpoint
Import from external sources like Google Drive, OneDrive, or SharePoint using the Generic Train endpoint
Add documents with custom metadata using the Add Knowledge Base Document endpoint
Adding a document will NOT automatically trigger the ingestion process.
These operations support:
Document management is handled through various endpoints that allow you to:
Knowledge Base details, including document status, can be retrieved through the Get Knowledge Base endpoint. You can also update KB properties using the update knowledge base endpoint.
You can monitor processing through the Running Jobs endpoint that provides detailed information about ongoing and completed operations.
Direct file uploads (PDFs, text files, images) with support for batch uploading
Import documents and files directly from your Google Drive
Access and import documents stored in Microsoft OneDrive
Access and import documents from SharePoint repositories
Web crawling with configurable depth and URL patterns
Slack, Notion, and more
Even though there are more integrations available, the above are currently supported for Knowledge Bases.
Knowledge Bases come in two primary types:
Permanent Knowledge Bases that are managed through the Knowledge Base sections and can be referenced by multiple agents or workflows.
Created within the Workflow Editor for specific workflow use cases, with the option to promote them to Entity KBs.
Entity Knowledge Bases are permanent repositories that:
These are the standard Knowledge Bases that most users will interact with for long-term knowledge storage and retrieval.
Temporary Knowledge Bases are workflow-specific repositories that:
Temporary KBs are ideal for workflow-specific data that may not need to be part of your permanent knowledge repository. If you later decide the knowledge is valuable for broader use, you can promote it to an Entity KB without losing any data.
Knowledge Bases have the following status values that indicate their overall state:
Status | Description |
---|---|
created | KB has been created but no documents have been added yet |
training | KB has documents that are currently being processed |
trained | All documents in the KB have been successfully processed |
error | All documents in the KB have failed processing |
The KB status is automatically updated based on the status of its documents.
Individual documents within a Knowledge Base have their own status values:
Status | Description |
---|---|
uploaded | Document has been uploaded but processing hasn’t started |
training | Document is currently being processed (chunked, embedded, etc.) |
trained | Document has been successfully processed and is available for queries |
error | Document processing failed |
You can filter documents by status using the Get Documents by Status endpoint.
Knowledge Bases are processed through a series of automated workflows that handle document ingestion, processing, and indexing. Understanding these workflows can help you optimize your knowledge base usage.
When you add documents to a Knowledge Base, they go through the following processing steps:
Document Upload
Documents are uploaded to secure storage and registered in the Knowledge Base with ‘uploaded’ status
Text Extraction
Text is extracted from various file formats (PDF, DOCX, images, etc.) using specialized parsers
Chunking
Documents are split into smaller, semantically meaningful chunks for better retrieval
Embedding Generation
Vector embeddings are created for each chunk to enable semantic search
Indexing
Chunks and their embeddings are stored in a vector database for efficient retrieval
If any step in the document processing flow fails:
Knowledge Bases can be integrated with AI agents to provide context for conversations:
trained
)For large document sets, the platform supports batch processing:
For optimal Knowledge Base management:
To get the best performance from your Knowledge Bases:
Knowledge Bases can process a wide variety of file formats to accommodate different content types and sources. Understanding which file types are supported helps ensure successful document ingestion.
Category | Supported Formats |
---|---|
Documents | PDF, DOCX, DOC, PPTX, PPT |
Text Files | TXT, HTML, MD, JSON |
Images | JPG/JPEG, PNG |
Archives | ZIP |
Google Workspace | Google Docs, Google Slides |
Our platform supports all major document formats including PDF, Microsoft Word (DOCX, DOC), and PowerPoint (PPTX, PPT) files.
Process plain text (TXT), web pages (HTML), documentation (MD), and structured data (JSON) with full text extraction.
Extract text from images (JPG/JPEG, PNG) using advanced OCR technology to make visual content searchable.
Upload ZIP archives containing multiple documents for batch processing, with automatic extraction and organization.
File Type | Maximum Size | Notes |
---|---|---|
Documents | 50 MB | Includes PDF, DOCX, DOC, etc. |
Images | 20 MB | Text will be extracted using OCR |
Archives | 100 MB | Contents will be extracted and processed individually |
Very large files may take longer to process and could impact system performance. Consider splitting large documents into smaller, more focused files for optimal results.
The platform uses specialized parsers to extract text from different file types:
Manage your Knowledge Bases.
Knowledge Bases (KBs) are collections of documents and data that provide context and information to AI agents. They serve as the foundation for enabling AI to access, understand, and utilize both structured and unstructured information.
A Knowledge Base acts as a centralized repository that organizes information in a way that’s optimized for AI consumption and retrieval.
You can create a new Knowledge Base within a group using the Add Knowledge Base endpoint. This returns the created KB object with its unique identifier.
The platform provides several ways to add documents to a Knowledge Base:
Upload files directly using the Upload Train endpoint
Import from external sources like Google Drive, OneDrive, or SharePoint using the Generic Train endpoint
Add documents with custom metadata using the Add Knowledge Base Document endpoint
Adding a document will NOT automatically trigger the ingestion process.
These operations support:
Document management is handled through various endpoints that allow you to:
Knowledge Base details, including document status, can be retrieved through the Get Knowledge Base endpoint. You can also update KB properties using the update knowledge base endpoint.
You can monitor processing through the Running Jobs endpoint that provides detailed information about ongoing and completed operations.
Direct file uploads (PDFs, text files, images) with support for batch uploading
Import documents and files directly from your Google Drive
Access and import documents stored in Microsoft OneDrive
Access and import documents from SharePoint repositories
Web crawling with configurable depth and URL patterns
Slack, Notion, and more
Even though there are more integrations available, the above are currently supported for Knowledge Bases.
Knowledge Bases come in two primary types:
Permanent Knowledge Bases that are managed through the Knowledge Base sections and can be referenced by multiple agents or workflows.
Created within the Workflow Editor for specific workflow use cases, with the option to promote them to Entity KBs.
Entity Knowledge Bases are permanent repositories that:
These are the standard Knowledge Bases that most users will interact with for long-term knowledge storage and retrieval.
Temporary Knowledge Bases are workflow-specific repositories that:
Temporary KBs are ideal for workflow-specific data that may not need to be part of your permanent knowledge repository. If you later decide the knowledge is valuable for broader use, you can promote it to an Entity KB without losing any data.
Knowledge Bases have the following status values that indicate their overall state:
Status | Description |
---|---|
created | KB has been created but no documents have been added yet |
training | KB has documents that are currently being processed |
trained | All documents in the KB have been successfully processed |
error | All documents in the KB have failed processing |
The KB status is automatically updated based on the status of its documents.
Individual documents within a Knowledge Base have their own status values:
Status | Description |
---|---|
uploaded | Document has been uploaded but processing hasn’t started |
training | Document is currently being processed (chunked, embedded, etc.) |
trained | Document has been successfully processed and is available for queries |
error | Document processing failed |
You can filter documents by status using the Get Documents by Status endpoint.
Knowledge Bases are processed through a series of automated workflows that handle document ingestion, processing, and indexing. Understanding these workflows can help you optimize your knowledge base usage.
When you add documents to a Knowledge Base, they go through the following processing steps:
Document Upload
Documents are uploaded to secure storage and registered in the Knowledge Base with ‘uploaded’ status
Text Extraction
Text is extracted from various file formats (PDF, DOCX, images, etc.) using specialized parsers
Chunking
Documents are split into smaller, semantically meaningful chunks for better retrieval
Embedding Generation
Vector embeddings are created for each chunk to enable semantic search
Indexing
Chunks and their embeddings are stored in a vector database for efficient retrieval
If any step in the document processing flow fails:
Knowledge Bases can be integrated with AI agents to provide context for conversations:
trained
)For large document sets, the platform supports batch processing:
For optimal Knowledge Base management:
To get the best performance from your Knowledge Bases:
Knowledge Bases can process a wide variety of file formats to accommodate different content types and sources. Understanding which file types are supported helps ensure successful document ingestion.
Category | Supported Formats |
---|---|
Documents | PDF, DOCX, DOC, PPTX, PPT |
Text Files | TXT, HTML, MD, JSON |
Images | JPG/JPEG, PNG |
Archives | ZIP |
Google Workspace | Google Docs, Google Slides |
Our platform supports all major document formats including PDF, Microsoft Word (DOCX, DOC), and PowerPoint (PPTX, PPT) files.
Process plain text (TXT), web pages (HTML), documentation (MD), and structured data (JSON) with full text extraction.
Extract text from images (JPG/JPEG, PNG) using advanced OCR technology to make visual content searchable.
Upload ZIP archives containing multiple documents for batch processing, with automatic extraction and organization.
File Type | Maximum Size | Notes |
---|---|---|
Documents | 50 MB | Includes PDF, DOCX, DOC, etc. |
Images | 20 MB | Text will be extracted using OCR |
Archives | 100 MB | Contents will be extracted and processed individually |
Very large files may take longer to process and could impact system performance. Consider splitting large documents into smaller, more focused files for optimal results.
The platform uses specialized parsers to extract text from different file types: