Skip to main content
Knowledge Bases support a wide range of file formats and can sync content from cloud storage integrations. Knowledge Base Interface

Supported File Formats

FormatExtensionsFeatures
PDF.pdfFull text extraction, image processing, page-level tracking, OCR for scanned documents
Word.docx, .docText and formatting preservation, table extraction
Plain Text.txt, .mdDirect ingestion, markdown rendering
HTML.html, .htmWeb pages and formatted content, link preservation
Rich Text.rtfFormatted text documents
FormatExtensionsFeatures
Excel.xlsx, .xlsPer-sheet processing, table extraction, formula values
CSV.csvTabular data ingestion, header detection
FormatExtensionsFeatures
PowerPoint.pptx, .pptSlide text extraction, speaker notes, image processing
FormatExtensionsFeatures
Images.png, .jpg, .jpeg, .webpOCR text extraction, vision-based understanding, chart interpretation
Scanned PDFs.pdfAutomatic OCR or vision model processing
FormatExtensionsFeatures
ZIP.zipExtract and process all contents, maintain folder structure
Email.emlEmail messages with metadata, attachment processing, thread preservation

Cloud Storage Integrations

Features:
  • Connect to Google Drive
  • Select folders to import
  • Import files from selected folders
  • Preserve folder structure
Setup:
  1. Connect Google Drive integration
  2. Select folders to import
  3. Import files into knowledge base
Files are imported at the time of selection. They do not automatically sync when updated in Google Drive.
Features:
  • Connect to OneDrive or SharePoint
  • Select document libraries or folders
  • Import files from selected locations
  • Preserve folder structure
Setup:
  1. Connect OneDrive/SharePoint integration
  2. Select folders or document libraries
  3. Import files into knowledge base
Files are imported at the time of selection. They do not automatically sync when updated in OneDrive or SharePoint.
Features:
  • Crawl entire websites
  • Follow links automatically
  • Extract clean text
  • Preserve page structure
  • Schedule regular updates
Setup:
  1. Provide starting URL
  2. Configure crawl depth
  3. Set URL patterns to include/exclude
  4. Schedule refresh frequency

Upload Methods

Via Platform UI:
  • Drag-and-drop files
  • Bulk upload multiple files
  • Organize into folders
  • Add metadata tags
Supports individual files, multiple files at once, and ZIP archives (auto-extracted)

Metadata

Automatically captured:
FieldDescription
FilenameOriginal file name
Upload dateWhen document was added
File sizeDocument size in bytes
Page countNumber of pages (PDFs)
Folder locationOrganization path
Add your own fields:Common Fields:
  • Author, Department, Document type
  • Version, Tags/categories
  • Creation date, Expiration date
Use Cases:
  • Filter searches by metadata
  • Organize documents
  • Track document lifecycle
  • Enable advanced queries

Advanced Settings

Configure search methods and retrieval settings