
Supported File Formats
Text Documents
Text Documents
| Format | Extensions | Features |
|---|---|---|
| Full text extraction, image processing, page-level tracking, OCR for scanned documents | ||
| Word | .docx, .doc | Text and formatting preservation, table extraction |
| Plain Text | .txt, .md | Direct ingestion, markdown rendering |
| HTML | .html, .htm | Web pages and formatted content, link preservation |
| Rich Text | .rtf | Formatted text documents |
Spreadsheets
Spreadsheets
| Format | Extensions | Features |
|---|---|---|
| Excel | .xlsx, .xls | Per-sheet processing, table extraction, formula values |
| CSV | .csv | Tabular data ingestion, header detection |
Presentations
Presentations
| Format | Extensions | Features |
|---|---|---|
| PowerPoint | .pptx, .ppt | Slide text extraction, speaker notes, image processing |
Images
Images
| Format | Extensions | Features |
|---|---|---|
| Images | .png, .jpg, .jpeg, .webp | OCR text extraction, vision-based understanding, chart interpretation |
| Scanned PDFs | Automatic OCR or vision model processing |
Archives & Email
Archives & Email
| Format | Extensions | Features |
|---|---|---|
| ZIP | .zip | Extract and process all contents, maintain folder structure |
| .eml | Email messages with metadata, attachment processing, thread preservation |
Cloud Storage Integrations
Google Drive
Google Drive
Features:
- Connect to Google Drive
- Select folders to import
- Import files from selected folders
- Preserve folder structure
- Connect Google Drive integration
- Select folders to import
- Import files into knowledge base
Files are imported at the time of selection. They do not automatically sync when updated in Google Drive.
Microsoft OneDrive / SharePoint
Microsoft OneDrive / SharePoint
Website Scraping
Website Scraping
Features:
- Crawl entire websites
- Follow links automatically
- Extract clean text
- Preserve page structure
- Schedule regular updates
- Provide starting URL
- Configure crawl depth
- Set URL patterns to include/exclude
- Schedule refresh frequency
Upload Methods
- Manual Upload
- API
- SDK
- Flows
Via Platform UI:
- Drag-and-drop files
- Bulk upload multiple files
- Organize into folders
- Add metadata tags
Supports individual files, multiple files at once, and ZIP archives (auto-extracted)
Metadata
Standard Metadata
Standard Metadata
Automatically captured:
| Field | Description |
|---|---|
| Filename | Original file name |
| Upload date | When document was added |
| File size | Document size in bytes |
| Page count | Number of pages (PDFs) |
| Folder location | Organization path |
Custom Metadata
Custom Metadata
Add your own fields:Common Fields:
- Author, Department, Document type
- Version, Tags/categories
- Creation date, Expiration date
- Filter searches by metadata
- Organize documents
- Track document lifecycle
- Enable advanced queries
Advanced Settings
Configure search methods and retrieval settings