> ## Documentation Index
> Fetch the complete documentation index at: https://docs.noxus.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Supported Files & Integrations

> File formats and data sources for knowledge bases

Knowledge Bases support a wide range of file formats and can sync content from cloud storage integrations.

<img src="https://mintcdn.com/spot-16018069/aIe-Q8xSXPJ4vFiE/images/kbs/kbs.png?fit=max&auto=format&n=aIe-Q8xSXPJ4vFiE&q=85&s=37399dd88d2e249f61ad1188c31e4de6" alt="Knowledge Base Interface" width="1085" height="660" data-path="images/kbs/kbs.png" />

## Supported File Formats

<AccordionGroup>
  <Accordion title="Text Documents" icon="file-text">
    | Format         | Extensions  | Features                                                                               |
    | :------------- | :---------- | :------------------------------------------------------------------------------------- |
    | **PDF**        | .pdf        | Full text extraction, image processing, page-level tracking, OCR for scanned documents |
    | **Word**       | .docx, .doc | Text and formatting preservation, table extraction                                     |
    | **Plain Text** | .txt, .md   | Direct ingestion, markdown rendering                                                   |
    | **HTML**       | .html, .htm | Web pages and formatted content, link preservation                                     |
    | **Rich Text**  | .rtf        | Formatted text documents                                                               |
  </Accordion>

  <Accordion title="Spreadsheets" icon="table">
    | Format    | Extensions  | Features                                               |
    | :-------- | :---------- | :----------------------------------------------------- |
    | **Excel** | .xlsx, .xls | Per-sheet processing, table extraction, formula values |
    | **CSV**   | .csv        | Tabular data ingestion, header detection               |
  </Accordion>

  <Accordion title="Presentations" icon="presentation">
    | Format         | Extensions  | Features                                               |
    | :------------- | :---------- | :----------------------------------------------------- |
    | **PowerPoint** | .pptx, .ppt | Slide text extraction, speaker notes, image processing |
  </Accordion>

  <Accordion title="Images" icon="image">
    | Format           | Extensions               | Features                                                              |
    | :--------------- | :----------------------- | :-------------------------------------------------------------------- |
    | **Images**       | .png, .jpg, .jpeg, .webp | OCR text extraction, vision-based understanding, chart interpretation |
    | **Scanned PDFs** | .pdf                     | Automatic OCR or vision model processing                              |
  </Accordion>

  <Accordion title="Archives & Email" icon="archive">
    | Format    | Extensions | Features                                                                 |
    | :-------- | :--------- | :----------------------------------------------------------------------- |
    | **ZIP**   | .zip       | Extract and process all contents, maintain folder structure              |
    | **Email** | .eml       | Email messages with metadata, attachment processing, thread preservation |
  </Accordion>
</AccordionGroup>

## Cloud Storage Integrations

<AccordionGroup>
  <Accordion title="Google Drive" icon="cloud">
    **Features:**

    * Connect to Google Drive
    * Select folders to import
    * Import files from selected folders
    * Preserve folder structure

    **Setup:**

    1. Connect Google Drive integration
    2. Select folders to import
    3. Import files into knowledge base

    <Note>
      Files are imported at the time of selection. They do not automatically sync when updated in Google Drive.
    </Note>
  </Accordion>

  <Accordion title="Microsoft OneDrive / SharePoint" icon="cloud">
    **Features:**

    * Connect to OneDrive or SharePoint
    * Select document libraries or folders
    * Import files from selected locations
    * Preserve folder structure

    **Setup:**

    1. Connect OneDrive/SharePoint integration
    2. Select folders or document libraries
    3. Import files into knowledge base

    <Note>
      Files are imported at the time of selection. They do not automatically sync when updated in OneDrive or SharePoint.
    </Note>
  </Accordion>

  <Accordion title="Website Scraping" icon="globe">
    **Features:**

    * Crawl entire websites
    * Follow links automatically
    * Extract clean text
    * Preserve page structure
    * Schedule regular updates

    **Setup:**

    1. Provide starting URL
    2. Configure crawl depth
    3. Set URL patterns to include/exclude
    4. Schedule refresh frequency
  </Accordion>
</AccordionGroup>

## Upload Methods

<Tabs>
  <Tab title="Manual Upload" icon="upload">
    **Via Platform UI:**

    * Drag-and-drop files
    * Bulk upload multiple files
    * Organize into folders
    * Add metadata tags

    <Info>
      Supports individual files, multiple files at once, and ZIP archives (auto-extracted)
    </Info>
  </Tab>

  <Tab title="API" icon="code">
    **Programmatic upload via REST API:**

    ```bash theme={null}
    curl -X POST https://api.noxus.ai/v1/knowledge-bases/kb_123/documents \
      -H "Authorization: Bearer ${API_KEY}" \
      -F "file=@document.pdf" \
      -F "folder=Product Docs"
    ```
  </Tab>

  <Tab title="SDK" icon="python">
    **Upload via Python SDK:**

    ```python theme={null}
    from noxus_sdk.client import Client

    client = Client(api_key="your_api_key")

    kb = client.knowledge_bases.get("kb_123")
    kb.upload_document(
        file_path="document.pdf",
        folder="Product Docs",
        metadata={"author": "John Doe", "version": "2.0"}
    )
    ```
  </Tab>

  <Tab title="Flows" icon="workflow">
    **Automate uploads with flows:**

    **Use Cases:**

    * Automate document ingestion
    * Process email attachments
    * Sync from external sources
    * Scheduled batch uploads
  </Tab>
</Tabs>

## Metadata

<AccordionGroup>
  <Accordion title="Standard Metadata" icon="tag">
    **Automatically captured:**

    | Field           | Description             |
    | :-------------- | :---------------------- |
    | Filename        | Original file name      |
    | Upload date     | When document was added |
    | File size       | Document size in bytes  |
    | Page count      | Number of pages (PDFs)  |
    | Folder location | Organization path       |
  </Accordion>

  <Accordion title="Custom Metadata" icon="tags">
    **Add your own fields:**

    **Common Fields:**

    * Author, Department, Document type
    * Version, Tags/categories
    * Creation date, Expiration date

    **Use Cases:**

    * Filter searches by metadata
    * Organize documents
    * Track document lifecycle
    * Enable advanced queries
  </Accordion>
</AccordionGroup>

<Card title="Advanced Settings" icon="settings" href="/platform/knowledge-bases/advanced-settings">
  Configure search methods and retrieval settings
</Card>
