> ## Documentation Index
> Fetch the complete documentation index at: https://docs.noxus.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Backup & Recovery

> Enterprise backup strategies, disaster recovery, and data resilience for Noxus

Noxus implements a multi-layered backup and recovery strategy to ensure that your AI infrastructure is resilient to data loss, service failures, and regional outages.

## Backup Strategy

Our backup philosophy is built on the principle of **continuous availability**. We categorize data into three distinct tiers with specific recovery objectives.

<CardGroup cols={3}>
  <Card title="Persistence" icon="database">
    **PostgreSQL**

    * **Method**: Daily snapshots + Point-in-Time Recovery (PITR).
    * **Scope**: User data, flow definitions, and knowledge base metadata.
  </Card>

  <Card title="Object Storage" icon="box-archive">
    **S3 / GCS / MinIO**

    * **Method**: Versioning + Cross-region replication.
    * **Scope**: Knowledge base documents, run-level logs, and artifacts.
  </Card>

  <Card title="Configuration" icon="gear">
    **Secrets & Env**

    * **Method**: Infrastructure-as-Code (IaC) versioning + Secret Manager backups.
    * **Scope**: `auth_config`, API keys, and deployment parameters.
  </Card>
</CardGroup>

***

## Data Layer Resilience

### PostgreSQL (Persistence Layer)

PostgreSQL is the source of truth for the platform. For production environments, we recommend:

* **Automated Snapshots**: Daily full snapshots with a minimum 30-day retention.
* **PITR**: Continuous transaction log (WAL) archiving to allow recovery to any specific second within the retention window.
* **Multi-AZ Failover**: Deploy with a synchronous standby in a separate availability zone for zero-downtime failover.

### Object Storage (Liquid Data)

As part of our **Liquid Data** architecture, object storage handles the bulk of your AI assets:

* **Versioning**: Enable bucket versioning to protect against accidental deletions or overwrites.
* **Lifecycle Policies**: Automatically transition older logs and artifacts to lower-cost storage classes (e.g., Glacier or Coldline) to optimize budgets.
* **Replication**: For mission-critical deployments, enable cross-region replication to ensure data availability even during a total cloud region failure.

***

## Recovery Objectives (RTO/RPO)

Noxus is designed to help you meet strict enterprise recovery targets:

| Objective                | Target       | Description                                                         |
| :----------------------- | :----------- | :------------------------------------------------------------------ |
| **RPO (Recovery Point)** | \< 5 Minutes | The maximum amount of data loss you can tolerate (driven by PITR).  |
| **RTO (Recovery Time)**  | \< 1 Hour    | The maximum time allowed to restore the platform to full operation. |

***

## Disaster Recovery (DR) Patterns

Depending on your deployment model, you can implement several DR patterns:

<AccordionGroup>
  <Accordion title="Active-Passive (Warm Standby)" icon="snowflake">
    Maintain a secondary deployment in a different region. Data is continuously replicated, and the secondary stack can be scaled up rapidly during a failover.
  </Accordion>

  <Accordion title="Active-Active (Multi-Region)" icon="fire">
    Run Noxus services in multiple regions simultaneously. Traffic is routed to the nearest healthy region, providing the highest level of availability and lowest latency for global users.
  </Accordion>

  <Accordion title="Air-Gapped Recovery" icon="shield-halved">
    For isolated environments, backups are stored on encrypted, physically separate media and recovered using verified offline procedures.
  </Accordion>
</AccordionGroup>

***

## Operational Readiness

<Warning>
  **The Restore Drill**: A backup is only as good as its last successful restore. We recommend performing quarterly restoration drills in a non-production environment to validate your runbooks.
</Warning>

<Steps>
  <Step title="Automate Everything">
    Use the Terraform and Helm assets in `noxus-infra` to automate the provisioning of backup resources.
  </Step>

  <Step title="Monitor Backup Health">
    Set up alerts for failed snapshots or replication lag in your monitoring dashboard.
  </Step>

  <Step title="Document the Runbook">
    Maintain a clear, step-by-step recovery guide that includes DNS switching and secret restoration.
  </Step>
</Steps>

<CardGroup cols={2}>
  <Card title="Storage Architecture" icon="database" href="/deployment/configuration/storage">
    Understand the three storage layers being backed up.
  </Card>

  <Card title="noxus-infra Repo" icon="github" href="https://github.com/noxus-ai/noxus-infra">
    Access automation scripts for backup and recovery.
  </Card>
</CardGroup>
