Skip to main content
Kubernetes is the recommended production model for organizations requiring robust autoscaling, zone resilience, and platform-level operations. Noxus is fully compatible with standard Kubernetes distributions as well as Red Hat OpenShift.

Infrastructure as Code

Production-ready deployment assets are maintained in the noxus-infra repository.

Terraform Modules

We provide Terraform modules to bootstrap the necessary cloud infrastructure, including:
  • Managed Kubernetes clusters (EKS, GKE, AKS).
  • Networking and VPC configurations.
  • Managed data services (PostgreSQL, Redis, Object Storage).
  • IAM roles and service account integrations.

Helm Charts

The core platform is deployed using our official Helm charts. These charts are designed to be flexible and support:
  • OpenShift Compatibility: Support for Routes and specific security contexts.
  • Granular Component Control: Independent configuration for Backend, Frontend, Workers, and Relays.
  • Advanced Scaling: Built-in support for HPA and KEDA-driven worker scaling.

Helm Components

Helm sectionNoxus componentDescription
backendNoxus BackendCore API and orchestration service.
frontendNoxus FrontendWeb-based user interface.
worker.poolsNoxus WorkersScalable execution engines for flows and agents.
beatNoxus BeatPeriodic task scheduler and heartbeat service.
relayNoxus RelaysWebhook and event receiver endpoints.
ingressIngress / RoutesPublic routing, TLS termination, and OpenShift Routes.

Kubernetes Runtime Diagram


Autoscaling Model

Backend and Frontend

Standard Horizontal Pod Autoscaler (HPA) manages replicas based on CPU and memory utilization.

Workers

Worker pools support two advanced scaling modes:
  • Resource-based: Standard HPA for CPU/Memory.
  • Queue-driven: KEDA-powered scaling based on task queue depth, allowing for scale-to-zero and rapid bursts.

Advanced Worker Pool Management

The Kubernetes deployment model offers sophisticated control over how AI workloads are isolated and scaled.

Workspace Mapping & Namespace Isolation

Worker pools can be logically or physically isolated to meet strict security requirements:
  • Workspace Mapping: You can map specific worker pools to individual workspaces, ensuring that a team’s workloads run on dedicated compute resources.
  • Namespace Isolation: For maximum security, worker pools can be deployed in separate Kubernetes namespaces. This allows for strict network policies and resource quotas between different projects or departments.

Workload-Specific Scaling

Not all AI tasks are created equal. You can configure independent scaling policies for different pools:
  • Real-time Pools: Optimized for low-latency agent responses with higher minimum replica counts.
  • Batch Pools: Configured with KEDA to scale-to-zero when idle and burst rapidly for high-volume data processing.
  • GPU Pools: Targeted scaling for AI-intensive operations like model inference or embedding generation.

Secret Isolation

Security can be hardened at the pool level by injecting secrets directly into specific worker deployments. This ensures that sensitive credentials (like proprietary API keys or database strings) are only accessible to the workers that actually require them, providing robust secret isolation across your organization.

Typical Deployment Steps

1

Bootstrap Infrastructure

Use Terraform from noxus-infra to provision the cluster, database, and storage.
2

Install Cluster Operators

Setup required add-ons: Ingress controllers, cert-manager, KEDA, and monitoring tools.
3

Configure Environment

Define your values.yaml with environment-specific secrets and connection strings.
4

Deploy via Helm

Install the Noxus Helm chart and verify component health and connectivity.