Task Routing
Queue Types (WORKER_SUBSCRIBE)
Each worker pool subscribes to one or more task types via the WORKER_SUBSCRIBE environment variable.
| Queue Type | Description |
|---|---|
all | Process all task types |
all_but_kb | Everything except knowledge-base ingestion |
flow | Workflow execution only |
chat | Conversational AI / agent tasks only |
kb | Knowledge-base ingestion only |
Tenant & Workspace Filtering
Workers can be scoped to specific tenants and/or workspaces using comma-separated ID lists:| Variable | Description |
|---|---|
WORKER_SUBSCRIBE_TENANTS | Comma-separated tenant IDs this worker processes (empty = all) |
WORKER_SUBSCRIBE_WORKSPACES | Comma-separated workspace IDs this worker processes (empty = all) |
WORKER_SUBSCRIBE — a worker set to workerSubscribe: "flow" with workerSubscribeTenants: "tenant-abc,tenant-xyz" will only process workflow tasks for those two tenants.
Worker Pools
Define multiple pools underworker.pools in your Helm values. Each pool creates an independent Kubernetes Deployment.
Pool Configuration Reference
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | — | Enable or disable this pool |
replicaCount | int | 1 | Static replica count (ignored when autoscaling is enabled) |
workerSubscribe | string | "all_but_kb" | Queue type subscription |
workerSubscribeTenants | string | "" | Comma-separated tenant IDs (empty = all) |
workerSubscribeWorkspaces | string | "" | Comma-separated workspace IDs (empty = all) |
envSecretRef | string | "" | Name of an additional K8s Secret to layer on top of the shared app-env |
resources | object | — | CPU/memory requests and limits |
autoscaling | object | — | HPA or KEDA autoscaling config |
podDisruptionBudget | object | — | PDB settings |
affinity | object | {} | Pod affinity/anti-affinity rules |
nodeSelector | object | {} | Node selector constraints |
tolerations | list | [] | Node tolerations |
topologySpreadConstraints | list | [] | Topology spread rules |
Basic Multi-Pool Example
Tenant Isolation
UseworkerSubscribeTenants and workerSubscribeWorkspaces to dedicate worker pools to specific tenants or workspaces. This is useful for:
- Noisy-neighbor isolation — prevent one tenant’s heavy workloads from starving others
- SLA tiers — dedicated capacity for premium tenants
- Data residency — pin certain tenants to workers in specific regions or nodes
Per-Pool Secrets
By default, all worker pools share the same Kubernetes Secret ({release}-app-env). When different pools need different environment variables — such as separate LLM API keys per tenant, different Redis databases, or pool-specific feature flags — use envSecretRef to layer an additional Secret on top.
envFrom, so its values take precedence for any overlapping keys.
Multi-Namespace Deployment
To run worker groups in different namespaces (e.g., for resource quotas or network policy isolation), deploy separate Helm releases that share the same backend infrastructure.Deploy the primary release
The primary release deploys backend, frontend, and the default worker pool.
All worker releases must connect to the same PostgreSQL and Redis
instances. Redis coordinates task distribution — workers in any namespace pick
up tasks from their subscribed queues regardless of where they run.
Cross-Namespace Considerations
- Secrets: Each namespace gets its own K8s Secret. Use External Secrets Operator or a shared values file to keep credentials in sync.
- Service Account: Worker-only releases still need a ServiceAccount with IRSA annotations for S3 access.
- KEDA: ScaledObjects are namespace-scoped. Each release creates its own KEDA resources; the cluster-wide KEDA operator discovers them automatically.
- Network Policies: Ensure worker namespaces can reach PostgreSQL, Redis, external APIs, and storage endpoints.
Autoscaling
- KEDA (Queue-Based)
- HPA (Resource-Based)
Best for scaling based on actual queue depth. Supports scale-to-zero.KEDA polls PostgreSQL to count queued/running tasks and adjusts replicas to maintain a target ratio.Individual pools can override the global KEDA query and target:
Run Archiving
After each workflow run completes, the worker archives run data (state, progress, node IO, logs, content streams) from Redis to object storage (S3/GCS/Azure Blob). This ensures long-term persistence but adds per-run overhead from token acquisition and upload latency.Archive Mode (ARCHIVE_MODE)
| Mode | Behavior | Use case |
|---|---|---|
sync | Archives immediately after each run completes (default) | Production — guarantees data is persisted before the worker moves on |
async | Skips per-run archiving; a periodic Celery task sweeps Redis and batch-archives completed runs | High-throughput deployments where per-run archiving creates bottlenecks |
disabled | No archiving — data stays in Redis until TTL expires or the orphan cleanup task runs | Benchmarking and stress testing |
Configuration
| Variable | Default | Description |
|---|---|---|
ARCHIVE_MODE | sync | Archive strategy: sync, async, or disabled |
ARCHIVE_BATCH_INTERVAL | 60 | Seconds between batch archive sweeps (only used in async mode) |
Async Mode
Inasync mode, a periodic Celery task (batch_archive_runs) runs every ARCHIVE_BATCH_INTERVAL seconds and archives all completed runs that have been idle in Redis for at least that duration. This eliminates per-run cloud storage token acquisition and reduces GCP/AWS auth pressure under high concurrency.
Disabled Mode
Usedisabled for pure performance benchmarking. Run data remains in Redis (subject to TTL) and can still be retrieved by the backend. The existing archive_orphaned_data periodic task (runs every 30 minutes) acts as a safety net and will eventually archive idle data regardless of mode.
Verification
After deploying, verify the setup:Environment
Platform environment variables and configuration layers
Secrets
Secret management, per-pool injection, and rotation
Scaling
General scaling strategies and capacity planning
Kubernetes
Kubernetes deployment guide