> ## Documentation Index
> Fetch the complete documentation index at: https://docs.noxus.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Introduction

> Automated testing and quality assurance for your flows

Tests (or Evaluations) let you define repeatable tests that verify your flows produce correct, high-quality outputs. By combining **tests**, **evaluators**, and **cases**, you can catch regressions, compare versions, and build confidence before deploying changes.

<img src="https://mintcdn.com/spot-16018069/r-GexOMmWfMkx6yW/images/tests/evaluations-overview.png?fit=max&auto=format&n=r-GexOMmWfMkx6yW&q=85&s=8eaba7db434dbc46295adaea06984202" alt="Tests overview" width="2796" height="1652" data-path="images/tests/evaluations-overview.png" />

## Key Concepts

### Tests

A test is a named collection of evaluators and cases scoped to a single flow. Each test acts as an independent group that can be run on its own or together with other tests.

A test contains:

* **Evaluators** -- rules or criteria that score each case's output.
* **Cases** -- specific input scenarios to run against your flow.

### Evaluators

Evaluators are the scoring functions applied to each case's output. They determine whether the output is correct by returning a **pass/fail** result.

<Note>
  Only text outputs can be evaluated at this time. File outputs, images, and other non-text types are not supported by evaluators.
</Note>

Noxus provides two categories of evaluators:

* **Deterministic evaluators** -- rule-based checks like regex matching, string comparison, and JSON validation. Fast, predictable, and free.
* **AI evaluators** -- LLM-powered assessments that score outputs against natural language rules or multi-criteria rubrics. Flexible but consume model tokens.

See [Evaluators](/platform/tests/evaluators) for the full list of available evaluators and their configuration options.

### Cases

A case defines the inputs your flow will receive during an evaluation run.

Cases can optionally include **Evaluator Values** -- parameters that an evaluator needs to perform its check. You can override some of these values per case when a specific scenario requires different criteria.

<Tip>
  For example, the `Equals` evaluator requires a target value to compare against, and that target may differ from one case to another.
</Tip>

You can create cases manually or generate them from a previous successful run.

See [Cases](/platform/tests/test-cases) for details on creating and managing test cases.

## How It Works

1. **Create a test** for your flow.
2. **Add evaluators** that define what "correct" means -- string matches, JSON validation, LLM-based scoring, or any combination.
3. **Add cases** with the inputs you want to verify.
4. **Run the test** against the current version or a specific version of your flow.
5. **Review results** -- each case shows a pass/fail status per evaluator, an overall score, and detailed feedback.

When your flow definition changes after a run, results are automatically flagged as **outdated** so you know to re-run.

<img src="https://mintcdn.com/spot-16018069/r-GexOMmWfMkx6yW/images/tests/run-test.png?fit=max&auto=format&n=r-GexOMmWfMkx6yW&q=85&s=91619a2aab1ebe32bbf6fdae0c3d17f6" alt="Running a test" width="2796" height="1652" data-path="images/tests/run-test.png" />

## Score Ring

Each test displays a score ring summarizing the latest results at a glance:

* **Green** -- passed cases
* **Red** -- failed cases (evaluator assertions did not pass)
* **Gray** -- cases not yet run, or cases that encountered an execution error

The percentage shown is the overall pass rate across all cases.

<img src="https://mintcdn.com/spot-16018069/r-GexOMmWfMkx6yW/images/tests/score-ring.png?fit=max&auto=format&n=r-GexOMmWfMkx6yW&q=85&s=17bcfe1c9bb871a3c619de4aeba424ff" alt="Score ring" width="450" height="153" data-path="images/tests/score-ring.png" />

## Statuses

Cases can have the following statuses:

| Status                                                                                                                                                                                                                     | Meaning                                                                               |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- |
| <span className="inline-flex items-center px-2.5 py-0.5 text-xs font-medium rounded-md bg-green-100 text-green-800">Passed</span>                                                                                          | All evaluators passed for this case.                                                  |
| <span className="inline-flex items-center px-2.5 py-0.5 text-xs font-medium rounded-md bg-red-50 text-red-600">Failed</span>                                                                                               | One or more evaluators did not pass.                                                  |
| <span className="inline-flex items-center gap-1.5 px-2.5 py-0.5 text-xs font-medium rounded-md border border-gray-200 !bg-[#FFFFFF] !text-[#1E1E1E]">Error</span> <Icon icon="triangle-alert" size={14} color="#eab308" /> | The flow failed to execute before evaluators could run.                               |
| <span className="inline-flex items-center gap-1 px-2.5 py-0.5 text-xs font-medium rounded-md border border-gray-200 !bg-[#FFFFFF] !text-[#1E1E1E]"><Icon icon="loader-circle" size={12} color="#1E1E1E" /> Running</span>  | The evaluation is currently in progress.                                              |
| <span className="inline-flex items-center px-2.5 py-0.5 text-xs font-medium rounded-md border border-gray-200 !bg-[#FFFFFF] !text-[#1E1E1E]">Not run</span>                                                                | No evaluation results exist for this case yet.                                        |
| <span className="inline-flex items-center px-2.5 py-0.5 text-xs font-medium rounded-md border border-gray-200 !bg-[#FFFFFF] !text-[#1E1E1E]">Cancelled</span>                                                              | The evaluation run was cancelled by a user.                                           |
| <span className="inline-flex items-center gap-1 text-xs font-medium text-gray-700 dark:text-gray-300">— <Icon icon="triangle-alert" size={14} color="#eab308" /></span>                                                    | Results exist but the flow, evaluators, or case data have changed since the last run. |

## Next Steps

* [Evaluators](/platform/tests/evaluators) -- explore all available evaluator types and their configuration.
* [Cases](/platform/tests/test-cases) -- learn how to create and manage test cases.
* [Running Tests](/platform/tests/running-tests) -- understand how to run evaluations and interpret results.
