A guide to designing automated test cases for evaluating LLM prompts

Introduction

In Asserto AI, Testing and Evaluations are organized with Test Cases and Assertions. Assertions are checks on the output that match your expectations. Tests are built as data-driven, no code required. You define input data and expected outputs, and the system checks whether the prompt output matches your expectations.

The term Automated tests refers to the automated validation and scoring of the performed tests and evaluations. No human is needed to validate the outputs. Tests are not created automatically — this system helps you create and run tests.

Test Cases

A test case defines: Input data: Set the value for individual parameters used in your prompt templates. Assertions: A test case can define one or more assertions.

Assertions

Supported operations:

equal
contains
greater

Often the LLM output does not provide a direct value to be checked on. In this case a computation can be used as an intermediate step in the assertion.

Computations

Computations can be code-based or LLM-based (aka LLM-as-Judge). Both reference-free and reference-based computations are supported. In case of reference-based computation, you need to input a target value (or ground truth).

Code-based computations

character-count

reference-free

A simple character-count to measure the output length.

similarity

reference-based

Semantic Textual Similarity based on cross-encoder/stsb-distilroberta-base.To be used when you need accurate similarity scores for sentence pairs.

“My app crashes when clicking login”
“App crashes after pressing login button”

Get Started

Visualizations

Platform Guides

Updates

API Reference

A guide to designing automated test cases for evaluating LLM prompts

Introduction

Test Cases

Assertions

Computations

Code-based computations

Get Started

Visualizations

Platform Guides

Updates

API Reference

​Introduction

​Test Cases

​Assertions

​Computations

​Code-based computations

Introduction

Test Cases

Assertions

Computations

Code-based computations