Eval Cookbooks
Here are some cookbooks we've prepared to make it easy to set up and run evals using Athina.
-
Run a preset eval (opens in a new tab): This cookbook shows you how to run a single eval on your dataset
-
Run an eval suite (opens in a new tab): This cookbook shows you how to run a suite of evals
-
Run an experiment (opens in a new tab) This cookbook shows how to run an eval using Athina, and also log the experiment configuration.
This is very similar to #1, but you are also describing an AthinaExperiment
object, so the experiments will be logged to your develop dashboard, along with the metadata and experiment parameters (like prompt).
A custom grading criteria is the easiest way to create your own eval.
These evals take the format: "If X, then fail. Otherwise, pass"
This gets wrapped inside our CoT prompt, and enforces a JSON output of pass / fail along with a reason.
This is best used for very simple conditional evals (like the one below)