Why Athina Evals

Why should I use Athina's Eval framework instead of writing my own evals?

You could build your own eval system from scratch, but here's why Athina (opens in a new tab) might be better for you.

  • Athina provides you with plug-and-play preset evals that have been well-tested
  • Athina evals can run on both development and production, giving you consistent metrics for evaluating model performance and drift.
  • Athina removes the need for your team to write boilerplate loaders, implement LLMs, normalize data formats, etc
  • Athina offers a modular, extensible framework for writing and running evals
  • Athina calculate analytics like pass rate and flakiness, and allows you to batch run evals against live production data or dev datasets

Athina Develop: A UI for your iterations

  • Athina Evals also automatically integrate into a UI that allows you to view results, metrics, and historical records in a user-friendly dashboard.
  • Your track your experiments automatically, so you can view a historical record of previous eval runs, including a history of your prompts, models, datasets and more.

The Athina Team is here for you
  • We are always improving our eval system.
  • We work closely with our users, and can even help design custom evals

If you want to talk, book a call (opens in a new tab) with a founder directly.