Log via API – Athina

Log via API Request

Using OpenAI with Python? Just follow our quick start guide to get started in just a few lines of code.

Using LiteLLM? Follow this guide (opens in a new tab) to get set up in just a few lines of code.

You can log your inference calls to Athina via a simple API request. The logging request should be made just after you receive a response from the LLM.

Method: POST
Endpoint: https://log.athina.ai/api/v1/log/inference
Headers:
- athina-api-key: YOUR_ATHINA_API_KEY
- Content-Type: application/json

Request Body

Here's a simple logging request:

{
    "language_model_id": "gpt-4",
    "prompt": [{ "role": "user", "content": "Translate this to french: hello, world" }],
    "response": chatCompletionResponse
}

language_model_id (string) is the ID of the language model used for inference. See the list of supported language models.
prompt is the prompt used for inference. This can be either a string or the messages array sent to OpenAI.
response is the response from the LLM. This can be either a string or the ChatCompletion response object from OpenAI.

💡

Tip: Include your logging code in a try/catch block to ensure that your application doesn't crash if the logging request fails.

Eval Fields

For evals, you must also log these fields:

{
    // ...otherFields,
    "user_query": user_query,
    "context": context
}

user_query (string) is the user's query. For conversational applications, this is the user's last message.
context is the context used as information for the prompt. For RAG applications, this is the "retrieved" data.
- You may log context as a string or as an object (dictionary).

Usage Metadata Fields

You should also add the following fields so that we can track and display analytics around token usage, cost and response time

{
  // ...otherFields,
  "prompt_tokens": 50, // tokens in the prompt
  "completion_tokens": 30, // tokens in the completion response
  "total_tokens": 80, // prompt_tokens + completion_tokens
  "response_time": 1208, // in milliseconds
  "cost": 0.0123
}

response_time (integer) is the response time in milliseconds. This is useful for segmenting inference calls by response time.

💡

Tip: If you log the entire OpenAI ChatCompletion response object to us, we'll automatically extract the token counts and cost.

Segmentation Fields

Optionally, you can also add the following fields for better segmentation on the dashboard

{
  // ...otherFields,
  "prompt_slug": "customer_support",
  "environment": "production",
  "customer_id": "stripe",
  "customer_user_id": "shiv@athina.ai",
  "session_id": "c45g-1234-s6g4-43d3",
  "external_reference_id": "5e838eaf-7dd0-4b6f-a32c-26110dd54e58"
}

prompt_slug (string) is the identifier for the prompt used for inference. This is useful for segmenting inference calls by prompt.
environment (string) is the environment your app is running in (ex: production, staging, etc). This is useful for segmenting inference calls by environment.
customer_id (string) is your customer ID. This is useful for segmenting inference calls by customer.
customer_user_id (string) is the end user ID. This is useful for segmenting inference calls by the end user.
cost (number) is the cost incurred for this LLM inference.
session_id (string) is the session or conversation ID. This is used for grouping different inferences into a conversation or chain. Read more
external_reference_id (string) is useful if you want to associate your own internal identifier with the inference logged to Athina.

Custom Attributes

Optionally, you can also log custom attributes with your prompt. You can pass attribute name and attribute value as key-value pair in the custom_attributes object.

Note:- A prompt run cannot have duplicate attribute names

{
  // ...otherFields,
  "custom_attributes": {
    "loggedBy": "John Doe",
    "age": 24,
    "isAdmin": true,
    "references": null
    //any other attribute to be logged
  }
}

Grounded Evals

For grounded evals like Answer Similarity, you must also log a reference response (string) to compare against:

{
    // ...otherFields,
    "expected_response": expected_response
}

expected_response (string) is the reference response to compare against for evaluation purposes. This is useful for segmenting inference calls by expected response.

Function Calling

For function calling, you must also log the following fields:

{
  // ...otherFields,
  "functions": function_call_request,
  "function_call_response": function_call_response,
}

Tools

For tools, you must log the following field

{
  // ...otherFields,
  "tools": tools, // can be any json
  "tool_calls": tool_calls // can be any json
}

If tool_calls field is not present, we extract it from the openai completion response and log it in our database

💡

Tip: To avoid adding any latency to your application, log your inference as a fire-and-forget request.

All other models Log via Python SDK