OpenAI Chat Completion
If you're using OpenAI chat completions in Python, you can get set up in just 2 minutes
1. Install the Python SDK
Run pip install athina-logger
2. Import Athina Logger
Replace your import openai
with this:
from athina_logger.api_key import AthinaApiKey
from athina_logger.athina_meta import AthinaMeta
from athina_logger.openai_wrapper import openai
openai.api_key = os.getenv('OPENAI_API_KEY')
3. Set Athina API key
# Initialize the Athina API key somewhere in your code
AthinaApiKey.set_api_key(os.getenv('ATHINA_API_KEY'))
4. Use OpenAI ChatCompletion request as you do normally
Not steaming example:
# Use openai.ChatCompletion just as you would normally
# Add fields to AthinaMeta for better segmentation of your data
openai.ChatCompletion.create(
model="gpt-4",
messages=messages,
stream=False,
athina_meta=AthinaMeta(
prompt_slug="yc_rag_v1",
user_query="How much funding does Y Combinator provide?", # For RAG Q&A systems, log the user's query
context={"information": retrieved_documents} # Your retrieved documents
session_id=session_id, # Conversation ID
customer_id=customer_id, # Your Customer's ID
customer_user_id=customer_id, # Your End User's ID
environment=environment, # Environment (production, staging, dev, etc)
external_reference_id="ext_ref_123456",
custom_attributes={
"name": "John",
"age": 30,
"city": "New York"
} # Your custom-attributes
),
)
Streaming example:
stream = openai.ChatCompletion.create(
model="gpt-4",
messages=messages,
stream=True,
athina_meta=AthinaMeta(
prompt_slug="yc_rag_v1",
user_query="How much funding does Y Combinator provide?", # For RAG Q&A systems, log the user's query
context={"information": retrieved_documents} # Your retrieved documents
session_id=session_id, # Conversation ID
customer_id=customer_id, # Your Customer's ID
customer_user_id=customer_id, # Your End User's ID
environment=environment, # Environment (production, staging, dev, etc)
external_reference_id="ext_ref_123456",
custom_attributes={
"name": "John",
"age": 30,
"city": "New York"
} # Your custom-attributes
),
)
for chunk in stream:
content = chunk["choices"][0].get("delta", {}).get("content")
if content is not None:
print(content, end='')
Note: We support both stream=True
and stream=False
for OpenAI chat completions. OpenAI doesn’t provide usage statistics such as prompt and completion tokens when streaming. However, We overcomes this limitation by getting these with the help of the tiktoken package, which is designed to work with all tokenized OpenAI GPT models.
Frequently Asked Questions
Q. What is AthinaMeta
The AthinaMeta
fields are used for segmentation of your data on the dashboard. All these fields are optional, but highly recommended.
class AthinaMeta:
prompt_slug: Optional[str] = None
context: Optional[dict] = None
customer_id: Optional[dict] = None
customer_user_id: Optional[dict] = None
session_id: Optional[dict] = None
user_query: Optional[dict] = None
environment: Optional[dict] = None
external_reference_id: Optional[dict] = None
customer_id: Optional[str] = None
customer_user_id: Optional[str] = None
response_time: Optional[int] = None
custom_attributes: Optional[dict] = None
Q. Is this SDK going to make a proxy request to OpenAI through Athina?
Nope! We know how important your OpenAI inference call is, so we don't want to interfere with that or increase response times.
Instead, we simply make an logging API request to Athina, which is separate from your OpenAI request.
Q. Will this SDK increase my latency?
Nope! The logging call is being made in a background thread as a fire and forget request, so there is almost no additional latency (< 5ms).