VoiceRun Developer Guide
This guide explains how to create agent functions using the VoiceRun framework. Agent functions are the core building blocks that define how your AI agent behaves and responds to various events.
Overview
VoiceRun is a framework for building conversational AI agents. It provides a simple, event-driven architecture that makes it easy to create sophisticated conversational experiences. The framework handles the complexity of speech-to-text, text-to-speech, and conversation management, allowing you to focus on defining your agent's behavior through Python functions.
Basic Structure
Every agent function follows this pattern:
from primfunctions.events import Event, StartEvent, TextEvent, StopEvent, TextToSpeechEvent, TimeoutEvent
from primfunctions.context import Context
async def handler(event: Event, context: Context):
if isinstance(event, StartEvent):
# Handle session start
yield TextToSpeechEvent(text="Hello!", voice="nova")
if isinstance(event, TextEvent):
# Handle text input
user_message = event.data.get("text", "")
yield TextToSpeechEvent(text=f"You said: {user_message}", voice="nova")
if isinstance(event, TimeoutEvent):
# Handle timeout
yield TextToSpeechEvent(text="Are you still there?", voice="nova")
if isinstance(event, StopEvent):
# Handle session end
yield TextToSpeechEvent(text="Goodbye!", voice="nova")Events
Input Events
| Event Type | Description | Data Format |
|---|---|---|
| StartEvent | Session begins | { } |
| TextEvent | User sends text | { "source": "text/speech", "text": "message" } |
| TimeoutEvent | Session timeout | { "count": 1, "ms_since_input": 1000 } |
| StopEvent | Session ends | { } |
Event Example
from primfunctions.events import TextToSpeechEvent, AudioEvent, SilenceEvent
# Speak text with specific voice
yield TextToSpeechEvent(
text="Hello, how can I help you?",
voice="nova",
cache=True, # optional, defaults to True
interruptable=True # optional, defaults to True
)
# Play audio file
yield AudioEvent(path="/path/to/audio.mp3")
# Wait for 2 seconds
yield SilenceEvent(duration=2000)Context
The context object provides access to session state and utilities:
# Access session variables
user_name = context.get_data("user_name", "Guest")
# Set session variables
context.set_data("user_name", "John")
# Access environment/organization variables
api_key = context.variables.get("OPENAI_API_KEY")Tests
Tests allow you to run A/B experiments to optimize your agent's performance:
async def handler(event: Event, context: Context):
if isinstance(event, StartEvent):
# Configure A/B/C test for greeting style
context.add_test("greeting_variant", {
"formal": 0.33,
"casual": 0.33,
"friendly": 0.34
}, stop={
"max_iterations": 500,
"max_confidence": 95,
"target_outcome": "user_satisfied",
"default": "friendly"
})
variant = context.get_test("greeting_variant")
if variant == "formal":
yield TextToSpeechEvent(text="Good day! How may I assist you?", voice="nova")
elif variant == "casual":
yield TextToSpeechEvent(text="Hey there! What's up?", voice="nova")
else: # friendly
yield TextToSpeechEvent(text="Hi! I'm here to help!", voice="nova")Outcomes
Outcomes let you track and optimize for key metrics in your tests, such as conversion rates or user satisfaction.
async def handler(event: Event, context: Context):
if isinstance(event, TextEvent):
user_message = event.data.get("text", "").lower()
# Increment conversion rate if user expresses purchase intent
if "buy" in user_message or "purchase" in user_message:
current_rate = context.get_outcome("conversion_rate", 0.0)
context.set_outcome("conversion_rate", current_rate + 0.1)Advanced Features
Custom Events
from primfunctions.events import CustomEvent
# Create custom event
custom_event = CustomEvent("payment_processed", {"amount": 100.00})
# Handle custom events
if isinstance(event, CustomEvent):
if event.name == "payment_processed":
payment_amt = event.data["amount"]
yield TextToSpeechEvent(text=f"Payment of {payment_amt} processed", voice="nova")Stop Conditions
# Configure a test with stop conditions for confidence and iterations
async def handler(event: Event, context: Context):
if isinstance(event, StartEvent):
context.add_test(
"button_color",
{
"red": 0.5,
"blue": 0.5
},
stop={
"max_iterations": 1000,
"max_confidence": 95,
"target_outcome": "conversion_rate",
"default": "blue"
}
)
variant = context.get_test("button_color")
if variant == "red":
yield TextToSpeechEvent(text="The button is red.", voice="nova")
else:
yield TextToSpeechEvent(text="The button is blue.", voice="nova")Background Tasks
Background tasks allow you to perform time-consuming operations without blocking the main conversation flow. This is essential for creating responsive agents that can handle complex workflows while maintaining natural conversation.
Creating Background Tasks
Background tasks are async generator functions that yield events. They run independently of the main conversation and can perform operations like API calls, database queries, or complex calculations.
Use context.create_task() to launch background tasks. This immediately returns control to the main conversation while the task runs asynchronously.
Background tasks can share state with the main conversation using context.get_data() and context.set_data():
# In background task
import asyncio
import time
import random
async def background_task(context: Context):
yield LogEvent("Processing background task...")
# Set initial state
context.set_data("task_completed", False)
# Do work...
await asyncio.sleep(random.random() * 10)
# Update state
context.set_data("task_completed", True)
context.set_data("completion_time", time.time())
yield LogEvent("Background task done")
async def handler(event: Event, context: Context):
if isinstance(event, StartEvent):
yield TextToSpeechEvent(
text="Hello! I'll start processing your data in the background.",
voice="brooke"
)
context.create_task(background_task(context))
if isinstance(event, TextEvent):
user_message = event.data.get("text", "").lower()
if context.get_data("task_completed", False):
completion_seconds_ago = int(time.time() - context.get_data("completion_time", 0))
yield TextToSpeechEvent(
text=f"The data is done processing. Completion was {completion_seconds_ago} seconds ago.",
voice="brooke"
)
yield TextToSpeechEvent(
text="Starting new task...",
voice="brooke"
)
context.create_task(background_task(context))
else:
yield TextToSpeechEvent(
text="The data is still processing.",
voice="brooke"
)Background Task Monitoring
Background tasks appear in the Agent Debugger interface with special visual indicators:
- Orange background and border to distinguish from regular events
- Task name display for easy identification
When to Use Background Tasks
Use background tasks for operations that might take more than a few seconds:
- API calls to external services
- Database queries and updates
- File processing and data analysis
- Complex calculations or model inference
- Any operation that could delay the conversation
π‘ Pro Tip: Background Task Best Practices
- Always use meaningful background_task_name for easy identification
- Log progress updates to keep users informed
- Use context for state management between tasks
- Keep background tasks focused on a single responsibility
- Handle errors gracefully in background tasks
Outbound Calling
You can start a phone session (outbound call) by creating an API key in the Prim Voices dashboard and then calling the session start endpoint for your agent.
Create an API key
Issue an API key from Prim Voices β Profile β API Keys.
Start an outbound call
curl 'https://api.primvoices.com/v1/agents/<AGENT_ID>/sessions/start' \
-X 'POST' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <API_KEY>' \
--data-raw '{"inputType":"phone","inputParameters":{"phoneNumber":"<NUMBER_TO_DIAL>"},"environment":"<ENVIRONMENT_NAME>","parameters":{}}'Parameters
inputParameters.phoneNumberis the phone number to dial.environmentis the environment name to run the agent in.parametersis a dictionary of values passed into the session context when the session starts. It is accessible from the handler using context.get_data("key").
This guide provides the foundation for creating sophisticated agent functions. The modular design allows for complex behaviors while maintaining simplicity and testability.
Ready to see examples in action? Check out our Examples section for complete, working agent implementations.