The SAMMY Three package implements a sophisticated tool system that allows the Live API agent to perform actions beyond conversation. Tools are type-safe, event-driven, and seamlessly integrated with Google’s Gemini Live API.
Architecture Overview
System Components
The tools system consists of several integrated components:
Core Components
ToolManager
Central registry for managing tool registration, execution, and events
Tool Definitions
Type-safe tool declarations with handlers and metadata
Event System
Type-safe event emission and subscription for tool interactions
Live API Integration
Seamless integration with Google’s Gemini Live API
Tool Definition Structure
Every tool in the system follows a consistent structure defined by the
ToolDefinition
interface.Components Breakdown
- Declaration
- Handler Function
- Tool Context
The
declaration
defines the tool’s schema for the Live API:Tool Lifecycle
Registration Phase
1
Tool Registration
Tools are registered during agent initialization:
2
Automatic Setup
The ToolManager automatically:
- Registers default tools from
DefaultToolDefinitions
- Registers custom tools passed via constructor
- Creates type-safe event emitters
Configuration Phase
During agent startup, tools are integrated with the Live API:Execution Phase
1
Message Reception
Live API sends tool call message
2
Event Emission
GenAI client emits ‘toolcall’ event
3
Handler Routing
ToolManager routes to appropriate handler
4
Execution
Tool handler executes with context
5
Response
Function response sent back to Live API
Built-in Tools
SAMMY Three includes several pre-configured tools that handle common scenarios:- End Session Tool
- Get Context Tool
- Escalate Tool
Session Management
Purpose: Manages agent session termination gracefullyCategory: ActionBehavior: Non-blockingAutomatic Triggers:
- User says goodbye or wants to end chat
- Task has been completed successfully
- User explicitly requests to stop
- Validates confirmation parameter
- Emits session end event if confirmed
- Saves conversation state
- Triggers cleanup processes
Creating Custom Tools
Step 1: Define Tool Types
Add your tool to the enum and event map:Step 2: Create Tool Definition
Tool Execution Flow
Parallel Processing
The ToolManager processes multiple function calls in parallel for optimal performance.
Error Handling
Robust error handling is implemented at multiple levels:
1
Missing Tool Name
Returns structured error response
2
Missing Handler
Returns “tool not found” error
3
Handler Errors
Catches and formats exceptions
Event System
Type-Safe Events
Compile-Time Safety
The event system provides compile-time type safety:
Event Flow
Error Response Structure
All errors return consistentFunctionResponse
objects:
Error Types
Missing Tool Name
Tool call without name
Missing Handler
Tool not registered
Handler Exceptions
Runtime errors in tool logic
Errors use
SILENT
scheduling to avoid interrupting conversation flow.Best Practices
Tool Design
1
Single Responsibility
Each tool should have one clear purpose
2
Clear Descriptions
Provide comprehensive descriptions for the LLM
3
Parameter Validation
Always validate input parameters
4
Error Handling
Implement robust error handling
Async Operations
- Use
Behavior.NON_BLOCKING
for async operations - Handle timing with appropriate
FunctionResponseScheduling
- Consider user experience when choosing scheduling
Event Usage
- Emit events for significant state changes
- Use type-safe event definitions
- Document event contracts
Performance
- Process multiple calls in parallel when possible
- Avoid blocking operations in tool handlers
- Use appropriate scheduling for response timing
Advanced Features
Asynchronous Function Calling
- Scheduling Options
- Custom Event Maps
For non-blocking operations, tools can use different scheduling options:
Service Integration
Tools have full access to the service layer:State Access
Tools can access current agent state:Integration with Live API
The tools system integrates seamlessly with Live API capabilities:
Function Calling
Direct integration with Gemini’s function calling
Code Execution
Can be combined with code execution tools
Google Search
Compatible with search grounding
Multi-modal
Supports audio, text, and visual contexts
Summary
The comprehensive tool system enables powerful agent capabilities while maintaining:- Type Safety: Full TypeScript support throughout
- Performance: Parallel processing and optimal scheduling
- Developer Experience: Clean API and clear patterns
- Extensibility: Easy to add custom tools without breaking existing ones