Introduction

What is SAMMY?

SAMMY (@sammy-labs/sammy-three) is designed to create sophisticated Screen-aware AI agents that can:

Engage in real-time voice conversations
Process and capture screen content with both render-based and video-based options
Manage conversation memory with semantic search and context injection
Handle complex workflows with interactive guides and walkthroughs
Extend capabilities through custom tools and MCP integration
Monitor performance with comprehensive observability and analytics

SAMMY is built for production use with worker-based architecture, advanced error handling, and performance optimization.

Key Features

Real-Time Voice

Advanced audio processing with noise suppression, noise gate, and environment presets

Screen Capture

Intelligent screen capture with DOM-to-image conversion and performance optimization

Memory Management

Automatic context tracking with semantic search and conversation continuity

Interactive Guides

URL-based walkthrough system for onboarding and feature tours

Custom Tools

Extensible tool system with built-in tools and MCP protocol support

Observability

Comprehensive event tracking, analytics, and performance monitoring

Error Handling

Robust error handling with automatic recovery and user-friendly fallbacks

Voice Activity Detection

Configurable VAD settings to prevent stuttering and false interruptions

Architecture Overview

SAMMY uses a modern, performance-focused architecture:

Worker-based processing for non-blocking operations
Event-driven system with type-safe event handling
Modular design with composable hooks and services
CSP-compliant using Data URLs for worker loading
Memory-efficient with automatic cleanup and optimization

Getting Started

Ready to build your first Screen-aware AI agent? Start with our quickstart guide:

Quick Start Guide

Get up and running with SAMMY in under 5 minutes

Or explore specific features:

Installation

Install and configure the package

Authentication

Set up JWT authentication

Basic Usage

Your first Screen-aware AI agent

Use Cases

SAMMY is perfect for:

Customer support with voice-enabled help systems
Product onboarding with interactive guided tours
Internal tools with voice-controlled interfaces
Educational platforms with conversational learning
Accessibility features with voice navigation
Complex workflows requiring contextual assistance

What’s Next?

Installation

Follow our quickstart guide to install and configure SAMMY

Configuration

Learn about authentication and basic configuration options

Features

Explore advanced features like audio processing and screen capture

Customization

Add custom tools and MCP integrations for your specific needs

Need help? Check out our troubleshooting guide or explore the comprehensive API reference.

Get Started

Features

Integrations

What is SAMMY?

Key Features

Real-Time Voice

Screen Capture

Memory Management

Interactive Guides

Custom Tools

Observability

Error Handling

Voice Activity Detection

Architecture Overview

Getting Started

Quick Start Guide

Installation

Authentication

Basic Usage

Use Cases

What’s Next?

Get Started

Features

Integrations

​What is SAMMY?

​Key Features

Real-Time Voice

Screen Capture

Memory Management

Interactive Guides

Custom Tools

Observability

Error Handling

Voice Activity Detection

​Architecture Overview

​Getting Started

Quick Start Guide

Installation

Authentication

Basic Usage

​Use Cases

​What’s Next?

What is SAMMY?

Key Features

Architecture Overview

Getting Started

Use Cases

What’s Next?