Skip to main content

Quick Start Guide

Get your Screen-aware AI agent running in just a few minutes with this step-by-step guide.

Installation

Install the SAMMY package using your preferred package manager:
npm install @sammy-labs/sammy-three
For UI components, also install the companion UI kit:
npm install @sammy-labs/sammy-three-ui-kit

Authentication

SAMMY requires JWT authentication. Set up your authentication configuration:
1

Get Your JWT Token

Obtain a JWT token from your authentication system or SAMMY Labs API.
2

Create Auth Hook

const useAuth = () => {
  const [token, setToken] = useState(null);
  
  useEffect(() => {
    // Fetch your JWT token
    fetchAuthToken().then(setToken);
  }, []);
  
  return {
    token,
    baseUrl: process.env.NEXT_PUBLIC_SAMMY_API_BASE_URL || 'https://api.sammylabs.com',
    onTokenExpired: async () => {
      // Handle token refresh
      const newToken = await refreshAuthToken();
      setToken(newToken);
    },
  };
};
3

Environment Variables

Add these environment variables to your .env.local:
NEXT_PUBLIC_SAMMY_API_BASE_URL=https://your-api-url.com
NEXT_PUBLIC_APP_VERSION=1.0.0

Basic Usage

1. Wrap Your App

Import the required components and wrap your application:
import {
  SammyAgentProvider,
  useSammyAgentContext,
} from '@sammy-labs/sammy-three';
import '@sammy-labs/sammy-three/styles.css';

function App() {
  const auth = useAuth();

  if (!auth.token) {
    return <div>Loading authentication...</div>;
  }

  return (
    <SammyAgentProvider
      config={{
        auth: auth,
        captureMethod: 'render', // or 'video'
        debugLogs: true,
        model: 'models/gemini-2.5-flash-preview-native-audio-dialog',
      }}
      onError={(error) => console.error('Agent error:', error)}
      onTokenExpired={auth.onTokenExpired}
    >
      <YourApp />
    </SammyAgentProvider>
  );
}

2. Create Your First Screen-aware AI Agent

Use the context hook to interact with the agent:
function ChatComponent() {
  const {
    startAgent,
    stopAgent,
    sendMessage,
    toggleMuted,
    agentStatus,
    agentVolume,
    userVolume,
  } = useSammyAgentContext();

  const handleStart = async () => {
    const success = await startAgent({
      agentMode: 'user', // 'admin', 'user', or 'sammy'
    });
    
    if (success) {
      console.log('Agent started successfully');
    }
  };

  return (
    <div>
      <button 
        onClick={handleStart} 
        disabled={agentStatus === 'connecting'}
      >
        {agentStatus === 'connecting' ? 'Starting...' : 'Start Agent'}
      </button>
      
      <button 
        onClick={stopAgent} 
        disabled={agentStatus === 'disconnected'}
      >
        Stop Agent
      </button>
      
      <button onClick={() => sendMessage('Hello!')}>
        Send Message
      </button>
      
      <button onClick={toggleMuted}>
        Mute/Unmute
      </button>

      <div>Status: {agentStatus}</div>
      <div>Agent Volume: {Math.round(agentVolume * 100)}%</div>
      <div>User Volume: {Math.round(userVolume * 100)}%</div>
    </div>
  );
}

Enhanced Configuration

Audio Processing

Add advanced audio processing for better voice quality:
const config = {
  auth: authConfig,
  captureMethod: 'render',
  
  // Audio processing configuration
  audioConfig: {
    noiseSuppression: {
      enabled: true,
      enhancementLevel: 'medium', // 'light', 'medium', 'aggressive'
    },
    noiseGate: {
      enabled: true,
      threshold: 0.04,
      attackTime: 30,
      holdTime: 400,
      releaseTime: 150,
    },
    environmentPreset: 'office', // 'office', 'home', 'noisy', 'studio'
  },
};

Learn More About Audio

Explore advanced audio processing features including noise suppression, noise gates, and environment presets.

Screen Capture

Configure screen capture for visual context:
const config = {
  // ... other config
  captureMethod: 'render', // Recommended for web apps
  captureConfig: {
    quality: 0.8, // JPEG quality (0.0-1.0)
    enableAudioAdaptation: true, // Reduce captures during audio
  },
};

Screen Capture Guide

Learn about render-based and video-based screen capture options.

Next Steps

Now that you have a basic agent running, explore these advanced features:

Common Configuration Examples

Production Setup

const productionConfig = {
  auth: {
    token: jwtToken,
    baseUrl: process.env.NEXT_PUBLIC_SAMMY_API_BASE_URL,
    onTokenExpired: handleTokenRefresh,
  },
  captureMethod: 'render',
  debugLogs: false, // Disable in production
  
  // Audio optimized for office environment
  audioConfig: {
    environmentPreset: 'office',
  },
  
  // Enable observability
  observability: {
    enabled: true,
    useWorker: true,
    includeAudioData: false, // Exclude large audio data
    includeImageData: true,
  },
};

Development Setup

const developmentConfig = {
  auth: {
    token: jwtToken,
    baseUrl: 'http://localhost:3000',
    onTokenExpired: handleTokenRefresh,
  },
  captureMethod: 'render',
  debugLogs: true, // Enable debug logging
  debugAudioPerformance: true, // Enable audio debugging
  
  // Aggressive audio filtering for noisy dev environment
  audioConfig: {
    environmentPreset: 'noisy',
  },
  
  // Full observability for debugging
  observability: {
    enabled: true,
    logToConsole: true,
    includeAudioData: true,
    includeImageData: true,
  },
};

Troubleshooting

Common Issues

  • Authentication
  • Microphone
  • Audio Quality
Token expired errors:
// Ensure onTokenExpired is implemented
const config = {
  auth: {
    token: jwtToken,
    onTokenExpired: async () => {
      await refreshToken();
    },
  },
};
If you encounter issues, check the error handling guide for comprehensive troubleshooting steps.

What’s Next?

1

Explore Features

Check out advanced features like VAD configuration and MCP integration
2

Add Tools

Learn how to create custom tools for your specific use case
3

Monitor Performance

Set up observability to track your agent’s performance
4

Handle Errors

Implement robust error handling for production use

Need Help?

Visit our comprehensive API reference or explore specific feature documentation for detailed implementation guides.
I