Skip to main content

Text Mode & Tour System

The Text Mode agent provides a lightweight, text-only alternative to the voice agent with an innovative Tour System that can visually guide users through web interfaces by highlighting DOM elements and providing contextual instructions.
Text mode is perfect for environments where audio isn’t suitable, users prefer typing, or you need lower bandwidth requirements.

Quick Start

Enable text mode in your Sammy Agent by setting the mode prop:
Basic Text Mode Setup
import { SammyAgentProvider, useSammyAgent } from '@sammy-labs/sammy-three';

function App() {
  return (
    <SammyAgentProvider
      config={yourConfig}
      mode="text"           // Enable text mode
      enableTour={true}     // Enable tour feature
    >
      <YourChatComponent />
    </SammyAgentProvider>
  );
}

Text Mode Features

Streaming Responses

Text mode provides real-time streaming of agent responses through the streamingMessage property:
Streaming Messages
function ChatComponent() {
  const { sendMessage, streamingMessage } = useSammyAgent();
  
  return (
    <div>
      {streamingMessage && (
        <div className="streaming-response">
          {streamingMessage}
        </div>
      )}
    </div>
  );
}

Starting a Text Session

Starting Text Agent
const { startAgent } = useSammyAgent();

// Start with specific source tracking
await startAgent({
  agentMode: AgentMode.USER,
  source: 'chat'  // or 'tour' for tour-initiated conversations
});

Message Handling

Text mode provides a simple interface for sending and receiving messages:
Message Management
function Chat() {
  const { sendMessage, activeSession } = useSammyAgent();
  const [messages, setMessages] = useState([]);

  const handleSend = (text) => {
    // Send message to agent
    sendMessage(text);
    
    // Add to local message history
    setMessages(prev => [...prev, {
      type: 'user',
      content: text,
      timestamp: new Date()
    }]);
  };
  
  // Listen for completed agent responses
  const onTurnComplete = ({ agent, user, source }) => {
    if (source !== 'tour') {  // Filter out tour messages
      setMessages(prev => [...prev, {
        type: 'agent',
        content: agent,
        timestamp: new Date()
      }]);
    }
  };
}

Tour System

The Tour System enables agents to create interactive guides by highlighting elements on the page and providing step-by-step instructions.
Tours are perfect for onboarding, troubleshooting, form assistance, and interactive tutorials.

How Tours Work

  1. Agent identifies elements - The agent receives DOM context with numbered interactive elements
  2. Highlights target - Purple border appears around the selected element
  3. Shows instructions - Popover displays contextual guidance
  4. Waits for user - User performs action or asks questions
  5. Proceeds to next step - Process repeats for multi-step workflows

Tour Tool Usage

The agent can trigger tours using the built-in tour() tool:
Tour Tool Schema
tour({
  element: 1,                           // Numeric ID from DOM context
  title: "Click Submit",                // Short title
  instruction: "Click here to submit"   // Clear instruction
})

Tour Callbacks

Set up callbacks to handle tour interactions:
Tour Callbacks
import { setTourCallbacks } from '@sammy-labs/sammy-three';

useEffect(() => {
  setTourCallbacks(
    // Handle user actions (clicks, cancellations)
    (action: string) => {
      console.log('User action:', action);
      agent.sendUserAction(action);
    },
    // Handle chat messages from tour popover
    (message: string) => {
      console.log('Tour chat:', message);
      agent.sendMessage(message);
    }
  );
}, [agent]);

Manual Tour Creation

You can also create tours programmatically:
Manual Tour
import { createTourHighlight, clearExistingHighlight } from '@sammy-labs/sammy-three';

// Create a tour highlight
createTourHighlight({
  element: document.querySelector('#submit-button'),
  title: 'Submit Form',
  description: 'Click this button to submit your application'
});

// Clear when done
clearExistingHighlight();

Advanced Usage

Direct Core Usage

For advanced scenarios, use the text agent core directly:
Text Agent Core
import { SammyTextAgentCore } from '@sammy-labs/sammy-three';

const textAgent = new SammyTextAgentCore({
  config: yourConfig,
  callbacks: {
    onTextResponse: (text) => {
      console.log('Streaming:', text);
    },
    onTurnComplete: ({ user, agent, source }) => {
      console.log('Turn complete:', { user, agent, source });
    }
  },
  enableTour: true,
  updateAgentState: (key, value) => {
    // State management
  }
});

// Start session
await textAgent.start({
  agentMode: AgentMode.USER,
  source: 'chat'
});

// Send messages
textAgent.sendMessage('Help me fill out this form');

// Clean up
await textAgent.destroy();

Context Injection

Text mode supports all context injection features:
Context Management
const textAgent = new SammyTextAgentCore({
  config: {
    contextInjection: {
      enabled: true,
      pageTracking: true,    // Track page navigation
      memorySearch: true,    // Search relevant memories
      clickTracking: false   // No click tracking in text mode
    }
  }
});

// Inject custom context
await textAgent.injectContext(
  "User is on checkout page with 3 items in cart",
  { source: 'custom' }
);

// Update page context
await textAgent.updatePageContext({
  title: 'Checkout - Payment',
  url: window.location.href,
  path: '/checkout/payment'
});

Screen Capture

Text mode can capture screenshots for visual context:
Screen Capture Config
const config = {
  captureMethod: 'render',  // Periodic screenshots
  // Screen capture runs every 2 seconds by default
};

Configuration Reference

Provider Props

mode
'text' | 'voice'
default:"'voice'"
Selects between text-only or voice agent modes
enableTour
boolean
default:"false"
Enables the tour tool for interactive guidance (text mode only)
streamingMessage
string | null
Current streaming agent response (available via useSammyAgent hook)

Text Start Options

agentMode
AgentMode
required
The agent mode (typically AgentMode.USER)
source
'chat' | 'tour'
default:"'chat'"
Identifies the source of the conversation for analytics
guideId
string
Optional guide ID for guided experiences

Tour Configuration

element
HTMLElement
required
The DOM element to highlight
title
string
required
Short title displayed in the popover header
description
string
required
Detailed instruction or description for the user

Tour UI Components

The tour system includes several UI elements:

Highlight Border

  • Purple border (#8b5cf6) around target element
  • 8px padding for visibility
  • Dark overlay on rest of page
  • Smooth transitions when moving between elements

Instruction Popover

  • Smart positioning - Automatically positions to avoid viewport edges
  • Arrow indicator - Points to highlighted element
  • Chat interface - Users can ask questions directly
  • Response display - Shows agent responses in real-time

User Interactions

  • Click detection - Tracks when users click highlighted elements
  • Cancellation - Detects when users click outside to cancel
  • Chat input - Allows questions at any step
  • Progress tracking - Maintains state across steps

Best Practices

For Text Mode

1

Choose the right mode

Use text mode when:
  • Audio is not suitable (quiet environments, accessibility)
  • Users prefer typing over speaking
  • Lower bandwidth is required
  • You need precise text logs for compliance
2

Handle streaming properly

  • Show typing indicators during streaming
  • Update UI smoothly as text arrives
  • Handle partial responses gracefully
3

Manage conversation history

  • Keep reasonable history limits (e.g., last 50 messages)
  • Persist important conversations
  • Clear history on mode switches

For Tours

1

Write clear instructions

  • Use imperative language (“Click the Submit button”)
  • Be specific about what users should do
  • Keep instructions concise and actionable
2

Handle dynamic content

  • Wait for elements to load before highlighting
  • Gracefully handle missing elements
  • Provide fallback text instructions
3

Test thoroughly

  • Verify tours work on different screen sizes
  • Test with various page layouts
  • Ensure z-index doesn’t conflict with your app

Examples

Complete Chat Implementation

import { useState, useEffect } from 'react';
import { 
  SammyAgentProvider, 
  useSammyAgent,
  AgentMode 
} from '@sammy-labs/sammy-three';

function ChatApp() {
  return (
    <SammyAgentProvider
      config={{
        auth: { token: 'YOUR_TOKEN' },
        contextInjection: { enabled: true }
      }}
      mode="text"
      enableTour={true}
      onTurnComplete={handleTurnComplete}
    >
      <ChatInterface />
    </SammyAgentProvider>
  );
}

function ChatInterface() {
  const { 
    startAgent, 
    sendMessage, 
    streamingMessage, 
    agentStatus 
  } = useSammyAgent();
  
  const [messages, setMessages] = useState([]);
  const [input, setInput] = useState('');
  
  useEffect(() => {
    // Auto-start agent
    if (agentStatus === 'disconnected') {
      startAgent({ 
        agentMode: AgentMode.USER,
        source: 'chat'
      });
    }
  }, [agentStatus]);
  
  const handleSend = () => {
    if (input.trim()) {
      sendMessage(input);
      setMessages(prev => [...prev, {
        type: 'user',
        content: input
      }]);
      setInput('');
    }
  };
  
  return (
    <div className="chat-container">
      <div className="messages">
        {messages.map((msg, i) => (
          <div key={i} className={`message ${msg.type}`}>
            {msg.content}
          </div>
        ))}
        {streamingMessage && (
          <div className="message agent streaming">
            {streamingMessage}
          </div>
        )}
      </div>
      <input
        value={input}
        onChange={(e) => setInput(e.target.value)}
        onKeyPress={(e) => e.key === 'Enter' && handleSend()}
        placeholder="Type your message..."
      />
    </div>
  );
}

Tour-Enabled Form Assistant

Form Assistant with Tours
function FormAssistant() {
  const { sendMessage } = useSammyAgent();
  
  const requestHelp = () => {
    // Agent will use tour() to highlight form fields
    sendMessage(
      "Help me fill out this application form step by step"
    );
  };
  
  return (
    <div>
      <button onClick={requestHelp}>
        Get Help with Form
      </button>
      
      <form id="application-form">
        <input id="name" placeholder="Full Name" />
        <input id="email" placeholder="Email" />
        <textarea id="experience" placeholder="Experience" />
        <button id="submit">Submit</button>
      </form>
    </div>
  );
}

Troubleshooting

Common issues and their solutions

Tour not highlighting elements

  • Check DOM context: Ensure elements are numbered in context
  • Wait for render: Elements must exist before highlighting
  • Verify z-index: Tour overlays need high z-index (999999+)

Streaming messages not appearing

  • Check connection: Ensure agent is connected before sending
  • Verify mode: Streaming only works in text mode
  • Handle errors: Check onError callback for issues

Chat messages mixed with tour messages

  • Filter by source: Check source field in onTurnComplete
  • Separate handlers: Use different callbacks for chat vs tour

Migration Guide

From Voice to Text Mode

Migration Example
// Before - Voice mode
<SammyAgentProvider
  config={config}
  enableHighlighting={true}  // Voice highlighting
>

// After - Text mode
<SammyAgentProvider
  config={config}
  mode="text"
  enableTour={true}          // Text tour system
  enableHighlighting={false}  // Not used in text mode
>

Adding Tours to Existing Text Implementation

Adding Tours
// Step 1: Enable tour in provider
<SammyAgentProvider
  mode="text"
  enableTour={true}  // Add this
>

// Step 2: Set up tour callbacks
import { setTourCallbacks } from '@sammy-labs/sammy-three';

useEffect(() => {
  setTourCallbacks(
    (action) => agent.sendUserAction(action),
    (message) => agent.sendMessage(message)
  );
}, [agent]);

Performance

Text mode offers several performance advantages:

Lower Memory Usage

~30% less memory than voice mode due to no audio buffers

Reduced Bandwidth

Text-only WebSocket uses < 1KB per message typically

Minimal CPU

No audio processing overhead, just text handling

Fast Response

< 200ms message round-trip typical latency

Summary

Text mode with the Tour System provides a powerful combination for creating accessible, bandwidth-efficient conversational experiences with visual guidance capabilities. It’s ideal for:
  • Customer support - Guide users through troubleshooting
  • Onboarding - Interactive product tours
  • Form assistance - Step-by-step form completion
  • Training - Interactive tutorials and learning paths
  • Accessibility - Text-based alternative for all users
The tour feature transforms static help into dynamic, contextual guidance that adapts to your application’s current state.
I