The Simple Render Capture System is a streamlined screen capture implementation that captures DOM content every second and sends it to the Gemini AI service. This is the current active implementation used in the sammy-three package.
Performance Optimization
Optimize for Your Use Case: Screen capture performance can be tuned based on your specific needs. The system offers flexible configuration options to balance quality, frequency, and performance.
Performance Tuning Options
Architecture Overview
System Features
The system uses a simplified approach for reliable screen capture:
- Fixed 1-second intervals for capturing (configurable)
- domToPng from modern-screenshot library for DOM-to-image conversion
- Fallback mechanism when capture times out
- Configurable DOM change detection and performance settings
- Support for targetElement to capture specific DOM elements
Component Flow
Configuration Flow
- Provider Config
- Hook Flow
Capture Element Resolution
The system determines which element to capture using a priority-based resolution system.
Priority Order
1
targetElement (Highest Priority)
Can be: CSS selector string,
HTMLElement
, or RefObject<HTMLElement>
Examples: '#my-capture-area'
, document.getElementById('app')
, useRef()
2
scope: 'context'
Uses the
contextElementRef
from the hookInternal wrapper element managed by sammy-three3
scope: 'document' (Default)
Searches for common app containers in order:
#root
#app
[data-testid="app"]
<main>
document.body
(last resort)
Resolution Logic
Capture Process
Interval Setup
Capture Function Steps
1
Pre-flight Checks
- Verify
isCapturing
is true - Verify
clientRef.current
exists - Verify
captureElement
exists
2
DOM to PNG Conversion
3
Fallback Mechanism
When timeout occurs:
4
Data Transmission
Fallback Canvas
The fallback canvas with βScreen capture in progressβ¦β message appears under these conditions:
domToPng Timeout
- Large or complex DOM structure
- Heavy CSS animations or transforms
- Many external resources
- Browser performance issues
domToPng Errors
- CORS issues with external resources
- Invalid DOM structure
- Memory constraints
- Browser security restrictions
Element Issues
- Element has zero dimensions
- Element is hidden or off-screen
- Element contains problematic content (iframes, canvas, video)
Fallback Canvas Details
Canvas Specifications
Modern Screenshot Process
Detailed Steps
1
DOM Cloning
Deep clone with computed styles
2
Resource Embedding
Inline external resources
3
SVG Creation
Wrap in ForeignObject SVG
4
Data URL Conversion
Convert SVG to data URL
5
Image Loading
Create image from data URL
6
Canvas Rendering
Draw image to canvas
7
PNG Export
Export canvas as PNG data URL
Usage Examples
Basic Usage
With Target Element
Configuration Options
Complete Configuration Reference
Capture method. Use
'video'
to avoid DOM cloning entirely.How often to capture in milliseconds. Adjust based on your update frequency needs.
JPEG quality from 0.0 to 1.0. Balance quality and file size for your use case.
Maximum width for captured images. Optimize based on your UI requirements.
Maximum height for captured images. Adjust for your display needs.
Minimum time between captures in milliseconds when changes are detected.
Maximum time between captures in milliseconds when no changes are detected.
Enable smart deduplication using image hashing to skip identical frames.
Scope of capture. Use
'context'
to capture only wrapped content.Debounce time after DOM mutations stop before capturing.
MutationObserver configuration for detecting DOM changes.
Target Element Options
The
targetElement
option accepts three types of values:Type | Example | Description |
---|---|---|
string | '#my-div' , '.capture-area' | CSS selector string |
HTMLElement | document.getElementById('my-div') | Direct element reference |
RefObject<HTMLElement> | useRef<HTMLDivElement>() | React ref object |
Configuration Presets
- Performance Focused
- Balanced
- Quality Focused
Optimized for maximum performance and efficiency
Key Differences from Complex Render
Feature | Simple Render | Complex Render |
---|---|---|
Capture Interval | Fixed 1s | Dynamic (100ms-2s) |
DOM Change Detection | No | Yes (MutationObserver) |
Audio Adaptation | No | Yes (throttling) |
Frame Hashing | No | Yes (deduplication) |
Critical Renders | No | Yes (conversation events) |
Worker Threads | No | Yes (optional) |
Performance Mode | Single | Multiple (worker/main) |
Capture Method | domToPng only | domToCanvas with fallbacks |
Debug Logging
When
debugLogs: true
, the system provides extensive logging:Performance Characteristics
Timing Analysis
Capture Frequency
Every 1000ms (1 FPS)
domToPng Timeout
2000ms maximum
Fallback Canvas
~5-10ms creation
Base64 Encoding
~10-20ms
Resource Usage
CPU
Medium (spikes during capture)
Memory
Low-Medium (temporary canvas/image)
Network
~50-200KB per frame (compressed PNG)
UI Impact
Minimal (no worker threads)
Error Handling
Timeout Recovery
Client Loss Protection
Capture Element Validation
Benefits of Target Element Capture
Reduced Size
Capture only the relevant part of your UI
Better Performance
Smaller capture area means faster processing
Focused Context
AI agent sees only the important content
Flexible Integration
Works with any existing DOM structure
Important Notes
Keep these points in mind when using screen capture:
- Only works with
captureMethod: 'render'
(explicit render capture) - The target element must exist in the DOM when capture starts
- If the target element is not found, the system falls back to the default scope behavior
- Enable
debugLogs: true
to see which element is being captured in the console
Optimization Guide
Performance Tuning
Optimize for slower devices
Optimize for slower devices
When to use: Targeting older devices or complex UIsOptimizations:
- Increase
checkInterval
to 5000ms or higher - Switch to
method: 'video'
for native browser capture - Reduce
maxWidth
andmaxHeight
to match your UI needs - Lower
jpegQuality
to 0.3 for smaller files - Set
domChangeDetection.observerConfig.subtree: false
Optimize capture responsiveness
Optimize capture responsiveness
When to use: Need faster updates for dynamic contentOptimizations:
- Reduce
domChangeDetection.debounceMs
to 200ms - Enable more mutation observer options
- Decrease
checkInterval
for more frequent captures
Optimize bandwidth usage
Optimize bandwidth usage
When to use: Limited bandwidth or high traffic applicationsOptimizations:
- Enable
useHashing: true
for smart deduplication - Increase
minInterval
to reduce capture frequency - Use restrictive
domChangeDetection.observerConfig
- Lower
jpegQuality
to reduce file sizes
Performance Monitoring
Best Practices
Choose the right method - Use
video
for simplicity, render
for DOM-specific featuresProfile your specific use case - Use Chrome DevTools to understand your appβs needs
Test across devices - Validate performance on your target device range
Monitor performance metrics - Enable
debugLogs
to track capture timingEnable smart optimizations - Use
useHashing: true
for automatic deduplicationTune DOM observation - Configure
observerConfig
based on your UI update patternsSummary
The Screen Capture system provides a flexible, configurable approach to screen capture that can be tuned for your specific performance and quality requirements. Choose from multiple presets or create custom configurations to match your applicationβs needs.
- Flexible intervals from 1-10+ seconds based on your needs
- High-quality captures using modern DOM-to-image conversion
- Reliable fallback mechanism ensures continuous operation
- Targeted capture support for focused UI regions
- Performance tuning options for various device capabilities
- Multiple configuration presets for common use cases