Voice capabilities (microphone input and spoken responses) are optional. You can build a text-only integration by skipping the Voice Input and Voice Output steps during configuration.
Creating an SDK / API Deployment
Navigate to your assistant, click Configure Assistant, then select Deployments from the sidebar. Click Add Deployment and choose SDK / API. The SDK deployment wizard walks you through three steps:General Experience
Define how the assistant greets users and handles session lifecycle.Required fields:
- Greeting — Opening message sent when a session starts. Supports
{{variable}}syntax for dynamic content passed as query parameters
- Error Message — Fallback message sent when an unexpected error occurs
- Idle Silence Timeout — Duration of user silence before Rapida sends a prompt (15-120 seconds, default: 30s)
- Idle Timeout Backoff — How many times the idle timeout multiplies before ending the session (0-5, default: 2)
- Idle Message — Message sent when the user hasn’t responded (default: “Are you there?”)
- Maximum Session Duration — Hard limit before the session is automatically ended (180-600 seconds, default: 300s)
Voice Input (Speech-to-Text) — Optional
Enable microphone-based voice input. When configured, the SDK streams browser audio to Rapida for real-time transcription.If enabled:
- STT Provider — Deepgram, AssemblyAI, Google, Azure, OpenAI Whisper, AWS Transcribe, Cartesia, Rev.ai, Speechmatics, Sarvam, Groq, or Nvidia
- Model — Provider-specific transcription model
- Language — Primary transcription language
- Encoding — Audio encoding format
- Sample Rate — Audio sample rate
- Voice Activity Detection (VAD) — Silero VAD with configurable threshold (0.0-1.0, default: 0.8)
- Background Noise Removal — RNNoise for ambient noise removal
- End of Speech Detection — Silence-based EOS with configurable timeout (default: 1000ms)
Voice Output (Text-to-Speech) — Optional
Enable spoken audio responses. When configured, the SDK receives audio streams from Rapida and plays them through the browser.If enabled:
- TTS Provider — ElevenLabs, Deepgram, Azure, Google, OpenAI, AWS Polly, Cartesia, Resemble, Rime, Sarvam, Neuphonic, MiniMax, Groq, Speechmatics, or Nvidia
- Model — Provider-specific voice model
- Language — Output speech language
- Voice ID — The specific voice from your TTS provider
- Pronunciation Dictionaries — Custom pronunciation for domain-specific terms
- Conjunction Boundaries — Natural pause points for more human-like speech
- Pause Duration — Length of pause at conjunction boundaries (100-300ms, default: 240ms)
Integration Methods
After deployment, you can integrate via the React SDK or a public URL.React SDK
Install the Rapida React SDK:Public URL
Every SDK / API deployment generates a public URL you can share or embed in an iframe:{{variable}} syntax.
Input and Output Modes
| Voice Input | Voice Output | Integration Style |
|---|---|---|
| Disabled | Disabled | Text-only chat via SDK |
| Enabled | Disabled | Users speak, assistant replies with text |
| Disabled | Enabled | Users type, assistant replies with voice + text |
| Enabled | Enabled | Full voice conversation with real-time transcripts |
Use Cases
Custom Voice Interface
Build a fully branded voice experience inside your own React application with complete UI control.
In-App Support
Add voice-powered support directly inside your SaaS product without redirecting users.
Voice Search
Implement voice-activated search for content-rich applications.
Accessibility
Enhance web accessibility with voice navigation for visually impaired users.
Related
- Create an Assistant — Set up your assistant before deploying
- Web Widget — Pre-built embeddable widget (no custom UI needed)
- API Reference — SDK installation and authentication
- Conversation Logs — Monitor sessions and transcripts