Assistants - rapida.ai documentation

An assistant is the core unit of Rapida. It packages everything needed to run a production voice AI conversation — the LLM and prompt, the voice pipeline (STT, VAD, TTS), knowledge sources, tools, and deployment channels — into a single versioned object. One assistant configuration drives every channel. The same prompt, model, and voice settings that handle your inbound phone calls also power your web widget and WhatsApp deployment. Change something once and it propagates everywhere.

Assistants are version-controlled. Every prompt or model change creates a new draft version. Versions must be explicitly released — live deployments are never changed automatically.

Anatomy of an assistant

Prompt & Model

Define persona and behavior with a system prompt, then choose and tune the LLM model per version.

Voice Pipeline

Tune STT, VAD, noise cancellation, end-of-speech detection, and TTS per deployment channel.

Knowledge Bases

Attach one or more knowledge bases for retrieval-augmented responses during live calls.

Tools

Let the LLM call tools mid-conversation: knowledge, APIs, endpoint prompts, hold, and end session.

Deployments

Run the same assistant across phone, web, and WhatsApp with channel-specific voice settings.

Webhooks

Stream conversation events, transcripts, and metadata to your external systems in real time.

Post-call Analysis

Run post-call analysis for sentiment, intent, compliance, CSAT, and custom metrics.

How a voice conversation works

Every conversation follows the same pipeline from audio-in to audio-out. Understanding this flow helps you tune latency, accuracy, and behaviour at each stage.

The EOS timeout (default 700ms) is the primary latency control between the caller finishing speaking and the assistant beginning to respond. Reduce it for snappy IVR-style interactions; increase it for conversational use cases where callers pause mid-thought.

Deployment channels

The same assistant is deployable across every channel. Each deployment is independently configured for voice settings and conversation experience while sharing the assistant’s prompt, model, and tools.

Phone

Inbound and outbound PSTN calls via Twilio, Vonage, Exotel, Asterisk, or SIP.

Web Widget

Embeddable widget for text and voice on any site with a script tag.

Web App (React SDK)

Native React SDK integration with direct WebRTC audio streaming.

Deploy on WhatsApp Business with multi-turn context across messages.

API / SDK

Programmatically create calls, pass variables, stream transcripts, and consume webhooks.

Debugger

Test live conversations in-browser before release, with real-time logs and tool traces.

The assistant lifecycle

1. Create — Define the assistant with an LLM provider and initial system prompt. The first version (v1) is created automatically in draft state. 2. Configure — Attach knowledge bases, add tools, tune the voice pipeline, and set up deployments. Each deployment channel has its own voice settings and experience configuration. 3. Test — Use the built-in Debugger deployment to run live conversations before any real traffic hits the assistant. Inspect transcripts, latency breakdowns, and tool invocations. 4. Release — Promote a version from draft to live. All active deployments switch to the released version immediately. 5. Monitor — Every conversation generates structured logs: full transcript, per-turn latency, tool call results, LLM token usage, and EOS timing. Webhook events and analysis pipeline outputs flow to your downstream systems. 6. Iterate — Create a new version with updated prompt or model parameters. Test in the Debugger. Release when confident. Previous versions are preserved and can be re-released for instant rollback.

Use separate assistants for distinct products or personas rather than a single assistant with complex conditional logic in the prompt. Assistants are cheap to create — isolation keeps prompts focused and version history clean.

In this section

Create an Assistant

Step-by-step guide through creation, prompt setup, model configuration, voice pipeline, and tools.

Voice Activity Detection

Configure VAD providers (Silero, FireRed) — speech detection sensitivity, barge-in timing, and noise handling.

End of Speech Detection

Configure EOS providers (Silence-Based, Pipecat Smart Turn) — turn detection, latency tuning, and parameter guidance.

Version Control

Create, compare, and release new versions. Roll back instantly if something goes wrong.

Tools

Add knowledge retrieval, API calls, endpoint invocations, hold, and end-of-conversation tools.

Knowledge

Attach knowledge bases and tune retrieval settings for your use case.

Webhooks

Configure event-driven delivery of transcripts and call data to external systems.

Logs

Browse conversation transcripts, tool call traces, LLM token usage, and latency breakdowns for every session.

Post-call Analysis

Run LLM-powered analysis pipelines against completed conversations.

AgentKit

Replace Rapida’s built-in LLM with your own gRPC backend — LangChain, CrewAI, or custom logic.

Twilio Integration

Step-by-step guide for connecting a Twilio phone number to your assistant.

Documentation Index

​Anatomy of an assistant

Prompt & Model

Voice Pipeline

Knowledge Bases

Tools

Deployments

Webhooks

Post-call Analysis

​How a voice conversation works

​Deployment channels

Phone

Web Widget

Web App (React SDK)

WhatsApp

API / SDK

Debugger

​The assistant lifecycle

​In this section

Create an Assistant

Voice Activity Detection

End of Speech Detection

Version Control

Tools

Knowledge

Webhooks

Logs

Post-call Analysis

AgentKit

Twilio Integration

Anatomy of an assistant

How a voice conversation works

Deployment channels

The assistant lifecycle

In this section