Skip to main content

Documentation Index

Fetch the complete documentation index at: https://doc.rapida.ai/llms.txt

Use this file to discover all available pages before exploring further.

AgentKit lets you run your own backend for live voice conversations. Rapida handles telephony + STT + TTS, while your server handles reasoning and tool execution over a bidirectional gRPC stream. With AgentKit, you can connect any LLM backend with Rapida over gRPC.

Quickstart

1

Create an AgentKit assistant in Rapida

AgentKit assistant configuration in Rapida
  1. Open your Rapida dashboard and create (or edit) an assistant.
  2. In provider/model settings, choose AgentKit as the LLM backend.
  3. Configure the AgentKit connection:
    • URL: your gRPC server address (for example: your-host:50051)
    • Auth token: if your server enforces token auth
    • TLS certificate settings: if your server uses TLS
  4. Save the assistant version and release it.
  5. Attach your voice deployment channel (phone/web/etc.) to this version.
Your AgentKit server must be reachable from Rapida network. Localhost works only for local development setups where connectivity is bridged.
2

Start from official examples

Use the Python or Node.js starter depending on your backend stack.
git clone https://github.com/rapidaai/rapida-python-recipes.git
cd rapida-python-recipes
pip install -r requirements.txt

cd examples/python/agentkit
export OPENAI_API_KEY="<your_key>"
python main.py --host 0.0.0.0 --port 50051
Node.js reference implementation: agentkit/index.js.
3

Implement required stream flow

Implement Talk(...) using the lifecycle contract described in Stream Lifecycle below.Minimum required order:
  1. handle initialization first and acknowledge it
  2. handle optional configuration updates
  3. process message turns and stream assistant/tool outputs
def Talk(self, request_iterator, context):
    for request in request_iterator:
        if self.is_initialization_request(request):
            yield self.initialization_response(request.initialization)

        elif self.is_configuration_request(request):
            yield self.configuration_response(request.configuration)

        elif self.is_message_request(request):
            if not self.is_text_message(request):
                yield self.error_response(501, "Audio not implemented")
                continue

            msg_id = self.get_message_id(request)
            yield self.assistant_response(msg_id, "Working on it... ", completed=False)
            yield self.assistant_response(msg_id, "Done.", completed=True)

        else:
            yield self.error_response(400, "Unknown request type")
If initialization is not handled first, conversations can fail with request-shape errors.
Node.js servers implement the same flow by extending AgentKitAgent and writing responses to the gRPC stream:
import { AgentKitAgent, AgentKitServer } from "@rapidaai/nodejs";

class EchoAgent extends AgentKitAgent {
  talk(call) {
    call.on("data", (request) => {
      if (this.isInitializationRequest(request)) {
        call.write(this.initializationResponse(request.getInitialization()));
        return;
      }

      if (this.isConfigurationRequest(request)) {
        call.write(this.configurationResponse(request.getConfiguration()));
        return;
      }

      if (this.isTextMessage(request)) {
        const messageId = this.getMessageId(request);
        const text = this.getUserText(request);

        call.write(this.assistantResponse(messageId, `You said: ${text}`, false));
        call.write(
          this.assistantResponse(
            messageId,
            "Thanks, I received your message.",
            true
          )
        );
      }
    });

    call.on("end", () => call.end());
  }
}

const server = new AgentKitServer({
  agent: new EchoAgent(),
  port: Number(process.env.AGENTKIT_PORT || 50051),
});

await server.start();
4

Implement tool lifecycle

For each tool execution in your backend:
  1. emit tool_call(...)
  2. execute tool locally
  3. emit tool_call_result(...)
  4. continue assistant response (or send call directive)
import uuid

tool_id = str(uuid.uuid4())

yield self.tool_call(msg_id, tool_id, "get_weather", {"city": "London"})
result = {"city": "London", "temp_c": 22}
yield self.tool_call_result(msg_id, tool_id, "get_weather", result, success=True)
yield self.assistant_response(msg_id, "In London it's 22°C.", completed=True)
For call-ending directives (END_CONVERSATION / TRANSFER_CONVERSATION), include toolId and name in the directive payload.
5

Configure security

Token auth:
server = AgentKitServer(
    agent=MyAgent(),
    host="0.0.0.0",
    port=50051,
    auth_config={"enabled": True, "token": os.getenv("AGENTKIT_TOKEN")},
)
TLS:
server = AgentKitServer(
    agent=MyAgent(),
    host="0.0.0.0",
    port=50051,
    ssl_config={
        "cert_path": "/etc/ssl/server.crt",
        "key_path": "/etc/ssl/server.key",
    },
)
6

Validate before customer traffic

Use the test client:
python examples/python/executor/agentkit/test_client.py localhost:50051
Validate:
  • initialization acknowledgement
  • assistant chunks and final response
  • tool call and tool call result events (for tooling flows)

Stream Lifecycle

The AgentKit stream is long-lived and stateful. Treat it as a lifecycle, not as isolated RPC requests.

Lifecycle phases

PhaseInbound frame from RapidaOutbound frame from your backendRequiredPurpose
1. Handshakeinitializationinitialization_response(...)YesEstablish conversation identity and runtime context.
2. Runtime configconfigurationconfiguration_response(...)OptionalApply stream-mode/runtime updates.
3. User turnmessageassistant_response(...) chunks + finalYes (per turn)Return assistant output for each user turn.
4. Tool executionInternal tool decision during turntool_call(...) then tool_call_result(...)OptionalReport tool lifecycle and results.
5. Call directiveInternal decision to transfer/endtoolCall with action + toolId + nameOptionalSignal call transfer or termination.
6. TerminationClient/server closes streamnone (or final error/event)YesEnd stream cleanly and release resources.
  1. Receive message and extract msg_id + text.
  2. Emit one or more assistant_response(..., completed=False) chunks.
  3. If tool is needed:
    • emit tool_call(msg_id, tool_id, name, args)
    • execute tool
    • emit tool_call_result(msg_id, tool_id, name, result, success=...)
  4. Emit final assistant_response(..., completed=True) for that turn.
  5. Repeat for next message frame until stream closes.

State and correlation rules

  • initialization is always first and must be acknowledged.
  • assistant.id, toolCall.id, and toolCallResult.id should match the current message.id.
  • tool_id must be stable between tool_call and tool_call_result.
  • Final assistant frame per user turn must set completed=True.
  • For call-ending/transfer directives, include toolId and name for correlation.

Request Types (TalkInput)

Request TypeWhat it isWhy it existsWhat your backend should do
initializationFirst frame of every stream (assistantConversationId, assistant metadata, runtime args).Establishes conversation/session context before user turns start.Always acknowledge using initialization_response(request.initialization).
configurationOptional runtime configuration update frame.Allows stream behavior updates mid-session.Acknowledge with configuration_response(request.configuration).
messageUser turn payload (id, text/audio, completed, time).Carries the actual user input to process.Parse with helpers (is_text_message, get_user_text, get_message_id) and emit assistant/tool responses.

Response Types (TalkOutput)

Response TypeWhat it isWhy it existsWhen to emit
code / successBase response envelope fields.Standard status signaling for each frame.On every response (200/true by default).
initializationInitialization acknowledgement payload.Confirms handshake accepted.Via initialization_response(...).
assistantAssistant chunk/final text (id, text, completed).Streams voice response content back to Rapida.Via assistant_response(...) for chunks and final frame.
toolCallTool lifecycle or directive event (id, toolId, name, action, args).Announces tool start or call-level action (transfer/end).Via tool_call(...) or explicit directive payload.
toolCallResultTool execution result (id, toolId, name, result).Closes the tool lifecycle and reports output.Via tool_call_result(...) after tool execution.
interruptionOptional interruption signal.Represents interruption events in stream.When interruption handling is implemented.
errorError frame (errorCode, errorMessage).Signals unrecoverable backend failures.Via error_response(code, message).
Keep the same tool_id between tool_call and tool_call_result. For call-ending directives, include toolId and name for correlation.

Utilities

Response Builder Utilities (AgentKitAgent)

UtilityInputWhat it returnsWhy/When to use
response(code=200, success=True, **kwargs)Base status + payload fieldsTalkOutputLow-level custom frame builder.
initialization_response(initialization)ConversationInitializationTalkOutput.initializationRequired ack for first frame.
configuration_response(configuration)ConversationConfigurationTalkOutput ackOptional config ack frame.
assistant_response(msg_id, content, completed=False)message id + text + completion flagTalkOutput.assistantStreaming/final assistant text frames.
error_response(code, message)code + messageTalkOutput.errorError signaling in stream.
tool_call(msg_id, tool_id, name, args)tool start metadataTalkOutput.toolCallStart tool lifecycle event.
tool_call_result(msg_id, tool_id, name, result, success=True)tool result metadataTalkOutput.toolCallResultFinish tool lifecycle event.
transfer_call(msg_id, args)transfer argsTalkOutput.toolCall directiveTransfer call action.
terminate_call(msg_id, args)termination argsTalkOutput.toolCall directiveEnd-conversation action.

Request Helper Utilities (AgentKitAgent)

UtilityWhat it checks/returnsWhy/When to use
is_initialization_request(request)True if initialization frameFirst branch in Talk.
is_configuration_request(request)True if configuration frameHandle optional config updates.
is_message_request(request)True if message frameMain user-turn branch.
is_text_message(request)True for text turnProcess user text safely.
is_audio_message(request)True for audio turnHandle/decline audio mode explicitly.
get_user_text(request)User text or NoneExtract text payload.
get_message_id(request)Message id or NoneCorrelate assistant/tool frames.
get_conversation_id(request)Conversation id or NoneConversation-scoped state mapping.
get_assistant_id(request)Assistant id or NoneLogging/routing/debug context.

Local Development Setup (ngrok)

For local development, expose your AgentKit server to Rapida with ngrok:
ngrok tcp 50051
ngrok prints a forwarding endpoint like:
  • tcp://0.tcp.ngrok.io:12345
Use the host and port (0.tcp.ngrok.io:12345) as the AgentKit URL in your Rapida assistant configuration. Development notes:
  • keep the local AgentKit server running on port 50051 during testing
  • if auth is enabled, set the same token in Rapida AgentKit provider config
  • if ngrok restarts, the endpoint changes unless a reserved domain is configured

Example Matrix (GitHub)

Python base path:
  • https://github.com/rapidaai/rapida-python-recipes/tree/main
Node.js base path:
  • https://github.com/rapidaai/rapida-nodejs-recipes/tree/main
ExamplePurposeGitHub
agentkit/index.jsNode.js EchoAgent gRPC server baselineOpen
agentkit/main.pyMinimal production-style AgentKit baselineOpen
executor/agentkit/openai-gpt.pyOpenAI chat + tool lifecycleOpen
executor/agentkit/azure-openai-gpt.pyAzure OpenAI + tool lifecycleOpen
executor/agentkit/anthropic-claude.pyAnthropic Claude + tool lifecycleOpen
executor/agentkit/gemini-example.pyGoogle Gemini + function/tool flowOpen
executor/agentkit/langchain-agent.pyLangChain orchestrated backendOpen
executor/agentkit/crewai-multi-agent.pyCrewAI multi-agent orchestrationOpen
executor/agentkit/autogen-collaborative.pyAutoGen collaborative agentsOpen
executor/agentkit/n8n-workflow.pyn8n workflow automation integrationOpen
executor/agentkit/example-with-ssl-option.pyTLS/auth focused server setupOpen
executor/agentkit/test_client.pyLocal stream validation clientOpen

Production Checklist

  • handle initialization first on every stream
  • always send final assistant_response(..., completed=True)
  • keep tool_id consistent between tool_call and tool_call_result
  • include toolId + name for call-ending directives
  • do not hardcode secrets in source code
  • configure TLS + auth for production