11 KiB
Softphone Implementation with Twilio & OpenAI Realtime
Overview
This implementation adds comprehensive voice calling functionality to the platform using Twilio for telephony and OpenAI Realtime API for AI-assisted calls. The softphone is accessible globally through a Vue component, with call state managed via WebSocket connections.
Architecture
Backend (NestJS + Fastify)
Core Components
-
VoiceModule (
backend/src/voice/)voice.module.ts- Module configurationvoice.gateway.ts- WebSocket gateway for real-time signalingvoice.service.ts- Business logic for call orchestrationvoice.controller.ts- REST endpoints and Twilio webhooksdto/- Data transfer objects for type safetyinterfaces/- TypeScript interfaces for configuration
-
Database Schema
- Central Database:
integrationsConfigJSON field in Tenant model (encrypted) - Tenant Database:
callstable for call history and metadata
- Central Database:
-
WebSocket Gateway
- Namespace:
/voice - Authentication: JWT token validation in handshake
- Tenant Context: Extracted from JWT payload
- Events:
call:initiate,call:accept,call:reject,call:end,call:dtmf - AI Events:
ai:transcript,ai:suggestion,ai:action
- Namespace:
-
Twilio Integration
- SDK:
twilionpm package - Features: Outbound calls, TwiML responses, Media Streams, webhooks
- Credentials: Stored encrypted per tenant in
integrationsConfig.twilio
- SDK:
-
OpenAI Realtime Integration
- Connection: WebSocket to
wss://api.openai.com/v1/realtime - Features: Real-time transcription, AI suggestions, tool calling
- Credentials: Stored encrypted per tenant in
integrationsConfig.openai
- Connection: WebSocket to
Frontend (Nuxt 3 + Vue 3)
Core Components
-
useSoftphone Composable (
frontend/composables/useSoftphone.ts)- Module-level shared state for global access
- WebSocket connection management with auto-reconnect
- Call state management (current call, incoming call)
- Audio management (ringtone playback)
- Event handlers for call lifecycle and AI events
-
SoftphoneDialog Component (
frontend/components/SoftphoneDialog.vue)- Global dialog accessible from anywhere
- Features:
- Dialer with numeric keypad
- Incoming call notifications with ringtone
- Active call controls (mute, DTMF, hang up)
- Real-time transcript display
- AI suggestions panel
- Recent call history
-
Integration in Layout (
frontend/layouts/default.vue)- SoftphoneDialog included globally
- Sidebar button with incoming call indicator
-
Settings Page (
frontend/pages/settings/integrations.vue)- Configure Twilio credentials
- Configure OpenAI API settings
- Encrypted storage via backend API
Configuration
Environment Variables
Backend (.env)
BACKEND_URL=http://localhost:3000
ENCRYPTION_KEY=your-32-byte-hex-key
Frontend (.env)
VITE_BACKEND_URL=http://localhost:3000
Tenant Configuration
Integrations are configured per tenant via the settings UI or API:
{
"twilio": {
"accountSid": "ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"authToken": "your-auth-token",
"phoneNumber": "+1234567890"
},
"openai": {
"apiKey": "sk-...",
"model": "gpt-4o-realtime-preview",
"voice": "alloy"
}
}
This configuration is encrypted using AES-256-CBC and stored in the central database.
API Endpoints
REST Endpoints
POST /api/voice/call- Initiate outbound callGET /api/voice/calls- Get call historyPOST /api/voice/twiml/outbound- TwiML for outbound callsPOST /api/voice/twiml/inbound- TwiML for inbound callsPOST /api/voice/webhook/status- Twilio status webhookPOST /api/voice/webhook/recording- Twilio recording webhookGET /api/tenant/integrations- Get integrations config (masked)PUT /api/tenant/integrations- Update integrations config
WebSocket Events
Client → Server
call:initiate- Initiate outbound callcall:accept- Accept incoming callcall:reject- Reject incoming callcall:end- End active callcall:dtmf- Send DTMF tone
Server → Client
call:incoming- Incoming call notificationcall:initiated- Call initiation confirmedcall:accepted- Call acceptedcall:rejected- Call rejectedcall:ended- Call endedcall:update- Call status updatecall:error- Call errorcall:state- Full call state syncai:transcript- AI transcription updateai:suggestion- AI suggestionai:action- AI action executed
Database Schema
Central Database - Tenant Model
model Tenant {
id String @id @default(cuid())
name String
slug String @unique
dbHost String
dbPort Int @default(3306)
dbName String
dbUsername String
dbPassword String // Encrypted
integrationsConfig Json? // NEW: Encrypted JSON config
status String @default("active")
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
domains Domain[]
}
Tenant Database - Calls Table
CREATE TABLE calls (
id VARCHAR(36) PRIMARY KEY,
call_sid VARCHAR(100) UNIQUE NOT NULL,
direction ENUM('inbound', 'outbound') NOT NULL,
from_number VARCHAR(20) NOT NULL,
to_number VARCHAR(20) NOT NULL,
status ENUM('queued', 'ringing', 'in-progress', 'completed', 'busy', 'failed', 'no-answer', 'canceled'),
duration_seconds INT UNSIGNED,
recording_url VARCHAR(500),
ai_transcript TEXT,
ai_summary TEXT,
ai_insights JSON,
user_id VARCHAR(36) NOT NULL,
started_at TIMESTAMP,
ended_at TIMESTAMP,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE,
INDEX idx_call_sid (call_sid),
INDEX idx_user_id (user_id),
INDEX idx_status (status),
INDEX idx_direction (direction),
INDEX idx_created_user (created_at, user_id)
);
Usage
For Developers
-
Install Dependencies
cd backend && npm install cd ../frontend && npm install -
Configure Environment
- Set
ENCRYPTION_KEYin backend.env - Ensure
BACKEND_URLmatches your deployment
- Set
-
Run Migrations
cd backend # Central database migration is handled by Prisma npm run migrate:all-tenants # Run tenant migrations -
Start Services
# Backend cd backend && npm run start:dev # Frontend cd frontend && npm run dev
For Users
-
Configure Integrations
- Navigate to Settings → Integrations
- Enter Twilio credentials (Account SID, Auth Token, Phone Number)
- Enter OpenAI API key
- Click "Save Configuration"
-
Make a Call
- Click the "Softphone" button in the sidebar
- Enter a phone number (E.164 format: +1234567890)
- Click "Call"
-
Receive Calls
- Configure Twilio webhook URLs to point to your backend
- Incoming calls will trigger a notification and ringtone
- Click "Accept" to answer or "Reject" to decline
Advanced Features
AI-Assisted Calling
The OpenAI Realtime API provides:
- Real-time Transcription - Live speech-to-text during calls
- AI Suggestions - Contextual suggestions for agents
- Tool Calling - CRM actions via AI (search contacts, create tasks, etc.)
Tool Definitions
The system includes predefined tools for AI:
search_contact- Search CRM for contactscreate_task- Create follow-up tasksupdate_contact- Update contact information
Tools automatically respect RBAC permissions as they call existing protected services.
Call Recording
- Automatic recording via Twilio
- Recording URLs stored in call records
- Accessible via API for playback
Security
- Encryption - All credentials encrypted using AES-256-CBC
- Authentication - JWT-based auth for WebSocket and REST
- Tenant Isolation - Multi-tenant architecture with database-per-tenant
- RBAC - Permission-based access control (future: add voice-specific permissions)
Limitations & Future Enhancements
Current Limitations
- Media Streaming - Twilio Media Streams WebSocket not fully implemented
- Call Routing - No intelligent routing for inbound calls yet
- Queue Management - Basic call handling, no queue system
- Audio Muting - UI placeholder, actual audio muting not implemented
- RBAC Permissions - Voice-specific permissions not yet added
Planned Enhancements
- Media Streams - Full bidirectional audio between Twilio ↔ OpenAI ↔ User
- Call Routing - Route calls based on availability, skills, round-robin
- Queue System - Call queuing with BullMQ integration
- Call Analytics - Dashboard with call metrics and insights
- RBAC Integration - Add
voice.make_calls,voice.receive_callspermissions - WebRTC - Direct browser-to-Twilio audio (bypass backend)
Troubleshooting
WebSocket Connection Issues
- Verify
BACKEND_URLenvironment variable - Check CORS settings in backend
- Ensure JWT token is valid and includes tenant information
Twilio Webhook Errors
- Ensure webhook URLs are publicly accessible
- Verify Twilio credentials in integrations config
- Check backend logs for webhook processing errors
OpenAI Connection Issues
- Verify OpenAI API key has Realtime API access
- Check network connectivity to OpenAI endpoints
- Monitor backend logs for WebSocket errors
Testing
Manual Testing
-
Outbound Calls
# Open softphone dialog # Enter test number (use Twilio test credentials) # Click Call # Verify call status updates -
Inbound Calls
# Configure Twilio number webhook # Call the Twilio number from external phone # Verify incoming call notification # Accept call and verify connection -
AI Features
# Make a call with OpenAI configured # Speak during the call # Verify transcript appears in UI # Check for AI suggestions
Dependencies
Backend
@nestjs/websockets- WebSocket support@nestjs/platform-socket.io- Socket.IO adapter@fastify/websocket- Fastify WebSocket pluginsocket.io- WebSocket librarytwilio- Twilio SDKopenai- OpenAI SDK (for Realtime API)ws- WebSocket client
Frontend
socket.io-client- WebSocket clientlucide-vue-next- Iconsvue-sonner- Toast notifications
Support
For issues or questions:
- Check backend logs for error details
- Verify tenant integrations configuration
- Test Twilio/OpenAI connectivity independently
- Review WebSocket connection in browser DevTools
License
Same as project license.