# Softphone Implementation with Twilio & OpenAI Realtime ## Overview This implementation adds comprehensive voice calling functionality to the platform using Twilio for telephony and OpenAI Realtime API for AI-assisted calls. The softphone is accessible globally through a Vue component, with call state managed via WebSocket connections. ## Architecture ### Backend (NestJS + Fastify) #### Core Components 1. **VoiceModule** (`backend/src/voice/`) - `voice.module.ts` - Module configuration - `voice.gateway.ts` - WebSocket gateway for real-time signaling - `voice.service.ts` - Business logic for call orchestration - `voice.controller.ts` - REST endpoints and Twilio webhooks - `dto/` - Data transfer objects for type safety - `interfaces/` - TypeScript interfaces for configuration 2. **Database Schema** - **Central Database**: `integrationsConfig` JSON field in Tenant model (encrypted) - **Tenant Database**: `calls` table for call history and metadata 3. **WebSocket Gateway** - Namespace: `/voice` - Authentication: JWT token validation in handshake - Tenant Context: Extracted from JWT payload - Events: `call:initiate`, `call:accept`, `call:reject`, `call:end`, `call:dtmf` - AI Events: `ai:transcript`, `ai:suggestion`, `ai:action` 4. **Twilio Integration** - SDK: `twilio` npm package - Features: Outbound calls, TwiML responses, Media Streams, webhooks - Credentials: Stored encrypted per tenant in `integrationsConfig.twilio` 5. **OpenAI Realtime Integration** - Connection: WebSocket to `wss://api.openai.com/v1/realtime` - Features: Real-time transcription, AI suggestions, tool calling - Credentials: Stored encrypted per tenant in `integrationsConfig.openai` ### Frontend (Nuxt 3 + Vue 3) #### Core Components 1. **useSoftphone Composable** (`frontend/composables/useSoftphone.ts`) - Module-level shared state for global access - WebSocket connection management with auto-reconnect - Call state management (current call, incoming call) - Audio management (ringtone playback) - Event handlers for call lifecycle and AI events 2. **SoftphoneDialog Component** (`frontend/components/SoftphoneDialog.vue`) - Global dialog accessible from anywhere - Features: - Dialer with numeric keypad - Incoming call notifications with ringtone - Active call controls (mute, DTMF, hang up) - Real-time transcript display - AI suggestions panel - Recent call history 3. **Integration in Layout** (`frontend/layouts/default.vue`) - SoftphoneDialog included globally - Sidebar button with incoming call indicator 4. **Settings Page** (`frontend/pages/settings/integrations.vue`) - Configure Twilio credentials - Configure OpenAI API settings - Encrypted storage via backend API ## Configuration ### Environment Variables #### Backend (.env) ```env BACKEND_URL=http://localhost:3000 ENCRYPTION_KEY=your-32-byte-hex-key ``` #### Frontend (.env) ```env VITE_BACKEND_URL=http://localhost:3000 ``` ### Tenant Configuration Integrations are configured per tenant via the settings UI or API: ```json { "twilio": { "accountSid": "ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", "authToken": "your-auth-token", "phoneNumber": "+1234567890" }, "openai": { "apiKey": "sk-...", "model": "gpt-4o-realtime-preview", "voice": "alloy" } } ``` This configuration is encrypted using AES-256-CBC and stored in the central database. ## API Endpoints ### REST Endpoints - `POST /api/voice/call` - Initiate outbound call - `GET /api/voice/calls` - Get call history - `POST /api/voice/twiml/outbound` - TwiML for outbound calls - `POST /api/voice/twiml/inbound` - TwiML for inbound calls - `POST /api/voice/webhook/status` - Twilio status webhook - `POST /api/voice/webhook/recording` - Twilio recording webhook - `GET /api/tenant/integrations` - Get integrations config (masked) - `PUT /api/tenant/integrations` - Update integrations config ### WebSocket Events #### Client → Server - `call:initiate` - Initiate outbound call - `call:accept` - Accept incoming call - `call:reject` - Reject incoming call - `call:end` - End active call - `call:dtmf` - Send DTMF tone #### Server → Client - `call:incoming` - Incoming call notification - `call:initiated` - Call initiation confirmed - `call:accepted` - Call accepted - `call:rejected` - Call rejected - `call:ended` - Call ended - `call:update` - Call status update - `call:error` - Call error - `call:state` - Full call state sync - `ai:transcript` - AI transcription update - `ai:suggestion` - AI suggestion - `ai:action` - AI action executed ## Database Schema ### Central Database - Tenant Model ```prisma model Tenant { id String @id @default(cuid()) name String slug String @unique dbHost String dbPort Int @default(3306) dbName String dbUsername String dbPassword String // Encrypted integrationsConfig Json? // NEW: Encrypted JSON config status String @default("active") createdAt DateTime @default(now()) updatedAt DateTime @updatedAt domains Domain[] } ``` ### Tenant Database - Calls Table ```sql CREATE TABLE calls ( id VARCHAR(36) PRIMARY KEY, call_sid VARCHAR(100) UNIQUE NOT NULL, direction ENUM('inbound', 'outbound') NOT NULL, from_number VARCHAR(20) NOT NULL, to_number VARCHAR(20) NOT NULL, status ENUM('queued', 'ringing', 'in-progress', 'completed', 'busy', 'failed', 'no-answer', 'canceled'), duration_seconds INT UNSIGNED, recording_url VARCHAR(500), ai_transcript TEXT, ai_summary TEXT, ai_insights JSON, user_id VARCHAR(36) NOT NULL, started_at TIMESTAMP, ended_at TIMESTAMP, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE, INDEX idx_call_sid (call_sid), INDEX idx_user_id (user_id), INDEX idx_status (status), INDEX idx_direction (direction), INDEX idx_created_user (created_at, user_id) ); ``` ## Usage ### For Developers 1. **Install Dependencies** ```bash cd backend && npm install cd ../frontend && npm install ``` 2. **Configure Environment** - Set `ENCRYPTION_KEY` in backend `.env` - Ensure `BACKEND_URL` matches your deployment 3. **Run Migrations** ```bash cd backend # Central database migration is handled by Prisma npm run migrate:all-tenants # Run tenant migrations ``` 4. **Start Services** ```bash # Backend cd backend && npm run start:dev # Frontend cd frontend && npm run dev ``` ### For Users 1. **Configure Integrations** - Navigate to Settings → Integrations - Enter Twilio credentials (Account SID, Auth Token, Phone Number) - Enter OpenAI API key - Click "Save Configuration" 2. **Make a Call** - Click the "Softphone" button in the sidebar - Enter a phone number (E.164 format: +1234567890) - Click "Call" 3. **Receive Calls** - Configure Twilio webhook URLs to point to your backend - Incoming calls will trigger a notification and ringtone - Click "Accept" to answer or "Reject" to decline ## Advanced Features ### AI-Assisted Calling The OpenAI Realtime API provides: 1. **Real-time Transcription** - Live speech-to-text during calls 2. **AI Suggestions** - Contextual suggestions for agents 3. **Tool Calling** - CRM actions via AI (search contacts, create tasks, etc.) ### Tool Definitions The system includes predefined tools for AI: - `search_contact` - Search CRM for contacts - `create_task` - Create follow-up tasks - `update_contact` - Update contact information Tools automatically respect RBAC permissions as they call existing protected services. ### Call Recording - Automatic recording via Twilio - Recording URLs stored in call records - Accessible via API for playback ## Security 1. **Encryption** - All credentials encrypted using AES-256-CBC 2. **Authentication** - JWT-based auth for WebSocket and REST 3. **Tenant Isolation** - Multi-tenant architecture with database-per-tenant 4. **RBAC** - Permission-based access control (future: add voice-specific permissions) ## Limitations & Future Enhancements ### Current Limitations 1. **Media Streaming** - Twilio Media Streams WebSocket not fully implemented 2. **Call Routing** - No intelligent routing for inbound calls yet 3. **Queue Management** - Basic call handling, no queue system 4. **Audio Muting** - UI placeholder, actual audio muting not implemented 5. **RBAC Permissions** - Voice-specific permissions not yet added ### Planned Enhancements 1. **Media Streams** - Full bidirectional audio between Twilio ↔ OpenAI ↔ User 2. **Call Routing** - Route calls based on availability, skills, round-robin 3. **Queue System** - Call queuing with BullMQ integration 4. **Call Analytics** - Dashboard with call metrics and insights 5. **RBAC Integration** - Add `voice.make_calls`, `voice.receive_calls` permissions 6. **WebRTC** - Direct browser-to-Twilio audio (bypass backend) ## Troubleshooting ### WebSocket Connection Issues - Verify `BACKEND_URL` environment variable - Check CORS settings in backend - Ensure JWT token is valid and includes tenant information ### Twilio Webhook Errors - Ensure webhook URLs are publicly accessible - Verify Twilio credentials in integrations config - Check backend logs for webhook processing errors ### OpenAI Connection Issues - Verify OpenAI API key has Realtime API access - Check network connectivity to OpenAI endpoints - Monitor backend logs for WebSocket errors ## Testing ### Manual Testing 1. **Outbound Calls** ```bash # Open softphone dialog # Enter test number (use Twilio test credentials) # Click Call # Verify call status updates ``` 2. **Inbound Calls** ```bash # Configure Twilio number webhook # Call the Twilio number from external phone # Verify incoming call notification # Accept call and verify connection ``` 3. **AI Features** ```bash # Make a call with OpenAI configured # Speak during the call # Verify transcript appears in UI # Check for AI suggestions ``` ## Dependencies ### Backend - `@nestjs/websockets` - WebSocket support - `@nestjs/platform-socket.io` - Socket.IO adapter - `@fastify/websocket` - Fastify WebSocket plugin - `socket.io` - WebSocket library - `twilio` - Twilio SDK - `openai` - OpenAI SDK (for Realtime API) - `ws` - WebSocket client ### Frontend - `socket.io-client` - WebSocket client - `lucide-vue-next` - Icons - `vue-sonner` - Toast notifications ## Support For issues or questions: 1. Check backend logs for error details 2. Verify tenant integrations configuration 3. Test Twilio/OpenAI connectivity independently 4. Review WebSocket connection in browser DevTools ## License Same as project license.