Files
neo/docs/SOFTPHONE_IMPLEMENTATION.md
2026-01-03 07:55:07 +01:00

11 KiB

Softphone Implementation with Twilio & OpenAI Realtime

Overview

This implementation adds comprehensive voice calling functionality to the platform using Twilio for telephony and OpenAI Realtime API for AI-assisted calls. The softphone is accessible globally through a Vue component, with call state managed via WebSocket connections.

Architecture

Backend (NestJS + Fastify)

Core Components

  1. VoiceModule (backend/src/voice/)

    • voice.module.ts - Module configuration
    • voice.gateway.ts - WebSocket gateway for real-time signaling
    • voice.service.ts - Business logic for call orchestration
    • voice.controller.ts - REST endpoints and Twilio webhooks
    • dto/ - Data transfer objects for type safety
    • interfaces/ - TypeScript interfaces for configuration
  2. Database Schema

    • Central Database: integrationsConfig JSON field in Tenant model (encrypted)
    • Tenant Database: calls table for call history and metadata
  3. WebSocket Gateway

    • Namespace: /voice
    • Authentication: JWT token validation in handshake
    • Tenant Context: Extracted from JWT payload
    • Events: call:initiate, call:accept, call:reject, call:end, call:dtmf
    • AI Events: ai:transcript, ai:suggestion, ai:action
  4. Twilio Integration

    • SDK: twilio npm package
    • Features: Outbound calls, TwiML responses, Media Streams, webhooks
    • Credentials: Stored encrypted per tenant in integrationsConfig.twilio
  5. OpenAI Realtime Integration

    • Connection: WebSocket to wss://api.openai.com/v1/realtime
    • Features: Real-time transcription, AI suggestions, tool calling
    • Credentials: Stored encrypted per tenant in integrationsConfig.openai

Frontend (Nuxt 3 + Vue 3)

Core Components

  1. useSoftphone Composable (frontend/composables/useSoftphone.ts)

    • Module-level shared state for global access
    • WebSocket connection management with auto-reconnect
    • Call state management (current call, incoming call)
    • Audio management (ringtone playback)
    • Event handlers for call lifecycle and AI events
  2. SoftphoneDialog Component (frontend/components/SoftphoneDialog.vue)

    • Global dialog accessible from anywhere
    • Features:
      • Dialer with numeric keypad
      • Incoming call notifications with ringtone
      • Active call controls (mute, DTMF, hang up)
      • Real-time transcript display
      • AI suggestions panel
      • Recent call history
  3. Integration in Layout (frontend/layouts/default.vue)

    • SoftphoneDialog included globally
    • Sidebar button with incoming call indicator
  4. Settings Page (frontend/pages/settings/integrations.vue)

    • Configure Twilio credentials
    • Configure OpenAI API settings
    • Encrypted storage via backend API

Configuration

Environment Variables

Backend (.env)

BACKEND_URL=http://localhost:3000
ENCRYPTION_KEY=your-32-byte-hex-key

Frontend (.env)

VITE_BACKEND_URL=http://localhost:3000

Tenant Configuration

Integrations are configured per tenant via the settings UI or API:

{
  "twilio": {
    "accountSid": "ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    "authToken": "your-auth-token",
    "phoneNumber": "+1234567890"
  },
  "openai": {
    "apiKey": "sk-...",
    "model": "gpt-4o-realtime-preview",
    "voice": "alloy"
  }
}

This configuration is encrypted using AES-256-CBC and stored in the central database.

API Endpoints

REST Endpoints

  • POST /api/voice/call - Initiate outbound call
  • GET /api/voice/calls - Get call history
  • POST /api/voice/twiml/outbound - TwiML for outbound calls
  • POST /api/voice/twiml/inbound - TwiML for inbound calls
  • POST /api/voice/webhook/status - Twilio status webhook
  • POST /api/voice/webhook/recording - Twilio recording webhook
  • GET /api/tenant/integrations - Get integrations config (masked)
  • PUT /api/tenant/integrations - Update integrations config

WebSocket Events

Client → Server

  • call:initiate - Initiate outbound call
  • call:accept - Accept incoming call
  • call:reject - Reject incoming call
  • call:end - End active call
  • call:dtmf - Send DTMF tone

Server → Client

  • call:incoming - Incoming call notification
  • call:initiated - Call initiation confirmed
  • call:accepted - Call accepted
  • call:rejected - Call rejected
  • call:ended - Call ended
  • call:update - Call status update
  • call:error - Call error
  • call:state - Full call state sync
  • ai:transcript - AI transcription update
  • ai:suggestion - AI suggestion
  • ai:action - AI action executed

Database Schema

Central Database - Tenant Model

model Tenant {
  id                 String   @id @default(cuid())
  name               String
  slug               String   @unique
  dbHost             String
  dbPort             Int      @default(3306)
  dbName             String
  dbUsername         String
  dbPassword         String   // Encrypted
  integrationsConfig Json?    // NEW: Encrypted JSON config
  status             String   @default("active")
  createdAt          DateTime @default(now())
  updatedAt          DateTime @updatedAt
  
  domains            Domain[]
}

Tenant Database - Calls Table

CREATE TABLE calls (
  id VARCHAR(36) PRIMARY KEY,
  call_sid VARCHAR(100) UNIQUE NOT NULL,
  direction ENUM('inbound', 'outbound') NOT NULL,
  from_number VARCHAR(20) NOT NULL,
  to_number VARCHAR(20) NOT NULL,
  status ENUM('queued', 'ringing', 'in-progress', 'completed', 'busy', 'failed', 'no-answer', 'canceled'),
  duration_seconds INT UNSIGNED,
  recording_url VARCHAR(500),
  ai_transcript TEXT,
  ai_summary TEXT,
  ai_insights JSON,
  user_id VARCHAR(36) NOT NULL,
  started_at TIMESTAMP,
  ended_at TIMESTAMP,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  
  FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE,
  INDEX idx_call_sid (call_sid),
  INDEX idx_user_id (user_id),
  INDEX idx_status (status),
  INDEX idx_direction (direction),
  INDEX idx_created_user (created_at, user_id)
);

Usage

For Developers

  1. Install Dependencies

    cd backend && npm install
    cd ../frontend && npm install
    
  2. Configure Environment

    • Set ENCRYPTION_KEY in backend .env
    • Ensure BACKEND_URL matches your deployment
  3. Run Migrations

    cd backend
    # Central database migration is handled by Prisma
    npm run migrate:all-tenants  # Run tenant migrations
    
  4. Start Services

    # Backend
    cd backend && npm run start:dev
    
    # Frontend
    cd frontend && npm run dev
    

For Users

  1. Configure Integrations

    • Navigate to Settings → Integrations
    • Enter Twilio credentials (Account SID, Auth Token, Phone Number)
    • Enter OpenAI API key
    • Click "Save Configuration"
  2. Make a Call

    • Click the "Softphone" button in the sidebar
    • Enter a phone number (E.164 format: +1234567890)
    • Click "Call"
  3. Receive Calls

    • Configure Twilio webhook URLs to point to your backend
    • Incoming calls will trigger a notification and ringtone
    • Click "Accept" to answer or "Reject" to decline

Advanced Features

AI-Assisted Calling

The OpenAI Realtime API provides:

  1. Real-time Transcription - Live speech-to-text during calls
  2. AI Suggestions - Contextual suggestions for agents
  3. Tool Calling - CRM actions via AI (search contacts, create tasks, etc.)

Tool Definitions

The system includes predefined tools for AI:

  • search_contact - Search CRM for contacts
  • create_task - Create follow-up tasks
  • update_contact - Update contact information

Tools automatically respect RBAC permissions as they call existing protected services.

Call Recording

  • Automatic recording via Twilio
  • Recording URLs stored in call records
  • Accessible via API for playback

Security

  1. Encryption - All credentials encrypted using AES-256-CBC
  2. Authentication - JWT-based auth for WebSocket and REST
  3. Tenant Isolation - Multi-tenant architecture with database-per-tenant
  4. RBAC - Permission-based access control (future: add voice-specific permissions)

Limitations & Future Enhancements

Current Limitations

  1. Media Streaming - Twilio Media Streams WebSocket not fully implemented
  2. Call Routing - No intelligent routing for inbound calls yet
  3. Queue Management - Basic call handling, no queue system
  4. Audio Muting - UI placeholder, actual audio muting not implemented
  5. RBAC Permissions - Voice-specific permissions not yet added

Planned Enhancements

  1. Media Streams - Full bidirectional audio between Twilio ↔ OpenAI ↔ User
  2. Call Routing - Route calls based on availability, skills, round-robin
  3. Queue System - Call queuing with BullMQ integration
  4. Call Analytics - Dashboard with call metrics and insights
  5. RBAC Integration - Add voice.make_calls, voice.receive_calls permissions
  6. WebRTC - Direct browser-to-Twilio audio (bypass backend)

Troubleshooting

WebSocket Connection Issues

  • Verify BACKEND_URL environment variable
  • Check CORS settings in backend
  • Ensure JWT token is valid and includes tenant information

Twilio Webhook Errors

  • Ensure webhook URLs are publicly accessible
  • Verify Twilio credentials in integrations config
  • Check backend logs for webhook processing errors

OpenAI Connection Issues

  • Verify OpenAI API key has Realtime API access
  • Check network connectivity to OpenAI endpoints
  • Monitor backend logs for WebSocket errors

Testing

Manual Testing

  1. Outbound Calls

    # Open softphone dialog
    # Enter test number (use Twilio test credentials)
    # Click Call
    # Verify call status updates
    
  2. Inbound Calls

    # Configure Twilio number webhook
    # Call the Twilio number from external phone
    # Verify incoming call notification
    # Accept call and verify connection
    
  3. AI Features

    # Make a call with OpenAI configured
    # Speak during the call
    # Verify transcript appears in UI
    # Check for AI suggestions
    

Dependencies

Backend

  • @nestjs/websockets - WebSocket support
  • @nestjs/platform-socket.io - Socket.IO adapter
  • @fastify/websocket - Fastify WebSocket plugin
  • socket.io - WebSocket library
  • twilio - Twilio SDK
  • openai - OpenAI SDK (for Realtime API)
  • ws - WebSocket client

Frontend

  • socket.io-client - WebSocket client
  • lucide-vue-next - Icons
  • vue-sonner - Toast notifications

Support

For issues or questions:

  1. Check backend logs for error details
  2. Verify tenant integrations configuration
  3. Test Twilio/OpenAI connectivity independently
  4. Review WebSocket connection in browser DevTools

License

Same as project license.