Add twilio softphone with integrated AI assistant

This commit is contained in:
Francisco Gaona
2026-01-04 08:48:43 +01:00
parent 16907aadf8
commit 5f3fcef1ec
42 changed files with 5689 additions and 923 deletions

View File

@@ -0,0 +1,370 @@
# Softphone Implementation with Twilio & OpenAI Realtime
## Overview
This implementation adds comprehensive voice calling functionality to the platform using Twilio for telephony and OpenAI Realtime API for AI-assisted calls. The softphone is accessible globally through a Vue component, with call state managed via WebSocket connections.
## Architecture
### Backend (NestJS + Fastify)
#### Core Components
1. **VoiceModule** (`backend/src/voice/`)
- `voice.module.ts` - Module configuration
- `voice.gateway.ts` - WebSocket gateway for real-time signaling
- `voice.service.ts` - Business logic for call orchestration
- `voice.controller.ts` - REST endpoints and Twilio webhooks
- `dto/` - Data transfer objects for type safety
- `interfaces/` - TypeScript interfaces for configuration
2. **Database Schema**
- **Central Database**: `integrationsConfig` JSON field in Tenant model (encrypted)
- **Tenant Database**: `calls` table for call history and metadata
3. **WebSocket Gateway**
- Namespace: `/voice`
- Authentication: JWT token validation in handshake
- Tenant Context: Extracted from JWT payload
- Events: `call:initiate`, `call:accept`, `call:reject`, `call:end`, `call:dtmf`
- AI Events: `ai:transcript`, `ai:suggestion`, `ai:action`
4. **Twilio Integration**
- SDK: `twilio` npm package
- Features: Outbound calls, TwiML responses, Media Streams, webhooks
- Credentials: Stored encrypted per tenant in `integrationsConfig.twilio`
5. **OpenAI Realtime Integration**
- Connection: WebSocket to `wss://api.openai.com/v1/realtime`
- Features: Real-time transcription, AI suggestions, tool calling
- Credentials: Stored encrypted per tenant in `integrationsConfig.openai`
### Frontend (Nuxt 3 + Vue 3)
#### Core Components
1. **useSoftphone Composable** (`frontend/composables/useSoftphone.ts`)
- Module-level shared state for global access
- WebSocket connection management with auto-reconnect
- Call state management (current call, incoming call)
- Audio management (ringtone playback)
- Event handlers for call lifecycle and AI events
2. **SoftphoneDialog Component** (`frontend/components/SoftphoneDialog.vue`)
- Global dialog accessible from anywhere
- Features:
- Dialer with numeric keypad
- Incoming call notifications with ringtone
- Active call controls (mute, DTMF, hang up)
- Real-time transcript display
- AI suggestions panel
- Recent call history
3. **Integration in Layout** (`frontend/layouts/default.vue`)
- SoftphoneDialog included globally
- Sidebar button with incoming call indicator
4. **Settings Page** (`frontend/pages/settings/integrations.vue`)
- Configure Twilio credentials
- Configure OpenAI API settings
- Encrypted storage via backend API
## Configuration
### Environment Variables
#### Backend (.env)
```env
BACKEND_URL=http://localhost:3000
ENCRYPTION_KEY=your-32-byte-hex-key
```
#### Frontend (.env)
```env
VITE_BACKEND_URL=http://localhost:3000
```
### Tenant Configuration
Integrations are configured per tenant via the settings UI or API:
```json
{
"twilio": {
"accountSid": "ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"authToken": "your-auth-token",
"phoneNumber": "+1234567890"
},
"openai": {
"apiKey": "sk-...",
"model": "gpt-4o-realtime-preview",
"voice": "alloy"
}
}
```
This configuration is encrypted using AES-256-CBC and stored in the central database.
## API Endpoints
### REST Endpoints
- `POST /api/voice/call` - Initiate outbound call
- `GET /api/voice/calls` - Get call history
- `POST /api/voice/twiml/outbound` - TwiML for outbound calls
- `POST /api/voice/twiml/inbound` - TwiML for inbound calls
- `POST /api/voice/webhook/status` - Twilio status webhook
- `POST /api/voice/webhook/recording` - Twilio recording webhook
- `GET /api/tenant/integrations` - Get integrations config (masked)
- `PUT /api/tenant/integrations` - Update integrations config
### WebSocket Events
#### Client → Server
- `call:initiate` - Initiate outbound call
- `call:accept` - Accept incoming call
- `call:reject` - Reject incoming call
- `call:end` - End active call
- `call:dtmf` - Send DTMF tone
#### Server → Client
- `call:incoming` - Incoming call notification
- `call:initiated` - Call initiation confirmed
- `call:accepted` - Call accepted
- `call:rejected` - Call rejected
- `call:ended` - Call ended
- `call:update` - Call status update
- `call:error` - Call error
- `call:state` - Full call state sync
- `ai:transcript` - AI transcription update
- `ai:suggestion` - AI suggestion
- `ai:action` - AI action executed
## Database Schema
### Central Database - Tenant Model
```prisma
model Tenant {
id String @id @default(cuid())
name String
slug String @unique
dbHost String
dbPort Int @default(3306)
dbName String
dbUsername String
dbPassword String // Encrypted
integrationsConfig Json? // NEW: Encrypted JSON config
status String @default("active")
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
domains Domain[]
}
```
### Tenant Database - Calls Table
```sql
CREATE TABLE calls (
id VARCHAR(36) PRIMARY KEY,
call_sid VARCHAR(100) UNIQUE NOT NULL,
direction ENUM('inbound', 'outbound') NOT NULL,
from_number VARCHAR(20) NOT NULL,
to_number VARCHAR(20) NOT NULL,
status ENUM('queued', 'ringing', 'in-progress', 'completed', 'busy', 'failed', 'no-answer', 'canceled'),
duration_seconds INT UNSIGNED,
recording_url VARCHAR(500),
ai_transcript TEXT,
ai_summary TEXT,
ai_insights JSON,
user_id VARCHAR(36) NOT NULL,
started_at TIMESTAMP,
ended_at TIMESTAMP,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE,
INDEX idx_call_sid (call_sid),
INDEX idx_user_id (user_id),
INDEX idx_status (status),
INDEX idx_direction (direction),
INDEX idx_created_user (created_at, user_id)
);
```
## Usage
### For Developers
1. **Install Dependencies**
```bash
cd backend && npm install
cd ../frontend && npm install
```
2. **Configure Environment**
- Set `ENCRYPTION_KEY` in backend `.env`
- Ensure `BACKEND_URL` matches your deployment
3. **Run Migrations**
```bash
cd backend
# Central database migration is handled by Prisma
npm run migrate:all-tenants # Run tenant migrations
```
4. **Start Services**
```bash
# Backend
cd backend && npm run start:dev
# Frontend
cd frontend && npm run dev
```
### For Users
1. **Configure Integrations**
- Navigate to Settings → Integrations
- Enter Twilio credentials (Account SID, Auth Token, Phone Number)
- Enter OpenAI API key
- Click "Save Configuration"
2. **Make a Call**
- Click the "Softphone" button in the sidebar
- Enter a phone number (E.164 format: +1234567890)
- Click "Call"
3. **Receive Calls**
- Configure Twilio webhook URLs to point to your backend
- Incoming calls will trigger a notification and ringtone
- Click "Accept" to answer or "Reject" to decline
## Advanced Features
### AI-Assisted Calling
The OpenAI Realtime API provides:
1. **Real-time Transcription** - Live speech-to-text during calls
2. **AI Suggestions** - Contextual suggestions for agents
3. **Tool Calling** - CRM actions via AI (search contacts, create tasks, etc.)
### Tool Definitions
The system includes predefined tools for AI:
- `search_contact` - Search CRM for contacts
- `create_task` - Create follow-up tasks
- `update_contact` - Update contact information
Tools automatically respect RBAC permissions as they call existing protected services.
### Call Recording
- Automatic recording via Twilio
- Recording URLs stored in call records
- Accessible via API for playback
## Security
1. **Encryption** - All credentials encrypted using AES-256-CBC
2. **Authentication** - JWT-based auth for WebSocket and REST
3. **Tenant Isolation** - Multi-tenant architecture with database-per-tenant
4. **RBAC** - Permission-based access control (future: add voice-specific permissions)
## Limitations & Future Enhancements
### Current Limitations
1. **Media Streaming** - Twilio Media Streams WebSocket not fully implemented
2. **Call Routing** - No intelligent routing for inbound calls yet
3. **Queue Management** - Basic call handling, no queue system
4. **Audio Muting** - UI placeholder, actual audio muting not implemented
5. **RBAC Permissions** - Voice-specific permissions not yet added
### Planned Enhancements
1. **Media Streams** - Full bidirectional audio between Twilio ↔ OpenAI ↔ User
2. **Call Routing** - Route calls based on availability, skills, round-robin
3. **Queue System** - Call queuing with BullMQ integration
4. **Call Analytics** - Dashboard with call metrics and insights
5. **RBAC Integration** - Add `voice.make_calls`, `voice.receive_calls` permissions
6. **WebRTC** - Direct browser-to-Twilio audio (bypass backend)
## Troubleshooting
### WebSocket Connection Issues
- Verify `BACKEND_URL` environment variable
- Check CORS settings in backend
- Ensure JWT token is valid and includes tenant information
### Twilio Webhook Errors
- Ensure webhook URLs are publicly accessible
- Verify Twilio credentials in integrations config
- Check backend logs for webhook processing errors
### OpenAI Connection Issues
- Verify OpenAI API key has Realtime API access
- Check network connectivity to OpenAI endpoints
- Monitor backend logs for WebSocket errors
## Testing
### Manual Testing
1. **Outbound Calls**
```bash
# Open softphone dialog
# Enter test number (use Twilio test credentials)
# Click Call
# Verify call status updates
```
2. **Inbound Calls**
```bash
# Configure Twilio number webhook
# Call the Twilio number from external phone
# Verify incoming call notification
# Accept call and verify connection
```
3. **AI Features**
```bash
# Make a call with OpenAI configured
# Speak during the call
# Verify transcript appears in UI
# Check for AI suggestions
```
## Dependencies
### Backend
- `@nestjs/websockets` - WebSocket support
- `@nestjs/platform-socket.io` - Socket.IO adapter
- `@fastify/websocket` - Fastify WebSocket plugin
- `socket.io` - WebSocket library
- `twilio` - Twilio SDK
- `openai` - OpenAI SDK (for Realtime API)
- `ws` - WebSocket client
### Frontend
- `socket.io-client` - WebSocket client
- `lucide-vue-next` - Icons
- `vue-sonner` - Toast notifications
## Support
For issues or questions:
1. Check backend logs for error details
2. Verify tenant integrations configuration
3. Test Twilio/OpenAI connectivity independently
4. Review WebSocket connection in browser DevTools
## License
Same as project license.