371 lines
11 KiB
Markdown
371 lines
11 KiB
Markdown
# Softphone Implementation with Twilio & OpenAI Realtime
|
|
|
|
## Overview
|
|
|
|
This implementation adds comprehensive voice calling functionality to the platform using Twilio for telephony and OpenAI Realtime API for AI-assisted calls. The softphone is accessible globally through a Vue component, with call state managed via WebSocket connections.
|
|
|
|
## Architecture
|
|
|
|
### Backend (NestJS + Fastify)
|
|
|
|
#### Core Components
|
|
|
|
1. **VoiceModule** (`backend/src/voice/`)
|
|
- `voice.module.ts` - Module configuration
|
|
- `voice.gateway.ts` - WebSocket gateway for real-time signaling
|
|
- `voice.service.ts` - Business logic for call orchestration
|
|
- `voice.controller.ts` - REST endpoints and Twilio webhooks
|
|
- `dto/` - Data transfer objects for type safety
|
|
- `interfaces/` - TypeScript interfaces for configuration
|
|
|
|
2. **Database Schema**
|
|
- **Central Database**: `integrationsConfig` JSON field in Tenant model (encrypted)
|
|
- **Tenant Database**: `calls` table for call history and metadata
|
|
|
|
3. **WebSocket Gateway**
|
|
- Namespace: `/voice`
|
|
- Authentication: JWT token validation in handshake
|
|
- Tenant Context: Extracted from JWT payload
|
|
- Events: `call:initiate`, `call:accept`, `call:reject`, `call:end`, `call:dtmf`
|
|
- AI Events: `ai:transcript`, `ai:suggestion`, `ai:action`
|
|
|
|
4. **Twilio Integration**
|
|
- SDK: `twilio` npm package
|
|
- Features: Outbound calls, TwiML responses, Media Streams, webhooks
|
|
- Credentials: Stored encrypted per tenant in `integrationsConfig.twilio`
|
|
|
|
5. **OpenAI Realtime Integration**
|
|
- Connection: WebSocket to `wss://api.openai.com/v1/realtime`
|
|
- Features: Real-time transcription, AI suggestions, tool calling
|
|
- Credentials: Stored encrypted per tenant in `integrationsConfig.openai`
|
|
|
|
### Frontend (Nuxt 3 + Vue 3)
|
|
|
|
#### Core Components
|
|
|
|
1. **useSoftphone Composable** (`frontend/composables/useSoftphone.ts`)
|
|
- Module-level shared state for global access
|
|
- WebSocket connection management with auto-reconnect
|
|
- Call state management (current call, incoming call)
|
|
- Audio management (ringtone playback)
|
|
- Event handlers for call lifecycle and AI events
|
|
|
|
2. **SoftphoneDialog Component** (`frontend/components/SoftphoneDialog.vue`)
|
|
- Global dialog accessible from anywhere
|
|
- Features:
|
|
- Dialer with numeric keypad
|
|
- Incoming call notifications with ringtone
|
|
- Active call controls (mute, DTMF, hang up)
|
|
- Real-time transcript display
|
|
- AI suggestions panel
|
|
- Recent call history
|
|
|
|
3. **Integration in Layout** (`frontend/layouts/default.vue`)
|
|
- SoftphoneDialog included globally
|
|
- Sidebar button with incoming call indicator
|
|
|
|
4. **Settings Page** (`frontend/pages/settings/integrations.vue`)
|
|
- Configure Twilio credentials
|
|
- Configure OpenAI API settings
|
|
- Encrypted storage via backend API
|
|
|
|
## Configuration
|
|
|
|
### Environment Variables
|
|
|
|
#### Backend (.env)
|
|
```env
|
|
BACKEND_URL=http://localhost:3000
|
|
ENCRYPTION_KEY=your-32-byte-hex-key
|
|
```
|
|
|
|
#### Frontend (.env)
|
|
```env
|
|
VITE_BACKEND_URL=http://localhost:3000
|
|
```
|
|
|
|
### Tenant Configuration
|
|
|
|
Integrations are configured per tenant via the settings UI or API:
|
|
|
|
```json
|
|
{
|
|
"twilio": {
|
|
"accountSid": "ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
|
|
"authToken": "your-auth-token",
|
|
"phoneNumber": "+1234567890"
|
|
},
|
|
"openai": {
|
|
"apiKey": "sk-...",
|
|
"model": "gpt-4o-realtime-preview",
|
|
"voice": "alloy"
|
|
}
|
|
}
|
|
```
|
|
|
|
This configuration is encrypted using AES-256-CBC and stored in the central database.
|
|
|
|
## API Endpoints
|
|
|
|
### REST Endpoints
|
|
|
|
- `POST /api/voice/call` - Initiate outbound call
|
|
- `GET /api/voice/calls` - Get call history
|
|
- `POST /api/voice/twiml/outbound` - TwiML for outbound calls
|
|
- `POST /api/voice/twiml/inbound` - TwiML for inbound calls
|
|
- `POST /api/voice/webhook/status` - Twilio status webhook
|
|
- `POST /api/voice/webhook/recording` - Twilio recording webhook
|
|
- `GET /api/tenant/integrations` - Get integrations config (masked)
|
|
- `PUT /api/tenant/integrations` - Update integrations config
|
|
|
|
### WebSocket Events
|
|
|
|
#### Client → Server
|
|
- `call:initiate` - Initiate outbound call
|
|
- `call:accept` - Accept incoming call
|
|
- `call:reject` - Reject incoming call
|
|
- `call:end` - End active call
|
|
- `call:dtmf` - Send DTMF tone
|
|
|
|
#### Server → Client
|
|
- `call:incoming` - Incoming call notification
|
|
- `call:initiated` - Call initiation confirmed
|
|
- `call:accepted` - Call accepted
|
|
- `call:rejected` - Call rejected
|
|
- `call:ended` - Call ended
|
|
- `call:update` - Call status update
|
|
- `call:error` - Call error
|
|
- `call:state` - Full call state sync
|
|
- `ai:transcript` - AI transcription update
|
|
- `ai:suggestion` - AI suggestion
|
|
- `ai:action` - AI action executed
|
|
|
|
## Database Schema
|
|
|
|
### Central Database - Tenant Model
|
|
|
|
```prisma
|
|
model Tenant {
|
|
id String @id @default(cuid())
|
|
name String
|
|
slug String @unique
|
|
dbHost String
|
|
dbPort Int @default(3306)
|
|
dbName String
|
|
dbUsername String
|
|
dbPassword String // Encrypted
|
|
integrationsConfig Json? // NEW: Encrypted JSON config
|
|
status String @default("active")
|
|
createdAt DateTime @default(now())
|
|
updatedAt DateTime @updatedAt
|
|
|
|
domains Domain[]
|
|
}
|
|
```
|
|
|
|
### Tenant Database - Calls Table
|
|
|
|
```sql
|
|
CREATE TABLE calls (
|
|
id VARCHAR(36) PRIMARY KEY,
|
|
call_sid VARCHAR(100) UNIQUE NOT NULL,
|
|
direction ENUM('inbound', 'outbound') NOT NULL,
|
|
from_number VARCHAR(20) NOT NULL,
|
|
to_number VARCHAR(20) NOT NULL,
|
|
status ENUM('queued', 'ringing', 'in-progress', 'completed', 'busy', 'failed', 'no-answer', 'canceled'),
|
|
duration_seconds INT UNSIGNED,
|
|
recording_url VARCHAR(500),
|
|
ai_transcript TEXT,
|
|
ai_summary TEXT,
|
|
ai_insights JSON,
|
|
user_id VARCHAR(36) NOT NULL,
|
|
started_at TIMESTAMP,
|
|
ended_at TIMESTAMP,
|
|
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
|
|
|
|
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE,
|
|
INDEX idx_call_sid (call_sid),
|
|
INDEX idx_user_id (user_id),
|
|
INDEX idx_status (status),
|
|
INDEX idx_direction (direction),
|
|
INDEX idx_created_user (created_at, user_id)
|
|
);
|
|
```
|
|
|
|
## Usage
|
|
|
|
### For Developers
|
|
|
|
1. **Install Dependencies**
|
|
```bash
|
|
cd backend && npm install
|
|
cd ../frontend && npm install
|
|
```
|
|
|
|
2. **Configure Environment**
|
|
- Set `ENCRYPTION_KEY` in backend `.env`
|
|
- Ensure `BACKEND_URL` matches your deployment
|
|
|
|
3. **Run Migrations**
|
|
```bash
|
|
cd backend
|
|
# Central database migration is handled by Prisma
|
|
npm run migrate:all-tenants # Run tenant migrations
|
|
```
|
|
|
|
4. **Start Services**
|
|
```bash
|
|
# Backend
|
|
cd backend && npm run start:dev
|
|
|
|
# Frontend
|
|
cd frontend && npm run dev
|
|
```
|
|
|
|
### For Users
|
|
|
|
1. **Configure Integrations**
|
|
- Navigate to Settings → Integrations
|
|
- Enter Twilio credentials (Account SID, Auth Token, Phone Number)
|
|
- Enter OpenAI API key
|
|
- Click "Save Configuration"
|
|
|
|
2. **Make a Call**
|
|
- Click the "Softphone" button in the sidebar
|
|
- Enter a phone number (E.164 format: +1234567890)
|
|
- Click "Call"
|
|
|
|
3. **Receive Calls**
|
|
- Configure Twilio webhook URLs to point to your backend
|
|
- Incoming calls will trigger a notification and ringtone
|
|
- Click "Accept" to answer or "Reject" to decline
|
|
|
|
## Advanced Features
|
|
|
|
### AI-Assisted Calling
|
|
|
|
The OpenAI Realtime API provides:
|
|
|
|
1. **Real-time Transcription** - Live speech-to-text during calls
|
|
2. **AI Suggestions** - Contextual suggestions for agents
|
|
3. **Tool Calling** - CRM actions via AI (search contacts, create tasks, etc.)
|
|
|
|
### Tool Definitions
|
|
|
|
The system includes predefined tools for AI:
|
|
|
|
- `search_contact` - Search CRM for contacts
|
|
- `create_task` - Create follow-up tasks
|
|
- `update_contact` - Update contact information
|
|
|
|
Tools automatically respect RBAC permissions as they call existing protected services.
|
|
|
|
### Call Recording
|
|
|
|
- Automatic recording via Twilio
|
|
- Recording URLs stored in call records
|
|
- Accessible via API for playback
|
|
|
|
## Security
|
|
|
|
1. **Encryption** - All credentials encrypted using AES-256-CBC
|
|
2. **Authentication** - JWT-based auth for WebSocket and REST
|
|
3. **Tenant Isolation** - Multi-tenant architecture with database-per-tenant
|
|
4. **RBAC** - Permission-based access control (future: add voice-specific permissions)
|
|
|
|
## Limitations & Future Enhancements
|
|
|
|
### Current Limitations
|
|
|
|
1. **Media Streaming** - Twilio Media Streams WebSocket not fully implemented
|
|
2. **Call Routing** - No intelligent routing for inbound calls yet
|
|
3. **Queue Management** - Basic call handling, no queue system
|
|
4. **Audio Muting** - UI placeholder, actual audio muting not implemented
|
|
5. **RBAC Permissions** - Voice-specific permissions not yet added
|
|
|
|
### Planned Enhancements
|
|
|
|
1. **Media Streams** - Full bidirectional audio between Twilio ↔ OpenAI ↔ User
|
|
2. **Call Routing** - Route calls based on availability, skills, round-robin
|
|
3. **Queue System** - Call queuing with BullMQ integration
|
|
4. **Call Analytics** - Dashboard with call metrics and insights
|
|
5. **RBAC Integration** - Add `voice.make_calls`, `voice.receive_calls` permissions
|
|
6. **WebRTC** - Direct browser-to-Twilio audio (bypass backend)
|
|
|
|
## Troubleshooting
|
|
|
|
### WebSocket Connection Issues
|
|
|
|
- Verify `BACKEND_URL` environment variable
|
|
- Check CORS settings in backend
|
|
- Ensure JWT token is valid and includes tenant information
|
|
|
|
### Twilio Webhook Errors
|
|
|
|
- Ensure webhook URLs are publicly accessible
|
|
- Verify Twilio credentials in integrations config
|
|
- Check backend logs for webhook processing errors
|
|
|
|
### OpenAI Connection Issues
|
|
|
|
- Verify OpenAI API key has Realtime API access
|
|
- Check network connectivity to OpenAI endpoints
|
|
- Monitor backend logs for WebSocket errors
|
|
|
|
## Testing
|
|
|
|
### Manual Testing
|
|
|
|
1. **Outbound Calls**
|
|
```bash
|
|
# Open softphone dialog
|
|
# Enter test number (use Twilio test credentials)
|
|
# Click Call
|
|
# Verify call status updates
|
|
```
|
|
|
|
2. **Inbound Calls**
|
|
```bash
|
|
# Configure Twilio number webhook
|
|
# Call the Twilio number from external phone
|
|
# Verify incoming call notification
|
|
# Accept call and verify connection
|
|
```
|
|
|
|
3. **AI Features**
|
|
```bash
|
|
# Make a call with OpenAI configured
|
|
# Speak during the call
|
|
# Verify transcript appears in UI
|
|
# Check for AI suggestions
|
|
```
|
|
|
|
## Dependencies
|
|
|
|
### Backend
|
|
- `@nestjs/websockets` - WebSocket support
|
|
- `@nestjs/platform-socket.io` - Socket.IO adapter
|
|
- `@fastify/websocket` - Fastify WebSocket plugin
|
|
- `socket.io` - WebSocket library
|
|
- `twilio` - Twilio SDK
|
|
- `openai` - OpenAI SDK (for Realtime API)
|
|
- `ws` - WebSocket client
|
|
|
|
### Frontend
|
|
- `socket.io-client` - WebSocket client
|
|
- `lucide-vue-next` - Icons
|
|
- `vue-sonner` - Toast notifications
|
|
|
|
## Support
|
|
|
|
For issues or questions:
|
|
1. Check backend logs for error details
|
|
2. Verify tenant integrations configuration
|
|
3. Test Twilio/OpenAI connectivity independently
|
|
4. Review WebSocket connection in browser DevTools
|
|
|
|
## License
|
|
|
|
Same as project license.
|