Add twilio softphone with integrated AI assistant
This commit is contained in:
370
docs/SOFTPHONE_IMPLEMENTATION.md
Normal file
370
docs/SOFTPHONE_IMPLEMENTATION.md
Normal file
@@ -0,0 +1,370 @@
|
||||
# Softphone Implementation with Twilio & OpenAI Realtime
|
||||
|
||||
## Overview
|
||||
|
||||
This implementation adds comprehensive voice calling functionality to the platform using Twilio for telephony and OpenAI Realtime API for AI-assisted calls. The softphone is accessible globally through a Vue component, with call state managed via WebSocket connections.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Backend (NestJS + Fastify)
|
||||
|
||||
#### Core Components
|
||||
|
||||
1. **VoiceModule** (`backend/src/voice/`)
|
||||
- `voice.module.ts` - Module configuration
|
||||
- `voice.gateway.ts` - WebSocket gateway for real-time signaling
|
||||
- `voice.service.ts` - Business logic for call orchestration
|
||||
- `voice.controller.ts` - REST endpoints and Twilio webhooks
|
||||
- `dto/` - Data transfer objects for type safety
|
||||
- `interfaces/` - TypeScript interfaces for configuration
|
||||
|
||||
2. **Database Schema**
|
||||
- **Central Database**: `integrationsConfig` JSON field in Tenant model (encrypted)
|
||||
- **Tenant Database**: `calls` table for call history and metadata
|
||||
|
||||
3. **WebSocket Gateway**
|
||||
- Namespace: `/voice`
|
||||
- Authentication: JWT token validation in handshake
|
||||
- Tenant Context: Extracted from JWT payload
|
||||
- Events: `call:initiate`, `call:accept`, `call:reject`, `call:end`, `call:dtmf`
|
||||
- AI Events: `ai:transcript`, `ai:suggestion`, `ai:action`
|
||||
|
||||
4. **Twilio Integration**
|
||||
- SDK: `twilio` npm package
|
||||
- Features: Outbound calls, TwiML responses, Media Streams, webhooks
|
||||
- Credentials: Stored encrypted per tenant in `integrationsConfig.twilio`
|
||||
|
||||
5. **OpenAI Realtime Integration**
|
||||
- Connection: WebSocket to `wss://api.openai.com/v1/realtime`
|
||||
- Features: Real-time transcription, AI suggestions, tool calling
|
||||
- Credentials: Stored encrypted per tenant in `integrationsConfig.openai`
|
||||
|
||||
### Frontend (Nuxt 3 + Vue 3)
|
||||
|
||||
#### Core Components
|
||||
|
||||
1. **useSoftphone Composable** (`frontend/composables/useSoftphone.ts`)
|
||||
- Module-level shared state for global access
|
||||
- WebSocket connection management with auto-reconnect
|
||||
- Call state management (current call, incoming call)
|
||||
- Audio management (ringtone playback)
|
||||
- Event handlers for call lifecycle and AI events
|
||||
|
||||
2. **SoftphoneDialog Component** (`frontend/components/SoftphoneDialog.vue`)
|
||||
- Global dialog accessible from anywhere
|
||||
- Features:
|
||||
- Dialer with numeric keypad
|
||||
- Incoming call notifications with ringtone
|
||||
- Active call controls (mute, DTMF, hang up)
|
||||
- Real-time transcript display
|
||||
- AI suggestions panel
|
||||
- Recent call history
|
||||
|
||||
3. **Integration in Layout** (`frontend/layouts/default.vue`)
|
||||
- SoftphoneDialog included globally
|
||||
- Sidebar button with incoming call indicator
|
||||
|
||||
4. **Settings Page** (`frontend/pages/settings/integrations.vue`)
|
||||
- Configure Twilio credentials
|
||||
- Configure OpenAI API settings
|
||||
- Encrypted storage via backend API
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
#### Backend (.env)
|
||||
```env
|
||||
BACKEND_URL=http://localhost:3000
|
||||
ENCRYPTION_KEY=your-32-byte-hex-key
|
||||
```
|
||||
|
||||
#### Frontend (.env)
|
||||
```env
|
||||
VITE_BACKEND_URL=http://localhost:3000
|
||||
```
|
||||
|
||||
### Tenant Configuration
|
||||
|
||||
Integrations are configured per tenant via the settings UI or API:
|
||||
|
||||
```json
|
||||
{
|
||||
"twilio": {
|
||||
"accountSid": "ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
|
||||
"authToken": "your-auth-token",
|
||||
"phoneNumber": "+1234567890"
|
||||
},
|
||||
"openai": {
|
||||
"apiKey": "sk-...",
|
||||
"model": "gpt-4o-realtime-preview",
|
||||
"voice": "alloy"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This configuration is encrypted using AES-256-CBC and stored in the central database.
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### REST Endpoints
|
||||
|
||||
- `POST /api/voice/call` - Initiate outbound call
|
||||
- `GET /api/voice/calls` - Get call history
|
||||
- `POST /api/voice/twiml/outbound` - TwiML for outbound calls
|
||||
- `POST /api/voice/twiml/inbound` - TwiML for inbound calls
|
||||
- `POST /api/voice/webhook/status` - Twilio status webhook
|
||||
- `POST /api/voice/webhook/recording` - Twilio recording webhook
|
||||
- `GET /api/tenant/integrations` - Get integrations config (masked)
|
||||
- `PUT /api/tenant/integrations` - Update integrations config
|
||||
|
||||
### WebSocket Events
|
||||
|
||||
#### Client → Server
|
||||
- `call:initiate` - Initiate outbound call
|
||||
- `call:accept` - Accept incoming call
|
||||
- `call:reject` - Reject incoming call
|
||||
- `call:end` - End active call
|
||||
- `call:dtmf` - Send DTMF tone
|
||||
|
||||
#### Server → Client
|
||||
- `call:incoming` - Incoming call notification
|
||||
- `call:initiated` - Call initiation confirmed
|
||||
- `call:accepted` - Call accepted
|
||||
- `call:rejected` - Call rejected
|
||||
- `call:ended` - Call ended
|
||||
- `call:update` - Call status update
|
||||
- `call:error` - Call error
|
||||
- `call:state` - Full call state sync
|
||||
- `ai:transcript` - AI transcription update
|
||||
- `ai:suggestion` - AI suggestion
|
||||
- `ai:action` - AI action executed
|
||||
|
||||
## Database Schema
|
||||
|
||||
### Central Database - Tenant Model
|
||||
|
||||
```prisma
|
||||
model Tenant {
|
||||
id String @id @default(cuid())
|
||||
name String
|
||||
slug String @unique
|
||||
dbHost String
|
||||
dbPort Int @default(3306)
|
||||
dbName String
|
||||
dbUsername String
|
||||
dbPassword String // Encrypted
|
||||
integrationsConfig Json? // NEW: Encrypted JSON config
|
||||
status String @default("active")
|
||||
createdAt DateTime @default(now())
|
||||
updatedAt DateTime @updatedAt
|
||||
|
||||
domains Domain[]
|
||||
}
|
||||
```
|
||||
|
||||
### Tenant Database - Calls Table
|
||||
|
||||
```sql
|
||||
CREATE TABLE calls (
|
||||
id VARCHAR(36) PRIMARY KEY,
|
||||
call_sid VARCHAR(100) UNIQUE NOT NULL,
|
||||
direction ENUM('inbound', 'outbound') NOT NULL,
|
||||
from_number VARCHAR(20) NOT NULL,
|
||||
to_number VARCHAR(20) NOT NULL,
|
||||
status ENUM('queued', 'ringing', 'in-progress', 'completed', 'busy', 'failed', 'no-answer', 'canceled'),
|
||||
duration_seconds INT UNSIGNED,
|
||||
recording_url VARCHAR(500),
|
||||
ai_transcript TEXT,
|
||||
ai_summary TEXT,
|
||||
ai_insights JSON,
|
||||
user_id VARCHAR(36) NOT NULL,
|
||||
started_at TIMESTAMP,
|
||||
ended_at TIMESTAMP,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
|
||||
|
||||
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE,
|
||||
INDEX idx_call_sid (call_sid),
|
||||
INDEX idx_user_id (user_id),
|
||||
INDEX idx_status (status),
|
||||
INDEX idx_direction (direction),
|
||||
INDEX idx_created_user (created_at, user_id)
|
||||
);
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### For Developers
|
||||
|
||||
1. **Install Dependencies**
|
||||
```bash
|
||||
cd backend && npm install
|
||||
cd ../frontend && npm install
|
||||
```
|
||||
|
||||
2. **Configure Environment**
|
||||
- Set `ENCRYPTION_KEY` in backend `.env`
|
||||
- Ensure `BACKEND_URL` matches your deployment
|
||||
|
||||
3. **Run Migrations**
|
||||
```bash
|
||||
cd backend
|
||||
# Central database migration is handled by Prisma
|
||||
npm run migrate:all-tenants # Run tenant migrations
|
||||
```
|
||||
|
||||
4. **Start Services**
|
||||
```bash
|
||||
# Backend
|
||||
cd backend && npm run start:dev
|
||||
|
||||
# Frontend
|
||||
cd frontend && npm run dev
|
||||
```
|
||||
|
||||
### For Users
|
||||
|
||||
1. **Configure Integrations**
|
||||
- Navigate to Settings → Integrations
|
||||
- Enter Twilio credentials (Account SID, Auth Token, Phone Number)
|
||||
- Enter OpenAI API key
|
||||
- Click "Save Configuration"
|
||||
|
||||
2. **Make a Call**
|
||||
- Click the "Softphone" button in the sidebar
|
||||
- Enter a phone number (E.164 format: +1234567890)
|
||||
- Click "Call"
|
||||
|
||||
3. **Receive Calls**
|
||||
- Configure Twilio webhook URLs to point to your backend
|
||||
- Incoming calls will trigger a notification and ringtone
|
||||
- Click "Accept" to answer or "Reject" to decline
|
||||
|
||||
## Advanced Features
|
||||
|
||||
### AI-Assisted Calling
|
||||
|
||||
The OpenAI Realtime API provides:
|
||||
|
||||
1. **Real-time Transcription** - Live speech-to-text during calls
|
||||
2. **AI Suggestions** - Contextual suggestions for agents
|
||||
3. **Tool Calling** - CRM actions via AI (search contacts, create tasks, etc.)
|
||||
|
||||
### Tool Definitions
|
||||
|
||||
The system includes predefined tools for AI:
|
||||
|
||||
- `search_contact` - Search CRM for contacts
|
||||
- `create_task` - Create follow-up tasks
|
||||
- `update_contact` - Update contact information
|
||||
|
||||
Tools automatically respect RBAC permissions as they call existing protected services.
|
||||
|
||||
### Call Recording
|
||||
|
||||
- Automatic recording via Twilio
|
||||
- Recording URLs stored in call records
|
||||
- Accessible via API for playback
|
||||
|
||||
## Security
|
||||
|
||||
1. **Encryption** - All credentials encrypted using AES-256-CBC
|
||||
2. **Authentication** - JWT-based auth for WebSocket and REST
|
||||
3. **Tenant Isolation** - Multi-tenant architecture with database-per-tenant
|
||||
4. **RBAC** - Permission-based access control (future: add voice-specific permissions)
|
||||
|
||||
## Limitations & Future Enhancements
|
||||
|
||||
### Current Limitations
|
||||
|
||||
1. **Media Streaming** - Twilio Media Streams WebSocket not fully implemented
|
||||
2. **Call Routing** - No intelligent routing for inbound calls yet
|
||||
3. **Queue Management** - Basic call handling, no queue system
|
||||
4. **Audio Muting** - UI placeholder, actual audio muting not implemented
|
||||
5. **RBAC Permissions** - Voice-specific permissions not yet added
|
||||
|
||||
### Planned Enhancements
|
||||
|
||||
1. **Media Streams** - Full bidirectional audio between Twilio ↔ OpenAI ↔ User
|
||||
2. **Call Routing** - Route calls based on availability, skills, round-robin
|
||||
3. **Queue System** - Call queuing with BullMQ integration
|
||||
4. **Call Analytics** - Dashboard with call metrics and insights
|
||||
5. **RBAC Integration** - Add `voice.make_calls`, `voice.receive_calls` permissions
|
||||
6. **WebRTC** - Direct browser-to-Twilio audio (bypass backend)
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### WebSocket Connection Issues
|
||||
|
||||
- Verify `BACKEND_URL` environment variable
|
||||
- Check CORS settings in backend
|
||||
- Ensure JWT token is valid and includes tenant information
|
||||
|
||||
### Twilio Webhook Errors
|
||||
|
||||
- Ensure webhook URLs are publicly accessible
|
||||
- Verify Twilio credentials in integrations config
|
||||
- Check backend logs for webhook processing errors
|
||||
|
||||
### OpenAI Connection Issues
|
||||
|
||||
- Verify OpenAI API key has Realtime API access
|
||||
- Check network connectivity to OpenAI endpoints
|
||||
- Monitor backend logs for WebSocket errors
|
||||
|
||||
## Testing
|
||||
|
||||
### Manual Testing
|
||||
|
||||
1. **Outbound Calls**
|
||||
```bash
|
||||
# Open softphone dialog
|
||||
# Enter test number (use Twilio test credentials)
|
||||
# Click Call
|
||||
# Verify call status updates
|
||||
```
|
||||
|
||||
2. **Inbound Calls**
|
||||
```bash
|
||||
# Configure Twilio number webhook
|
||||
# Call the Twilio number from external phone
|
||||
# Verify incoming call notification
|
||||
# Accept call and verify connection
|
||||
```
|
||||
|
||||
3. **AI Features**
|
||||
```bash
|
||||
# Make a call with OpenAI configured
|
||||
# Speak during the call
|
||||
# Verify transcript appears in UI
|
||||
# Check for AI suggestions
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Backend
|
||||
- `@nestjs/websockets` - WebSocket support
|
||||
- `@nestjs/platform-socket.io` - Socket.IO adapter
|
||||
- `@fastify/websocket` - Fastify WebSocket plugin
|
||||
- `socket.io` - WebSocket library
|
||||
- `twilio` - Twilio SDK
|
||||
- `openai` - OpenAI SDK (for Realtime API)
|
||||
- `ws` - WebSocket client
|
||||
|
||||
### Frontend
|
||||
- `socket.io-client` - WebSocket client
|
||||
- `lucide-vue-next` - Icons
|
||||
- `vue-sonner` - Toast notifications
|
||||
|
||||
## Support
|
||||
|
||||
For issues or questions:
|
||||
1. Check backend logs for error details
|
||||
2. Verify tenant integrations configuration
|
||||
3. Test Twilio/OpenAI connectivity independently
|
||||
4. Review WebSocket connection in browser DevTools
|
||||
|
||||
## License
|
||||
|
||||
Same as project license.
|
||||
Reference in New Issue
Block a user