Files
neo/SOFTPHONE_AI_ASSISTANT.md
2026-01-05 07:59:02 +01:00

174 lines
5.4 KiB
Markdown

# Softphone AI Assistant - Complete Implementation
## 🎉 Features Implemented
### ✅ Real-time AI Call Assistant
- **OpenAI Realtime API Integration** - Listens to live calls and provides suggestions
- **Audio Streaming** - Twilio Media Streams fork audio to backend for AI processing
- **Real-time Transcription** - Speech-to-text during calls
- **Smart Suggestions** - AI analyzes conversation and advises the agent
## 🔧 Architecture
### Backend Flow
```
Inbound Call → TwiML (<Start><Stream> + <Dial>)
→ Media Stream WebSocket → OpenAI Realtime API
→ AI Processing → Socket.IO → Frontend
```
### Key Components
1. **TwiML Structure** (`voice.controller.ts:226-234`)
- `<Start><Stream>` - Forks audio for AI processing
- `<Dial><Client>` - Connects call to agent's softphone
2. **OpenAI Integration** (`voice.service.ts:431-519`)
- WebSocket connection to `wss://api.openai.com/v1/realtime?model=gpt-4o-realtime-preview-2024-10-01`
- Session config with custom instructions for agent assistance
- Handles transcripts and generates suggestions
3. **AI Message Handler** (`voice.service.ts:609-707`)
- Processes OpenAI events (transcripts, suggestions, audio)
- Routes suggestions to frontend via Socket.IO
- Saves transcripts to database
4. **Voice Gateway** (`voice.gateway.ts:272-289`)
- `notifyAiTranscript()` - Real-time transcript chunks
- `notifyAiSuggestion()` - AI suggestions to agent
### Frontend Components
1. **Softphone Dialog** (`SoftphoneDialog.vue:104-135`)
- AI Assistant section with badge showing suggestion count
- Color-coded suggestions (blue=response, green=action, purple=insight)
- Animated highlight for newest suggestion
2. **Softphone Composable** (`useSoftphone.ts:515-535`)
- Socket.IO event handlers for `ai:suggestion` and `ai:transcript`
- Maintains history of last 10 suggestions
- Maintains history of last 50 transcript items
## 📋 AI Prompt Configuration
The AI is instructed to:
- **Listen, not talk** - It advises the agent, not the caller
- **Provide concise suggestions** - 1-2 sentences max
- **Use formatted output**:
- `💡 Suggestion: [advice]`
- `⚠️ Alert: [important notice]`
- `📋 Action: [CRM action]`
## 🎨 UI Features
### Suggestion Types
- **Response** (Blue) - Suggested replies or approaches
- **Action** (Green) - Recommended CRM actions
- **Insight** (Purple) - Important alerts or observations
### Visual Feedback
- Badge showing number of suggestions
- Newest suggestion pulses for attention
- Auto-scrolling suggestion list
- Timestamp on each suggestion
## 🔍 How to Monitor
### 1. Backend Logs
```bash
# Watch for AI events
docker logs -f neo-backend-1 | grep -E "AI|OpenAI|transcript|suggestion"
```
Key log markers:
- `📝 Transcript chunk:` - Real-time speech detection
- `✅ Final transcript:` - Complete transcript saved
- `💡 AI Suggestion:` - AI-generated advice
### 2. Database
```sql
-- View call transcripts
SELECT call_sid, ai_transcript, created_at
FROM calls
ORDER BY created_at DESC
LIMIT 5;
```
### 3. Frontend Console
- Open browser DevTools Console
- Watch for: "AI suggestion:", "AI transcript:"
## 🚀 Testing
1. **Make a test call** to your Twilio number
2. **Accept the call** in the softphone dialog
3. **Talk during the call** - Say something like "I need to schedule a follow-up"
4. **Watch the UI** - AI suggestions appear in real-time
5. **Check logs** - See transcription and suggestion generation
## 📊 Current Status
**Working**:
- Inbound calls ring softphone
- Media stream forks audio to backend
- OpenAI processes audio (1300+ packets/call)
- AI generates suggestions
- Suggestions appear in frontend
- Transcripts saved to database
## 🔧 Configuration
### Required Environment Variables
```env
# OpenAI API Key (set in tenant integrations config)
OPENAI_API_KEY=sk-...
# Optional overrides
OPENAI_MODEL=gpt-4o-realtime-preview-2024-10-01
OPENAI_VOICE=alloy
```
### Tenant Configuration
Set in Settings > Integrations:
- OpenAI API Key
- Model (optional)
- Voice (optional)
## 🎯 Next Steps (Optional Enhancements)
1. **CRM Tool Execution** - Implement actual tool calls (search contacts, create tasks)
2. **Audio Response** - Send OpenAI audio back to caller (two-way AI interaction)
3. **Sentiment Analysis** - Track call sentiment in real-time
4. **Call Summary** - Generate post-call summary automatically
5. **Custom Prompts** - Allow agents to customize AI instructions per call type
## 🐛 Troubleshooting
### No suggestions appearing?
1. Check OpenAI API key is configured
2. Verify WebSocket connection logs show "OpenAI Realtime connected"
3. Check frontend Socket.IO connection is established
4. Verify user ID matches between backend and frontend
### Transcripts not saving?
1. Check tenant database connection
2. Verify `calls` table has `ai_transcript` column
3. Check logs for "Failed to update transcript" errors
### OpenAI connection fails?
1. Verify API key is valid
2. Check model name is correct
3. Review WebSocket close codes in logs
## 📝 Files Modified
**Backend:**
- `/backend/src/voice/voice.service.ts` - OpenAI integration & AI message handling
- `/backend/src/voice/voice.controller.ts` - TwiML generation with stream fork
- `/backend/src/voice/voice.gateway.ts` - Socket.IO event emission
- `/backend/src/main.ts` - Media stream WebSocket handler
**Frontend:**
- `/frontend/components/SoftphoneDialog.vue` - AI suggestions UI
- `/frontend/composables/useSoftphone.ts` - Socket.IO event handlers