Files
neo/SOFTPHONE_AI_ASSISTANT.md
2026-01-04 08:48:43 +01:00

5.4 KiB

Softphone AI Assistant - Complete Implementation

🎉 Features Implemented

Real-time AI Call Assistant

  • OpenAI Realtime API Integration - Listens to live calls and provides suggestions
  • Audio Streaming - Twilio Media Streams fork audio to backend for AI processing
  • Real-time Transcription - Speech-to-text during calls
  • Smart Suggestions - AI analyzes conversation and advises the agent

🔧 Architecture

Backend Flow

Inbound Call → TwiML (<Start><Stream> + <Dial>) 
→ Media Stream WebSocket → OpenAI Realtime API 
→ AI Processing → Socket.IO → Frontend

Key Components

  1. TwiML Structure (voice.controller.ts:226-234)

    • <Start><Stream> - Forks audio for AI processing
    • <Dial><Client> - Connects call to agent's softphone
  2. OpenAI Integration (voice.service.ts:431-519)

    • WebSocket connection to wss://api.openai.com/v1/realtime?model=gpt-4o-realtime-preview-2024-10-01
    • Session config with custom instructions for agent assistance
    • Handles transcripts and generates suggestions
  3. AI Message Handler (voice.service.ts:609-707)

    • Processes OpenAI events (transcripts, suggestions, audio)
    • Routes suggestions to frontend via Socket.IO
    • Saves transcripts to database
  4. Voice Gateway (voice.gateway.ts:272-289)

    • notifyAiTranscript() - Real-time transcript chunks
    • notifyAiSuggestion() - AI suggestions to agent

Frontend Components

  1. Softphone Dialog (SoftphoneDialog.vue:104-135)

    • AI Assistant section with badge showing suggestion count
    • Color-coded suggestions (blue=response, green=action, purple=insight)
    • Animated highlight for newest suggestion
  2. Softphone Composable (useSoftphone.ts:515-535)

    • Socket.IO event handlers for ai:suggestion and ai:transcript
    • Maintains history of last 10 suggestions
    • Maintains history of last 50 transcript items

📋 AI Prompt Configuration

The AI is instructed to:

  • Listen, not talk - It advises the agent, not the caller
  • Provide concise suggestions - 1-2 sentences max
  • Use formatted output:
    • 💡 Suggestion: [advice]
    • ⚠️ Alert: [important notice]
    • 📋 Action: [CRM action]

🎨 UI Features

Suggestion Types

  • Response (Blue) - Suggested replies or approaches
  • Action (Green) - Recommended CRM actions
  • Insight (Purple) - Important alerts or observations

Visual Feedback

  • Badge showing number of suggestions
  • Newest suggestion pulses for attention
  • Auto-scrolling suggestion list
  • Timestamp on each suggestion

🔍 How to Monitor

1. Backend Logs

# Watch for AI events
docker logs -f neo-backend-1 | grep -E "AI|OpenAI|transcript|suggestion"

Key log markers:

  • 📝 Transcript chunk: - Real-time speech detection
  • ✅ Final transcript: - Complete transcript saved
  • 💡 AI Suggestion: - AI-generated advice

2. Database

-- View call transcripts
SELECT call_sid, ai_transcript, created_at 
FROM calls 
ORDER BY created_at DESC 
LIMIT 5;

3. Frontend Console

  • Open browser DevTools Console
  • Watch for: "AI suggestion:", "AI transcript:"

🚀 Testing

  1. Make a test call to your Twilio number
  2. Accept the call in the softphone dialog
  3. Talk during the call - Say something like "I need to schedule a follow-up"
  4. Watch the UI - AI suggestions appear in real-time
  5. Check logs - See transcription and suggestion generation

📊 Current Status

Working:

  • Inbound calls ring softphone
  • Media stream forks audio to backend
  • OpenAI processes audio (1300+ packets/call)
  • AI generates suggestions
  • Suggestions appear in frontend
  • Transcripts saved to database

🔧 Configuration

Required Environment Variables

# OpenAI API Key (set in tenant integrations config)
OPENAI_API_KEY=sk-...

# Optional overrides
OPENAI_MODEL=gpt-4o-realtime-preview-2024-10-01
OPENAI_VOICE=alloy

Tenant Configuration

Set in Settings > Integrations:

  • OpenAI API Key
  • Model (optional)
  • Voice (optional)

🎯 Next Steps (Optional Enhancements)

  1. CRM Tool Execution - Implement actual tool calls (search contacts, create tasks)
  2. Audio Response - Send OpenAI audio back to caller (two-way AI interaction)
  3. Sentiment Analysis - Track call sentiment in real-time
  4. Call Summary - Generate post-call summary automatically
  5. Custom Prompts - Allow agents to customize AI instructions per call type

🐛 Troubleshooting

No suggestions appearing?

  1. Check OpenAI API key is configured
  2. Verify WebSocket connection logs show "OpenAI Realtime connected"
  3. Check frontend Socket.IO connection is established
  4. Verify user ID matches between backend and frontend

Transcripts not saving?

  1. Check tenant database connection
  2. Verify calls table has ai_transcript column
  3. Check logs for "Failed to update transcript" errors

OpenAI connection fails?

  1. Verify API key is valid
  2. Check model name is correct
  3. Review WebSocket close codes in logs

📝 Files Modified

Backend:

  • /backend/src/voice/voice.service.ts - OpenAI integration & AI message handling
  • /backend/src/voice/voice.controller.ts - TwiML generation with stream fork
  • /backend/src/voice/voice.gateway.ts - Socket.IO event emission
  • /backend/src/main.ts - Media stream WebSocket handler

Frontend:

  • /frontend/components/SoftphoneDialog.vue - AI suggestions UI
  • /frontend/composables/useSoftphone.ts - Socket.IO event handlers