2:I[7012,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],"MarkdownRenderer"] 4:I[9856,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],""] 5:I[4126,[],""] 7:I[9630,[],""] 8:I[4278,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"HeadingProvider"] 9:I[1476,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Header"] a:I[3167,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Sidebar"] b:I[7409,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"PageFrame"] 3:T318a, # Adaptive Quality Service **Phase 3 - Voice Mode v4.1** Dynamic voice processing quality management based on network conditions and system load. ## Overview The Adaptive Quality Service monitors network performance and system load in real-time, automatically adjusting voice processing quality to maintain optimal user experience within latency budgets. ``` ┌─────────────────────────────────────────────────────────────────┐ │ Adaptive Quality Service │ │ │ │ Network Metrics ───▶ ┌─────────────┐ ───▶ Quality Level │ │ (RTT, bandwidth, │ Quality │ (ULTRA→MINIMAL) │ │ packet loss) │ Adjuster │ │ │ └─────────────┘ │ │ │ │ │ Latency Budget ────────────▶│◀─────── User Preferences │ │ (per component) │ │ │ ▼ │ │ ┌─────────────┐ │ │ │ Settings │ │ │ │ Generator │ │ │ └─────────────┘ │ │ │ │ │ ▼ │ │ STT Model, TTS Model, Bitrate, Features │ └─────────────────────────────────────────────────────────────────┘ ``` ## Features - **5 Quality Levels**: ULTRA, HIGH, MEDIUM, LOW, MINIMAL - **Network Monitoring**: RTT, bandwidth, packet loss, jitter - **Latency Budgets**: Per-component budget tracking - **Graceful Degradation**: Automatic quality reduction - **Load Testing**: Built-in test utilities - **Hysteresis**: Prevents quality flapping ## Quality Levels | Level | Target Latency | STT Model | TTS Model | Features | | ----------- | -------------- | ---------------- | --------------- | -------------------- | | **ULTRA** | 800ms | whisper-large-v3 | eleven_turbo_v2 | All enabled | | **HIGH** | 600ms | whisper-1 | eleven_turbo_v2 | All enabled | | **MEDIUM** | 500ms | whisper-1 | tts-1 | Sentiment, Language | | **LOW** | 400ms | whisper-1 | tts-1 | None | | **MINIMAL** | 300ms | whisper-1 | tts-1 | None, reduced tokens | ### Detailed Settings per Level ```python QUALITY_PRESETS = { QualityLevel.ULTRA: QualitySettings( stt_model="whisper-large-v3", tts_model="eleven_turbo_v2", audio_bitrate_kbps=128, sample_rate_hz=48000, max_context_tokens=8000, max_response_tokens=2000, enable_speaker_diarization=True, enable_sentiment_analysis=True, enable_language_detection=True, ), QualityLevel.HIGH: QualitySettings( stt_model="whisper-1", tts_model="eleven_turbo_v2", audio_bitrate_kbps=96, sample_rate_hz=24000, max_context_tokens=6000, max_response_tokens=1500, ... ), # ... and so on } ``` ## Network Conditions | Condition | RTT | Bandwidth | Packet Loss | Auto Level | | ------------- | --------- | ---------- | ----------- | ---------- | | **EXCELLENT** | <50ms | >10 Mbps | <0.1% | ULTRA | | **GOOD** | 50-150ms | 2-10 Mbps | <1% | HIGH | | **FAIR** | 150-300ms | 0.5-2 Mbps | <5% | MEDIUM | | **POOR** | 300-500ms | <0.5 Mbps | <10% | LOW | | **CRITICAL** | >500ms | Very low | >10% | MINIMAL | ## Feature Flag ```yaml # flag_definitions.yaml backend.voice_v4_adaptive_quality: default: false description: "Enable adaptive quality management" ``` ## Basic Usage ### Initialize Session ```python from app.services.adaptive_quality_service import ( get_adaptive_quality_service, QualityLevel ) service = get_adaptive_quality_service() await service.initialize() # Start session with initial quality state = await service.init_session( session_id="voice-123", initial_level=QualityLevel.HIGH, user_preference=QualityLevel.MEDIUM # Optional override ) print(f"Quality: {state.current_level.value}") print(f"Target latency: {state.current_settings.target_latency_ms}ms") ``` ### Update Network Metrics ```python from app.services.adaptive_quality_service import NetworkMetrics # Measure network conditions metrics = NetworkMetrics( rtt_ms=150, bandwidth_kbps=5000, packet_loss_pct=0.5, jitter_ms=15 ) # Update service (may trigger quality change) state = await service.update_network_metrics("voice-123", metrics) print(f"Network: {state.network_condition.value}") print(f"Quality: {state.current_level.value}") ``` ### Record Component Latency ```python # Track latency for budget monitoring budget = service.record_latency("voice-123", "stt", latency_ms=180) budget = service.record_latency("voice-123", "llm", latency_ms=250) budget = service.record_latency("voice-123", "tts", latency_ms=120) print(f"Total: {budget.total_actual_ms}ms / {budget.total_budget_ms}ms") print(f"Exceeded: {budget.is_exceeded}") ``` ### Get Current Settings ```python settings = service.get_current_settings("voice-123") # Use settings in voice pipeline stt_response = await stt_service.transcribe( audio=audio_data, model=settings.stt_model, sample_rate=settings.sample_rate_hz ) tts_response = await tts_service.synthesize( text=response_text, model=settings.tts_model, bitrate=settings.audio_bitrate_kbps ) ``` ## Latency Budget System ### Budget Allocation ``` Total Budget (e.g., 600ms for HIGH) ├── STT: 25% (150ms) ├── LLM: 35% (210ms) ├── TTS: 25% (150ms) └── Network: 15% (90ms) ``` ### LatencyBudget Class ```python @dataclass class LatencyBudget: total_budget_ms: int stt_budget_ms: int llm_budget_ms: int tts_budget_ms: int network_budget_ms: int # Actual measurements stt_actual_ms: float = 0 llm_actual_ms: float = 0 tts_actual_ms: float = 0 network_actual_ms: float = 0 @property def is_exceeded(self) -> bool: return self.total_actual_ms > self.total_budget_ms ``` ### Automatic Degradation When latency budget is exceeded: ```python # In voice pipeline budget = service.record_latency(session_id, "llm", 350) # Over budget if budget.is_exceeded: # Service automatically triggers degradation # Quality: HIGH → MEDIUM # New budget: 600ms → 500ms ``` ## Quality Change Callbacks ```python def on_quality_change(state: QualityState, event: DegradationEvent): print(f"Quality changed: {event.from_level} → {event.to_level}") print(f"Reason: {event.reason}") # Update UI send_to_frontend({ "type": "quality_change", "level": state.current_level.value, "reason": event.reason }) service.on_quality_change(on_quality_change) ``` ## Hysteresis Logic The service prevents quality flapping with hysteresis: ```python # Downgrade: Requires 2+ poor samples in last 3 def _should_downgrade(history): recent = history[-3:] poor_count = sum(1 for m in recent if m.condition in ["poor", "critical"]) return poor_count >= 2 # Upgrade: Requires 4+ good samples in last 5 def _should_upgrade(history): recent = history[-5:] if len(recent) < 5: return False good_count = sum(1 for m in recent if m.condition in ["excellent", "good"]) return good_count >= 4 ``` ## Load Testing ### Concurrent Session Test ```python from app.services.adaptive_quality_service import get_load_test_runner runner = get_load_test_runner() result = await runner.run_concurrent_session_test( num_sessions=50, duration_seconds=120, requests_per_second=10 ) print(f"Success rate: {result.success_rate}%") print(f"P95 latency: {result.p95_latency_ms}ms") print(f"Degradations: {result.degradations_triggered}") ``` ### Degradation Behavior Test ```python # Test quality degradation under poor network events = await runner.run_degradation_test( session_id="test-session", simulate_poor_network=True ) for event in events: print(f"{event.from_level} → {event.to_level}: {event.reason}") ``` ### LoadTestResult ```python @dataclass class LoadTestResult: test_name: str concurrent_sessions: int duration_seconds: float total_requests: int successful_requests: int failed_requests: int avg_latency_ms: float p50_latency_ms: float p95_latency_ms: float p99_latency_ms: float degradations_triggered: int @property def success_rate(self) -> float: return self.successful_requests / self.total_requests * 100 ``` ## Override Thresholds ### Custom Network Thresholds ```python # Define custom condition thresholds class CustomNetworkMetrics(NetworkMetrics): @property def condition(self) -> NetworkCondition: # Stricter thresholds for healthcare if self.rtt_ms < 30 and self.packet_loss_pct < 0.05: return NetworkCondition.EXCELLENT elif self.rtt_ms < 100 and self.packet_loss_pct < 0.5: return NetworkCondition.GOOD # ... etc ``` ### Custom Quality Presets ```python # Override preset settings custom_presets = QUALITY_PRESETS.copy() custom_presets[QualityLevel.HIGH] = QualitySettings( level=QualityLevel.HIGH, stt_model="whisper-1", tts_model="eleven_multilingual_v2", # Custom TTS target_latency_ms=500, # Tighter budget # ... other settings ) ``` ## Frontend Integration ### QualityBadge Component ```tsx interface QualityBadgeProps { level: "ultra" | "high" | "medium" | "low" | "minimal"; showLabel?: boolean; } function QualityBadge({ level, showLabel = true }: QualityBadgeProps) { const colors = { ultra: "bg-purple-500", high: "bg-green-500", medium: "bg-yellow-500", low: "bg-orange-500", minimal: "bg-red-500", }; return (
{showLabel && {level}}
); } ``` ### Real-time Quality Updates ```tsx function useQualityState(sessionId: string) { const [quality, setQuality] = useState(null); useEffect(() => { const es = new EventSource(`/api/voice/${sessionId}/quality`); es.onmessage = (event) => { setQuality(JSON.parse(event.data)); }; return () => es.close(); }, [sessionId]); return quality; } ``` ## Metrics and Monitoring ### Prometheus Metrics ```python # Exposed metrics voice_quality_level{session_id, level} voice_latency_budget_exceeded{session_id} voice_degradation_total{from_level, to_level, reason} voice_network_condition{session_id, condition} ``` ### Logging ```python # Quality change log logger.info( "Quality level changed", extra={ "session_id": session_id, "from_level": old_level, "to_level": new_level, "reason": reason, "network_condition": condition, } ) ``` ## Best Practices 1. **Initialize early**: Call `init_session` at voice mode start 2. **Update frequently**: Send network metrics every 5-10 seconds 3. **Record all latencies**: Track STT, LLM, TTS, and network 4. **Handle callbacks**: Update UI when quality changes 5. **Clean up**: Call `end_session` when voice mode ends 6. **Test degradation**: Use load tests before deployment ## Related Documentation - [Voice Mode v4 Overview](./voice-mode-v4-overview.md) - [Latency Budgets Guide](./latency-budgets-guide.md) - [Speaker Diarization Service](./speaker-diarization-service.md) - [FHIR Streaming Service](./fhir-streaming-service.md) 6:["slug","voice/adaptive-quality-service","c"] 0:["X7oMT3VrOffzp0qvbeOas",[[["",{"children":["docs",{"children":[["slug","voice/adaptive-quality-service","c"],{"children":["__PAGE__?{\"slug\":[\"voice\",\"adaptive-quality-service\"]}",{}]}]}]},"$undefined","$undefined",true],["",{"children":["docs",{"children":[["slug","voice/adaptive-quality-service","c"],{"children":["__PAGE__",{},[["$L1",["$","div",null,{"children":[["$","div",null,{"className":"mb-6 flex items-center justify-between gap-4","children":[["$","div",null,{"children":[["$","p",null,{"className":"text-sm text-gray-500 dark:text-gray-400","children":"Docs / Raw"}],["$","h1",null,{"className":"text-3xl font-bold text-gray-900 dark:text-white","children":"Adaptive Quality Service"}],["$","p",null,{"className":"text-sm text-gray-600 dark:text-gray-400","children":["Sourced from"," ",["$","code",null,{"className":"font-mono text-xs","children":["docs/","voice/adaptive-quality-service.md"]}]]}]]}],["$","a",null,{"href":"https://github.com/mohammednazmy/VoiceAssist/edit/main/docs/voice/adaptive-quality-service.md","target":"_blank","rel":"noreferrer","className":"inline-flex items-center gap-2 rounded-md border border-gray-200 dark:border-gray-700 px-3 py-1.5 text-sm text-gray-700 dark:text-gray-200 hover:border-primary-500 dark:hover:border-primary-400 hover:text-primary-700 dark:hover:text-primary-300","children":"Edit on GitHub"}]]}],["$","div",null,{"className":"rounded-lg border border-gray-200 dark:border-gray-800 bg-white dark:bg-gray-900 p-6","children":["$","$L2",null,{"content":"$3"}]}],["$","div",null,{"className":"mt-6 flex flex-wrap gap-2 text-sm","children":[["$","$L4",null,{"href":"/reference/all-docs","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"← All documentation"}],["$","$L4",null,{"href":"/","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"Home"}]]}]]}],null],null],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children","$6","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/7f586cdbbaa33ff7.css","precedence":"next","crossOrigin":"$undefined"}]],["$","html",null,{"lang":"en","className":"h-full","children":["$","body",null,{"className":"__className_f367f3 h-full bg-white dark:bg-gray-900","children":[["$","a",null,{"href":"#main-content","className":"skip-to-content","children":"Skip to main content"}],["$","$L8",null,{"children":[["$","$L9",null,{}],["$","$La",null,{}],["$","main",null,{"id":"main-content","className":"lg:pl-64","role":"main","aria-label":"Documentation content","children":["$","$Lb",null,{"children":["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}]}]]}]]}]}]],null],null],["$Lc",null]]]] c:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"Adaptive Quality Service | Docs | VoiceAssist Docs"}],["$","meta","3",{"name":"description","content":"Dynamic voice processing quality management based on network conditions and system load."}],["$","meta","4",{"name":"keywords","content":"VoiceAssist,documentation,medical AI,voice assistant,healthcare,HIPAA,API"}],["$","meta","5",{"name":"robots","content":"index, follow"}],["$","meta","6",{"name":"googlebot","content":"index, follow"}],["$","link","7",{"rel":"canonical","href":"https://assistdocs.asimo.io"}],["$","meta","8",{"property":"og:title","content":"VoiceAssist Documentation"}],["$","meta","9",{"property":"og:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","10",{"property":"og:url","content":"https://assistdocs.asimo.io"}],["$","meta","11",{"property":"og:site_name","content":"VoiceAssist Docs"}],["$","meta","12",{"property":"og:type","content":"website"}],["$","meta","13",{"name":"twitter:card","content":"summary"}],["$","meta","14",{"name":"twitter:title","content":"VoiceAssist Documentation"}],["$","meta","15",{"name":"twitter:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","16",{"name":"next-size-adjust"}]] 1:null