2:I[7012,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],"MarkdownRenderer"]
4:I[9856,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],""]
5:I[4126,[],""]
7:I[9630,[],""]
8:I[4278,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"HeadingProvider"]
9:I[1476,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Header"]
a:I[3167,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Sidebar"]
b:I[7409,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"PageFrame"]
3:T318a,
# Adaptive Quality Service

**Phase 3 - Voice Mode v4.1**

Dynamic voice processing quality management based on network conditions and system load.

## Overview

The Adaptive Quality Service monitors network performance and system load in real-time, automatically adjusting voice processing quality to maintain optimal user experience within latency budgets.

```
┌─────────────────────────────────────────────────────────────────┐
│                    Adaptive Quality Service                      │
│                                                                  │
│  Network Metrics ───▶ ┌─────────────┐ ───▶ Quality Level        │
│  (RTT, bandwidth,     │  Quality    │      (ULTRA→MINIMAL)      │
│   packet loss)        │  Adjuster   │                           │
│                       └─────────────┘                           │
│                              │                                   │
│  Latency Budget ────────────▶│◀─────── User Preferences        │
│  (per component)             │                                   │
│                              ▼                                   │
│                       ┌─────────────┐                           │
│                       │  Settings   │                           │
│                       │  Generator  │                           │
│                       └─────────────┘                           │
│                              │                                   │
│                              ▼                                   │
│              STT Model, TTS Model, Bitrate, Features            │
└─────────────────────────────────────────────────────────────────┘
```

## Features

- **5 Quality Levels**: ULTRA, HIGH, MEDIUM, LOW, MINIMAL
- **Network Monitoring**: RTT, bandwidth, packet loss, jitter
- **Latency Budgets**: Per-component budget tracking
- **Graceful Degradation**: Automatic quality reduction
- **Load Testing**: Built-in test utilities
- **Hysteresis**: Prevents quality flapping

## Quality Levels

| Level       | Target Latency | STT Model        | TTS Model       | Features             |
| ----------- | -------------- | ---------------- | --------------- | -------------------- |
| **ULTRA**   | 800ms          | whisper-large-v3 | eleven_turbo_v2 | All enabled          |
| **HIGH**    | 600ms          | whisper-1        | eleven_turbo_v2 | All enabled          |
| **MEDIUM**  | 500ms          | whisper-1        | tts-1           | Sentiment, Language  |
| **LOW**     | 400ms          | whisper-1        | tts-1           | None                 |
| **MINIMAL** | 300ms          | whisper-1        | tts-1           | None, reduced tokens |

### Detailed Settings per Level

```python
QUALITY_PRESETS = {
    QualityLevel.ULTRA: QualitySettings(
        stt_model="whisper-large-v3",
        tts_model="eleven_turbo_v2",
        audio_bitrate_kbps=128,
        sample_rate_hz=48000,
        max_context_tokens=8000,
        max_response_tokens=2000,
        enable_speaker_diarization=True,
        enable_sentiment_analysis=True,
        enable_language_detection=True,
    ),
    QualityLevel.HIGH: QualitySettings(
        stt_model="whisper-1",
        tts_model="eleven_turbo_v2",
        audio_bitrate_kbps=96,
        sample_rate_hz=24000,
        max_context_tokens=6000,
        max_response_tokens=1500,
        ...
    ),
    # ... and so on
}
```

## Network Conditions

| Condition     | RTT       | Bandwidth  | Packet Loss | Auto Level |
| ------------- | --------- | ---------- | ----------- | ---------- |
| **EXCELLENT** | <50ms     | >10 Mbps   | <0.1%       | ULTRA      |
| **GOOD**      | 50-150ms  | 2-10 Mbps  | <1%         | HIGH       |
| **FAIR**      | 150-300ms | 0.5-2 Mbps | <5%         | MEDIUM     |
| **POOR**      | 300-500ms | <0.5 Mbps  | <10%        | LOW        |
| **CRITICAL**  | >500ms    | Very low   | >10%        | MINIMAL    |

## Feature Flag

```yaml
# flag_definitions.yaml
backend.voice_v4_adaptive_quality:
  default: false
  description: "Enable adaptive quality management"
```

## Basic Usage

### Initialize Session

```python
from app.services.adaptive_quality_service import (
    get_adaptive_quality_service,
    QualityLevel
)

service = get_adaptive_quality_service()
await service.initialize()

# Start session with initial quality
state = await service.init_session(
    session_id="voice-123",
    initial_level=QualityLevel.HIGH,
    user_preference=QualityLevel.MEDIUM  # Optional override
)

print(f"Quality: {state.current_level.value}")
print(f"Target latency: {state.current_settings.target_latency_ms}ms")
```

### Update Network Metrics

```python
from app.services.adaptive_quality_service import NetworkMetrics

# Measure network conditions
metrics = NetworkMetrics(
    rtt_ms=150,
    bandwidth_kbps=5000,
    packet_loss_pct=0.5,
    jitter_ms=15
)

# Update service (may trigger quality change)
state = await service.update_network_metrics("voice-123", metrics)

print(f"Network: {state.network_condition.value}")
print(f"Quality: {state.current_level.value}")
```

### Record Component Latency

```python
# Track latency for budget monitoring
budget = service.record_latency("voice-123", "stt", latency_ms=180)
budget = service.record_latency("voice-123", "llm", latency_ms=250)
budget = service.record_latency("voice-123", "tts", latency_ms=120)

print(f"Total: {budget.total_actual_ms}ms / {budget.total_budget_ms}ms")
print(f"Exceeded: {budget.is_exceeded}")
```

### Get Current Settings

```python
settings = service.get_current_settings("voice-123")

# Use settings in voice pipeline
stt_response = await stt_service.transcribe(
    audio=audio_data,
    model=settings.stt_model,
    sample_rate=settings.sample_rate_hz
)

tts_response = await tts_service.synthesize(
    text=response_text,
    model=settings.tts_model,
    bitrate=settings.audio_bitrate_kbps
)
```

## Latency Budget System

### Budget Allocation

```
Total Budget (e.g., 600ms for HIGH)
├── STT: 25% (150ms)
├── LLM: 35% (210ms)
├── TTS: 25% (150ms)
└── Network: 15% (90ms)
```

### LatencyBudget Class

```python
@dataclass
class LatencyBudget:
    total_budget_ms: int
    stt_budget_ms: int
    llm_budget_ms: int
    tts_budget_ms: int
    network_budget_ms: int

    # Actual measurements
    stt_actual_ms: float = 0
    llm_actual_ms: float = 0
    tts_actual_ms: float = 0
    network_actual_ms: float = 0

    @property
    def is_exceeded(self) -> bool:
        return self.total_actual_ms > self.total_budget_ms
```

### Automatic Degradation

When latency budget is exceeded:

```python
# In voice pipeline
budget = service.record_latency(session_id, "llm", 350)  # Over budget

if budget.is_exceeded:
    # Service automatically triggers degradation
    # Quality: HIGH → MEDIUM
    # New budget: 600ms → 500ms
```

## Quality Change Callbacks

```python
def on_quality_change(state: QualityState, event: DegradationEvent):
    print(f"Quality changed: {event.from_level} → {event.to_level}")
    print(f"Reason: {event.reason}")

    # Update UI
    send_to_frontend({
        "type": "quality_change",
        "level": state.current_level.value,
        "reason": event.reason
    })

service.on_quality_change(on_quality_change)
```

## Hysteresis Logic

The service prevents quality flapping with hysteresis:

```python
# Downgrade: Requires 2+ poor samples in last 3
def _should_downgrade(history):
    recent = history[-3:]
    poor_count = sum(1 for m in recent if m.condition in ["poor", "critical"])
    return poor_count >= 2

# Upgrade: Requires 4+ good samples in last 5
def _should_upgrade(history):
    recent = history[-5:]
    if len(recent) < 5:
        return False
    good_count = sum(1 for m in recent if m.condition in ["excellent", "good"])
    return good_count >= 4
```

## Load Testing

### Concurrent Session Test

```python
from app.services.adaptive_quality_service import get_load_test_runner

runner = get_load_test_runner()

result = await runner.run_concurrent_session_test(
    num_sessions=50,
    duration_seconds=120,
    requests_per_second=10
)

print(f"Success rate: {result.success_rate}%")
print(f"P95 latency: {result.p95_latency_ms}ms")
print(f"Degradations: {result.degradations_triggered}")
```

### Degradation Behavior Test

```python
# Test quality degradation under poor network
events = await runner.run_degradation_test(
    session_id="test-session",
    simulate_poor_network=True
)

for event in events:
    print(f"{event.from_level} → {event.to_level}: {event.reason}")
```

### LoadTestResult

```python
@dataclass
class LoadTestResult:
    test_name: str
    concurrent_sessions: int
    duration_seconds: float
    total_requests: int
    successful_requests: int
    failed_requests: int
    avg_latency_ms: float
    p50_latency_ms: float
    p95_latency_ms: float
    p99_latency_ms: float
    degradations_triggered: int

    @property
    def success_rate(self) -> float:
        return self.successful_requests / self.total_requests * 100
```

## Override Thresholds

### Custom Network Thresholds

```python
# Define custom condition thresholds
class CustomNetworkMetrics(NetworkMetrics):
    @property
    def condition(self) -> NetworkCondition:
        # Stricter thresholds for healthcare
        if self.rtt_ms < 30 and self.packet_loss_pct < 0.05:
            return NetworkCondition.EXCELLENT
        elif self.rtt_ms < 100 and self.packet_loss_pct < 0.5:
            return NetworkCondition.GOOD
        # ... etc
```

### Custom Quality Presets

```python
# Override preset settings
custom_presets = QUALITY_PRESETS.copy()
custom_presets[QualityLevel.HIGH] = QualitySettings(
    level=QualityLevel.HIGH,
    stt_model="whisper-1",
    tts_model="eleven_multilingual_v2",  # Custom TTS
    target_latency_ms=500,  # Tighter budget
    # ... other settings
)
```

## Frontend Integration

### QualityBadge Component

```tsx
interface QualityBadgeProps {
  level: "ultra" | "high" | "medium" | "low" | "minimal";
  showLabel?: boolean;
}

function QualityBadge({ level, showLabel = true }: QualityBadgeProps) {
  const colors = {
    ultra: "bg-purple-500",
    high: "bg-green-500",
    medium: "bg-yellow-500",
    low: "bg-orange-500",
    minimal: "bg-red-500",
  };

  return (
    <div className={cn("flex items-center gap-1.5 px-2 py-1 rounded-full text-white text-xs", colors[level])}>
      <span className="w-2 h-2 rounded-full bg-white/50" />
      {showLabel && <span className="uppercase font-medium">{level}</span>}
    </div>
  );
}
```

### Real-time Quality Updates

```tsx
function useQualityState(sessionId: string) {
  const [quality, setQuality] = useState<QualityState | null>(null);

  useEffect(() => {
    const es = new EventSource(`/api/voice/${sessionId}/quality`);

    es.onmessage = (event) => {
      setQuality(JSON.parse(event.data));
    };

    return () => es.close();
  }, [sessionId]);

  return quality;
}
```

## Metrics and Monitoring

### Prometheus Metrics

```python
# Exposed metrics
voice_quality_level{session_id, level}
voice_latency_budget_exceeded{session_id}
voice_degradation_total{from_level, to_level, reason}
voice_network_condition{session_id, condition}
```

### Logging

```python
# Quality change log
logger.info(
    "Quality level changed",
    extra={
        "session_id": session_id,
        "from_level": old_level,
        "to_level": new_level,
        "reason": reason,
        "network_condition": condition,
    }
)
```

## Best Practices

1. **Initialize early**: Call `init_session` at voice mode start
2. **Update frequently**: Send network metrics every 5-10 seconds
3. **Record all latencies**: Track STT, LLM, TTS, and network
4. **Handle callbacks**: Update UI when quality changes
5. **Clean up**: Call `end_session` when voice mode ends
6. **Test degradation**: Use load tests before deployment

## Related Documentation

- [Voice Mode v4 Overview](./voice-mode-v4-overview.md)
- [Latency Budgets Guide](./latency-budgets-guide.md)
- [Speaker Diarization Service](./speaker-diarization-service.md)
- [FHIR Streaming Service](./fhir-streaming-service.md)
6:["slug","voice/adaptive-quality-service","c"]
0:["X7oMT3VrOffzp0qvbeOas",[[["",{"children":["docs",{"children":[["slug","voice/adaptive-quality-service","c"],{"children":["__PAGE__?{\"slug\":[\"voice\",\"adaptive-quality-service\"]}",{}]}]}]},"$undefined","$undefined",true],["",{"children":["docs",{"children":[["slug","voice/adaptive-quality-service","c"],{"children":["__PAGE__",{},[["$L1",["$","div",null,{"children":[["$","div",null,{"className":"mb-6 flex items-center justify-between gap-4","children":[["$","div",null,{"children":[["$","p",null,{"className":"text-sm text-gray-500 dark:text-gray-400","children":"Docs / Raw"}],["$","h1",null,{"className":"text-3xl font-bold text-gray-900 dark:text-white","children":"Adaptive Quality Service"}],["$","p",null,{"className":"text-sm text-gray-600 dark:text-gray-400","children":["Sourced from"," ",["$","code",null,{"className":"font-mono text-xs","children":["docs/","voice/adaptive-quality-service.md"]}]]}]]}],["$","a",null,{"href":"https://github.com/mohammednazmy/VoiceAssist/edit/main/docs/voice/adaptive-quality-service.md","target":"_blank","rel":"noreferrer","className":"inline-flex items-center gap-2 rounded-md border border-gray-200 dark:border-gray-700 px-3 py-1.5 text-sm text-gray-700 dark:text-gray-200 hover:border-primary-500 dark:hover:border-primary-400 hover:text-primary-700 dark:hover:text-primary-300","children":"Edit on GitHub"}]]}],["$","div",null,{"className":"rounded-lg border border-gray-200 dark:border-gray-800 bg-white dark:bg-gray-900 p-6","children":["$","$L2",null,{"content":"$3"}]}],["$","div",null,{"className":"mt-6 flex flex-wrap gap-2 text-sm","children":[["$","$L4",null,{"href":"/reference/all-docs","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"← All documentation"}],["$","$L4",null,{"href":"/","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"Home"}]]}]]}],null],null],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children","$6","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/7f586cdbbaa33ff7.css","precedence":"next","crossOrigin":"$undefined"}]],["$","html",null,{"lang":"en","className":"h-full","children":["$","body",null,{"className":"__className_f367f3 h-full bg-white dark:bg-gray-900","children":[["$","a",null,{"href":"#main-content","className":"skip-to-content","children":"Skip to main content"}],["$","$L8",null,{"children":[["$","$L9",null,{}],["$","$La",null,{}],["$","main",null,{"id":"main-content","className":"lg:pl-64","role":"main","aria-label":"Documentation content","children":["$","$Lb",null,{"children":["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}]}]]}]]}]}]],null],null],["$Lc",null]]]]
c:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"Adaptive Quality Service | Docs | VoiceAssist Docs"}],["$","meta","3",{"name":"description","content":"Dynamic voice processing quality management based on network conditions and system load."}],["$","meta","4",{"name":"keywords","content":"VoiceAssist,documentation,medical AI,voice assistant,healthcare,HIPAA,API"}],["$","meta","5",{"name":"robots","content":"index, follow"}],["$","meta","6",{"name":"googlebot","content":"index, follow"}],["$","link","7",{"rel":"canonical","href":"https://assistdocs.asimo.io"}],["$","meta","8",{"property":"og:title","content":"VoiceAssist Documentation"}],["$","meta","9",{"property":"og:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","10",{"property":"og:url","content":"https://assistdocs.asimo.io"}],["$","meta","11",{"property":"og:site_name","content":"VoiceAssist Docs"}],["$","meta","12",{"property":"og:type","content":"website"}],["$","meta","13",{"name":"twitter:card","content":"summary"}],["$","meta","14",{"name":"twitter:title","content":"VoiceAssist Documentation"}],["$","meta","15",{"name":"twitter:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","16",{"name":"next-size-adjust"}]]
1:null