2:I[7012,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],"MarkdownRenderer"]
4:I[9856,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],""]
5:I[4126,[],""]
7:I[9630,[],""]
8:I[4278,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"HeadingProvider"]
9:I[1476,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Header"]
a:I[3167,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Sidebar"]
b:I[7409,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"PageFrame"]
3:T3826,
# PHI-Aware STT Routing

Voice Mode v4.1 introduces PHI-aware speech-to-text routing to ensure Protected Health Information remains on-premises when required for HIPAA compliance.

## Overview

The PHI-aware STT router intelligently routes audio based on content sensitivity:

```
┌─────────────────────────────────────────────────────────────────┐
│                      Audio Input                                 │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────┐     ┌──────────────────┐                      │
│  │ PHI Detector │────▶│ Sensitivity Score │                     │
│  └──────────────┘     └──────────────────┘                      │
│                              │                                   │
│              ┌───────────────┼───────────────┐                  │
│              ▼               ▼               ▼                  │
│        Score < 0.3     0.3 ≤ Score < 0.7   Score ≥ 0.7         │
│              │               │               │                  │
│              ▼               ▼               ▼                  │
│     ┌────────────┐   ┌────────────┐   ┌────────────┐           │
│     │ Cloud STT  │   │ Hybrid Mode│   │Local Whisper│           │
│     │(OpenAI/GCP)│   │  (Redacted) │   │ (On-Prem)  │           │
│     └────────────┘   └────────────┘   └────────────┘           │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
```

### Thinker-Talker Pipeline Integration

```mermaid
sequenceDiagram
    participant User
    participant Frontend
    participant VoicePipeline
    participant PHIRouter
    participant Thinker as Thinker (LLM)
    participant Talker as Talker (TTS)
    participant Telemetry

    User->>Frontend: Speaks audio
    Frontend->>VoicePipeline: Audio stream
    VoicePipeline->>PHIRouter: route(audio_context)

    Note over PHIRouter: PHI Detection & Scoring
    PHIRouter->>Telemetry: update_routing_state()
    Telemetry-->>Frontend: PHI mode indicator (🛡️/🔒/☁️)

    alt PHI Score >= 0.7
        PHIRouter->>VoicePipeline: route="local"
        Note over VoicePipeline: Use Local Whisper
    else PHI Score 0.3-0.7
        PHIRouter->>VoicePipeline: route="hybrid"
        Note over VoicePipeline: Use Cloud + Redaction
    else PHI Score < 0.3
        PHIRouter->>VoicePipeline: route="cloud"
        Note over VoicePipeline: Use Cloud STT
    end

    VoicePipeline->>Thinker: transcript + context
    Thinker-->>VoicePipeline: response_stream
    VoicePipeline->>Talker: text_stream
    Talker-->>Frontend: audio_chunks
    Frontend-->>User: Plays response
```

### Routing Priority Order

```mermaid
flowchart TD
    A[Audio Input] --> B{Session has prior PHI?}
    B -->|Yes| L[LOCAL<br/>🛡️ On-device Whisper]
    B -->|No| C{PHI Score >= 0.7?}
    C -->|Yes| L
    C -->|No| D{PHI Score >= 0.3?}
    D -->|Yes| H[HYBRID<br/>🔒 Cloud + Redaction]
    D -->|No| E{Medical Context?}
    E -->|Yes| H
    E -->|No| CL[CLOUD<br/>☁️ Standard STT]

    L --> T[Thinker-Talker Pipeline]
    H --> T
    CL --> T

    style L fill:#90EE90
    style H fill:#FFE4B5
    style CL fill:#ADD8E6
```

## PHI Detection

### Detection Signals

The PHI detector analyzes multiple signals to score content sensitivity:

| Signal                   | Weight | Examples                                |
| ------------------------ | ------ | --------------------------------------- |
| Medical entity detection | 0.4    | "My doctor said...", "I take metformin" |
| Personal identifiers     | 0.3    | Names, DOB, SSN patterns                |
| Appointment context      | 0.2    | "My appointment at...", "Dr. Smith"     |
| Session history          | 0.1    | Previous PHI in conversation            |

### Sensitivity Scores

| Score Range | Classification        | Routing Decision       |
| ----------- | --------------------- | ---------------------- |
| 0.0 - 0.29  | General               | Cloud STT (fastest)    |
| 0.3 - 0.69  | Potentially Sensitive | Hybrid mode (redacted) |
| 0.7 - 1.0   | PHI Detected          | Local Whisper (secure) |

## Routing Strategies

### 1. Cloud STT (Default)

For general queries with no PHI indicators:

```python
from app.services.phi_stt_router import PHISTTRouter

router = PHISTTRouter()

# General query - routes to cloud
result = await router.transcribe(
    audio_data=audio_bytes,
    session_id="session_123"
)

# result.provider = "openai_whisper"
# result.phi_score = 0.15
# result.routing = "cloud"
```

### 2. Local Whisper (Secure)

For queries with high PHI probability:

```python
# PHI detected - routes to local Whisper
result = await router.transcribe(
    audio_data=audio_bytes,
    session_id="session_123",
    context={"has_prior_phi": True}  # Session context
)

# result.provider = "local_whisper"
# result.phi_score = 0.85
# result.routing = "local"
# result.phi_entities = ["medication", "condition"]
```

### 3. Hybrid Mode (Redacted)

For borderline cases, audio is processed with entity redaction:

```python
# Borderline - uses hybrid with redaction
result = await router.transcribe(
    audio_data=audio_bytes,
    session_id="session_123"
)

# result.provider = "openai_whisper_redacted"
# result.phi_score = 0.45
# result.routing = "hybrid"
# result.redacted_entities = ["name", "date"]
```

## Configuration

### Environment Variables

```bash
# Enable PHI-aware routing
VOICE_V4_PHI_ROUTING=true

# Local Whisper model path
WHISPER_MODEL_PATH=/opt/voiceassist/models/whisper-large-v3
WHISPER_MODEL_SIZE=large-v3

# Cloud STT provider (fallback)
STT_PROVIDER=openai  # openai, google, azure

# PHI detection thresholds
PHI_THRESHOLD_LOCAL=0.7
PHI_THRESHOLD_HYBRID=0.3

# Session context window (for PHI history)
PHI_SESSION_CONTEXT_WINDOW=10  # messages
```

### Feature Flag

```python
# Check if PHI routing is enabled
from app.core.feature_flags import feature_flag_service

if await feature_flag_service.is_enabled("backend.voice_v4_phi_routing"):
    router = PHISTTRouter()
else:
    router = StandardSTTRouter()
```

## Local Whisper Setup

### Installation

```bash
# Install faster-whisper (optimized inference)
pip install faster-whisper

# Download model
python -c "
from faster_whisper import WhisperModel
model = WhisperModel('large-v3', device='cuda', compute_type='float16')
print('Model downloaded successfully')
"
```

### Model Options

| Model    | Size   | VRAM  | RTF\* | Quality |
| -------- | ------ | ----- | ----- | ------- |
| tiny     | 39 MB  | 1 GB  | 0.03  | Basic   |
| base     | 74 MB  | 1 GB  | 0.05  | Good    |
| small    | 244 MB | 2 GB  | 0.08  | Better  |
| medium   | 769 MB | 5 GB  | 0.15  | Great   |
| large-v3 | 1.5 GB | 10 GB | 0.25  | Best    |

\*Real-time factor (lower is faster)

### GPU Requirements

- **Minimum**: NVIDIA GPU with 4GB VRAM (small model)
- **Recommended**: NVIDIA GPU with 10GB VRAM (large-v3)
- **CPU Fallback**: Available but 5-10x slower

## UI Integration

### PHI Indicator Component

```tsx
import { PHIIndicator } from "@/components/voice/PHIIndicator";

<PHIIndicator
  routing={result.routing} // "cloud" | "hybrid" | "local"
  phiScore={result.phi_score}
  showDetails={true}
/>;
```

### Visual States

| Routing | Icon | Color  | Tooltip                      |
| ------- | ---- | ------ | ---------------------------- |
| cloud   | ☁️   | Blue   | "Using cloud transcription"  |
| hybrid  | 🔒   | Yellow | "Sensitive content detected" |
| local   | 🛡️   | Green  | "Secure local processing"    |

### Subscribing to PHI Routing Updates (Frontend)

The `PHITelemetryService` provides real-time PHI routing state to the frontend via WebSocket events and a polling API.

#### Option 1: WebSocket Subscription

```tsx
import { useEffect, useState } from "react";
import { useWebSocket } from "@/hooks/useWebSocket";

interface PHIState {
  sessionId: string;
  phiMode: "local" | "hybrid" | "cloud";
  phiScore: number;
  isSecureMode: boolean;
  hasPriorPhi: boolean;
  indicatorColor: "green" | "yellow" | "blue";
  indicatorIcon: "shield" | "lock" | "cloud";
  tooltip: string;
}

function usePHIRoutingState(sessionId: string) {
  const [phiState, setPHIState] = useState<PHIState | null>(null);
  const { subscribe, unsubscribe } = useWebSocket();

  useEffect(() => {
    // Subscribe to PHI telemetry events
    const handlePHIEvent = (event: { type: string; data: PHIState }) => {
      if (event.type === "phi.routing_decision" || event.type === "phi.mode_change") {
        setPHIState(event.data);
      }
    };

    subscribe(`phi.${sessionId}`, handlePHIEvent);

    return () => unsubscribe(`phi.${sessionId}`, handlePHIEvent);
  }, [sessionId, subscribe, unsubscribe]);

  return phiState;
}
```

#### Option 2: REST API Polling

```tsx
// GET /api/voice/phi-state/{session_id}
// Returns current PHI routing state for the session

async function fetchPHIState(sessionId: string): Promise<PHIState> {
  const response = await fetch(`/api/voice/phi-state/${sessionId}`);
  return response.json();
}

// Example usage in a component
function PHIIndicator({ sessionId }: { sessionId: string }) {
  const [state, setState] = useState<PHIState | null>(null);

  useEffect(() => {
    const interval = setInterval(async () => {
      const newState = await fetchPHIState(sessionId);
      setState(newState);
    }, 1000); // Poll every second

    return () => clearInterval(interval);
  }, [sessionId]);

  if (!state) return null;

  return (
    <div className={`phi-indicator phi-${state.indicatorColor}`}>
      <span className="icon">{getIcon(state.indicatorIcon)}</span>
      <span className="tooltip">{state.tooltip}</span>
    </div>
  );
}
```

#### Backend API for Frontend State

```python
# In your FastAPI router
from app.services.phi_stt_router import get_phi_stt_router

@router.get("/api/voice/phi-state/{session_id}")
async def get_phi_state(session_id: str):
    """Get current PHI routing state for frontend indicator."""
    router = get_phi_stt_router()
    state = router.get_frontend_state(session_id)

    if state is None:
        raise HTTPException(404, "Session not found")

    return state
```

### Telemetry Event Types

| Event Type             | Description                            | Payload                        |
| ---------------------- | -------------------------------------- | ------------------------------ |
| `phi.routing_decision` | New routing decision made              | Full PHI state + previous mode |
| `phi.mode_change`      | PHI mode changed (e.g., cloud → local) | From/to modes, reason          |
| `phi.phi_detected`     | PHI entities detected in audio         | Score, entity types            |
| `phi.session_start`    | New PHI session initialized            | Initial state                  |
| `phi.session_end`      | PHI session ended                      | Final mode, had PHI flag       |

## Audit Logging

All PHI routing decisions are logged for compliance:

```python
logger.info("PHI routing decision", extra={
    "session_id": session_id,
    "phi_score": 0.85,
    "routing_decision": "local",
    "detection_signals": ["medication_mention", "condition_name"],
    "provider": "local_whisper",
    "processing_time_ms": 234,
    "model": "whisper-large-v3"
})
```

### Prometheus Metrics

```python
# Routing distribution
stt_routing_total.labels(routing="local").inc()
stt_routing_total.labels(routing="cloud").inc()
stt_routing_total.labels(routing="hybrid").inc()

# PHI detection accuracy
phi_detection_score_histogram.observe(phi_score)

# Latency by routing type
stt_latency_ms.labels(routing="local").observe(234)
```

## Testing

### Unit Tests

```python
@pytest.mark.asyncio
async def test_phi_routing_high_score():
    """High PHI score routes to local Whisper."""
    router = PHISTTRouter()

    # Mock audio with PHI content
    audio = generate_test_audio("I take metformin for my diabetes")

    result = await router.transcribe(audio)

    assert result.routing == "local"
    assert result.phi_score >= 0.7
    assert result.provider == "local_whisper"

@pytest.mark.asyncio
async def test_phi_routing_low_score():
    """Low PHI score routes to cloud."""
    router = PHISTTRouter()

    # Mock audio without PHI
    audio = generate_test_audio("What is the weather today?")

    result = await router.transcribe(audio)

    assert result.routing == "cloud"
    assert result.phi_score < 0.3
```

### Integration Tests

```bash
# Run PHI routing tests
pytest tests/services/test_phi_stt_router.py -v

# Test with real audio samples
pytest tests/integration/test_phi_routing_e2e.py -v --audio-samples ./test_audio/
```

## Best Practices

1. **Default to local for medical context**: If session involves health topics, bias toward local processing
2. **Cache PHI decisions per session**: Avoid re-evaluating the same session repeatedly
3. **Monitor latency impact**: Local Whisper adds ~200ms; account for this in latency budgets
4. **Regular model updates**: Update Whisper model quarterly for accuracy improvements
5. **Audit trail**: Maintain logs of all routing decisions for compliance audits

## Related Documentation

- [Voice Mode v4.1 Overview](./voice-mode-v4-overview.md)
- [Latency Budgets Guide](./latency-budgets-guide.md)
- [HIPAA Compliance Matrix](../HIPAA_COMPLIANCE_MATRIX.md)
6:["slug","voice/phi-aware-stt-routing","c"]
0:["X7oMT3VrOffzp0qvbeOas",[[["",{"children":["docs",{"children":[["slug","voice/phi-aware-stt-routing","c"],{"children":["__PAGE__?{\"slug\":[\"voice\",\"phi-aware-stt-routing\"]}",{}]}]}]},"$undefined","$undefined",true],["",{"children":["docs",{"children":[["slug","voice/phi-aware-stt-routing","c"],{"children":["__PAGE__",{},[["$L1",["$","div",null,{"children":[["$","div",null,{"className":"mb-6 flex items-center justify-between gap-4","children":[["$","div",null,{"children":[["$","p",null,{"className":"text-sm text-gray-500 dark:text-gray-400","children":"Docs / Raw"}],["$","h1",null,{"className":"text-3xl font-bold text-gray-900 dark:text-white","children":"PHI-Aware STT Routing"}],["$","p",null,{"className":"text-sm text-gray-600 dark:text-gray-400","children":["Sourced from"," ",["$","code",null,{"className":"font-mono text-xs","children":["docs/","voice/phi-aware-stt-routing.md"]}]]}]]}],["$","a",null,{"href":"https://github.com/mohammednazmy/VoiceAssist/edit/main/docs/voice/phi-aware-stt-routing.md","target":"_blank","rel":"noreferrer","className":"inline-flex items-center gap-2 rounded-md border border-gray-200 dark:border-gray-700 px-3 py-1.5 text-sm text-gray-700 dark:text-gray-200 hover:border-primary-500 dark:hover:border-primary-400 hover:text-primary-700 dark:hover:text-primary-300","children":"Edit on GitHub"}]]}],["$","div",null,{"className":"rounded-lg border border-gray-200 dark:border-gray-800 bg-white dark:bg-gray-900 p-6","children":["$","$L2",null,{"content":"$3"}]}],["$","div",null,{"className":"mt-6 flex flex-wrap gap-2 text-sm","children":[["$","$L4",null,{"href":"/reference/all-docs","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"← All documentation"}],["$","$L4",null,{"href":"/","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"Home"}]]}]]}],null],null],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children","$6","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/7f586cdbbaa33ff7.css","precedence":"next","crossOrigin":"$undefined"}]],["$","html",null,{"lang":"en","className":"h-full","children":["$","body",null,{"className":"__className_f367f3 h-full bg-white dark:bg-gray-900","children":[["$","a",null,{"href":"#main-content","className":"skip-to-content","children":"Skip to main content"}],["$","$L8",null,{"children":[["$","$L9",null,{}],["$","$La",null,{}],["$","main",null,{"id":"main-content","className":"lg:pl-64","role":"main","aria-label":"Documentation content","children":["$","$Lb",null,{"children":["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}]}]]}]]}]}]],null],null],["$Lc",null]]]]
c:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"PHI-Aware STT Routing | Docs | VoiceAssist Docs"}],["$","meta","3",{"name":"description","content":"Guide to PHI-aware speech-to-text routing for HIPAA compliance"}],["$","meta","4",{"name":"keywords","content":"VoiceAssist,documentation,medical AI,voice assistant,healthcare,HIPAA,API"}],["$","meta","5",{"name":"robots","content":"index, follow"}],["$","meta","6",{"name":"googlebot","content":"index, follow"}],["$","link","7",{"rel":"canonical","href":"https://assistdocs.asimo.io"}],["$","meta","8",{"property":"og:title","content":"VoiceAssist Documentation"}],["$","meta","9",{"property":"og:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","10",{"property":"og:url","content":"https://assistdocs.asimo.io"}],["$","meta","11",{"property":"og:site_name","content":"VoiceAssist Docs"}],["$","meta","12",{"property":"og:type","content":"website"}],["$","meta","13",{"name":"twitter:card","content":"summary"}],["$","meta","14",{"name":"twitter:title","content":"VoiceAssist Documentation"}],["$","meta","15",{"name":"twitter:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","16",{"name":"next-size-adjust"}]]
1:null