2:I[7012,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],"MarkdownRenderer"] 4:I[9856,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],""] 5:I[4126,[],""] 7:I[9630,[],""] 8:I[4278,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"HeadingProvider"] 9:I[1476,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Header"] a:I[3167,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Sidebar"] b:I[7409,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"PageFrame"] 3:T3826, # PHI-Aware STT Routing Voice Mode v4.1 introduces PHI-aware speech-to-text routing to ensure Protected Health Information remains on-premises when required for HIPAA compliance. ## Overview The PHI-aware STT router intelligently routes audio based on content sensitivity: ``` ┌─────────────────────────────────────────────────────────────────┐ │ Audio Input │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────────┐ ┌──────────────────┐ │ │ │ PHI Detector │────▶│ Sensitivity Score │ │ │ └──────────────┘ └──────────────────┘ │ │ │ │ │ ┌───────────────┼───────────────┐ │ │ ▼ ▼ ▼ │ │ Score < 0.3 0.3 ≤ Score < 0.7 Score ≥ 0.7 │ │ │ │ │ │ │ ▼ ▼ ▼ │ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │ │ Cloud STT │ │ Hybrid Mode│ │Local Whisper│ │ │ │(OpenAI/GCP)│ │ (Redacted) │ │ (On-Prem) │ │ │ └────────────┘ └────────────┘ └────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` ### Thinker-Talker Pipeline Integration ```mermaid sequenceDiagram participant User participant Frontend participant VoicePipeline participant PHIRouter participant Thinker as Thinker (LLM) participant Talker as Talker (TTS) participant Telemetry User->>Frontend: Speaks audio Frontend->>VoicePipeline: Audio stream VoicePipeline->>PHIRouter: route(audio_context) Note over PHIRouter: PHI Detection & Scoring PHIRouter->>Telemetry: update_routing_state() Telemetry-->>Frontend: PHI mode indicator (🛡️/🔒/☁️) alt PHI Score >= 0.7 PHIRouter->>VoicePipeline: route="local" Note over VoicePipeline: Use Local Whisper else PHI Score 0.3-0.7 PHIRouter->>VoicePipeline: route="hybrid" Note over VoicePipeline: Use Cloud + Redaction else PHI Score < 0.3 PHIRouter->>VoicePipeline: route="cloud" Note over VoicePipeline: Use Cloud STT end VoicePipeline->>Thinker: transcript + context Thinker-->>VoicePipeline: response_stream VoicePipeline->>Talker: text_stream Talker-->>Frontend: audio_chunks Frontend-->>User: Plays response ``` ### Routing Priority Order ```mermaid flowchart TD A[Audio Input] --> B{Session has prior PHI?} B -->|Yes| L[LOCAL
🛡️ On-device Whisper] B -->|No| C{PHI Score >= 0.7?} C -->|Yes| L C -->|No| D{PHI Score >= 0.3?} D -->|Yes| H[HYBRID
🔒 Cloud + Redaction] D -->|No| E{Medical Context?} E -->|Yes| H E -->|No| CL[CLOUD
☁️ Standard STT] L --> T[Thinker-Talker Pipeline] H --> T CL --> T style L fill:#90EE90 style H fill:#FFE4B5 style CL fill:#ADD8E6 ``` ## PHI Detection ### Detection Signals The PHI detector analyzes multiple signals to score content sensitivity: | Signal | Weight | Examples | | ------------------------ | ------ | --------------------------------------- | | Medical entity detection | 0.4 | "My doctor said...", "I take metformin" | | Personal identifiers | 0.3 | Names, DOB, SSN patterns | | Appointment context | 0.2 | "My appointment at...", "Dr. Smith" | | Session history | 0.1 | Previous PHI in conversation | ### Sensitivity Scores | Score Range | Classification | Routing Decision | | ----------- | --------------------- | ---------------------- | | 0.0 - 0.29 | General | Cloud STT (fastest) | | 0.3 - 0.69 | Potentially Sensitive | Hybrid mode (redacted) | | 0.7 - 1.0 | PHI Detected | Local Whisper (secure) | ## Routing Strategies ### 1. Cloud STT (Default) For general queries with no PHI indicators: ```python from app.services.phi_stt_router import PHISTTRouter router = PHISTTRouter() # General query - routes to cloud result = await router.transcribe( audio_data=audio_bytes, session_id="session_123" ) # result.provider = "openai_whisper" # result.phi_score = 0.15 # result.routing = "cloud" ``` ### 2. Local Whisper (Secure) For queries with high PHI probability: ```python # PHI detected - routes to local Whisper result = await router.transcribe( audio_data=audio_bytes, session_id="session_123", context={"has_prior_phi": True} # Session context ) # result.provider = "local_whisper" # result.phi_score = 0.85 # result.routing = "local" # result.phi_entities = ["medication", "condition"] ``` ### 3. Hybrid Mode (Redacted) For borderline cases, audio is processed with entity redaction: ```python # Borderline - uses hybrid with redaction result = await router.transcribe( audio_data=audio_bytes, session_id="session_123" ) # result.provider = "openai_whisper_redacted" # result.phi_score = 0.45 # result.routing = "hybrid" # result.redacted_entities = ["name", "date"] ``` ## Configuration ### Environment Variables ```bash # Enable PHI-aware routing VOICE_V4_PHI_ROUTING=true # Local Whisper model path WHISPER_MODEL_PATH=/opt/voiceassist/models/whisper-large-v3 WHISPER_MODEL_SIZE=large-v3 # Cloud STT provider (fallback) STT_PROVIDER=openai # openai, google, azure # PHI detection thresholds PHI_THRESHOLD_LOCAL=0.7 PHI_THRESHOLD_HYBRID=0.3 # Session context window (for PHI history) PHI_SESSION_CONTEXT_WINDOW=10 # messages ``` ### Feature Flag ```python # Check if PHI routing is enabled from app.core.feature_flags import feature_flag_service if await feature_flag_service.is_enabled("backend.voice_v4_phi_routing"): router = PHISTTRouter() else: router = StandardSTTRouter() ``` ## Local Whisper Setup ### Installation ```bash # Install faster-whisper (optimized inference) pip install faster-whisper # Download model python -c " from faster_whisper import WhisperModel model = WhisperModel('large-v3', device='cuda', compute_type='float16') print('Model downloaded successfully') " ``` ### Model Options | Model | Size | VRAM | RTF\* | Quality | | -------- | ------ | ----- | ----- | ------- | | tiny | 39 MB | 1 GB | 0.03 | Basic | | base | 74 MB | 1 GB | 0.05 | Good | | small | 244 MB | 2 GB | 0.08 | Better | | medium | 769 MB | 5 GB | 0.15 | Great | | large-v3 | 1.5 GB | 10 GB | 0.25 | Best | \*Real-time factor (lower is faster) ### GPU Requirements - **Minimum**: NVIDIA GPU with 4GB VRAM (small model) - **Recommended**: NVIDIA GPU with 10GB VRAM (large-v3) - **CPU Fallback**: Available but 5-10x slower ## UI Integration ### PHI Indicator Component ```tsx import { PHIIndicator } from "@/components/voice/PHIIndicator"; ; ``` ### Visual States | Routing | Icon | Color | Tooltip | | ------- | ---- | ------ | ---------------------------- | | cloud | ☁️ | Blue | "Using cloud transcription" | | hybrid | 🔒 | Yellow | "Sensitive content detected" | | local | 🛡️ | Green | "Secure local processing" | ### Subscribing to PHI Routing Updates (Frontend) The `PHITelemetryService` provides real-time PHI routing state to the frontend via WebSocket events and a polling API. #### Option 1: WebSocket Subscription ```tsx import { useEffect, useState } from "react"; import { useWebSocket } from "@/hooks/useWebSocket"; interface PHIState { sessionId: string; phiMode: "local" | "hybrid" | "cloud"; phiScore: number; isSecureMode: boolean; hasPriorPhi: boolean; indicatorColor: "green" | "yellow" | "blue"; indicatorIcon: "shield" | "lock" | "cloud"; tooltip: string; } function usePHIRoutingState(sessionId: string) { const [phiState, setPHIState] = useState(null); const { subscribe, unsubscribe } = useWebSocket(); useEffect(() => { // Subscribe to PHI telemetry events const handlePHIEvent = (event: { type: string; data: PHIState }) => { if (event.type === "phi.routing_decision" || event.type === "phi.mode_change") { setPHIState(event.data); } }; subscribe(`phi.${sessionId}`, handlePHIEvent); return () => unsubscribe(`phi.${sessionId}`, handlePHIEvent); }, [sessionId, subscribe, unsubscribe]); return phiState; } ``` #### Option 2: REST API Polling ```tsx // GET /api/voice/phi-state/{session_id} // Returns current PHI routing state for the session async function fetchPHIState(sessionId: string): Promise { const response = await fetch(`/api/voice/phi-state/${sessionId}`); return response.json(); } // Example usage in a component function PHIIndicator({ sessionId }: { sessionId: string }) { const [state, setState] = useState(null); useEffect(() => { const interval = setInterval(async () => { const newState = await fetchPHIState(sessionId); setState(newState); }, 1000); // Poll every second return () => clearInterval(interval); }, [sessionId]); if (!state) return null; return (
{getIcon(state.indicatorIcon)} {state.tooltip}
); } ``` #### Backend API for Frontend State ```python # In your FastAPI router from app.services.phi_stt_router import get_phi_stt_router @router.get("/api/voice/phi-state/{session_id}") async def get_phi_state(session_id: str): """Get current PHI routing state for frontend indicator.""" router = get_phi_stt_router() state = router.get_frontend_state(session_id) if state is None: raise HTTPException(404, "Session not found") return state ``` ### Telemetry Event Types | Event Type | Description | Payload | | ---------------------- | -------------------------------------- | ------------------------------ | | `phi.routing_decision` | New routing decision made | Full PHI state + previous mode | | `phi.mode_change` | PHI mode changed (e.g., cloud → local) | From/to modes, reason | | `phi.phi_detected` | PHI entities detected in audio | Score, entity types | | `phi.session_start` | New PHI session initialized | Initial state | | `phi.session_end` | PHI session ended | Final mode, had PHI flag | ## Audit Logging All PHI routing decisions are logged for compliance: ```python logger.info("PHI routing decision", extra={ "session_id": session_id, "phi_score": 0.85, "routing_decision": "local", "detection_signals": ["medication_mention", "condition_name"], "provider": "local_whisper", "processing_time_ms": 234, "model": "whisper-large-v3" }) ``` ### Prometheus Metrics ```python # Routing distribution stt_routing_total.labels(routing="local").inc() stt_routing_total.labels(routing="cloud").inc() stt_routing_total.labels(routing="hybrid").inc() # PHI detection accuracy phi_detection_score_histogram.observe(phi_score) # Latency by routing type stt_latency_ms.labels(routing="local").observe(234) ``` ## Testing ### Unit Tests ```python @pytest.mark.asyncio async def test_phi_routing_high_score(): """High PHI score routes to local Whisper.""" router = PHISTTRouter() # Mock audio with PHI content audio = generate_test_audio("I take metformin for my diabetes") result = await router.transcribe(audio) assert result.routing == "local" assert result.phi_score >= 0.7 assert result.provider == "local_whisper" @pytest.mark.asyncio async def test_phi_routing_low_score(): """Low PHI score routes to cloud.""" router = PHISTTRouter() # Mock audio without PHI audio = generate_test_audio("What is the weather today?") result = await router.transcribe(audio) assert result.routing == "cloud" assert result.phi_score < 0.3 ``` ### Integration Tests ```bash # Run PHI routing tests pytest tests/services/test_phi_stt_router.py -v # Test with real audio samples pytest tests/integration/test_phi_routing_e2e.py -v --audio-samples ./test_audio/ ``` ## Best Practices 1. **Default to local for medical context**: If session involves health topics, bias toward local processing 2. **Cache PHI decisions per session**: Avoid re-evaluating the same session repeatedly 3. **Monitor latency impact**: Local Whisper adds ~200ms; account for this in latency budgets 4. **Regular model updates**: Update Whisper model quarterly for accuracy improvements 5. **Audit trail**: Maintain logs of all routing decisions for compliance audits ## Related Documentation - [Voice Mode v4.1 Overview](./voice-mode-v4-overview.md) - [Latency Budgets Guide](./latency-budgets-guide.md) - [HIPAA Compliance Matrix](../HIPAA_COMPLIANCE_MATRIX.md) 6:["slug","voice/phi-aware-stt-routing","c"] 0:["X7oMT3VrOffzp0qvbeOas",[[["",{"children":["docs",{"children":[["slug","voice/phi-aware-stt-routing","c"],{"children":["__PAGE__?{\"slug\":[\"voice\",\"phi-aware-stt-routing\"]}",{}]}]}]},"$undefined","$undefined",true],["",{"children":["docs",{"children":[["slug","voice/phi-aware-stt-routing","c"],{"children":["__PAGE__",{},[["$L1",["$","div",null,{"children":[["$","div",null,{"className":"mb-6 flex items-center justify-between gap-4","children":[["$","div",null,{"children":[["$","p",null,{"className":"text-sm text-gray-500 dark:text-gray-400","children":"Docs / Raw"}],["$","h1",null,{"className":"text-3xl font-bold text-gray-900 dark:text-white","children":"PHI-Aware STT Routing"}],["$","p",null,{"className":"text-sm text-gray-600 dark:text-gray-400","children":["Sourced from"," ",["$","code",null,{"className":"font-mono text-xs","children":["docs/","voice/phi-aware-stt-routing.md"]}]]}]]}],["$","a",null,{"href":"https://github.com/mohammednazmy/VoiceAssist/edit/main/docs/voice/phi-aware-stt-routing.md","target":"_blank","rel":"noreferrer","className":"inline-flex items-center gap-2 rounded-md border border-gray-200 dark:border-gray-700 px-3 py-1.5 text-sm text-gray-700 dark:text-gray-200 hover:border-primary-500 dark:hover:border-primary-400 hover:text-primary-700 dark:hover:text-primary-300","children":"Edit on GitHub"}]]}],["$","div",null,{"className":"rounded-lg border border-gray-200 dark:border-gray-800 bg-white dark:bg-gray-900 p-6","children":["$","$L2",null,{"content":"$3"}]}],["$","div",null,{"className":"mt-6 flex flex-wrap gap-2 text-sm","children":[["$","$L4",null,{"href":"/reference/all-docs","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"← All documentation"}],["$","$L4",null,{"href":"/","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"Home"}]]}]]}],null],null],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children","$6","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/7f586cdbbaa33ff7.css","precedence":"next","crossOrigin":"$undefined"}]],["$","html",null,{"lang":"en","className":"h-full","children":["$","body",null,{"className":"__className_f367f3 h-full bg-white dark:bg-gray-900","children":[["$","a",null,{"href":"#main-content","className":"skip-to-content","children":"Skip to main content"}],["$","$L8",null,{"children":[["$","$L9",null,{}],["$","$La",null,{}],["$","main",null,{"id":"main-content","className":"lg:pl-64","role":"main","aria-label":"Documentation content","children":["$","$Lb",null,{"children":["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}]}]]}]]}]}]],null],null],["$Lc",null]]]] c:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"PHI-Aware STT Routing | Docs | VoiceAssist Docs"}],["$","meta","3",{"name":"description","content":"Guide to PHI-aware speech-to-text routing for HIPAA compliance"}],["$","meta","4",{"name":"keywords","content":"VoiceAssist,documentation,medical AI,voice assistant,healthcare,HIPAA,API"}],["$","meta","5",{"name":"robots","content":"index, follow"}],["$","meta","6",{"name":"googlebot","content":"index, follow"}],["$","link","7",{"rel":"canonical","href":"https://assistdocs.asimo.io"}],["$","meta","8",{"property":"og:title","content":"VoiceAssist Documentation"}],["$","meta","9",{"property":"og:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","10",{"property":"og:url","content":"https://assistdocs.asimo.io"}],["$","meta","11",{"property":"og:site_name","content":"VoiceAssist Docs"}],["$","meta","12",{"property":"og:type","content":"website"}],["$","meta","13",{"name":"twitter:card","content":"summary"}],["$","meta","14",{"name":"twitter:title","content":"VoiceAssist Documentation"}],["$","meta","15",{"name":"twitter:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","16",{"name":"next-size-adjust"}]] 1:null