2:I[7012,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],"MarkdownRenderer"] 4:I[9856,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],""] 5:I[4126,[],""] 7:I[9630,[],""] 8:I[4278,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"HeadingProvider"] 9:I[1476,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Header"] a:I[3167,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Sidebar"] b:I[7409,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"PageFrame"] 3:T681a, # Voice Mode Enhancement - 10 Phase Implementation > **Status**: ✅ COMPLETE (2025-12-03) > **All 10 phases implemented with full backend-frontend integration** This document describes the comprehensive 10-phase enhancement to VoiceAssist's voice mode, transforming it from a functional voice assistant into a human-like conversational partner with medical dictation capabilities. ## Executive Summary **Primary Goals Achieved:** 1. ✅ Natural, human-like voice interactions 2. ✅ Contextual memory across conversations 3. ✅ Professional medical dictation 4. ✅ Natural backchanneling 5. ✅ Session analytics and feedback collection **Key External Services:** - **Hume AI** - Emotion detection from audio (HIPAA BAA available) - **Deepgram Nova-3 Medical** - Upgraded STT for medical vocabulary - **ElevenLabs** - TTS with backchanneling support --- ## Phase Implementation Status | Phase | Name | Status | Backend Service | Frontend Handler | | ----- | -------------------------------- | ------ | ---------------------------------------------------------------------------------------------------------------- | --------------------------- | | 1 | Emotional Intelligence | ✅ | `emotion_detection_service.py` | `emotion.detected` | | 2 | Backchanneling System | ✅ | `backchannel_service.py` | `backchannel.trigger` | | 3 | Prosody Analysis | ✅ | `prosody_analysis_service.py` | Integrated | | 4 | Memory & Context | ✅ | `memory_context_service.py` | `memory.context_loaded` | | 5 | Advanced Turn-Taking | ✅ | Integrated in pipeline | `turn.state` | | 6 | Variable Response Timing | ✅ | Integrated in pipeline | Timing controls | | 7 | Conversational Repair | ✅ | `repair_strategy_service.py` | Repair flows | | 8 | Medical Dictation Core | ✅ | `dictation_service.py`, `voice_command_service.py`, `note_formatter_service.py`, `medical_vocabulary_service.py` | `dictation.*` | | 9 | Patient Context Integration | ✅ | `patient_context_service.py`, `dictation_phi_monitor.py` | `patient.*`, `phi.*` | | 10 | Frontend Integration & Analytics | ✅ | `session_analytics_service.py`, `feedback_service.py` | `analytics.*`, `feedback.*` | --- ## Architecture Overview ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ ENHANCED VOICE PIPELINE │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ User Audio ──┬──> Deepgram Nova-3 ──> Transcript ──┐ │ │ │ (Medical STT) │ │ │ │ │ │ │ ├──> Hume AI ──────────> Emotion ──────┼──> Context Builder │ │ │ (Emotion) │ │ │ │ │ │ │ └──> Prosody Analyzer ──> Urgency ─────┘ │ │ (from Deepgram) │ │ │ │ Context Builder ──┬──> Short-term (Redis) ─────────┐ │ │ ├──> Medium-term (PostgreSQL) ───┼──> Memory Service │ │ └──> Long-term (Qdrant vectors) ─┘ │ │ │ │ Memory + Emotion + Transcript ──> Thinker (GPT-4o) ──> Response │ │ │ │ Response ──> Turn Manager ──> TTS (ElevenLabs) ──> User │ │ │ │ │ └──> Backchannel Service (parallel audio) │ │ │ │ Session Analytics ──> Metrics + Latency Tracking ──> Feedback Prompts │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` --- ## Phase 1: Emotional Intelligence **Goal:** Detect user emotions from speech and adapt responses accordingly. ### Backend Service **Location:** `services/api-gateway/app/services/emotion_detection_service.py` ```python class EmotionDetectionService: """ Wraps Hume AI Expression Measurement API. - Analyzes audio chunks (500ms) in parallel with STT - Returns: valence, arousal, discrete emotions - Caches recent emotion states for trending """ async def analyze_audio_chunk(self, audio: bytes) -> EmotionResult async def get_emotion_trend(self, session_id: str) -> EmotionTrend def map_emotion_to_response_style(self, emotion: str) -> VoiceStyle ``` ### WebSocket Message ```typescript { type: "emotion.detected", data: { emotion: string, confidence: number, valence: number, arousal: number } } ``` ### Frontend Handler In `useThinkerTalkerSession.ts`: ```typescript onEmotionDetected?: (event: TTEmotionDetectedEvent) => void; ``` ### Latency Impact: +50-100ms (parallel, non-blocking) --- ## Phase 2: Backchanneling System **Goal:** Natural verbal acknowledgments during user speech. ### Backend Service **Location:** `services/api-gateway/app/services/backchannel_service.py` ```python class BackchannelService: """ Generates and manages backchanneling audio. - Pre-caches common phrases per voice - Triggers based on VAD pause detection """ PHRASES = { "en": ["uh-huh", "mm-hmm", "I see", "right", "got it"], "ar": ["اها", "نعم", "صح"] } async def get_backchannel_audio(self, phrase: str, voice_id: str) -> bytes def should_trigger(self, session_state: SessionState) -> bool ``` ### Timing Logic - Trigger after 2-3 seconds of continuous user speech - Only during natural pauses (150-300ms silence) - Minimum 5 seconds between backchannels - Never interrupt mid-sentence ### WebSocket Message ```typescript { type: "backchannel.trigger", data: { phrase: string, audio_base64: string } } ``` ### Latency Impact: ~0ms (pre-cached audio) --- ## Phase 3: Prosody Analysis **Goal:** Analyze speech patterns for better intent understanding. ### Backend Service **Location:** `services/api-gateway/app/services/prosody_analysis_service.py` ```python @dataclass class ProsodyAnalysis: speech_rate_wpm: float # Words per minute pitch_variance: float # Emotion indicator loudness: float # Urgency indicator pause_patterns: List[float] # Hesitation detection urgency_score: float # Derived 0-1 score confidence_score: float # Speaker certainty ``` ### Integration - Parses Deepgram's prosody/topics metadata - Matches response speech rate to user's rate - Detects uncertainty from pitch patterns ### Latency Impact: +0ms (data from Deepgram) --- ## Phase 4: Memory & Context System **Goal:** Conversation memory across turns and sessions. ### Backend Service **Location:** `services/api-gateway/app/services/memory_context_service.py` ```python class MemoryContextService: """Three-tier memory management.""" async def store_turn_context(self, user_id, session_id, turn) -> None # Redis: last 10 turns, TTL = session duration async def get_recent_context(self, user_id, session_id, turns=5) -> list # Retrieve from Redis async def summarize_session(self, session_id) -> SessionContext # LLM-generated summary at session end async def store_long_term_memory(self, user_id, memory) -> str # Store in PostgreSQL + Qdrant vector async def retrieve_relevant_memories(self, user_id, query, top_k=5) -> list # Semantic search over Qdrant async def build_context_window(self, user_id, session_id, query) -> str # Assemble optimized context for LLM (max 4K tokens) ``` ### WebSocket Message ```typescript { type: "memory.context_loaded", data: { memories: Memory[], relevance_scores: number[] } } ``` --- ## Phase 5 & 6: Turn-Taking and Response Timing **Goal:** Fluid conversation flow with natural turn transitions and human-like timing. ### Turn States ```python class TurnTakingState(Enum): USER_TURN = "user_turn" TRANSITION = "transition" # Brief transition window AI_TURN = "ai_turn" OVERLAP = "overlap" # Both speaking (barge-in) ``` ### Response Timing Configuration ```python RESPONSE_TIMING = { "urgent": {"delay_ms": 0, "use_filler": False}, # Medical emergency "simple": {"delay_ms": 200, "use_filler": False}, # Yes/no, confirmations "complex": {"delay_ms": 600, "use_filler": True}, # Multi-part questions "clarification": {"delay_ms": 0, "use_filler": False} } ``` ### WebSocket Message ```typescript { type: "turn.state", data: { state: "user_turn" | "transition" | "ai_turn" } } ``` --- ## Phase 7: Conversational Repair **Goal:** Graceful handling of misunderstandings. ### Backend Service **Location:** `services/api-gateway/app/services/repair_strategy_service.py` ```python class RepairStrategy(Enum): ECHO_CHECK = "echo_check" # "So you're asking about X?" CLARIFY_SPECIFIC = "clarify_specific" # "Did you mean X or Y?" REQUEST_REPHRASE = "request_rephrase" # "Could you say that differently?" PARTIAL_ANSWER = "partial_answer" # "I'm not sure, but..." ``` ### Features - Confidence scoring for responses - Clarifying questions when confidence < 0.7 - Natural upward inflection for questions (SSML) - Frustration detection from repeated corrections --- ## Phase 8: Medical Dictation Core **Goal:** Hands-free clinical documentation. ### Backend Services **Location:** `services/api-gateway/app/services/` #### `dictation_service.py` ```python class DictationState(Enum): IDLE = "idle" LISTENING = "listening" PROCESSING = "processing" PAUSED = "paused" REVIEWING = "reviewing" class NoteType(Enum): SOAP = "soap" # Subjective, Objective, Assessment, Plan HP = "h_and_p" # History and Physical PROGRESS = "progress" # Progress Note PROCEDURE = "procedure" CUSTOM = "custom" ``` #### `voice_command_service.py` ```python # Navigation "go to subjective", "move to objective", "next section", "previous section" # Formatting "new paragraph", "bullet point", "number one/two/three" # Editing "delete that", "scratch that", "read that back", "undo" # Clinical "check interactions", "what's the dosing for", "show labs", "show medications" # Control "start dictation", "pause", "stop dictation", "save note" ``` #### `note_formatter_service.py` - LLM-assisted note formatting - Grammar correction preserving medical terminology - Auto-punctuation and abbreviation handling #### `medical_vocabulary_service.py` - Specialty-specific keyword sets - User-customizable vocabulary - Medical abbreviation expansion ### WebSocket Messages ```typescript { type: "dictation.state", data: { state: DictationState, note_type: NoteType } } { type: "dictation.section_update", data: { section: string, content: string } } { type: "dictation.section_change", data: { previous: string, current: string } } { type: "dictation.command", data: { command: string, executed: boolean } } ``` --- ## Phase 9: Patient Context Integration **Goal:** Context-aware clinical assistance with HIPAA compliance. ### Backend Services #### `patient_context_service.py` ```python class PatientContextService: async def get_context_for_dictation(self, user_id, patient_id) -> DictationContext def generate_context_prompts(self, context) -> List[str] # "I see 3 recent lab results. Would you like me to summarize them?" ``` #### `dictation_phi_monitor.py` - Real-time PHI detection during dictation - Alert if unexpected PHI spoken outside patient context ### HIPAA Audit Events ```python # Added to audit_service.py DICTATION_STARTED = "dictation_started" PATIENT_CONTEXT_ACCESSED = "patient_context_accessed" NOTE_SAVED = "note_saved" PHI_DETECTED = "phi_detected" ``` ### WebSocket Messages ```typescript { type: "patient.context_loaded", data: { patientId: string, context: PatientContext } } { type: "phi.alert", data: { severity: string, message: string, detected_phi: string[] } } ``` --- ## Phase 10: Frontend Integration & Analytics **Goal:** Session analytics, feedback collection, and full frontend integration. ### Backend Services #### `session_analytics_service.py` **Location:** `services/api-gateway/app/services/session_analytics_service.py` ```python class SessionAnalyticsService: """ Comprehensive voice session analytics tracking. Tracks: - Latency metrics (STT, LLM, TTS, E2E) with percentiles - Interaction counts (utterances, responses, tool calls, barge-ins) - Quality metrics (confidence scores, turn-taking, repairs) - Dictation-specific metrics """ def create_session(self, session_id: str, user_id: Optional[str], mode: str, on_analytics_update: Optional[Callable]) -> SessionAnalytics def record_latency(self, session_id: str, latency_type: str, latency_ms: float) -> None def record_interaction(self, session_id: str, interaction_type: InteractionType, word_count: int, duration_ms: float) -> None def record_emotion(self, session_id: str, emotion: str, valence: float, arousal: float) -> None def record_barge_in(self, session_id: str) -> None def record_repair(self, session_id: str) -> None def record_error(self, session_id: str, error_type: str, message: str) -> None def end_session(self, session_id: str) -> Optional[Dict[str, Any]] ``` #### `feedback_service.py` **Location:** `services/api-gateway/app/services/feedback_service.py` ```python class FeedbackService: """ User feedback collection for voice sessions. Features: - Quick thumbs up/down during session - Detailed session ratings with categories - Bug reports and suggestions - Feedback prompts based on session context """ def record_quick_feedback(self, session_id: str, user_id: Optional[str] = None, thumbs_up: bool = True, message_id: Optional[str] = None) -> FeedbackItem def record_session_rating(self, session_id: str, user_id: Optional[str] = None, rating: int = 5, categories: Optional[Dict[str, int]] = None, comment: Optional[str] = None) -> List[FeedbackItem] def get_feedback_prompts(self, session_id: str, session_duration_ms: float = 0, interaction_count: int = 0, has_errors: bool = False) -> List[FeedbackPrompt] def generate_analytics_report(self, session_ids: Optional[List[str]] = None) -> Dict[str, Any] ``` ### Analytics Data Structure ```typescript interface TTSessionAnalytics { sessionId: string; userId: string | null; phase: string; mode: string; timing: { startedAt: string; endedAt: string | null; durationMs: number; }; latency: { stt: { count: number; total: number; min: number; max: number; p50: number; p95: number; p99: number }; llm: { count: number; total: number; min: number; max: number; p50: number; p95: number; p99: number }; tts: { count: number; total: number; min: number; max: number; p50: number; p95: number; p99: number }; e2e: { count: number; total: number; min: number; max: number; p50: number; p95: number; p99: number }; }; interactions: { counts: Record; words: { user: number; assistant: number }; speakingTimeMs: { user: number; assistant: number }; }; quality: { sttConfidence: { count: number; total: number; min: number; max: number }; aiConfidence: { count: number; total: number; min: number; max: number }; emotion: { dominant: string | null; valence: number; arousal: number }; turnTaking: { bargeIns: number; overlaps: number; smoothTransitions: number }; repairs: number; }; dictation: { sectionsEdited: string[]; commandsExecuted: number; wordsTranscribed: number; } | null; errors: { count: number; details: Array<{ timestamp: string; type: string; message: string }>; }; } ``` ### WebSocket Messages ```typescript // Analytics { type: "analytics.update", data: TTSessionAnalytics } { type: "analytics.session_ended", data: TTSessionAnalytics } // Feedback { type: "feedback.prompts", data: { prompts: TTFeedbackPrompt[] } } { type: "feedback.recorded", data: { thumbsUp: boolean, messageId: string | null } } ``` ### Frontend Handlers In `useThinkerTalkerSession.ts`: ```typescript // Phase 10 callbacks onAnalyticsUpdate?: (analytics: TTSessionAnalytics) => void; onSessionEnded?: (analytics: TTSessionAnalytics) => void; onFeedbackPrompts?: (event: TTFeedbackPromptsEvent) => void; onFeedbackRecorded?: (event: TTFeedbackRecordedEvent) => void; ``` --- ## Complete WebSocket Protocol ### All Message Types | Phase | Message Type | Direction | Description | | ----- | -------------------------- | --------------- | ------------------------------ | | 1 | `emotion.detected` | Server → Client | User emotion detected | | 2 | `backchannel.trigger` | Server → Client | Play backchannel audio | | 4 | `memory.context_loaded` | Server → Client | Relevant memories loaded | | 5 | `turn.state` | Server → Client | Turn state changed | | 8 | `dictation.state` | Server → Client | Dictation state changed | | 8 | `dictation.section_update` | Server → Client | Section content updated | | 8 | `dictation.section_change` | Server → Client | Current section changed | | 8 | `dictation.command` | Server → Client | Voice command executed | | 9 | `patient.context_loaded` | Server → Client | Patient context loaded | | 9 | `phi.alert` | Server → Client | PHI detected alert | | 10 | `analytics.update` | Server → Client | Session analytics update | | 10 | `analytics.session_ended` | Server → Client | Final session analytics | | 10 | `feedback.prompts` | Server → Client | Feedback prompts | | 10 | `feedback.recorded` | Server → Client | Feedback recorded confirmation | --- ## Integration Points ### Voice Pipeline Service **Location:** `services/api-gateway/app/services/voice_pipeline_service.py` The voice pipeline service orchestrates all 10 phases: ```python class VoicePipelineService: # Phase 1-9 services _emotion_detector: EmotionDetectionService _backchannel_service: BackchannelService _prosody_analyzer: ProsodyAnalysisService _memory_service: MemoryContextService _repair_service: RepairStrategyService _dictation_service: DictationService _voice_command_service: VoiceCommandService _note_formatter: NoteFormatterService _medical_vocabulary: MedicalVocabularyService _patient_context_service: PatientContextService _phi_monitor: DictationPHIMonitor # Phase 10 services _analytics: SessionAnalytics _analytics_service: SessionAnalyticsService _feedback_service: FeedbackService async def start(self): # Initialize analytics session self._analytics = self._analytics_service.create_session( session_id=self.session_id, user_id=self.user_id, mode="dictation" if self.config.mode == PipelineMode.DICTATION else "conversation", on_analytics_update=self._send_analytics_update, ) async def stop(self): # Send feedback prompts prompts = self._feedback_service.get_feedback_prompts(...) await self._on_message(PipelineMessage(type="feedback.prompts", ...)) # Finalize analytics final_analytics = self._analytics_service.end_session(self.session_id) await self._on_message(PipelineMessage(type="analytics.session_ended", ...)) ``` ### Frontend Hook **Location:** `apps/web-app/src/hooks/useThinkerTalkerSession.ts` All 10 phases integrated with callbacks: ```typescript export interface UseThinkerTalkerSessionOptions { // ... existing options ... // Phase 1: Emotion onEmotionDetected?: (event: TTEmotionDetectedEvent) => void; // Phase 2: Backchanneling onBackchannelTrigger?: (event: TTBackchannelTriggerEvent) => void; // Phase 4: Memory onMemoryContextLoaded?: (event: TTMemoryContextLoadedEvent) => void; // Phase 5: Turn-taking onTurnStateChange?: (event: TTTurnStateChangeEvent) => void; // Phase 8: Dictation onDictationStateChange?: (event: TTDictationStateChangeEvent) => void; onDictationSectionUpdate?: (event: TTDictationSectionUpdateEvent) => void; onDictationSectionChange?: (event: TTDictationSectionChangeEvent) => void; onDictationCommand?: (event: TTDictationCommandEvent) => void; // Phase 9: Patient Context onPatientContextLoaded?: (event: TTPatientContextLoadedEvent) => void; onPHIAlert?: (event: TTPHIAlertEvent) => void; // Phase 10: Analytics & Feedback onAnalyticsUpdate?: (analytics: TTSessionAnalytics) => void; onSessionEnded?: (analytics: TTSessionAnalytics) => void; onFeedbackPrompts?: (event: TTFeedbackPromptsEvent) => void; onFeedbackRecorded?: (event: TTFeedbackRecordedEvent) => void; } ``` --- ## File Reference ### Backend Services (New) | File | Phase | Purpose | | ------------------------------- | ----- | ------------------------- | | `emotion_detection_service.py` | 1 | Hume AI emotion detection | | `backchannel_service.py` | 2 | Natural acknowledgments | | `prosody_analysis_service.py` | 3 | Speech pattern analysis | | `memory_context_service.py` | 4 | Three-tier memory system | | `repair_strategy_service.py` | 7 | Conversational repair | | `dictation_service.py` | 8 | Medical dictation state | | `voice_command_service.py` | 8 | Voice command processing | | `note_formatter_service.py` | 8 | Note formatting | | `medical_vocabulary_service.py` | 8 | Medical terminology | | `patient_context_service.py` | 9 | Patient context | | `dictation_phi_monitor.py` | 9 | PHI monitoring | | `session_analytics_service.py` | 10 | Session analytics | | `feedback_service.py` | 10 | User feedback | ### Backend Services (Modified) | File | Changes | | --------------------------- | ------------------------------------------------- | | `voice_pipeline_service.py` | Orchestrates all 10 phases, analytics integration | | `thinker_service.py` | Emotion context, repair strategies | | `talker_service.py` | Variable timing, backchanneling | | `streaming_stt_service.py` | Nova-3 Medical, prosody features | | `audit_service.py` | Dictation audit events | ### Frontend | File | Purpose | | ---------------------------- | ------------------------- | | `useThinkerTalkerSession.ts` | All message type handlers | --- ## Success Metrics | Metric | Target | Measurement | | -------------------------- | --------------------- | ------------------------ | | Response latency | <200ms | P95 from analytics | | Emotion detection accuracy | >80% | Manual validation | | User satisfaction | >4.2/5 | Feedback ratings | | Dictation word accuracy | >95% WER | Medical vocabulary tests | | Memory retrieval relevance | >0.7 | Cosine similarity | | Turn-taking smoothness | <5% interruption rate | Session analytics | --- ## Related Documentation - [VOICE_MODE_PIPELINE.md](./VOICE_MODE_PIPELINE.md) - Core pipeline architecture - [VOICE_MODE_SETTINGS_GUIDE.md](./VOICE_MODE_SETTINGS_GUIDE.md) - User settings - [VOICE_STATE_2025-11-29.md](./VOICE_STATE_2025-11-29.md) - Voice state snapshot --- _Last updated: 2025-12-03_ _All 10 phases implemented and integrated_ 6:["slug","VOICE_MODE_ENHANCEMENT_10_PHASE","c"] 0:["X7oMT3VrOffzp0qvbeOas",[[["",{"children":["docs",{"children":[["slug","VOICE_MODE_ENHANCEMENT_10_PHASE","c"],{"children":["__PAGE__?{\"slug\":[\"VOICE_MODE_ENHANCEMENT_10_PHASE\"]}",{}]}]}]},"$undefined","$undefined",true],["",{"children":["docs",{"children":[["slug","VOICE_MODE_ENHANCEMENT_10_PHASE","c"],{"children":["__PAGE__",{},[["$L1",["$","div",null,{"children":[["$","div",null,{"className":"mb-6 flex items-center justify-between gap-4","children":[["$","div",null,{"children":[["$","p",null,{"className":"text-sm text-gray-500 dark:text-gray-400","children":"Docs / Raw"}],["$","h1",null,{"className":"text-3xl font-bold text-gray-900 dark:text-white","children":"Voice Mode Enhancement - 10 Phase Implementation"}],["$","p",null,{"className":"text-sm text-gray-600 dark:text-gray-400","children":["Sourced from"," ",["$","code",null,{"className":"font-mono text-xs","children":["docs/","VOICE_MODE_ENHANCEMENT_10_PHASE.md"]}]]}]]}],["$","a",null,{"href":"https://github.com/mohammednazmy/VoiceAssist/edit/main/docs/VOICE_MODE_ENHANCEMENT_10_PHASE.md","target":"_blank","rel":"noreferrer","className":"inline-flex items-center gap-2 rounded-md border border-gray-200 dark:border-gray-700 px-3 py-1.5 text-sm text-gray-700 dark:text-gray-200 hover:border-primary-500 dark:hover:border-primary-400 hover:text-primary-700 dark:hover:text-primary-300","children":"Edit on GitHub"}]]}],["$","div",null,{"className":"rounded-lg border border-gray-200 dark:border-gray-800 bg-white dark:bg-gray-900 p-6","children":["$","$L2",null,{"content":"$3"}]}],["$","div",null,{"className":"mt-6 flex flex-wrap gap-2 text-sm","children":[["$","$L4",null,{"href":"/reference/all-docs","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"← All documentation"}],["$","$L4",null,{"href":"/","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"Home"}]]}]]}],null],null],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children","$6","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/7f586cdbbaa33ff7.css","precedence":"next","crossOrigin":"$undefined"}]],["$","html",null,{"lang":"en","className":"h-full","children":["$","body",null,{"className":"__className_f367f3 h-full bg-white dark:bg-gray-900","children":[["$","a",null,{"href":"#main-content","className":"skip-to-content","children":"Skip to main content"}],["$","$L8",null,{"children":[["$","$L9",null,{}],["$","$La",null,{}],["$","main",null,{"id":"main-content","className":"lg:pl-64","role":"main","aria-label":"Documentation content","children":["$","$Lb",null,{"children":["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}]}]]}]]}]}]],null],null],["$Lc",null]]]] c:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"Voice Mode Enhancement - 10 Phase Implementation | Docs | VoiceAssist Docs"}],["$","meta","3",{"name":"description","content":"Comprehensive 10-phase enhancement plan transforming VoiceAssist voice mode into a human-like conversational partner with medical dictation capabilities. All phases complete."}],["$","meta","4",{"name":"keywords","content":"VoiceAssist,documentation,medical AI,voice assistant,healthcare,HIPAA,API"}],["$","meta","5",{"name":"robots","content":"index, follow"}],["$","meta","6",{"name":"googlebot","content":"index, follow"}],["$","link","7",{"rel":"canonical","href":"https://assistdocs.asimo.io"}],["$","meta","8",{"property":"og:title","content":"VoiceAssist Documentation"}],["$","meta","9",{"property":"og:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","10",{"property":"og:url","content":"https://assistdocs.asimo.io"}],["$","meta","11",{"property":"og:site_name","content":"VoiceAssist Docs"}],["$","meta","12",{"property":"og:type","content":"website"}],["$","meta","13",{"name":"twitter:card","content":"summary"}],["$","meta","14",{"name":"twitter:title","content":"VoiceAssist Documentation"}],["$","meta","15",{"name":"twitter:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","16",{"name":"next-size-adjust"}]] 1:null