2:I[7012,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],"MarkdownRenderer"] 4:I[9856,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],""] 5:I[4126,[],""] 7:I[9630,[],""] 8:I[4278,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"HeadingProvider"] 9:I[1476,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Header"] a:I[3167,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Sidebar"] b:I[7409,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"PageFrame"] 3:T14c6, # Voice Mode v4.1.2 Release Notes **Version:** 4.1.2 (Feature Release) **Date:** December 2025 **Type:** G2P enhancement and lexicon expansion --- ## Summary Voice Mode v4.1.2 delivers enhanced grapheme-to-phoneme (G2P) conversion with a multi-source fallback chain and significantly expands multi-language lexicon support with 1,384 total pronunciation entries. --- ## New Features ### EnhancedG2PService A new G2P service with intelligent fallback chain for accurate pronunciation generation: **Fallback Chain (in priority order):** 1. **Medical Pronunciation Cache** (50+ terms, confidence: 0.95) - Pre-computed IPA for common medical terms - Includes drugs, conditions, and procedures 2. **CMUdict** (English, confidence: 0.9) - Carnegie Mellon Pronouncing Dictionary - 134,000+ English words with ARPABET phonemes - Automatic ARPABET-to-IPA conversion 3. **gruut** (Multi-language, confidence: 0.8) - Pure Python G2P for multiple languages - Supports: English, German, French, Spanish, Russian, Polish 4. **espeak-ng** (Fallback, confidence: 0.7) - System TTS fallback for unsupported terms - Broad language coverage including Arabic, CJK 5. **Raw Term Fallback** (Last resort, confidence: 0.3) - Returns term wrapped in slashes: /term/ - Ensures no silent failures **Key Features:** - Runtime caching for repeated lookups - Batch generation for multiple terms - Language-aware processing - Comprehensive statistics API **Usage:** ```python from app.services.enhanced_g2p_service import EnhancedG2PService g2p = EnhancedG2PService() result = await g2p.generate("metformin", "en") # G2PResult(term='metformin', phonemes='mɛtfɔːrmɪn', source='medical_cache', confidence=0.95) ``` ### ARPABET-to-IPA Conversion 100+ phoneme mappings for converting CMUdict ARPABET to IPA: | ARPABET | IPA | Example | | ------- | --- | ------------------ | | AA | ɑ | father | | AE | æ | cat | | IY | i | beet | | SH | ʃ | ship | | TH | θ | think | | AA1 | ˈɑ | (primary stress) | | AA2 | ˌɑ | (secondary stress) | ### Lexicon Expansion Total coverage increased to 1,384 pronunciation entries: | Language | Terms | Status | Notes | | -------- | ------- | -------- | --------------------------------- | | Arabic | 485 | Complete | Quranic vocabulary | | English | 852+334 | Complete | General + Quranic transliteration | | Spanish | 210 | Complete | Medical terminology | | Chinese | 160 | Complete | Medical terminology | | Japanese | 55 | Expanded | Medical + common terms | | Korean | 55 | Expanded | Medical + common terms | | Polish | 55 | Expanded | Medical + common terms | | Russian | 55 | Expanded | Medical + common terms | | Turkish | 55 | Expanded | Medical + common terms | --- ## Documentation Updates ### Getting Started Guide Added comprehensive Getting Started section to What's New v4.1: - Voice-First Input Bar usage - VAD preset selection guide - Quality badge and PHI indicator explanations - Thinking feedback configuration ### Screenshot Requirements Created `docs/voice/screenshots/README.md` with: - Capture status table for 5 required screenshots - Annotation guidelines and tools - Resolution and format requirements ### VAD Preset Terminology Aligned terminology across all documentation: | Old Term | New Term | | -------- | --------- | | Quiet | Sensitive | | Normal | Balanced | | Noisy | Relaxed | --- ## Bug Fixes - **Turkish lexicon typo**: Fixed "zatürree" → "zatürre" (pneumonia) - **ElevenLabs test**: Updated model count assertion for new Flash/Turbo v2.5 models --- ## Test Results ``` ======================== 869 passed, 34 skipped ======================== ``` **New Tests Added:** - `test_enhanced_g2p_service.py`: 34 integration tests - Medical cache tests (4) - CMUdict tests (2) - Multi-language tests (5) - Fallback chain tests (7) - Caching tests (2) - Batch generation tests (2) - ARPABET conversion tests (2) - Edge case tests (5) - Statistics tests (1) - Result dataclass tests (4) --- ## Dependencies New dependencies added to `requirements.txt`: ``` cmudict>=1.0.12 # CMU Pronouncing Dictionary for English gruut>=2.4.0 # Multi-language G2P with pure Python implementation ``` --- ## Installation ```bash cd services/api-gateway pip install -r requirements.txt ``` --- ## Upgrade Notes This release is fully backward compatible with v4.1.1. No configuration changes required. **Optional Enhancements:** - Install `espeak-ng` system package for broader language fallback support - Configure medical pronunciation cache for domain-specific terms --- ## Related Links - [GitHub Release](https://github.com/mohammednazmy/VoiceAssist/releases/tag/v4.1.2) - [What's New in Voice Mode v4.1](../voice/whats-new-v4-1.md) - [Post-v4.1 Roadmap](../voice/roadmap/voice-mode-post-v41-roadmap.md) - [G2P Alternatives Evaluation](../voice/design/g2p-alternatives-evaluation.md) --- **Released:** December 4, 2025 **Commit:** 7047d4d **PR:** #165 6:["slug","releases/v4.1.2-release-announcement","c"] 0:["X7oMT3VrOffzp0qvbeOas",[[["",{"children":["docs",{"children":[["slug","releases/v4.1.2-release-announcement","c"],{"children":["__PAGE__?{\"slug\":[\"releases\",\"v4.1.2-release-announcement\"]}",{}]}]}]},"$undefined","$undefined",true],["",{"children":["docs",{"children":[["slug","releases/v4.1.2-release-announcement","c"],{"children":["__PAGE__",{},[["$L1",["$","div",null,{"children":[["$","div",null,{"className":"mb-6 flex items-center justify-between gap-4","children":[["$","div",null,{"children":[["$","p",null,{"className":"text-sm text-gray-500 dark:text-gray-400","children":"Docs / Raw"}],["$","h1",null,{"className":"text-3xl font-bold text-gray-900 dark:text-white","children":"Voice Mode v4.1.2 Release Announcement"}],["$","p",null,{"className":"text-sm text-gray-600 dark:text-gray-400","children":["Sourced from"," ",["$","code",null,{"className":"font-mono text-xs","children":["docs/","releases/v4.1.2-release-announcement.md"]}]]}]]}],["$","a",null,{"href":"https://github.com/mohammednazmy/VoiceAssist/edit/main/docs/releases/v4.1.2-release-announcement.md","target":"_blank","rel":"noreferrer","className":"inline-flex items-center gap-2 rounded-md border border-gray-200 dark:border-gray-700 px-3 py-1.5 text-sm text-gray-700 dark:text-gray-200 hover:border-primary-500 dark:hover:border-primary-400 hover:text-primary-700 dark:hover:text-primary-300","children":"Edit on GitHub"}]]}],["$","div",null,{"className":"rounded-lg border border-gray-200 dark:border-gray-800 bg-white dark:bg-gray-900 p-6","children":["$","$L2",null,{"content":"$3"}]}],["$","div",null,{"className":"mt-6 flex flex-wrap gap-2 text-sm","children":[["$","$L4",null,{"href":"/reference/all-docs","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"← All documentation"}],["$","$L4",null,{"href":"/","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"Home"}]]}]]}],null],null],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children","$6","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/7f586cdbbaa33ff7.css","precedence":"next","crossOrigin":"$undefined"}]],["$","html",null,{"lang":"en","className":"h-full","children":["$","body",null,{"className":"__className_f367f3 h-full bg-white dark:bg-gray-900","children":[["$","a",null,{"href":"#main-content","className":"skip-to-content","children":"Skip to main content"}],["$","$L8",null,{"children":[["$","$L9",null,{}],["$","$La",null,{}],["$","main",null,{"id":"main-content","className":"lg:pl-64","role":"main","aria-label":"Documentation content","children":["$","$Lb",null,{"children":["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}]}]]}]]}]}]],null],null],["$Lc",null]]]] c:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"Voice Mode v4.1.2 Release Announcement | Docs | VoiceAssist Docs"}],["$","meta","3",{"name":"description","content":"Feature release with EnhancedG2PService and expanded multi-language lexicon (1,384 entries)."}],["$","meta","4",{"name":"keywords","content":"VoiceAssist,documentation,medical AI,voice assistant,healthcare,HIPAA,API"}],["$","meta","5",{"name":"robots","content":"index, follow"}],["$","meta","6",{"name":"googlebot","content":"index, follow"}],["$","link","7",{"rel":"canonical","href":"https://assistdocs.asimo.io"}],["$","meta","8",{"property":"og:title","content":"VoiceAssist Documentation"}],["$","meta","9",{"property":"og:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","10",{"property":"og:url","content":"https://assistdocs.asimo.io"}],["$","meta","11",{"property":"og:site_name","content":"VoiceAssist Docs"}],["$","meta","12",{"property":"og:type","content":"website"}],["$","meta","13",{"name":"twitter:card","content":"summary"}],["$","meta","14",{"name":"twitter:title","content":"VoiceAssist Documentation"}],["$","meta","15",{"name":"twitter:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","16",{"name":"next-size-adjust"}]] 1:null