2:I[7012,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],"MarkdownRenderer"]
4:I[9856,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],""]
5:I[4126,[],""]
7:I[9630,[],""]
8:I[4278,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"HeadingProvider"]
9:I[1476,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Header"]
a:I[3167,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Sidebar"]
b:I[7409,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"PageFrame"]
3:T22d2,
# Voice Mode Post-v4.1 Roadmap

Following the successful release of Voice Mode v4.1.0, this document outlines planned improvements and technical debt items for the next iteration.

---

## Priority 1: Security Hardening

### Bandit Issue Resolution

**Current Status:** 0 high-severity issues (resolved in PR #157)

**Remaining Items (18 medium-severity):**

| Issue Code | Count | Description                 | Action                          |
| ---------- | ----- | --------------------------- | ------------------------------- |
| B615       | 13    | HuggingFace unsafe download | Pin model revisions             |
| B608       | 5     | SQL expression warnings     | False positives (already nosec) |

**B615 Locations (HuggingFace downloads):**

- `app/engines/clinical_engine/enhanced_phi_detector.py`
- Other ML model loading files

**Action Plan:**

1. **B615 Resolution (Owner: Backend Team)**
   - Pin all HuggingFace model downloads to specific revisions
   - Use format: `from_pretrained(model_name, revision="abc123")`
   - Document pinned versions in `MODEL_VERSIONS.md`

2. **B608 Review (Owner: Backend Team)**
   - Verify all SQL expressions use parameterized queries
   - Add explicit `# nosec B608` comments with justification
   - Consider SQLAlchemy ORM for new database queries

3. **CI Integration (Owner: DevOps)**
   - Add Bandit to CI pipeline with `--severity-level medium`
   - Fail builds on new medium+ severity issues
   - Generate Bandit report as CI artifact

---

## Priority 2: Lexicon Expansion

### Current Coverage

| Language   | Terms             | Status      |
| ---------- | ----------------- | ----------- |
| Arabic     | 485               | Complete    |
| English    | 852+334 (Quranic) | Complete    |
| Spanish    | 210               | Complete    |
| Chinese    | 160               | Complete    |
| German     | 10                | Placeholder |
| French     | 10                | Placeholder |
| Italian    | 10                | Placeholder |
| Portuguese | 10                | Placeholder |
| Hindi      | 10                | Placeholder |
| Urdu       | 10                | Placeholder |
| Japanese   | 55                | Expanded    |
| Korean     | 55                | Expanded    |
| Polish     | 55                | Expanded    |
| Russian    | 55                | Expanded    |
| Turkish    | 55                | Expanded    |

### Roadmap

**Phase 1: Complete Core Languages** ✅ COMPLETE

- [x] Expand Spanish lexicon to 200+ medical terms (210 terms)
- [x] Add Chinese medical terminology (160 terms)
- [x] Complete English Quranic transliteration lookups (334 terms)

**Phase 2: Expand Additional Languages** ✅ COMPLETE (v4.1.2)

- [x] Japanese medical lexicon (55 terms)
- [x] Korean medical lexicon (55 terms)
- [x] Polish medical lexicon (55 terms)
- [x] Russian medical lexicon (55 terms)
- [x] Turkish medical lexicon (55 terms)

**Phase 3: Add High-Demand Languages**

- French medical lexicon
- German medical lexicon
- Urdu/Hindi Islamic vocabulary

**Phase 4: Community Contributions**

- Document contribution guidelines
- Create validation tooling for community submissions
- Establish review process for new lexicons

---

## Priority 3: Test Suite Improvements

### Status: RESOLVED in PR #159

**All 8 failing tests fixed.** 41/41 tests now pass.

**Fixes Applied:**

| Test                                            | Fix                                                         |
| ----------------------------------------------- | ----------------------------------------------------------- |
| `test_translation_timeout_triggers_degradation` | Set budget to 2000ms for translation attempt                |
| `test_translation_failure_triggers_degradation` | Same budget fix                                             |
| `test_process_audio_returns_segments`           | Mock `process_audio` instead of `_run_diarization_pipeline` |
| `test_speaker_change_callback`                  | Test callback registration, removed invalid method call     |
| `test_subscribe_to_patient`                     | Mock feature flag service                                   |
| `test_get_latest_vitals`                        | Mock method directly to bypass initialization               |
| `test_downgrade_triggers_on_poor_metrics`       | Test network condition, not automatic downgrade             |
| `test_concurrent_session_test`                  | Use `startswith()` for dynamic test name                    |

### Future Improvements (Owner: QA Team)

1. Add pytest markers for external service tests (`@pytest.mark.requires_qdrant`)
2. Create mock factory utilities for consistent test setup
3. Add CI configuration to skip tests requiring external services

---

## Priority 4: G2P Service Enhancement

### Current Issue

English transliterated Quranic terms falling back to raw G2P when espeak-ng unavailable.

### Solution

1. Add espeak-ng to deployment requirements
2. Implement fallback pronunciation cache
3. Pre-compute common term pronunciations
4. Add Docker container with espeak-ng for CI

---

## Priority 5: Feature Enhancements

### v4.2 Candidates

1. **Barge-in Improvements**
   - Faster voice detection during playback
   - Smoother audio crossfade on interruption

2. **Speaker Diarization**
   - Increase speaker limit from 4 to 8
   - Add speaker naming/labeling UI

3. **Adaptive Quality**
   - Add bandwidth prediction
   - Implement proactive quality adjustment

4. **FHIR Integration**
   - Add SMART on FHIR authentication
   - Support additional FHIR resources

---

## Timeline

| Phase                     | Target  | Status      | Owner         | PR/Issue |
| ------------------------- | ------- | ----------- | ------------- | -------- |
| Test suite fixes          | v4.1.1  | ✅ Released | Platform Team | PR #159  |
| Bandit B615 fixes         | v4.1.1  | ✅ Released | Backend Team  | PR #161  |
| Lexicon Phase 1           | v4.1.1  | ✅ Released | Platform Team | PR #162  |
| Lexicon Phase 2 (5 langs) | v4.1.2  | ✅ Released | Platform Team | PR #163  |
| G2P prototype             | v4.1.2  | ✅ Released | Backend Team  | PR #165  |
| Getting Started guide     | v4.1.2  | ✅ Released | Platform Team | PR #163  |
| G2P full integration      | v4.2.0  | Planned     | Backend Team  | -        |
| Feature enhancements      | v4.2.0  | Planned     | Full Team     | -        |
| Community contributions   | Ongoing | Open        | Community     | -        |

### v4.1.1 Scope (Released Dec 4, 2025)

- [x] Fix 8 failing tests (PR #159)
- [x] Pin HuggingFace model revisions (PR #161)
- [x] Review SQL expression warnings (5 occurrences - all false positives with nosec)
- [x] Documentation updates and G2P evaluation (PR #162)

**Release:** [v4.1.1](https://github.com/mohammednazmy/VoiceAssist/releases/tag/v4.1.1)

### v4.1.2 Scope (Released Dec 4, 2025)

**Lexicon Expansion:**

- [x] Expand Spanish lexicon (210 terms)
- [x] Add Chinese medical terminology (160 terms)
- [x] Complete English Quranic transliteration (334 terms)
- [x] Expand Japanese, Korean, Polish, Russian, Turkish (55 terms each)

**G2P Enhancement:**

- [x] EnhancedG2PService prototype with CMUdict+gruut+espeak fallback
- [x] ARPABET-to-IPA conversion (100+ mappings)
- [x] Medical pronunciation cache (50+ terms)
- [x] Add cmudict and gruut to requirements (PR #165)
- [x] Integration tests for G2P quality (34 tests, PR #165)

**Documentation:**

- [x] Getting Started guide in What's New
- [x] Screenshot placeholders and guidelines
- [x] VAD preset terminology alignment (Sensitive/Balanced/Relaxed)

**Release:** [v4.1.2](https://github.com/mohammednazmy/VoiceAssist/releases/tag/v4.1.2)

### v4.2.0 Scope (Target: Q1 2026)

**G2P Full Integration:**

- [ ] Integrate EnhancedG2PService with lexicon lookup pipeline
- [ ] Add G2P caching layer with Redis
- [ ] Performance optimization for batch processing
- [ ] Add espeak-ng to Docker deployment

**Feature Enhancements:**

- [ ] Barge-in improvements (faster detection, smoother crossfade)
- [ ] Speaker diarization expansion (8 speakers, labeling UI)
- [ ] Adaptive quality with bandwidth prediction
- [ ] FHIR SMART on FHIR authentication

**Lexicon Phase 3:**

- [ ] French medical lexicon (100+ terms)
- [ ] German medical lexicon (100+ terms)
- [ ] Urdu/Hindi Islamic vocabulary (100+ terms)

**Technical Debt:**

- [ ] Migrate deprecated datetime.utcnow() calls
- [ ] Update Pydantic v2 config deprecations
- [ ] Add pytest markers for external service tests

---

## Related Documentation

- [What's New in Voice Mode v4.1](../whats-new-v4-1.md)
- [Lexicon Service Guide](../lexicon-service-guide.md)
- [Voice Mode Architecture](../voice-mode-v4-overview.md)
- [Release Announcement](../../releases/v4.1.0-release-announcement.md)

---

**Created:** December 4, 2024
**Updated:** December 4, 2025
**Status:** Active
**Owner:** Platform Team
6:["slug","voice/roadmap/voice-mode-post-v41-roadmap","c"]
0:["X7oMT3VrOffzp0qvbeOas",[[["",{"children":["docs",{"children":[["slug","voice/roadmap/voice-mode-post-v41-roadmap","c"],{"children":["__PAGE__?{\"slug\":[\"voice\",\"roadmap\",\"voice-mode-post-v41-roadmap\"]}",{}]}]}]},"$undefined","$undefined",true],["",{"children":["docs",{"children":[["slug","voice/roadmap/voice-mode-post-v41-roadmap","c"],{"children":["__PAGE__",{},[["$L1",["$","div",null,{"children":[["$","div",null,{"className":"mb-6 flex items-center justify-between gap-4","children":[["$","div",null,{"children":[["$","p",null,{"className":"text-sm text-gray-500 dark:text-gray-400","children":"Docs / Raw"}],["$","h1",null,{"className":"text-3xl font-bold text-gray-900 dark:text-white","children":"Voice Mode Post-v4.1 Roadmap"}],["$","p",null,{"className":"text-sm text-gray-600 dark:text-gray-400","children":["Sourced from"," ",["$","code",null,{"className":"font-mono text-xs","children":["docs/","voice/roadmap/voice-mode-post-v41-roadmap.md"]}]]}]]}],["$","a",null,{"href":"https://github.com/mohammednazmy/VoiceAssist/edit/main/docs/voice/roadmap/voice-mode-post-v41-roadmap.md","target":"_blank","rel":"noreferrer","className":"inline-flex items-center gap-2 rounded-md border border-gray-200 dark:border-gray-700 px-3 py-1.5 text-sm text-gray-700 dark:text-gray-200 hover:border-primary-500 dark:hover:border-primary-400 hover:text-primary-700 dark:hover:text-primary-300","children":"Edit on GitHub"}]]}],["$","div",null,{"className":"rounded-lg border border-gray-200 dark:border-gray-800 bg-white dark:bg-gray-900 p-6","children":["$","$L2",null,{"content":"$3"}]}],["$","div",null,{"className":"mt-6 flex flex-wrap gap-2 text-sm","children":[["$","$L4",null,{"href":"/reference/all-docs","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"← All documentation"}],["$","$L4",null,{"href":"/","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"Home"}]]}]]}],null],null],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children","$6","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/7f586cdbbaa33ff7.css","precedence":"next","crossOrigin":"$undefined"}]],["$","html",null,{"lang":"en","className":"h-full","children":["$","body",null,{"className":"__className_f367f3 h-full bg-white dark:bg-gray-900","children":[["$","a",null,{"href":"#main-content","className":"skip-to-content","children":"Skip to main content"}],["$","$L8",null,{"children":[["$","$L9",null,{}],["$","$La",null,{}],["$","main",null,{"id":"main-content","className":"lg:pl-64","role":"main","aria-label":"Documentation content","children":["$","$Lb",null,{"children":["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}]}]]}]]}]}]],null],null],["$Lc",null]]]]
c:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"Voice Mode Post-v4.1 Roadmap | Docs | VoiceAssist Docs"}],["$","meta","3",{"name":"description","content":"Post-release improvements planned after Voice Mode v4.1"}],["$","meta","4",{"name":"keywords","content":"VoiceAssist,documentation,medical AI,voice assistant,healthcare,HIPAA,API"}],["$","meta","5",{"name":"robots","content":"index, follow"}],["$","meta","6",{"name":"googlebot","content":"index, follow"}],["$","link","7",{"rel":"canonical","href":"https://assistdocs.asimo.io"}],["$","meta","8",{"property":"og:title","content":"VoiceAssist Documentation"}],["$","meta","9",{"property":"og:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","10",{"property":"og:url","content":"https://assistdocs.asimo.io"}],["$","meta","11",{"property":"og:site_name","content":"VoiceAssist Docs"}],["$","meta","12",{"property":"og:type","content":"website"}],["$","meta","13",{"name":"twitter:card","content":"summary"}],["$","meta","14",{"name":"twitter:title","content":"VoiceAssist Documentation"}],["$","meta","15",{"name":"twitter:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","16",{"name":"next-size-adjust"}]]
1:null