π VoiceAssist V2 - Start Here
Welcome to VoiceAssist V2 - A HIPAA-compliant voice-enabled clinical decision support system.
This document is your entry point to the project. Choose your path below based on your role and experience level.
Status update: All 16 project phases (0-15) are complete. Backend, infrastructure, admin panel, and web app (through Phase 3.5) are production-ready. Voice Mode Enhancement (10-phase plan) completed 2025-12-03 - includes emotion detection, medical dictation, session analytics, and feedback collection. See Implementation Status for the authoritative component status.
π― Quick Start
For New Developers
- Read What is VoiceAssist V2? (5 min)
- Follow Getting Started (30 min)
- Review Documentation Map to understand what's available
- Start with PHASE_00_INITIALIZATION.md
For Experienced Developers
- Review UNIFIED_ARCHITECTURE.md for complete system design
- Check ARCHITECTURE_DIAGRAMS.md for visual diagrams
- Set up local environment: LOCAL_DEVELOPMENT.md
- Jump to Development Roadmap to see phases
For Clinicians
- Read WEB_APP_SPECS.md to understand clinical workflows
- Review User Settings you'll be able to configure
- Understand HIPAA protections built into the system
For Security Reviewers
- Start with SECURITY_COMPLIANCE.md
- Review PHI Detection & Routing
- Check Audit Logging requirements
For System Administrators
- Read ADMIN_PANEL_SPECS.md for admin interface
- Review System Settings you'll configure
- Follow INFRASTRUCTURE_SETUP.md for deployment
Choosing API References
- Use API_REFERENCE.md for a high-level overview of endpoint groups, concepts, and quick lookups.
- Use api-reference/rest-api.md for endpoint-by-endpoint request/response details and examples.
- See ../services/api-gateway/README.md for the canonical backend service guide.
π What is VoiceAssist V2?
VoiceAssist V2 is a HIPAA-compliant voice-enabled clinical decision support system designed for healthcare providers. It enables doctors to ask clinical questions using voice input and receive evidence-based answers with citations from authoritative medical sources.
Key Features
- π€ Voice-First Interface: Push-to-talk and voice-activated modes
- π HIPAA Compliant: PHI detection, audit logging, encrypted storage
- π Evidence-Based: Searches UpToDate, PubMed, guidelines, and local knowledge base
- π€ Hybrid AI: Local Llama for PHI queries, cloud models for general clinical questions
- π Clinical Workflows: Quick Consult, Case Workspace, Differential Diagnosis, Drug Reference
- π Admin Panel: Knowledge base management, user administration, analytics
- π Secure Architecture: Separate Nextcloud stack for PHI document storage
Architecture Overview
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β VoiceAssist V2 Stack β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Web App (Vite/React) Admin Panel (Vite/React) β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β FastAPI Backend (Python) β β
β β - RAG Engine - Auth - PHI Detection β β
β β - AI Router - Search - Audit Logs β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β β β
β PostgreSQL Qdrant Vector Redis Cache β
β (Users/Logs) (Embeddings) (Sessions) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Separate Nextcloud Stack (PHI Docs) β
β - Document Storage - WebDAV API - Encryption at Rest β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Documentation Map
All documentation is in the docs/ directory. Here's the complete index:
π― Overview & Planning
| Document | Purpose | Audience |
|---|---|---|
| START_HERE.md β | This file - project orientation | Everyone |
| UNIFIED_ARCHITECTURE.md β | Canonical architecture reference | Developers, Architects, DevOps |
| architecture/ARCHITECTURE_DIAGRAMS.md β NEW | Visual architecture diagrams (Mermaid) | Developers, Architects |
| ARCHITECTURE_V2.md | System architecture, Docker Compose-first approach (reference) | Developers, DevOps |
| PROJECT_SUMMARY.md | High-level overview, tech stack, cost estimates | Stakeholders, PMs |
| ROADMAP.md | Product roadmap and feature timeline | Product, Management |
| ENHANCEMENT_SUMMARY.md | Summary of documentation enhancements | Contributors |
π οΈ Getting Started
| Document | Purpose | Audience |
|---|---|---|
| LOCAL_DEVELOPMENT.md β | Complete local dev setup guide | Developers |
| INFRASTRUCTURE_SETUP.md | Production server deployment | DevOps |
| COMPOSE_TO_K8S_MIGRATION.md | Migration guide from Compose to K8s | DevOps |
π₯οΈ Frontend Specifications
| Document | Purpose | Audience |
|---|---|---|
| WEB_APP_SPECS.md β | Doctor-facing web app specs, clinical workflows | Frontend devs, UX |
| ADMIN_PANEL_SPECS.md β | Admin panel specs, system management | Frontend devs, Admins |
| DOCUMENTATION_SITE_SPECS.md | User-facing docs site specs | Technical writers |
π§ Backend & Services
| Document | Purpose | Audience |
|---|---|---|
| SERVICE_CATALOG.md β | Complete catalog of all 10 microservices | All developers, DevOps |
| SEMANTIC_SEARCH_DESIGN.md β | Knowledge base, vector search, RAG pipeline | Backend devs, ML |
| api-reference/rest-api.md | Endpoint-by-endpoint REST reference | Backend devs |
| API_REFERENCE.md | High-level API overview and endpoint groups | Backend devs, stakeholders |
| ../services/api-gateway/README.md | Canonical API Gateway service guide | Backend devs |
| server/README.md | β οΈ DEPRECATED - Legacy backend (use api-gateway) | Reference only |
| apps/web-app/README.md | Web app implementation details | Frontend devs |
| apps/admin-panel/README.md | Admin panel implementation details | Frontend devs |
| apps/docs-site/README.md | Documentation site implementation | Frontend devs |
Shared packages: ../packages/api-client/README.md, ../packages/config/README.md, ../packages/design-tokens/README.md, ../packages/telemetry/README.md, ../packages/types/README.md, ../packages/ui/README.md, ../packages/utils/README.md
π Security & Compliance
| Document | Purpose | Audience |
|---|---|---|
| SECURITY_COMPLIANCE.md β | HIPAA compliance, PHI handling, audit logs | Security, Compliance |
| NEXTCLOUD_INTEGRATION.md | Separate Nextcloud stack for PHI docs | DevOps, Security |
π Infrastructure & Deployment
| Document | Purpose | Audience |
|---|---|---|
| INFRASTRUCTURE_SETUP.md | Ubuntu server setup, production deployment | DevOps |
| COMPOSE_TO_K8S_MIGRATION.md | Kubernetes migration guide | DevOps |
π€ Voice Features
| Document | Purpose | Audience |
|---|---|---|
| VOICE_MODE_PIPELINE.md β | Core voice pipeline architecture, WebSocket protocol | Backend/Frontend devs |
| VOICE_MODE_ENHANCEMENT_10_PHASE.md β | 10-phase enhancement: emotion, dictation, memory, analytics | All developers |
| VOICE_MODE_SETTINGS_GUIDE.md | User voice settings configuration | Frontend devs |
| frontend/thinker-talker-hooks.md | Thinker-Talker React hooks | Frontend devs |
| api-reference/voice-pipeline-ws.md | Voice pipeline WebSocket API reference | Backend/Frontend devs |
π€ For AI Assistants / Automation
Quick Links for AI Agents:
- Agent Onboarding - Start here
- Implementation Status - Component status (source of truth)
- Agent API Reference - Machine-readable endpoints
- Agent Task Index - Common tasks and relevant docs
| Document | Purpose | Audience |
|---|---|---|
| Agent Onboarding β | Quick start guide for AI coding assistants | Claude Code, AI assistants |
| Agent API Reference β | Machine-readable JSON endpoints for agents | Claude Code, AI assistants |
| Agent Task Index | Common AI agent tasks and relevant documentation | Claude Code, AI assistants |
| CLAUDE_EXECUTION_GUIDE.md | Session startup, branching, safety rules, quality checks | Claude Code, AI assistants |
| CLAUDE_PROMPTS.md | Ready-to-use prompts for common development tasks | Claude Code, AI assistants |
Machine-Readable Endpoints (web):
GET /agent/index.json- Documentation system metadataGET /agent/docs.json- Full document list with filteringGET /search-index.json- Full-text search index (Fuse.js format)
π Phase Documents (Development Plan)
All phases are in docs/phases/. The project has 16 phases (0-15):
| Phase | Name | Status | Focus | File |
|---|---|---|---|---|
| Phase 0 | Initialization | Complete | Read all specs, understand architecture | PHASE_00_INITIALIZATION.md β |
| Phase 1 | Local Environment | Complete | Docker Compose, PostgreSQL, Redis, Qdrant | PHASE01*.md |
| Phase 2 | Database Schema | Complete | SQLAlchemy models, Alembic migrations | PHASE02*.md |
| Phase 3 | Authentication | Complete | JWT, user management, RBAC | PHASE03*.md |
| Phase 4 | Document Ingestion | Complete | PDF/DOCX parsing, vector embeddings | PHASE04*.md |
| Phase 5 | Semantic Search | Complete | Qdrant integration, RAG pipeline | PHASE05*.md |
| Phase 6 | PHI Detection | Complete | Presidio integration, routing logic | PHASE06*.md |
| Phase 7 | AI Router | Complete | Llama local, OpenAI cloud, cost tracking | PHASE07*.md |
| Phase 8 | External Search | Complete | PubMed, UpToDate APIs | PHASE08*.md |
| Phase 9 | Nextcloud Integration | Complete | WebDAV, PHI document storage | PHASE09*.md |
| Phase 10 | WebSocket & Voice | Complete | Real-time chat, voice transcription | PHASE10*.md |
| Phase 11 | Security & HIPAA | Complete | Security hardening, compliance | PHASE11*.md |
| Phase 12 | HA/DR | Complete | High availability, disaster recovery | PHASE12*.md |
| Phase 13 | Testing & Docs | Complete | Pytest, Prometheus, documentation | PHASE13*.md |
| Phase 14 | Production Deployment | Complete | Ubuntu server, systemd, backups | PHASE14*.md |
| Phase 15 | Final Review | Complete | Final review, handoff, validation | PHASE15*.md |
Note: Web App frontend development follows a separate milestone plan (Phases 0-8) tracked in Implementation Status.
πΊοΈ Development Roadmap
Project Phases (0-15) - Complete β
All 16 project phases have been completed:
- β Phases 0-3: Foundation (environment, database, auth)
- β Phases 4-8: Core functionality (ingestion, search, AI)
- β Phases 9-10: Integration (Nextcloud, voice backend)
- β Phases 11-12: Security, HA/DR
- β Phases 13-15: Testing, deployment, final review
Deliverable: Production-ready backend and infrastructure
Web App Frontend Milestones - Phase 3.5 Complete β
The web app follows its own milestone plan:
- β Phase 0: Foundation (monorepo, shared packages)
- β Phase 1: Auth & Layout
- β Phase 2: Chat Interface
- β Phase 3: Voice Features
- β Phase 3.5: Unified Chat/Voice UI
- π Phases 4-8: Files, medical features, admin, polish (planned)
See Implementation Status for current progress.
Future: Kubernetes Migration (Optional)
Goal: Scale to multi-node K8s cluster
- Follow COMPOSE_TO_K8S_MIGRATION.md
- Convert Docker Compose to K8s manifests or Helm charts
- Add auto-scaling, load balancing, multi-region
π Key Decisions & Rationale
1. Docker Compose First, Kubernetes Later
Decision: Build with Docker Compose, deploy to single Ubuntu server first, migrate to K8s when needed
Rationale:
- Faster development iteration
- Simpler debugging and local testing
- Cost-effective for initial deployment
- Easy migration path when scaling needs arise
2. Separate Nextcloud Stack
Decision: Run Nextcloud in separate Docker Compose stack with its own database
Rationale:
- PHI isolation (separate audit logs, backups, encryption keys)
- Independent scaling and maintenance
- Clear security boundary
- Easier compliance audits
3. HIPAA Compliance from Day 1
Decision: Build HIPAA controls into every component from the start
Rationale:
- Retrofitting compliance is expensive and risky
- PHI detection must be part of core routing logic
- Audit logging must be comprehensive from start
- Encryption and access controls easier to add early
4. Hybrid AI Model
Decision: Use local Llama for PHI queries, cloud models for general questions
Rationale:
- Keeps PHI on-premises for HIPAA compliance
- Leverages cloud model quality when safe
- Reduces cloud costs by routing appropriately
- Provides fallback options
5. Phase-Based Development
Decision: Break project into 16 sequential phases (0-15)
Rationale:
- Each phase is independently completable
- Clear exit criteria and verification
- Easy progress tracking
- Suitable for AI-assisted development with Claude Code
π Getting Started
Prerequisites
- macOS (or Linux) with Docker Desktop
- Python 3.11+
- Node.js 18+ with pnpm
- 16GB RAM minimum
- Basic knowledge of FastAPI, Next.js, Docker
Step 1: Set Up Environment
# Navigate to project root cd ~/VoiceAssist # or your project directory # Read the local development guide cat docs/LOCAL_DEVELOPMENT.md # Follow the setup instructions # - Install Docker Desktop # - Create .env files # - Start Docker Compose services
Step 2: Understand the Architecture
# Read architecture document cat docs/ARCHITECTURE_V2.md # Review key specifications cat docs/WEB_APP_SPECS.md cat docs/ADMIN_PANEL_SPECS.md cat docs/SEMANTIC_SEARCH_DESIGN.md
Step 3: Start Phase 0
# Read Phase 0 instructions cat docs/phases/PHASE_00_INITIALIZATION.md # This phase ensures you understand: # - System architecture # - Clinical workflows # - Security requirements # - Development approach
Step 4: Continue Through Phases
Follow phases sequentially, verifying exit criteria before moving to the next phase.
π§ Learning Path
Week 1: Foundation
- Day 1-2: Read all specifications, understand architecture
- Day 3: Set up local environment (Phase 1)
- Day 4: Create database schema (Phase 2)
- Day 5: Implement authentication (Phase 3)
Week 2: Core Features
- Day 1-2: Document ingestion pipeline (Phase 4)
- Day 3-4: Semantic search and RAG (Phase 5)
- Day 5: PHI detection (Phase 6)
Week 3: AI & Integration
- Day 1-2: AI model router (Phase 7)
- Day 3: External search APIs (Phase 8)
- Day 4: Nextcloud integration (Phase 9)
- Day 5: WebSocket and voice (Phase 10)
Week 4: Frontend
- Day 1-4: Web app UI (Phase 11)
- Day 5: Admin panel UI (Phase 12)
Week 5: Production
- Day 1-2: Testing and monitoring (Phase 13)
- Day 3-4: Production deployment (Phase 14)
- Day 5: Verification and documentation
π Development Workflow
Daily Workflow
- Start services:
docker compose up -d - Check logs:
docker compose logs -f - Work on current phase: Follow phase document
- Run tests:
pytestorpnpm test - Verify functionality: Manual testing
- Commit changes: Git commit with clear message
- Update phase status: Mark tasks complete
Working with Claude Code
I want to work on Phase [N]. Please:
1. Read ~/VoiceAssist/docs/phases/PHASE_[N]_*.md
2. Check all prerequisites are met
3. Complete all tasks in order
4. Run all tests and verify functionality
5. Update documentation
6. Verify exit criteria
7. Commit the changes
Troubleshooting
- Check service logs:
docker compose logs [service-name] - Verify environment variables:
cat .env - Review LOCAL_DEVELOPMENT.md troubleshooting section
- Check phase-specific troubleshooting in phase documents
π Project Status
Current Status: Backend, Infrastructure, Admin Panel, and Web App (Phase 3.5) Production-Ready.
Phase Completion: All 16 project phases (0-15) complete. Web app Phase 3.5 (Unified Chat/Voice UI) complete.
Implementation Reference: See Implementation Status for detailed component status.
Target Deployment: Ubuntu server with Docker Compose (production-ready)
π Support & Resources
Documentation
- All specs:
docs/ - Phase docs:
docs/phases/ - Applications:
apps/{web-app,admin-panel,docs-site}/ - Services:
services/api-gateway/ - Server:
server/
Key Technologies
- Backend: FastAPI, SQLAlchemy, Alembic, LangChain
- Frontend: Vite + React (web-app, admin-panel), Next.js 14 (docs-site), TailwindCSS, shadcn/ui
- AI/ML: OpenAI GPT-4o, Qdrant, ElevenLabs TTS, Deepgram STT
- Infrastructure: Docker Compose, PostgreSQL, Redis, Nextcloud
Getting Help
- Check phase troubleshooting section
- Review specification documents
- Search logs for errors
- Ask Claude Code for assistance with specific issues
π Machine-Readable Documentation API
For AI assistants and automated tooling, VoiceAssist provides machine-readable JSON endpoints:
Web API Endpoints
Base URL: https://assistdocs.asimo.io
| Endpoint | Purpose |
|---|---|
/agent/index.json | Documentation system metadata and discovery |
/agent/docs.json | Full document list with metadata for filtering |
/search-index.json | Full-text search index (Fuse.js format) |
/sitemap.xml | XML sitemap for crawlers |
Usage by AI Agents:
- Fetch
/agent/index.jsonto understand available endpoints and schema - Fetch
/agent/docs.jsonto get all documents with metadata - Filter client-side by
status,audience,tags, etc. - Use
/search-index.jsonwith Fuse.js for full-text search
See the Agent API Reference for complete details.
DOC_INDEX.yml (Legacy)
Location: docs/DOC_INDEX.yml
Purpose: Canonical registry of all project documentation with metadata. This YAML file is still available for local tooling but the web JSON endpoints are preferred for programmatic access.
π Let's Build!
You now have a comprehensive understanding of VoiceAssist V2. The project is structured to be built phase-by-phase, with clear specifications and requirements at every step.
Ready to start? Open PHASE_00_INITIALIZATION.md and begin your journey.
Good luck! π
π Legacy V1 Materials
The following documents describe the original 20-phase V1 plan. They are preserved for historical reference only and are not canonical for V2 development:
- DEVELOPMENT_PHASES.md - Original 20-phase plan (V1)
- ALL_PHASES_SUMMARY.md - Original phase summaries (V1)
- ROADMAP.md - Original roadmap (V1)
- ARCHITECTURE.md - Original architecture (V1)
- ../PHASE_STATUS.md - Original phase tracking (V1)
Note: All V1 documents have been marked with a legacy banner directing readers to the current V2 documentation.