Docs / Raw

Start Here

Sourced from docs/START_HERE.md

Edit on GitHub

πŸš€ VoiceAssist V2 - Start Here

Welcome to VoiceAssist V2 - A HIPAA-compliant voice-enabled clinical decision support system.

This document is your entry point to the project. Choose your path below based on your role and experience level.

Status update: All 16 project phases (0-15) are complete. Backend, infrastructure, admin panel, and web app (through Phase 3.5) are production-ready. Voice Mode Enhancement (10-phase plan) completed 2025-12-03 - includes emotion detection, medical dictation, session analytics, and feedback collection. See Implementation Status for the authoritative component status.


🎯 Quick Start

For New Developers

  1. Read What is VoiceAssist V2? (5 min)
  2. Follow Getting Started (30 min)
  3. Review Documentation Map to understand what's available
  4. Start with PHASE_00_INITIALIZATION.md

For Experienced Developers

  1. Review UNIFIED_ARCHITECTURE.md for complete system design
  2. Check ARCHITECTURE_DIAGRAMS.md for visual diagrams
  3. Set up local environment: LOCAL_DEVELOPMENT.md
  4. Jump to Development Roadmap to see phases

For Clinicians

  1. Read WEB_APP_SPECS.md to understand clinical workflows
  2. Review User Settings you'll be able to configure
  3. Understand HIPAA protections built into the system

For Security Reviewers

  1. Start with SECURITY_COMPLIANCE.md
  2. Review PHI Detection & Routing
  3. Check Audit Logging requirements

For System Administrators

  1. Read ADMIN_PANEL_SPECS.md for admin interface
  2. Review System Settings you'll configure
  3. Follow INFRASTRUCTURE_SETUP.md for deployment

Choosing API References


πŸ“– What is VoiceAssist V2?

VoiceAssist V2 is a HIPAA-compliant voice-enabled clinical decision support system designed for healthcare providers. It enables doctors to ask clinical questions using voice input and receive evidence-based answers with citations from authoritative medical sources.

Key Features

  • 🎀 Voice-First Interface: Push-to-talk and voice-activated modes
  • πŸ”’ HIPAA Compliant: PHI detection, audit logging, encrypted storage
  • πŸ“š Evidence-Based: Searches UpToDate, PubMed, guidelines, and local knowledge base
  • πŸ€– Hybrid AI: Local Llama for PHI queries, cloud models for general clinical questions
  • πŸ“‹ Clinical Workflows: Quick Consult, Case Workspace, Differential Diagnosis, Drug Reference
  • πŸ“Š Admin Panel: Knowledge base management, user administration, analytics
  • πŸ” Secure Architecture: Separate Nextcloud stack for PHI document storage

Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    VoiceAssist V2 Stack                          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Web App (Vite/React)       Admin Panel (Vite/React)            β”‚
β”‚       ↓                            ↓                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”‚
β”‚  β”‚         FastAPI Backend (Python)                β”‚            β”‚
β”‚  β”‚  - RAG Engine    - Auth         - PHI Detection β”‚            β”‚
β”‚  β”‚  - AI Router     - Search       - Audit Logs    β”‚            β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚
β”‚       ↓           ↓              ↓                               β”‚
β”‚  PostgreSQL   Qdrant Vector   Redis Cache                       β”‚
β”‚  (Users/Logs)   (Embeddings)   (Sessions)                       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        ↕
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚               Separate Nextcloud Stack (PHI Docs)               β”‚
β”‚  - Document Storage  - WebDAV API  - Encryption at Rest         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“š Documentation Map

All documentation is in the docs/ directory. Here's the complete index:

🎯 Overview & Planning

DocumentPurposeAudience
START_HERE.md ⭐This file - project orientationEveryone
UNIFIED_ARCHITECTURE.md ⭐Canonical architecture referenceDevelopers, Architects, DevOps
architecture/ARCHITECTURE_DIAGRAMS.md ⭐ NEWVisual architecture diagrams (Mermaid)Developers, Architects
ARCHITECTURE_V2.mdSystem architecture, Docker Compose-first approach (reference)Developers, DevOps
PROJECT_SUMMARY.mdHigh-level overview, tech stack, cost estimatesStakeholders, PMs
ROADMAP.mdProduct roadmap and feature timelineProduct, Management
ENHANCEMENT_SUMMARY.mdSummary of documentation enhancementsContributors

πŸ› οΈ Getting Started

DocumentPurposeAudience
LOCAL_DEVELOPMENT.md ⭐Complete local dev setup guideDevelopers
INFRASTRUCTURE_SETUP.mdProduction server deploymentDevOps
COMPOSE_TO_K8S_MIGRATION.mdMigration guide from Compose to K8sDevOps

πŸ–₯️ Frontend Specifications

DocumentPurposeAudience
WEB_APP_SPECS.md ⭐Doctor-facing web app specs, clinical workflowsFrontend devs, UX
ADMIN_PANEL_SPECS.md ⭐Admin panel specs, system managementFrontend devs, Admins
DOCUMENTATION_SITE_SPECS.mdUser-facing docs site specsTechnical writers

πŸ”§ Backend & Services

DocumentPurposeAudience
SERVICE_CATALOG.md ⭐Complete catalog of all 10 microservicesAll developers, DevOps
SEMANTIC_SEARCH_DESIGN.md ⭐Knowledge base, vector search, RAG pipelineBackend devs, ML
api-reference/rest-api.mdEndpoint-by-endpoint REST referenceBackend devs
API_REFERENCE.mdHigh-level API overview and endpoint groupsBackend devs, stakeholders
../services/api-gateway/README.mdCanonical API Gateway service guideBackend devs
server/README.md⚠️ DEPRECATED - Legacy backend (use api-gateway)Reference only
apps/web-app/README.mdWeb app implementation detailsFrontend devs
apps/admin-panel/README.mdAdmin panel implementation detailsFrontend devs
apps/docs-site/README.mdDocumentation site implementationFrontend devs

Shared packages: ../packages/api-client/README.md, ../packages/config/README.md, ../packages/design-tokens/README.md, ../packages/telemetry/README.md, ../packages/types/README.md, ../packages/ui/README.md, ../packages/utils/README.md

πŸ”’ Security & Compliance

DocumentPurposeAudience
SECURITY_COMPLIANCE.md ⭐HIPAA compliance, PHI handling, audit logsSecurity, Compliance
NEXTCLOUD_INTEGRATION.mdSeparate Nextcloud stack for PHI docsDevOps, Security

πŸš€ Infrastructure & Deployment

DocumentPurposeAudience
INFRASTRUCTURE_SETUP.mdUbuntu server setup, production deploymentDevOps
COMPOSE_TO_K8S_MIGRATION.mdKubernetes migration guideDevOps

🎀 Voice Features

DocumentPurposeAudience
VOICE_MODE_PIPELINE.md ⭐Core voice pipeline architecture, WebSocket protocolBackend/Frontend devs
VOICE_MODE_ENHANCEMENT_10_PHASE.md ⭐10-phase enhancement: emotion, dictation, memory, analyticsAll developers
VOICE_MODE_SETTINGS_GUIDE.mdUser voice settings configurationFrontend devs
frontend/thinker-talker-hooks.mdThinker-Talker React hooksFrontend devs
api-reference/voice-pipeline-ws.mdVoice pipeline WebSocket API referenceBackend/Frontend devs

πŸ€– For AI Assistants / Automation

Quick Links for AI Agents:

DocumentPurposeAudience
Agent Onboarding ⭐Quick start guide for AI coding assistantsClaude Code, AI assistants
Agent API Reference ⭐Machine-readable JSON endpoints for agentsClaude Code, AI assistants
Agent Task IndexCommon AI agent tasks and relevant documentationClaude Code, AI assistants
CLAUDE_EXECUTION_GUIDE.mdSession startup, branching, safety rules, quality checksClaude Code, AI assistants
CLAUDE_PROMPTS.mdReady-to-use prompts for common development tasksClaude Code, AI assistants

Machine-Readable Endpoints (web):

  • GET /agent/index.json - Documentation system metadata
  • GET /agent/docs.json - Full document list with filtering
  • GET /search-index.json - Full-text search index (Fuse.js format)

πŸ“‹ Phase Documents (Development Plan)

All phases are in docs/phases/. The project has 16 phases (0-15):

PhaseNameStatusFocusFile
Phase 0InitializationCompleteRead all specs, understand architecturePHASE_00_INITIALIZATION.md ⭐
Phase 1Local EnvironmentCompleteDocker Compose, PostgreSQL, Redis, QdrantPHASE01*.md
Phase 2Database SchemaCompleteSQLAlchemy models, Alembic migrationsPHASE02*.md
Phase 3AuthenticationCompleteJWT, user management, RBACPHASE03*.md
Phase 4Document IngestionCompletePDF/DOCX parsing, vector embeddingsPHASE04*.md
Phase 5Semantic SearchCompleteQdrant integration, RAG pipelinePHASE05*.md
Phase 6PHI DetectionCompletePresidio integration, routing logicPHASE06*.md
Phase 7AI RouterCompleteLlama local, OpenAI cloud, cost trackingPHASE07*.md
Phase 8External SearchCompletePubMed, UpToDate APIsPHASE08*.md
Phase 9Nextcloud IntegrationCompleteWebDAV, PHI document storagePHASE09*.md
Phase 10WebSocket & VoiceCompleteReal-time chat, voice transcriptionPHASE10*.md
Phase 11Security & HIPAACompleteSecurity hardening, compliancePHASE11*.md
Phase 12HA/DRCompleteHigh availability, disaster recoveryPHASE12*.md
Phase 13Testing & DocsCompletePytest, Prometheus, documentationPHASE13*.md
Phase 14Production DeploymentCompleteUbuntu server, systemd, backupsPHASE14*.md
Phase 15Final ReviewCompleteFinal review, handoff, validationPHASE15*.md

Note: Web App frontend development follows a separate milestone plan (Phases 0-8) tracked in Implementation Status.


πŸ—ΊοΈ Development Roadmap

Project Phases (0-15) - Complete βœ…

All 16 project phases have been completed:

  • βœ… Phases 0-3: Foundation (environment, database, auth)
  • βœ… Phases 4-8: Core functionality (ingestion, search, AI)
  • βœ… Phases 9-10: Integration (Nextcloud, voice backend)
  • βœ… Phases 11-12: Security, HA/DR
  • βœ… Phases 13-15: Testing, deployment, final review

Deliverable: Production-ready backend and infrastructure

Web App Frontend Milestones - Phase 3.5 Complete βœ…

The web app follows its own milestone plan:

  • βœ… Phase 0: Foundation (monorepo, shared packages)
  • βœ… Phase 1: Auth & Layout
  • βœ… Phase 2: Chat Interface
  • βœ… Phase 3: Voice Features
  • βœ… Phase 3.5: Unified Chat/Voice UI
  • πŸ“‹ Phases 4-8: Files, medical features, admin, polish (planned)

See Implementation Status for current progress.

Future: Kubernetes Migration (Optional)

Goal: Scale to multi-node K8s cluster

  • Follow COMPOSE_TO_K8S_MIGRATION.md
  • Convert Docker Compose to K8s manifests or Helm charts
  • Add auto-scaling, load balancing, multi-region

πŸ”‘ Key Decisions & Rationale

1. Docker Compose First, Kubernetes Later

Decision: Build with Docker Compose, deploy to single Ubuntu server first, migrate to K8s when needed

Rationale:

  • Faster development iteration
  • Simpler debugging and local testing
  • Cost-effective for initial deployment
  • Easy migration path when scaling needs arise

2. Separate Nextcloud Stack

Decision: Run Nextcloud in separate Docker Compose stack with its own database

Rationale:

  • PHI isolation (separate audit logs, backups, encryption keys)
  • Independent scaling and maintenance
  • Clear security boundary
  • Easier compliance audits

3. HIPAA Compliance from Day 1

Decision: Build HIPAA controls into every component from the start

Rationale:

  • Retrofitting compliance is expensive and risky
  • PHI detection must be part of core routing logic
  • Audit logging must be comprehensive from start
  • Encryption and access controls easier to add early

4. Hybrid AI Model

Decision: Use local Llama for PHI queries, cloud models for general questions

Rationale:

  • Keeps PHI on-premises for HIPAA compliance
  • Leverages cloud model quality when safe
  • Reduces cloud costs by routing appropriately
  • Provides fallback options

5. Phase-Based Development

Decision: Break project into 16 sequential phases (0-15)

Rationale:

  • Each phase is independently completable
  • Clear exit criteria and verification
  • Easy progress tracking
  • Suitable for AI-assisted development with Claude Code

🏁 Getting Started

Prerequisites

  • macOS (or Linux) with Docker Desktop
  • Python 3.11+
  • Node.js 18+ with pnpm
  • 16GB RAM minimum
  • Basic knowledge of FastAPI, Next.js, Docker

Step 1: Set Up Environment

# Navigate to project root cd ~/VoiceAssist # or your project directory # Read the local development guide cat docs/LOCAL_DEVELOPMENT.md # Follow the setup instructions # - Install Docker Desktop # - Create .env files # - Start Docker Compose services

Step 2: Understand the Architecture

# Read architecture document cat docs/ARCHITECTURE_V2.md # Review key specifications cat docs/WEB_APP_SPECS.md cat docs/ADMIN_PANEL_SPECS.md cat docs/SEMANTIC_SEARCH_DESIGN.md

Step 3: Start Phase 0

# Read Phase 0 instructions cat docs/phases/PHASE_00_INITIALIZATION.md # This phase ensures you understand: # - System architecture # - Clinical workflows # - Security requirements # - Development approach

Step 4: Continue Through Phases

Follow phases sequentially, verifying exit criteria before moving to the next phase.


🧭 Learning Path

Week 1: Foundation

  • Day 1-2: Read all specifications, understand architecture
  • Day 3: Set up local environment (Phase 1)
  • Day 4: Create database schema (Phase 2)
  • Day 5: Implement authentication (Phase 3)

Week 2: Core Features

  • Day 1-2: Document ingestion pipeline (Phase 4)
  • Day 3-4: Semantic search and RAG (Phase 5)
  • Day 5: PHI detection (Phase 6)

Week 3: AI & Integration

  • Day 1-2: AI model router (Phase 7)
  • Day 3: External search APIs (Phase 8)
  • Day 4: Nextcloud integration (Phase 9)
  • Day 5: WebSocket and voice (Phase 10)

Week 4: Frontend

  • Day 1-4: Web app UI (Phase 11)
  • Day 5: Admin panel UI (Phase 12)

Week 5: Production

  • Day 1-2: Testing and monitoring (Phase 13)
  • Day 3-4: Production deployment (Phase 14)
  • Day 5: Verification and documentation

πŸ“ Development Workflow

Daily Workflow

  1. Start services: docker compose up -d
  2. Check logs: docker compose logs -f
  3. Work on current phase: Follow phase document
  4. Run tests: pytest or pnpm test
  5. Verify functionality: Manual testing
  6. Commit changes: Git commit with clear message
  7. Update phase status: Mark tasks complete

Working with Claude Code

I want to work on Phase [N]. Please:
1. Read ~/VoiceAssist/docs/phases/PHASE_[N]_*.md
2. Check all prerequisites are met
3. Complete all tasks in order
4. Run all tests and verify functionality
5. Update documentation
6. Verify exit criteria
7. Commit the changes

Troubleshooting

  • Check service logs: docker compose logs [service-name]
  • Verify environment variables: cat .env
  • Review LOCAL_DEVELOPMENT.md troubleshooting section
  • Check phase-specific troubleshooting in phase documents

πŸ“Š Project Status

Current Status: Backend, Infrastructure, Admin Panel, and Web App (Phase 3.5) Production-Ready.

Phase Completion: All 16 project phases (0-15) complete. Web app Phase 3.5 (Unified Chat/Voice UI) complete.

Implementation Reference: See Implementation Status for detailed component status.

Target Deployment: Ubuntu server with Docker Compose (production-ready)


πŸ†˜ Support & Resources

Documentation

  • All specs: docs/
  • Phase docs: docs/phases/
  • Applications: apps/{web-app,admin-panel,docs-site}/
  • Services: services/api-gateway/
  • Server: server/

Key Technologies

  • Backend: FastAPI, SQLAlchemy, Alembic, LangChain
  • Frontend: Vite + React (web-app, admin-panel), Next.js 14 (docs-site), TailwindCSS, shadcn/ui
  • AI/ML: OpenAI GPT-4o, Qdrant, ElevenLabs TTS, Deepgram STT
  • Infrastructure: Docker Compose, PostgreSQL, Redis, Nextcloud

Getting Help

  1. Check phase troubleshooting section
  2. Review specification documents
  3. Search logs for errors
  4. Ask Claude Code for assistance with specific issues

πŸ“‡ Machine-Readable Documentation API

For AI assistants and automated tooling, VoiceAssist provides machine-readable JSON endpoints:

Web API Endpoints

Base URL: https://assistdocs.asimo.io

EndpointPurpose
/agent/index.jsonDocumentation system metadata and discovery
/agent/docs.jsonFull document list with metadata for filtering
/search-index.jsonFull-text search index (Fuse.js format)
/sitemap.xmlXML sitemap for crawlers

Usage by AI Agents:

  1. Fetch /agent/index.json to understand available endpoints and schema
  2. Fetch /agent/docs.json to get all documents with metadata
  3. Filter client-side by status, audience, tags, etc.
  4. Use /search-index.json with Fuse.js for full-text search

See the Agent API Reference for complete details.

DOC_INDEX.yml (Legacy)

Location: docs/DOC_INDEX.yml

Purpose: Canonical registry of all project documentation with metadata. This YAML file is still available for local tooling but the web JSON endpoints are preferred for programmatic access.


πŸŽ‰ Let's Build!

You now have a comprehensive understanding of VoiceAssist V2. The project is structured to be built phase-by-phase, with clear specifications and requirements at every step.

Ready to start? Open PHASE_00_INITIALIZATION.md and begin your journey.

Good luck! πŸš€


πŸ“œ Legacy V1 Materials

The following documents describe the original 20-phase V1 plan. They are preserved for historical reference only and are not canonical for V2 development:

Note: All V1 documents have been marked with a legacy banner directing readers to the current V2 documentation.

Beginning of guide
End of guide