AI Agent Quick Start
Last Updated: 2025-12-01
This guide helps AI coding assistants (Claude, GPT, Copilot, etc.) quickly understand and work on VoiceAssist.
Project Overview
VoiceAssist is a HIPAA-compliant medical AI assistant platform with:
- Voice Mode: Thinker-Talker pipeline (Deepgram STT → GPT-4o → ElevenLabs TTS)
- Text Mode: Streaming chat with citations
- Knowledge Base: Medical textbooks, guidelines, literature
- Admin Panel: User management, analytics, KB administration
Key Directories
/home/asimo/VoiceAssist/
├── apps/
│ ├── web-app/ # React frontend (main app)
│ ├── admin-panel/ # Admin dashboard
│ └── docs-site/ # Next.js documentation site
├── packages/
│ ├── api-client/ # Type-safe API client
│ ├── types/ # Shared TypeScript types
│ ├── ui/ # Shared UI components
│ └── utils/ # Shared utilities
├── services/
│ └── api-gateway/ # FastAPI backend
│ └── app/
│ ├── api/ # REST endpoints
│ ├── services/ # Business logic (Thinker, Talker, etc.)
│ └── models/ # Database models
└── docs/ # Documentation source
Critical Files
Voice Pipeline (Thinker-Talker)
| File | Purpose |
|---|---|
services/api-gateway/app/services/thinker_service.py | LLM orchestration |
services/api-gateway/app/services/talker_service.py | TTS synthesis |
services/api-gateway/app/services/thinker_talker_websocket_handler.py | WebSocket handling |
services/api-gateway/app/services/sentence_chunker.py | Text chunking for TTS |
apps/web-app/src/hooks/useThinkerTalkerSession.ts | Frontend voice hook |
API Endpoints
| File | Purpose |
|---|---|
services/api-gateway/app/api/conversations.py | Chat CRUD |
services/api-gateway/app/api/voice.py | Voice session management |
services/api-gateway/app/api/realtime.py | Chat WebSocket |
services/api-gateway/app/api/auth.py | Authentication |
Frontend Components
| File | Purpose |
|---|---|
apps/web-app/src/components/ChatView.tsx | Main chat interface |
apps/web-app/src/components/VoicePanel.tsx | Voice mode UI |
apps/web-app/src/hooks/useChatSession.ts | Chat state management |
Documentation Index
Architecture
- THINKER_TALKER_PIPELINE.md - Voice pipeline architecture
- UNIFIED_ARCHITECTURE.md - System overview
- BACKEND_ARCHITECTURE.md - API Gateway design
- FRONTEND_ARCHITECTURE.md - React app structure
API Reference
- API_REFERENCE.md - Endpoint overview
- api-reference/rest-api.md - Complete REST docs
- api-reference/voice-pipeline-ws.md - Voice WebSocket protocol
- WEBSOCKET_PROTOCOL.md - Chat WebSocket protocol
Services
- services/thinker-service.md - ThinkerService API
- services/talker-service.md - TalkerService API
Data
- DATA_MODEL.md - Database schema
- CONFIGURATION_REFERENCE.md - Environment variables
Machine-Readable Endpoints
The docs site provides JSON endpoints for programmatic access:
| Endpoint | Description |
|---|---|
GET /agent/index.json | Documentation system metadata |
GET /agent/docs.json | Full document list with metadata |
GET /search-index.json | Full-text search index |
Base URL: https://assistdocs.asimo.io
See Agent API Reference for details.
Common Tasks
Adding a New API Endpoint
- Create route in
services/api-gateway/app/api/<module>.py - Add schema in
services/api-gateway/app/schemas/<module>.py - Register in
services/api-gateway/app/main.py - Add TypeScript types in
packages/types/src/ - Update API client in
packages/api-client/src/
Adding a Frontend Feature
- Create component in
apps/web-app/src/components/ - Add hooks in
apps/web-app/src/hooks/ - Use shared UI from
packages/ui/ - Use API client from
packages/api-client/
Modifying Voice Pipeline
- ThinkerService changes:
thinker_service.py - TalkerService changes:
talker_service.py - WebSocket protocol:
thinker_talker_websocket_handler.py - Frontend hooks:
useThinkerTalkerSession.ts
Code Patterns
Backend (Python/FastAPI)
# Typical endpoint structure @router.post("/conversations/{id}/messages") async def create_message( id: UUID, request: CreateMessageRequest, user: User = Depends(get_current_user), db: Session = Depends(get_db) ) -> MessageResponse: # Business logic return MessageResponse(...)
Frontend (React/TypeScript)
// Typical hook usage const { messages, sendMessage, isLoading } = useChatSession(conversationId); // Voice mode const { startListening, stopListening, isRecording } = useThinkerTalkerSession({ onTranscript: (text) => console.log(text), onAudio: (audio) => playAudio(audio), });
Testing
# Backend tests cd services/api-gateway pytest tests/ -v # Frontend tests pnpm test # Type checking pnpm typecheck # Linting pnpm lint
Environment Setup
Required API keys (in .env):
OPENAI_API_KEY=sk-... # GPT-4o for Thinker DEEPGRAM_API_KEY=... # Speech-to-text ELEVENLABS_API_KEY=... # Text-to-speech DATABASE_URL=postgresql://... # PostgreSQL REDIS_URL=redis://... # Cache
Quick References
- OpenAPI Spec:
http://localhost:8000/openapi.json - Swagger UI:
http://localhost:8000/docs - Health Check:
http://localhost:8000/health - Docs Site:
https://assistdocs.asimo.io
Related Documentation
- Agent Onboarding - Detailed onboarding guide
- Agent Task Index - Common tasks and relevant docs
- Claude Execution Guide - Claude-specific guidelines