2:I[7012,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],"MarkdownRenderer"] 4:I[9856,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],""] 5:I[4126,[],""] 7:I[9630,[],""] 8:I[4278,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"HeadingProvider"] 9:I[1476,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Header"] a:I[3167,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Sidebar"] b:I[7409,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"PageFrame"] 3:T18b08, # Part 3: Platform Enhancements - Implementation Plan **Version:** 1.0 **Date:** 2025-11-26 **Status:** Planning **Priority:** MEDIUM **Estimated Duration:** 11-14 weeks --- ## Executive Summary This document provides a comprehensive implementation plan for platform enhancements that improve the VoiceAssist foundation. These enhancements focus on design consistency, security hardening, search quality, and continuous improvement systems. **Scope:** 1. **Design System Improvements** (2-3 weeks) - Complete design token system and documentation 2. **Client-Side Security** (2 weeks) - PHI protection and audit enhancements 3. **Advanced RAG Techniques** (4-5 weeks) - Hybrid search, re-ranking, contextual retrieval 4. **Continuous Learning System** (3-4 weeks) - Feedback collection and model improvement **Total Estimated Effort:** 11-14 weeks with 2 developers --- ## Table of Contents 1. [Current State Analysis](#current-state-analysis) 2. [Design System Improvements](#1-design-system-improvements) 3. [Client-Side Security](#2-client-side-security) 4. [Advanced RAG Techniques](#3-advanced-rag-techniques) 5. [Continuous Learning System](#4-continuous-learning-system) 6. [Implementation Phases](#implementation-phases) 7. [Technical Architecture](#technical-architecture) 8. [Risk Assessment](#risk-assessment) 9. [Success Metrics](#success-metrics) 10. [Appendices](#appendices) --- ## Current State Analysis ### What's Already Implemented | Component | Status | Location | Notes | | --------------------------- | ----------- | -------------------------------------------------------- | -------------------------- | | **Design Tokens (Colors)** | ✅ Complete | `packages/design-tokens/src/colors.ts` | Light/dark themes, WCAG AA | | **Design Tokens (Spacing)** | ✅ Complete | `packages/design-tokens/src/spacing.ts` | 4px base scale | | **Design Tokens (Typo)** | ✅ Complete | `packages/design-tokens/src/typography.ts` | Font scales defined | | **Storybook Setup** | ✅ Complete | `packages/ui/.storybook/` | 12 component stories | | **Theme Provider** | ✅ Complete | `packages/ui/src/providers/ThemeProvider.tsx` | Context-based theming | | **PHI Detector (Backend)** | ✅ Complete | `services/api-gateway/app/services/phi_detector.py` | Pattern-based detection | | **PHI Redaction MW** | ✅ Complete | `services/api-gateway/app/middleware/phi_redaction.py` | Request/response filtering | | **Audit Service** | ✅ Complete | `services/api-gateway/app/services/audit_service.py` | HIPAA-compliant logging | | **Vector Search** | ✅ Complete | `services/api-gateway/app/services/search_aggregator.py` | Qdrant + OpenAI embeddings | | **RAG Service** | ✅ Complete | `services/api-gateway/app/services/rag_service.py` | Basic RAG pipeline | | **Sentry Integration** | ✅ Complete | `services/api-gateway/app/core/sentry.py` | Error tracking configured | ### What's Missing (This Plan) | Component | Priority | Complexity | Dependencies | | -------------------------- | -------- | ---------- | ------------------------- | | Animation Tokens | MEDIUM | Low | Design tokens | | Medical UI Components | MEDIUM | Medium | Design tokens, Storybook | | Component Docs (Storybook) | MEDIUM | Low | Existing components | | Client-Side PHI Detection | HIGH | Medium | PHI patterns | | Encrypted Local Storage | HIGH | Medium | Web Crypto API | | Session Audit Trail (FE) | MEDIUM | Low | Audit service API | | Hybrid Search (BM25) | HIGH | High | Elasticsearch/Meilisearch | | Cross-Encoder Re-ranking | HIGH | High | sentence-transformers | | Medical Synonym Expansion | MEDIUM | Medium | UMLS/SNOMED CT | | Contextual Retrieval | MEDIUM | Medium | Chunk metadata | | Feedback Collection | HIGH | Medium | Frontend UI, Backend API | | A/B Testing Framework | MEDIUM | High | Feature flags, Analytics | | KB Curation Dashboard | MEDIUM | Medium | Admin panel | --- ## 1. Design System Improvements ### 1.1 Overview **Objective:** Establish a comprehensive, documented design system that ensures UI consistency across all VoiceAssist applications. **Current State:** Basic design tokens exist (colors, spacing, typography) with Storybook configured and 12 component stories. **Target State:** Complete design system with animations, medical-themed components, interactive documentation, and WCAG AAA compliance. ### 1.2 Technical Architecture ``` packages/ ├── design-tokens/ │ └── src/ │ ├── colors.ts ✅ Complete │ ├── spacing.ts ✅ Complete │ ├── typography.ts ✅ Complete │ ├── animations.ts 🔲 NEW - Motion tokens │ ├── shadows.ts 🔲 NEW - Elevation system │ ├── breakpoints.ts 🔲 NEW - Responsive breakpoints │ └── index.ts ├── ui/ │ ├── .storybook/ ✅ Configured │ └── src/ │ ├── components/ │ │ ├── primitives/ ✅ Button, Input, etc. │ │ └── medical/ 🔲 NEW - VitalSignCard, MedicationList, etc. │ ├── stories/ │ │ ├── *.stories.tsx ✅ 12 stories exist │ │ └── medical/ 🔲 NEW - Medical component stories │ └── providers/ │ └── ThemeProvider.tsx ✅ Complete └── tailwind-config/ └── tailwind.preset.js ✅ Shared config ``` ### 1.3 Component Specifications #### 1.3.1 Animation Tokens **File:** `packages/design-tokens/src/animations.ts` ```typescript /** * Animation tokens following medical UI best practices: * - Reduced motion support * - Subtle, non-distracting transitions * - Clear feedback for interactions */ export const durations = { instant: "0ms", fast: "100ms", normal: "200ms", slow: "300ms", slower: "500ms", } as const; export const easings = { linear: "linear", easeIn: "cubic-bezier(0.4, 0, 1, 1)", easeOut: "cubic-bezier(0, 0, 0.2, 1)", easeInOut: "cubic-bezier(0.4, 0, 0.2, 1)", spring: "cubic-bezier(0.175, 0.885, 0.32, 1.275)", } as const; export const animations = { fadeIn: { keyframes: { from: { opacity: 0 }, to: { opacity: 1 } }, duration: durations.normal, easing: easings.easeOut, }, slideUp: { keyframes: { from: { transform: "translateY(8px)", opacity: 0 }, to: { transform: "translateY(0)", opacity: 1 }, }, duration: durations.normal, easing: easings.easeOut, }, pulse: { keyframes: { "0%, 100%": { opacity: 1 }, "50%": { opacity: 0.5 }, }, duration: durations.slower, easing: easings.easeInOut, iterationCount: "infinite", }, // Medical-specific: Alert pulse for critical values criticalPulse: { keyframes: { "0%, 100%": { boxShadow: "0 0 0 0 rgba(239, 68, 68, 0.4)", borderColor: "var(--color-error-500)", }, "50%": { boxShadow: "0 0 0 8px rgba(239, 68, 68, 0)", borderColor: "var(--color-error-600)", }, }, duration: "1.5s", easing: easings.easeInOut, iterationCount: "infinite", }, } as const; // Reduced motion variants export const reducedMotionAnimations = { fadeIn: { ...animations.fadeIn, duration: durations.instant }, slideUp: { ...animations.fadeIn, duration: durations.instant }, // Fallback to fade pulse: null, // Disable pulsing animations criticalPulse: null, } as const; ``` #### 1.3.2 Shadow/Elevation Tokens **File:** `packages/design-tokens/src/shadows.ts` ```typescript /** * Elevation system for depth and hierarchy * Based on Material Design principles, adapted for medical UI */ export const shadows = { none: "none", sm: "0 1px 2px 0 rgb(0 0 0 / 0.05)", md: "0 4px 6px -1px rgb(0 0 0 / 0.1), 0 2px 4px -2px rgb(0 0 0 / 0.1)", lg: "0 10px 15px -3px rgb(0 0 0 / 0.1), 0 4px 6px -4px rgb(0 0 0 / 0.1)", xl: "0 20px 25px -5px rgb(0 0 0 / 0.1), 0 8px 10px -6px rgb(0 0 0 / 0.1)", // Medical-specific: Focus ring for accessibility focus: "0 0 0 3px var(--color-primary-500 / 0.3)", focusError: "0 0 0 3px var(--color-error-500 / 0.3)", } as const; export const elevation = { surface: shadows.none, // Base level (cards, panels) raised: shadows.sm, // Slightly elevated (buttons) overlay: shadows.md, // Dropdowns, tooltips modal: shadows.lg, // Modals, dialogs floating: shadows.xl, // FABs, floating elements } as const; ``` #### 1.3.3 Medical UI Components **File:** `packages/ui/src/components/medical/VitalSignCard.tsx` ```tsx /** * VitalSignCard - Displays a single vital sign with status indication * * Features: * - Color-coded status (normal, warning, critical) * - Trend indicator (up, down, stable) * - Accessibility: High contrast, screen reader friendly * - Animation: Critical pulse for out-of-range values */ import React from "react"; import { cn } from "../../utils/cn"; export interface VitalSignCardProps { label: string; value: number | string; unit: string; status: "normal" | "warning" | "critical"; trend?: "up" | "down" | "stable"; normalRange?: { min: number; max: number }; timestamp?: Date; className?: string; } const statusStyles = { normal: "bg-success-50 border-success-200 text-success-800", warning: "bg-warning-50 border-warning-200 text-warning-800", critical: "bg-error-50 border-error-200 text-error-800 animate-critical-pulse", }; const trendIcons = { up: "↑", down: "↓", stable: "→", }; export const VitalSignCard: React.FC = ({ label, value, unit, status, trend, normalRange, timestamp, className, }) => { return (
{label} {trend && ( {trendIcons[trend]} )}
{value} {unit}
{normalRange && (
Normal: {normalRange.min}-{normalRange.max} {unit}
)} {timestamp &&
{timestamp.toLocaleTimeString()}
}
); }; ``` **File:** `packages/ui/src/components/medical/MedicationList.tsx` ```tsx /** * MedicationList - Displays patient medications with interaction warnings */ import React from "react"; export interface Medication { id: string; name: string; dosage: string; frequency: string; route: string; startDate: Date; endDate?: Date; prescriber?: string; interactions?: string[]; contraindications?: string[]; } export interface MedicationListProps { medications: Medication[]; showInteractions?: boolean; onMedicationClick?: (med: Medication) => void; className?: string; } export const MedicationList: React.FC = ({ medications, showInteractions = true, onMedicationClick, className, }) => { const hasInteractions = medications.some((m) => m.interactions?.length); return (
{hasInteractions && showInteractions && (
Drug Interactions Detected

Review potential interactions below

)}
    {medications.map((med) => (
  • onMedicationClick?.(med)} role="button" tabIndex={0} onKeyDown={(e) => e.key === "Enter" && onMedicationClick?.(med)} >
    {med.name} {med.dosage}
    {med.interactions?.length ? ( {med.interactions.length} interaction{med.interactions.length > 1 ? "s" : ""} ) : null}
    {med.frequency} · {med.route}
  • ))}
); }; ``` ### 1.4 Implementation Tasks | Task | Priority | Effort | Dependencies | | ---------------------------------------- | -------- | ------- | ------------ | | Create animation tokens | HIGH | 4h | None | | Create shadow/elevation tokens | HIGH | 2h | None | | Create breakpoint tokens | MEDIUM | 2h | None | | Build VitalSignCard component | HIGH | 4h | Tokens | | Build MedicationList component | HIGH | 4h | Tokens | | Build AlertBanner component | MEDIUM | 3h | Tokens | | Build TimelineEvent component | MEDIUM | 4h | Tokens | | Build ClinicalNote component | MEDIUM | 4h | Tokens | | Add Storybook stories for new components | HIGH | 6h | Components | | Write Storybook MDX documentation | MEDIUM | 8h | Stories | | Add WCAG AAA contrast validation | HIGH | 4h | Colors | | Create theme toggle demo page | LOW | 2h | Theme system | | **Total** | | **47h** | | ### 1.5 Deliverables 1. `packages/design-tokens/src/animations.ts` - Animation token definitions 2. `packages/design-tokens/src/shadows.ts` - Elevation system 3. `packages/design-tokens/src/breakpoints.ts` - Responsive breakpoints 4. `packages/ui/src/components/medical/*` - 5+ medical UI components 5. `packages/ui/src/stories/medical/*` - Storybook stories with docs 6. Updated `packages/ui/README.md` with usage guidelines 7. Storybook deployment at `storybook.voiceassist.dev` (optional) --- ## 2. Client-Side Security ### 2.1 Overview **Objective:** Extend HIPAA-compliant security to the frontend with PHI detection, encrypted storage, and comprehensive audit trails. **Current State:** Backend has PHI detection (`phi_detector.py`), redaction middleware, and audit logging. Frontend has no client-side PHI protection. **Target State:** Client-side PHI detection with warnings, encrypted IndexedDB storage, and session audit trails synced to backend. ### 2.2 Technical Architecture ``` apps/web-app/src/ ├── services/ │ ├── phi/ │ │ ├── PhiDetector.ts 🔲 NEW - Client-side PHI detection │ │ ├── PhiWarningDialog.tsx 🔲 NEW - Warning UI component │ │ └── patterns.ts 🔲 NEW - PHI regex patterns │ ├── storage/ │ │ ├── EncryptedStorage.ts 🔲 NEW - Encrypted IndexedDB wrapper │ │ ├── CryptoUtils.ts 🔲 NEW - Web Crypto API utilities │ │ └── StorageSchema.ts 🔲 NEW - Schema definitions │ └── audit/ │ ├── AuditTrail.ts 🔲 NEW - Client-side audit logger │ ├── SessionRecorder.ts 🔲 NEW - Session activity recorder │ └── AuditSync.ts 🔲 NEW - Background sync to backend ├── hooks/ │ ├── usePhiDetection.ts 🔲 NEW - PHI detection hook │ ├── useEncryptedStorage.ts 🔲 NEW - Encrypted storage hook │ └── useAuditTrail.ts 🔲 NEW - Audit trail hook └── components/ └── security/ ├── PhiWarningBanner.tsx 🔲 NEW - Warning banner component └── SessionActivityLog.tsx 🔲 NEW - Activity log viewer ``` ### 2.3 Component Specifications #### 2.3.1 Client-Side PHI Detector **File:** `apps/web-app/src/services/phi/PhiDetector.ts` ```typescript /** * Client-Side PHI Detection Service * * Mirrors backend PHI detection for real-time warnings before submission. * Uses pattern matching similar to services/api-gateway/app/services/phi_detector.py * * IMPORTANT: This is a defensive layer. Backend validation is still authoritative. */ export interface PhiDetectionResult { containsPhi: boolean; phiTypes: PhiType[]; confidence: number; matches: PhiMatch[]; } export interface PhiMatch { type: PhiType; value: string; startIndex: number; endIndex: number; redacted: string; } export type PhiType = "ssn" | "phone" | "email" | "mrn" | "account" | "ip_address" | "dob" | "name"; // Pattern definitions matching backend const PHI_PATTERNS: Record = { ssn: /\b\d{3}[- ]?\d{2}[- ]?\d{4}\b/g, phone: /\b(?:\+?1[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b/g, email: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g, mrn: /\b(?:MRN|mrn|medical record|record number)[\s:-]?\d{6,}\b/gi, account: /\b(?:ACCT|acct|account)[\s:-]?\d{6,}\b/gi, ip_address: /\b(?:\d{1,3}\.){3}\d{1,3}\b/g, dob: /\b(?:born|dob|date of birth|birthday)[\s:]?(?:0?[1-9]|1[0-2])[/-](?:0?[1-9]|[12][0-9]|3[01])[/-](?:19|20)\d{2}\b/gi, name: /\b[A-Z][a-z]+ [A-Z][a-z]+\b/g, }; // Medical terms to exclude from name detection const MEDICAL_TERMS = new Set([ "heart disease", "blood pressure", "diabetes mellitus", "atrial fibrillation", "chronic kidney", "coronary artery", "pulmonary embolism", "myocardial infarction", // ... extend as needed ]); export class PhiDetector { /** * Detect PHI in text */ detect(text: string): PhiDetectionResult { if (!text) { return { containsPhi: false, phiTypes: [], confidence: 1, matches: [] }; } const matches: PhiMatch[] = []; const phiTypes = new Set(); for (const [type, pattern] of Object.entries(PHI_PATTERNS) as [PhiType, RegExp][]) { // Reset regex state pattern.lastIndex = 0; let match: RegExpExecArray | null; while ((match = pattern.exec(text)) !== null) { // Filter out medical terms for name detection if (type === "name" && MEDICAL_TERMS.has(match[0].toLowerCase())) { continue; } matches.push({ type, value: match[0], startIndex: match.index, endIndex: match.index + match[0].length, redacted: this.redactValue(type, match[0]), }); phiTypes.add(type); } } return { containsPhi: matches.length > 0, phiTypes: Array.from(phiTypes), confidence: 0.8, // Pattern matching confidence matches, }; } /** * Sanitize text by redacting detected PHI */ sanitize(text: string): string { const result = this.detect(text); let sanitized = text; // Process matches in reverse order to preserve indices const sortedMatches = [...result.matches].sort((a, b) => b.startIndex - a.startIndex); for (const match of sortedMatches) { sanitized = sanitized.slice(0, match.startIndex) + match.redacted + sanitized.slice(match.endIndex); } return sanitized; } private redactValue(type: PhiType, value: string): string { return `[${type.toUpperCase()}_REDACTED]`; } } // Singleton instance export const phiDetector = new PhiDetector(); ``` #### 2.3.2 PHI Detection Hook **File:** `apps/web-app/src/hooks/usePhiDetection.ts` ```typescript import { useState, useCallback, useMemo } from "react"; import { phiDetector, PhiDetectionResult } from "../services/phi/PhiDetector"; import { useDebounce } from "./useDebounce"; interface UsePhiDetectionOptions { debounceMs?: number; onPhiDetected?: (result: PhiDetectionResult) => void; } export function usePhiDetection(options: UsePhiDetectionOptions = {}) { const { debounceMs = 300, onPhiDetected } = options; const [text, setText] = useState(""); const [result, setResult] = useState(null); const [showWarning, setShowWarning] = useState(false); const debouncedText = useDebounce(text, debounceMs); // Run detection when debounced text changes useMemo(() => { if (debouncedText) { const detection = phiDetector.detect(debouncedText); setResult(detection); if (detection.containsPhi) { setShowWarning(true); onPhiDetected?.(detection); } } else { setResult(null); setShowWarning(false); } }, [debouncedText, onPhiDetected]); const checkText = useCallback((newText: string) => { setText(newText); }, []); const sanitizeText = useCallback(() => { return phiDetector.sanitize(text); }, [text]); const dismissWarning = useCallback(() => { setShowWarning(false); }, []); const acknowledgeAndProceed = useCallback(() => { // Log acknowledgment for audit console.info("[PHI] User acknowledged PHI warning and proceeded"); setShowWarning(false); return text; // Return original text if user chooses to proceed }, [text]); return { checkText, result, showWarning, dismissWarning, sanitizeText, acknowledgeAndProceed, }; } ``` #### 2.3.3 Encrypted Storage Service **File:** `apps/web-app/src/services/storage/EncryptedStorage.ts` ```typescript /** * Encrypted IndexedDB Storage * * Uses Web Crypto API for AES-GCM encryption of sensitive data. * Keys are derived from user authentication tokens. * * Use cases: * - Offline voice recordings awaiting sync * - Cached clinical context * - Session state */ import { openDB, DBSchema, IDBPDatabase } from "idb"; interface EncryptedStorageSchema extends DBSchema { "encrypted-data": { key: string; value: { id: string; encrypted: ArrayBuffer; iv: Uint8Array; timestamp: number; metadata?: Record; }; }; "session-audit": { key: number; value: { id: number; action: string; timestamp: number; details: Record; synced: boolean; }; indexes: { "by-synced": boolean }; }; } export class EncryptedStorage { private db: IDBPDatabase | null = null; private encryptionKey: CryptoKey | null = null; async init(userToken: string): Promise { // Derive encryption key from user token this.encryptionKey = await this.deriveKey(userToken); // Open IndexedDB this.db = await openDB("voiceassist-secure", 1, { upgrade(db) { db.createObjectStore("encrypted-data", { keyPath: "id" }); const auditStore = db.createObjectStore("session-audit", { keyPath: "id", autoIncrement: true, }); auditStore.createIndex("by-synced", "synced"); }, }); } private async deriveKey(token: string): Promise { const encoder = new TextEncoder(); const keyMaterial = await crypto.subtle.importKey("raw", encoder.encode(token), "PBKDF2", false, ["deriveKey"]); return crypto.subtle.deriveKey( { name: "PBKDF2", salt: encoder.encode("voiceassist-salt-v1"), // Static salt is OK for this use case iterations: 100000, hash: "SHA-256", }, keyMaterial, { name: "AES-GCM", length: 256 }, false, ["encrypt", "decrypt"], ); } async store(id: string, data: unknown, metadata?: Record): Promise { if (!this.db || !this.encryptionKey) { throw new Error("EncryptedStorage not initialized"); } const encoder = new TextEncoder(); const iv = crypto.getRandomValues(new Uint8Array(12)); const encrypted = await crypto.subtle.encrypt( { name: "AES-GCM", iv }, this.encryptionKey, encoder.encode(JSON.stringify(data)), ); await this.db.put("encrypted-data", { id, encrypted, iv, timestamp: Date.now(), metadata, }); } async retrieve(id: string): Promise { if (!this.db || !this.encryptionKey) { throw new Error("EncryptedStorage not initialized"); } const record = await this.db.get("encrypted-data", id); if (!record) return null; const decrypted = await crypto.subtle.decrypt( { name: "AES-GCM", iv: record.iv }, this.encryptionKey, record.encrypted, ); const decoder = new TextDecoder(); return JSON.parse(decoder.decode(decrypted)) as T; } async delete(id: string): Promise { if (!this.db) throw new Error("EncryptedStorage not initialized"); await this.db.delete("encrypted-data", id); } async logAuditEvent(action: string, details: Record): Promise { if (!this.db) throw new Error("EncryptedStorage not initialized"); await this.db.add("session-audit", { id: Date.now(), // Will be overwritten by autoIncrement action, timestamp: Date.now(), details, synced: false, }); } async getUnsyncedAuditEvents(): Promise< Array<{ id: number; action: string; timestamp: number; details: Record; }> > { if (!this.db) throw new Error("EncryptedStorage not initialized"); return this.db.getAllFromIndex("session-audit", "by-synced", false); } async markAuditEventsSynced(ids: number[]): Promise { if (!this.db) throw new Error("EncryptedStorage not initialized"); const tx = this.db.transaction("session-audit", "readwrite"); for (const id of ids) { const event = await tx.store.get(id); if (event) { await tx.store.put({ ...event, synced: true }); } } await tx.done; } async clear(): Promise { if (!this.db) throw new Error("EncryptedStorage not initialized"); await this.db.clear("encrypted-data"); await this.db.clear("session-audit"); } } export const encryptedStorage = new EncryptedStorage(); ``` #### 2.3.4 Session Audit Trail **File:** `apps/web-app/src/services/audit/AuditTrail.ts` ```typescript /** * Session Audit Trail * * Tracks user actions for HIPAA compliance and security monitoring. * Stores locally and syncs to backend audit service. */ import { encryptedStorage } from "../storage/EncryptedStorage"; export type AuditAction = | "session_start" | "session_end" | "message_sent" | "message_received" | "phi_warning_shown" | "phi_warning_acknowledged" | "phi_warning_dismissed" | "clinical_context_set" | "clinical_context_cleared" | "voice_mode_started" | "voice_mode_ended" | "file_uploaded" | "export_requested" | "navigation" | "error"; export interface AuditEvent { action: AuditAction; timestamp: number; sessionId: string; userId?: string; details: Record; } class AuditTrail { private sessionId: string; private userId: string | null = null; private syncInterval: ReturnType | null = null; constructor() { this.sessionId = this.generateSessionId(); } private generateSessionId(): string { return `${Date.now()}-${Math.random().toString(36).substring(2, 9)}`; } init(userId?: string): void { this.userId = userId ?? null; this.log("session_start", { userAgent: navigator.userAgent }); // Sync every 30 seconds this.syncInterval = setInterval(() => this.sync(), 30000); // Sync on page unload window.addEventListener("beforeunload", () => { this.log("session_end", {}); this.sync(); // Best effort sync }); } log(action: AuditAction, details: Record): void { const event: AuditEvent = { action, timestamp: Date.now(), sessionId: this.sessionId, userId: this.userId ?? undefined, details, }; // Store locally encryptedStorage .logAuditEvent(action, { ...details, sessionId: this.sessionId, userId: this.userId, }) .catch(console.error); // Also log to console in development if (process.env.NODE_ENV === "development") { console.debug("[Audit]", action, details); } } async sync(): Promise { try { const unsyncedEvents = await encryptedStorage.getUnsyncedAuditEvents(); if (unsyncedEvents.length === 0) return; // POST to backend audit endpoint const response = await fetch("/api/audit/batch", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ events: unsyncedEvents }), credentials: "include", }); if (response.ok) { await encryptedStorage.markAuditEventsSynced(unsyncedEvents.map((e) => e.id)); } } catch (error) { console.error("[Audit] Sync failed:", error); } } destroy(): void { if (this.syncInterval) { clearInterval(this.syncInterval); } } } export const auditTrail = new AuditTrail(); ``` ### 2.4 Implementation Tasks | Task | Priority | Effort | Dependencies | | ------------------------------------- | -------- | ------- | ---------------- | | Create PhiDetector service | HIGH | 6h | None | | Create usePhiDetection hook | HIGH | 3h | PhiDetector | | Build PhiWarningBanner component | HIGH | 4h | Hook | | Integrate PHI warnings in ChatInput | HIGH | 4h | Banner | | Create EncryptedStorage service | HIGH | 8h | idb library | | Create useEncryptedStorage hook | MEDIUM | 3h | EncryptedStorage | | Create AuditTrail service | HIGH | 6h | EncryptedStorage | | Create useAuditTrail hook | MEDIUM | 2h | AuditTrail | | Build SessionActivityLog component | LOW | 4h | AuditTrail | | Add backend /api/audit/batch endpoint | HIGH | 4h | None | | Write unit tests | HIGH | 8h | All components | | Write E2E tests for PHI flow | MEDIUM | 4h | Integration | | **Total** | | **56h** | | ### 2.5 Deliverables 1. `apps/web-app/src/services/phi/*` - PHI detection service and patterns 2. `apps/web-app/src/services/storage/*` - Encrypted IndexedDB storage 3. `apps/web-app/src/services/audit/*` - Audit trail service with sync 4. `apps/web-app/src/hooks/usePhi*.ts` - React hooks for security features 5. `apps/web-app/src/components/security/*` - Warning banners and activity log 6. Backend `/api/audit/batch` endpoint for audit sync 7. Unit and E2E tests with >80% coverage --- ## 3. Advanced RAG Techniques ### 3.1 Overview **Objective:** Significantly improve search quality through hybrid search, re-ranking, and medical-domain optimizations. **Current State:** Vector-only search using Qdrant with OpenAI embeddings (`search_aggregator.py`). No lexical search, no re-ranking. **Target State:** Hybrid search (semantic + BM25), cross-encoder re-ranking, medical synonym expansion, and metadata filtering. ### 3.2 Technical Architecture ``` services/api-gateway/app/services/ ├── search/ │ ├── search_aggregator.py ✅ Exists - Vector search only │ ├── hybrid_search.py 🔲 NEW - Combines vector + lexical │ ├── bm25_index.py 🔲 NEW - BM25 lexical search │ ├── cross_encoder.py 🔲 NEW - Re-ranking service │ ├── query_expansion.py 🔲 NEW - Medical synonym expansion │ └── contextual_retrieval.py 🔲 NEW - Chunk context enhancement ├── medical/ │ ├── synonym_database.py 🔲 NEW - UMLS/SNOMED synonyms │ └── abbreviation_expander.py 🔲 NEW - Medical abbreviations └── rag_service.py ✅ Exists - Main RAG orchestration External Dependencies: ├── Meilisearch (or Elasticsearch) - Lexical search engine ├── sentence-transformers - Cross-encoder models └── UMLS API (optional) - Medical synonyms ``` ### 3.3 Hybrid Search Architecture ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ User Query │ │ "What are the contraindications for ASA?" │ └────────────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ Query Preprocessor │ │ ┌───────────────┐ ┌─────────────────┐ ┌─────────────────────────┐ │ │ │ Query │ │ Abbreviation │ │ Synonym Expansion │ │ │ │ Cleaning │──▶│ Expansion │──▶│ (UMLS/SNOMED) │ │ │ │ │ │ "ASA"→"aspirin" │ │ "aspirin, acetylsalicylic" │ │ └───────────────┘ └─────────────────┘ └─────────────────────────┘ │ └────────────────────────────────┬────────────────────────────────────────┘ │ ┌────────────┴────────────┐ ▼ ▼ ┌─────────────────────────────┐ ┌─────────────────────────────┐ │ Semantic Search │ │ Lexical Search │ │ ┌───────────────────────┐ │ │ ┌───────────────────────┐ │ │ │ OpenAI Embeddings │ │ │ │ BM25 via Meilisearch │ │ │ │ text-embedding-3-small│ │ │ │ (exact keyword match) │ │ │ └───────────┬───────────┘ │ │ └───────────┬───────────┘ │ │ ▼ │ │ ▼ │ │ ┌───────────────────────┐ │ │ ┌───────────────────────┐ │ │ │ Qdrant Vector DB │ │ │ │ Meilisearch Index │ │ │ │ cosine similarity │ │ │ │ BM25 scoring │ │ │ └───────────┬───────────┘ │ │ └───────────┬───────────┘ │ │ ▼ │ │ ▼ │ │ Top K=50 semantic results │ │ Top K=50 lexical results │ └──────────────┬──────────────┘ └──────────────┬──────────────┘ │ │ └─────────────┬────────────────────┘ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ Reciprocal Rank Fusion (RRF) │ │ Combines results with formula: 1 / (k + rank) │ │ k=60 constant, deduplicates, normalizes scores │ └────────────────────────────────┬────────────────────────────────────────┘ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ Cross-Encoder Re-ranking │ │ ┌────────────────────────────────────────────────────────────────────┐ │ │ │ Model: cross-encoder/ms-marco-MiniLM-L-6-v2 │ │ │ │ Input: (query, passage) pairs │ │ │ │ Output: Relevance scores 0-1 │ │ │ │ Top 20 candidates → Re-ranked top 10 │ │ │ └────────────────────────────────────────────────────────────────────┘ │ └────────────────────────────────┬────────────────────────────────────────┘ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ Contextual Enrichment │ │ - Add surrounding paragraph context │ │ - Include document metadata (chapter, section) │ │ - Apply metadata filters (date, source type, specialty) │ └────────────────────────────────┬────────────────────────────────────────┘ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ Final Results (Top 10) │ │ [{ content, score, metadata, context, source }] │ └─────────────────────────────────────────────────────────────────────────┘ ``` ### 3.4 Component Specifications #### 3.4.1 Hybrid Search Service **File:** `services/api-gateway/app/services/search/hybrid_search.py` ```python """ Hybrid Search Service Combines semantic (vector) and lexical (BM25) search using Reciprocal Rank Fusion (RRF) for optimal retrieval. Research basis: - "Reciprocal Rank Fusion outperforms Condorcet and individual Rank Learning Methods" (Cormack et al., 2009) - Anthropic's "Contextual Retrieval" blog post (2024) """ from typing import List, Dict, Optional, Any from dataclasses import dataclass import asyncio import logging from .search_aggregator import SearchAggregator # Existing semantic search from .bm25_index import BM25Index from .cross_encoder import CrossEncoderReranker logger = logging.getLogger(__name__) @dataclass class HybridSearchResult: """Result from hybrid search""" doc_id: str content: str score: float semantic_rank: Optional[int] lexical_rank: Optional[int] rerank_score: Optional[float] metadata: Dict[str, Any] class HybridSearchService: """ Hybrid search combining semantic and lexical retrieval. Architecture: 1. Query preprocessing (synonym expansion, abbreviations) 2. Parallel semantic + lexical search 3. Reciprocal Rank Fusion 4. Cross-encoder re-ranking 5. Contextual enrichment """ def __init__( self, semantic_search: SearchAggregator, lexical_search: BM25Index, reranker: CrossEncoderReranker, semantic_weight: float = 0.5, rrf_k: int = 60, ): self.semantic_search = semantic_search self.lexical_search = lexical_search self.reranker = reranker self.semantic_weight = semantic_weight self.rrf_k = rrf_k async def search( self, query: str, top_k: int = 10, expand_synonyms: bool = True, rerank: bool = True, filters: Optional[Dict[str, Any]] = None, ) -> List[HybridSearchResult]: """ Execute hybrid search. Args: query: Search query top_k: Number of results to return expand_synonyms: Whether to expand medical synonyms rerank: Whether to apply cross-encoder re-ranking filters: Metadata filters (e.g., {"source_type": "guideline"}) Returns: List of hybrid search results """ # Step 1: Preprocess query expanded_query = query if expand_synonyms: expanded_query = await self._expand_query(query) logger.debug(f"Expanded query: {query} -> {expanded_query}") # Step 2: Parallel search retrieval_k = max(top_k * 5, 50) # Retrieve more for fusion semantic_task = self.semantic_search.search( expanded_query, top_k=retrieval_k, filter_conditions=filters, ) lexical_task = self.lexical_search.search( expanded_query, top_k=retrieval_k, filters=filters, ) semantic_results, lexical_results = await asyncio.gather( semantic_task, lexical_task ) # Step 3: Reciprocal Rank Fusion fused_results = self._reciprocal_rank_fusion( semantic_results, lexical_results, k=self.rrf_k, ) # Step 4: Re-ranking (optional) if rerank and len(fused_results) > 0: rerank_candidates = fused_results[:min(20, len(fused_results))] reranked = await self.reranker.rerank( query, [r.content for r in rerank_candidates] ) # Apply rerank scores for i, score in enumerate(reranked): if i < len(fused_results): fused_results[i].rerank_score = score # Sort by rerank score fused_results.sort(key=lambda x: x.rerank_score or 0, reverse=True) # Step 5: Return top K return fused_results[:top_k] def _reciprocal_rank_fusion( self, semantic_results: List[Any], lexical_results: List[Any], k: int = 60, ) -> List[HybridSearchResult]: """ Combine results using Reciprocal Rank Fusion. RRF score = Σ 1 / (k + rank) """ doc_scores: Dict[str, Dict] = {} # Process semantic results for rank, result in enumerate(semantic_results, 1): doc_id = result.doc_id rrf_score = 1 / (k + rank) if doc_id not in doc_scores: doc_scores[doc_id] = { "content": result.content, "metadata": result.metadata, "rrf_score": 0, "semantic_rank": None, "lexical_rank": None, } doc_scores[doc_id]["rrf_score"] += rrf_score * self.semantic_weight doc_scores[doc_id]["semantic_rank"] = rank # Process lexical results for rank, result in enumerate(lexical_results, 1): doc_id = result.doc_id rrf_score = 1 / (k + rank) if doc_id not in doc_scores: doc_scores[doc_id] = { "content": result.content, "metadata": result.metadata, "rrf_score": 0, "semantic_rank": None, "lexical_rank": None, } doc_scores[doc_id]["rrf_score"] += rrf_score * (1 - self.semantic_weight) doc_scores[doc_id]["lexical_rank"] = rank # Sort by RRF score and create results sorted_docs = sorted( doc_scores.items(), key=lambda x: x[1]["rrf_score"], reverse=True ) return [ HybridSearchResult( doc_id=doc_id, content=data["content"], score=data["rrf_score"], semantic_rank=data["semantic_rank"], lexical_rank=data["lexical_rank"], rerank_score=None, metadata=data["metadata"], ) for doc_id, data in sorted_docs ] async def _expand_query(self, query: str) -> str: """Expand query with medical synonyms and abbreviations.""" # Placeholder - implement with synonym_database.py return query ``` #### 3.4.2 BM25 Lexical Search **File:** `services/api-gateway/app/services/search/bm25_index.py` ```python """ BM25 Lexical Search using Meilisearch Meilisearch provides: - Fast BM25-based full-text search - Typo tolerance - Faceted filtering - Easy deployment (single binary) """ from typing import List, Dict, Optional, Any from dataclasses import dataclass import httpx import logging from ..core.config import settings logger = logging.getLogger(__name__) @dataclass class LexicalSearchResult: doc_id: str content: str score: float metadata: Dict[str, Any] class BM25Index: """ BM25 lexical search via Meilisearch. Index structure: - id: Document ID - content: Searchable text - title: Document title - source_type: "guideline" | "textbook" | "research" - specialty: Medical specialty - created_at: Timestamp """ def __init__( self, host: str = None, api_key: str = None, index_name: str = "kb_documents", ): self.host = host or settings.MEILISEARCH_HOST self.api_key = api_key or settings.MEILISEARCH_API_KEY self.index_name = index_name self.client = httpx.AsyncClient( base_url=self.host, headers={"Authorization": f"Bearer {self.api_key}"}, timeout=30.0, ) async def search( self, query: str, top_k: int = 50, filters: Optional[Dict[str, Any]] = None, ) -> List[LexicalSearchResult]: """ Execute BM25 search. Args: query: Search query top_k: Number of results filters: Metadata filters Returns: List of lexical search results """ # Build Meilisearch filter string filter_str = self._build_filter(filters) if filters else None payload = { "q": query, "limit": top_k, "attributesToRetrieve": ["id", "content", "title", "metadata"], "showRankingScore": True, } if filter_str: payload["filter"] = filter_str try: response = await self.client.post( f"/indexes/{self.index_name}/search", json=payload, ) response.raise_for_status() data = response.json() return [ LexicalSearchResult( doc_id=hit["id"], content=hit["content"], score=hit.get("_rankingScore", 0), metadata=hit.get("metadata", {}), ) for hit in data.get("hits", []) ] except Exception as e: logger.error(f"Meilisearch search failed: {e}") return [] async def index_document( self, doc_id: str, content: str, title: str, metadata: Dict[str, Any], ) -> bool: """Index a document for lexical search.""" try: await self.client.post( f"/indexes/{self.index_name}/documents", json=[{ "id": doc_id, "content": content, "title": title, **metadata, }], ) return True except Exception as e: logger.error(f"Failed to index document {doc_id}: {e}") return False async def delete_document(self, doc_id: str) -> bool: """Delete a document from the index.""" try: await self.client.delete( f"/indexes/{self.index_name}/documents/{doc_id}" ) return True except Exception as e: logger.error(f"Failed to delete document {doc_id}: {e}") return False def _build_filter(self, filters: Dict[str, Any]) -> str: """Build Meilisearch filter string from dict.""" conditions = [] for key, value in filters.items(): if isinstance(value, list): # OR condition for list values or_conditions = " OR ".join(f'{key} = "{v}"' for v in value) conditions.append(f"({or_conditions})") else: conditions.append(f'{key} = "{value}"') return " AND ".join(conditions) async def close(self): await self.client.aclose() ``` #### 3.4.3 Cross-Encoder Re-ranker **File:** `services/api-gateway/app/services/search/cross_encoder.py` ```python """ Cross-Encoder Re-ranking Service Uses sentence-transformers cross-encoder models for high-quality passage re-ranking. Cross-encoders process query-passage pairs together, enabling better relevance scoring than bi-encoders. Model choices: - cross-encoder/ms-marco-MiniLM-L-6-v2 (fast, good quality) - cross-encoder/ms-marco-MiniLM-L-12-v2 (slower, better quality) - BAAI/bge-reranker-base (good for general domain) """ from typing import List, Tuple import logging import torch from sentence_transformers import CrossEncoder logger = logging.getLogger(__name__) class CrossEncoderReranker: """ Re-ranks search results using a cross-encoder model. Architecture: - Query and each passage are concatenated and encoded together - Model outputs a relevance score for each pair - Results are sorted by relevance score """ def __init__( self, model_name: str = "cross-encoder/ms-marco-MiniLM-L-6-v2", device: str = None, max_length: int = 512, ): self.device = device or ("cuda" if torch.cuda.is_available() else "cpu") self.model = CrossEncoder(model_name, device=self.device, max_length=max_length) logger.info(f"Loaded cross-encoder model {model_name} on {self.device}") async def rerank( self, query: str, passages: List[str], batch_size: int = 16, ) -> List[float]: """ Re-rank passages for a query. Args: query: Search query passages: List of passage texts batch_size: Batch size for inference Returns: List of relevance scores (same order as passages) """ if not passages: return [] # Create query-passage pairs pairs = [[query, passage] for passage in passages] try: # Get scores (returns numpy array) scores = self.model.predict( pairs, batch_size=batch_size, show_progress_bar=False, ) # Convert to Python floats return [float(s) for s in scores] except Exception as e: logger.error(f"Cross-encoder re-ranking failed: {e}") # Return neutral scores on failure return [0.5] * len(passages) async def rerank_with_indices( self, query: str, passages: List[str], top_k: int = 10, ) -> List[Tuple[int, float]]: """ Re-rank and return top-k indices with scores. Returns: List of (original_index, score) tuples, sorted by score """ scores = await self.rerank(query, passages) # Pair indices with scores and sort indexed_scores = list(enumerate(scores)) indexed_scores.sort(key=lambda x: x[1], reverse=True) return indexed_scores[:top_k] ``` #### 3.4.4 Medical Synonym Expansion **File:** `services/api-gateway/app/services/medical/synonym_database.py` ```python """ Medical Synonym Database Provides medical term expansion using: 1. Static synonym dictionary (common terms) 2. Abbreviation expansion 3. Optional UMLS API integration This improves search recall by matching different representations of the same medical concept. """ from typing import List, Set, Dict, Optional import logging import re logger = logging.getLogger(__name__) class MedicalSynonymDatabase: """ Medical synonym and abbreviation expansion. """ def __init__(self, umls_api_key: Optional[str] = None): self.umls_api_key = umls_api_key # Static synonym dictionary (extensible) self.synonyms: Dict[str, Set[str]] = { # Cardiovascular "heart attack": {"myocardial infarction", "MI", "STEMI", "NSTEMI"}, "myocardial infarction": {"heart attack", "MI", "STEMI", "NSTEMI"}, "high blood pressure": {"hypertension", "HTN", "elevated BP"}, "hypertension": {"high blood pressure", "HTN", "elevated BP"}, "afib": {"atrial fibrillation", "AF", "a-fib"}, "atrial fibrillation": {"afib", "AF", "a-fib"}, # Medications "aspirin": {"ASA", "acetylsalicylic acid", "Bayer"}, "asa": {"aspirin", "acetylsalicylic acid"}, "metformin": {"glucophage", "metformin hydrochloride"}, "lisinopril": {"zestril", "prinivil", "ACE inhibitor"}, # Conditions "diabetes": {"diabetes mellitus", "DM", "type 2 diabetes", "T2DM"}, "ckd": {"chronic kidney disease", "renal insufficiency"}, "copd": {"chronic obstructive pulmonary disease", "emphysema"}, "dvt": {"deep vein thrombosis", "deep venous thrombosis"}, "pe": {"pulmonary embolism", "pulmonary embolus"}, # Symptoms "shortness of breath": {"dyspnea", "SOB", "breathlessness"}, "chest pain": {"angina", "chest discomfort"}, # Labs "cbc": {"complete blood count", "blood count"}, "bmp": {"basic metabolic panel", "chem 7"}, "cmp": {"comprehensive metabolic panel", "chem 14"}, "hba1c": {"hemoglobin a1c", "glycated hemoglobin", "a1c"}, } # Common medical abbreviations self.abbreviations: Dict[str, str] = { "MI": "myocardial infarction", "HTN": "hypertension", "DM": "diabetes mellitus", "CHF": "congestive heart failure", "CABG": "coronary artery bypass graft", "PCI": "percutaneous coronary intervention", "CVA": "cerebrovascular accident", "TIA": "transient ischemic attack", "DVT": "deep vein thrombosis", "PE": "pulmonary embolism", "COPD": "chronic obstructive pulmonary disease", "CKD": "chronic kidney disease", "UTI": "urinary tract infection", "BID": "twice daily", "TID": "three times daily", "QID": "four times daily", "PRN": "as needed", "PO": "by mouth", "IV": "intravenous", "IM": "intramuscular", "SC": "subcutaneous", "ASA": "aspirin", "NSAID": "nonsteroidal anti-inflammatory drug", "ACE": "angiotensin converting enzyme", "ARB": "angiotensin receptor blocker", "CBC": "complete blood count", "BMP": "basic metabolic panel", "CMP": "comprehensive metabolic panel", "LFT": "liver function test", "TSH": "thyroid stimulating hormone", "ECG": "electrocardiogram", "EKG": "electrocardiogram", } def expand_query(self, query: str) -> str: """ Expand query with medical synonyms. Returns expanded query with OR-joined synonyms. Example: "ASA contraindications" -> "(aspirin OR ASA OR acetylsalicylic acid) contraindications" """ words = query.lower().split() expanded_parts = [] i = 0 while i < len(words): # Try multi-word matches (up to 3 words) matched = False for n in range(min(3, len(words) - i), 0, -1): phrase = " ".join(words[i:i+n]) if phrase in self.synonyms: synonyms = self.synonyms[phrase] all_terms = {phrase} | synonyms expanded_parts.append(f"({' OR '.join(all_terms)})") i += n matched = True break # Check abbreviations (single word) if n == 1 and phrase.upper() in self.abbreviations: expanded = self.abbreviations[phrase.upper()] expanded_parts.append(f"({phrase} OR {expanded})") i += 1 matched = True break if not matched: expanded_parts.append(words[i]) i += 1 return " ".join(expanded_parts) def get_synonyms(self, term: str) -> Set[str]: """Get synonyms for a specific term.""" term_lower = term.lower() # Check direct synonyms if term_lower in self.synonyms: return self.synonyms[term_lower] # Check abbreviations if term.upper() in self.abbreviations: expanded = self.abbreviations[term.upper()] return {expanded} return set() def add_synonym_group(self, terms: List[str]) -> None: """Add a group of synonymous terms.""" term_set = set(t.lower() for t in terms) for term in term_set: existing = self.synonyms.get(term, set()) self.synonyms[term] = existing | (term_set - {term}) ``` ### 3.5 Implementation Tasks | Task | Priority | Effort | Dependencies | | ------------------------------------ | -------- | ------- | --------------------- | | Set up Meilisearch server | HIGH | 4h | Infrastructure | | Create BM25Index service | HIGH | 8h | Meilisearch | | Create HybridSearchService | HIGH | 12h | BM25Index | | Implement RRF fusion algorithm | HIGH | 4h | HybridSearch | | Install sentence-transformers | MEDIUM | 2h | Python env | | Create CrossEncoderReranker | HIGH | 8h | sentence-transformers | | Create MedicalSynonymDatabase | MEDIUM | 6h | None | | Integrate query expansion | MEDIUM | 4h | SynonymDB | | Add contextual chunk metadata | MEDIUM | 6h | DB schema | | Create metadata filtering API | MEDIUM | 4h | HybridSearch | | Index existing KB in Meilisearch | HIGH | 4h | Meilisearch | | Benchmark search quality (MRR, NDCG) | HIGH | 8h | All components | | Write unit tests | HIGH | 12h | All components | | Write integration tests | HIGH | 8h | All components | | Performance tuning (latency < 200ms) | HIGH | 8h | All components | | **Total** | | **98h** | | ### 3.6 Deliverables 1. `services/api-gateway/app/services/search/hybrid_search.py` - Main hybrid search 2. `services/api-gateway/app/services/search/bm25_index.py` - Meilisearch integration 3. `services/api-gateway/app/services/search/cross_encoder.py` - Re-ranking service 4. `services/api-gateway/app/services/medical/synonym_database.py` - Medical synonyms 5. Meilisearch deployment configuration (Docker Compose) 6. KB indexing scripts for Meilisearch 7. Updated RAG service using hybrid search 8. Search quality benchmarks (MRR@10, NDCG@10) 9. API documentation for new search endpoints --- ## 4. Continuous Learning System ### 4.1 Overview **Objective:** Create infrastructure for collecting user feedback, improving model performance, and enabling data-driven KB curation. **Current State:** Sentry for error tracking. No feedback collection or A/B testing. **Target State:** Comprehensive feedback system with thumbs up/down, KB curation dashboard, A/B testing framework, and analytics. ### 4.2 Technical Architecture ``` services/api-gateway/app/ ├── services/ │ ├── feedback/ │ │ ├── feedback_service.py 🔲 NEW - Feedback collection │ │ ├── feedback_analyzer.py 🔲 NEW - Sentiment analysis │ │ └── feedback_export.py 🔲 NEW - Export for fine-tuning │ ├── ab_testing/ │ │ ├── experiment_manager.py 🔲 NEW - A/B test management │ │ ├── variant_selector.py 🔲 NEW - User variant assignment │ │ └── metrics_collector.py 🔲 NEW - Experiment metrics │ └── analytics/ │ ├── search_analytics.py 🔲 NEW - Search quality metrics │ ├── usage_analytics.py 🔲 NEW - Usage patterns │ └── dashboard_service.py 🔲 NEW - Analytics API apps/admin-panel/src/ ├── pages/ │ ├── KBCurationDashboard.tsx 🔲 NEW - KB curation UI │ ├── FeedbackReview.tsx 🔲 NEW - Feedback review UI │ └── ABTestingDashboard.tsx 🔲 NEW - A/B test management Database: ├── feedback table 🔲 NEW ├── experiments table 🔲 NEW ├── experiment_assignments table 🔲 NEW ├── search_metrics table 🔲 NEW └── usage_events table 🔲 NEW ``` ### 4.3 Database Schema ```sql -- Feedback collection CREATE TABLE feedback ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id UUID REFERENCES users(id), message_id UUID REFERENCES messages(id), conversation_id UUID REFERENCES conversations(id), -- Feedback data rating feedback_type NOT NULL, -- 'positive', 'negative', 'neutral' category VARCHAR(50), -- 'accuracy', 'relevance', 'clarity', 'other' comment TEXT, -- Context query TEXT, response_snippet TEXT, search_results JSONB, -- What was retrieved model_used VARCHAR(100), -- Metadata created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), processed BOOLEAN DEFAULT FALSE, processed_at TIMESTAMPTZ ); CREATE TYPE feedback_type AS ENUM ('positive', 'negative', 'neutral'); CREATE INDEX idx_feedback_rating ON feedback(rating); CREATE INDEX idx_feedback_unprocessed ON feedback(processed) WHERE processed = FALSE; -- A/B Testing experiments CREATE TABLE experiments ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), name VARCHAR(255) NOT NULL UNIQUE, description TEXT, -- Variants variants JSONB NOT NULL, -- [{"id": "control", "weight": 50}, {"id": "treatment", "weight": 50}] -- Configuration target_metric VARCHAR(100), -- 'search_mrr', 'feedback_positive_rate' min_sample_size INTEGER DEFAULT 1000, -- Status status experiment_status NOT NULL DEFAULT 'draft', started_at TIMESTAMPTZ, ended_at TIMESTAMPTZ, -- Results results JSONB, winner_variant VARCHAR(100), created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE TYPE experiment_status AS ENUM ('draft', 'running', 'paused', 'completed', 'archived'); -- User variant assignments CREATE TABLE experiment_assignments ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), experiment_id UUID REFERENCES experiments(id) ON DELETE CASCADE, user_id UUID NOT NULL, -- Can be anonymous user ID variant_id VARCHAR(100) NOT NULL, assigned_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), UNIQUE(experiment_id, user_id) ); -- Search quality metrics CREATE TABLE search_metrics ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), query TEXT NOT NULL, query_hash VARCHAR(64) NOT NULL, -- For aggregation -- Retrieval metrics results_count INTEGER, top_result_score FLOAT, mrr FLOAT, -- Mean Reciprocal Rank ndcg FLOAT, -- Normalized Discounted Cumulative Gain -- User interaction clicked_result_position INTEGER, time_to_click_ms INTEGER, -- Context user_id UUID, experiment_id UUID REFERENCES experiments(id), variant_id VARCHAR(100), created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE INDEX idx_search_metrics_query_hash ON search_metrics(query_hash); CREATE INDEX idx_search_metrics_experiment ON search_metrics(experiment_id, variant_id); ``` ### 4.4 Component Specifications #### 4.4.1 Feedback Service **File:** `services/api-gateway/app/services/feedback/feedback_service.py` ```python """ Feedback Collection Service Collects user feedback on AI responses for: 1. Quality monitoring 2. Fine-tuning data preparation 3. KB content curation """ from typing import Optional, List, Dict, Any from uuid import UUID from datetime import datetime import logging from sqlalchemy.orm import Session from sqlalchemy import func from ...models.feedback import Feedback, FeedbackType from ...core.database import get_db logger = logging.getLogger(__name__) class FeedbackService: """ Manages user feedback collection and analysis. """ async def submit_feedback( self, db: Session, user_id: Optional[UUID], message_id: UUID, conversation_id: UUID, rating: FeedbackType, category: Optional[str] = None, comment: Optional[str] = None, context: Optional[Dict[str, Any]] = None, ) -> Feedback: """ Submit user feedback for a message. Args: user_id: User who submitted feedback (optional for anonymous) message_id: Message being rated conversation_id: Parent conversation rating: positive, negative, or neutral category: Feedback category (accuracy, relevance, clarity, other) comment: Optional text comment context: Additional context (query, search results, etc.) Returns: Created Feedback object """ feedback = Feedback( user_id=user_id, message_id=message_id, conversation_id=conversation_id, rating=rating, category=category, comment=comment, query=context.get("query") if context else None, response_snippet=context.get("response_snippet") if context else None, search_results=context.get("search_results") if context else None, model_used=context.get("model_used") if context else None, ) db.add(feedback) db.commit() db.refresh(feedback) logger.info( f"Feedback submitted: {rating.value} for message {message_id}" ) return feedback async def get_feedback_stats( self, db: Session, start_date: Optional[datetime] = None, end_date: Optional[datetime] = None, ) -> Dict[str, Any]: """Get aggregated feedback statistics.""" query = db.query(Feedback) if start_date: query = query.filter(Feedback.created_at >= start_date) if end_date: query = query.filter(Feedback.created_at <= end_date) total = query.count() # Count by rating rating_counts = ( query .with_entities(Feedback.rating, func.count(Feedback.id)) .group_by(Feedback.rating) .all() ) # Count by category category_counts = ( query .filter(Feedback.category.isnot(None)) .with_entities(Feedback.category, func.count(Feedback.id)) .group_by(Feedback.category) .all() ) return { "total": total, "by_rating": {r.value: c for r, c in rating_counts}, "by_category": dict(category_counts), "positive_rate": ( next((c for r, c in rating_counts if r == FeedbackType.POSITIVE), 0) / total if total > 0 else 0 ), } async def get_negative_feedback( self, db: Session, limit: int = 100, unprocessed_only: bool = True, ) -> List[Feedback]: """ Get negative feedback for review. Used by KB curation dashboard to identify content issues. """ query = ( db.query(Feedback) .filter(Feedback.rating == FeedbackType.NEGATIVE) .order_by(Feedback.created_at.desc()) ) if unprocessed_only: query = query.filter(Feedback.processed == False) return query.limit(limit).all() async def mark_processed( self, db: Session, feedback_ids: List[UUID], ) -> int: """Mark feedback as processed after review.""" updated = ( db.query(Feedback) .filter(Feedback.id.in_(feedback_ids)) .update( {"processed": True, "processed_at": datetime.utcnow()}, synchronize_session=False, ) ) db.commit() return updated async def export_for_fine_tuning( self, db: Session, min_rating: FeedbackType = FeedbackType.POSITIVE, limit: int = 10000, ) -> List[Dict[str, Any]]: """ Export feedback data formatted for fine-tuning. Returns data in OpenAI fine-tuning format: {"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]} """ feedbacks = ( db.query(Feedback) .filter(Feedback.rating == min_rating) .filter(Feedback.query.isnot(None)) .filter(Feedback.response_snippet.isnot(None)) .limit(limit) .all() ) return [ { "messages": [ {"role": "user", "content": f.query}, {"role": "assistant", "content": f.response_snippet}, ] } for f in feedbacks ] ``` #### 4.4.2 A/B Testing Manager **File:** `services/api-gateway/app/services/ab_testing/experiment_manager.py` ```python """ A/B Testing Experiment Manager Enables controlled experiments for: - Search algorithm variants - Model versions - Prompt variations - UI changes """ from typing import Optional, List, Dict, Any from uuid import UUID from datetime import datetime import hashlib import logging from sqlalchemy.orm import Session from ...models.experiment import Experiment, ExperimentAssignment, ExperimentStatus from ...core.database import get_db logger = logging.getLogger(__name__) class ExperimentManager: """ Manages A/B testing experiments. Features: - Consistent user-to-variant assignment (sticky) - Weighted variant distribution - Statistical significance tracking """ async def create_experiment( self, db: Session, name: str, description: str, variants: List[Dict[str, Any]], target_metric: str, min_sample_size: int = 1000, ) -> Experiment: """ Create a new experiment. Args: name: Unique experiment name description: Experiment description variants: List of variants with weights [{"id": "control", "weight": 50, "config": {...}}] target_metric: Primary metric to track min_sample_size: Minimum samples before significance Returns: Created Experiment object """ experiment = Experiment( name=name, description=description, variants=variants, target_metric=target_metric, min_sample_size=min_sample_size, status=ExperimentStatus.DRAFT, ) db.add(experiment) db.commit() db.refresh(experiment) logger.info(f"Created experiment: {name}") return experiment async def start_experiment( self, db: Session, experiment_id: UUID, ) -> Experiment: """Start an experiment.""" experiment = db.query(Experiment).get(experiment_id) if not experiment: raise ValueError(f"Experiment {experiment_id} not found") experiment.status = ExperimentStatus.RUNNING experiment.started_at = datetime.utcnow() db.commit() logger.info(f"Started experiment: {experiment.name}") return experiment async def get_variant_for_user( self, db: Session, experiment_id: UUID, user_id: str, ) -> Optional[Dict[str, Any]]: """ Get or assign variant for a user. Uses consistent hashing for sticky assignment. Args: experiment_id: Experiment ID user_id: User identifier (can be anonymous) Returns: Variant configuration or None if experiment not running """ experiment = db.query(Experiment).get(experiment_id) if not experiment or experiment.status != ExperimentStatus.RUNNING: return None # Check existing assignment assignment = ( db.query(ExperimentAssignment) .filter( ExperimentAssignment.experiment_id == experiment_id, ExperimentAssignment.user_id == user_id, ) .first() ) if assignment: # Return existing variant return self._get_variant_config(experiment, assignment.variant_id) # Assign new variant using consistent hashing variant_id = self._select_variant(experiment, user_id) new_assignment = ExperimentAssignment( experiment_id=experiment_id, user_id=user_id, variant_id=variant_id, ) db.add(new_assignment) db.commit() return self._get_variant_config(experiment, variant_id) def _select_variant(self, experiment: Experiment, user_id: str) -> str: """ Select variant using consistent hashing. Ensures same user always gets same variant. """ # Hash user_id + experiment_id for consistent assignment hash_input = f"{experiment.id}:{user_id}" hash_value = int(hashlib.sha256(hash_input.encode()).hexdigest(), 16) # Calculate bucket (0-99) bucket = hash_value % 100 # Assign based on cumulative weights cumulative = 0 for variant in experiment.variants: cumulative += variant["weight"] if bucket < cumulative: return variant["id"] # Fallback to last variant return experiment.variants[-1]["id"] def _get_variant_config( self, experiment: Experiment, variant_id: str ) -> Dict[str, Any]: """Get variant configuration by ID.""" for variant in experiment.variants: if variant["id"] == variant_id: return variant return {"id": variant_id} async def record_metric( self, db: Session, experiment_id: UUID, user_id: str, metric_name: str, metric_value: float, ) -> None: """Record a metric for an experiment.""" # Get user's variant assignment = ( db.query(ExperimentAssignment) .filter( ExperimentAssignment.experiment_id == experiment_id, ExperimentAssignment.user_id == user_id, ) .first() ) if not assignment: logger.warning( f"No assignment found for user {user_id} in experiment {experiment_id}" ) return # Record metric (implementation depends on metrics storage) logger.debug( f"Recorded metric {metric_name}={metric_value} " f"for variant {assignment.variant_id}" ) async def get_experiment_results( self, db: Session, experiment_id: UUID, ) -> Dict[str, Any]: """ Get experiment results with statistical analysis. Returns: { "variants": [ {"id": "control", "sample_size": 500, "metric_mean": 0.65, ...}, {"id": "treatment", "sample_size": 520, "metric_mean": 0.72, ...}, ], "p_value": 0.023, "significant": True, "winner": "treatment", } """ # Implementation would include statistical significance calculation # using scipy.stats for t-test or chi-squared test pass ``` #### 4.4.3 KB Curation Dashboard (Frontend) **File:** `apps/admin-panel/src/pages/KBCurationDashboard.tsx` ```tsx /** * KB Curation Dashboard * * Allows admins to: * - Review negative feedback * - Identify problematic content * - Update/remove KB entries * - Track content quality metrics */ import React, { useState } from "react"; import { useQuery, useMutation } from "@tanstack/react-query"; import { Card, Table, Badge, Button, Tabs } from "@voiceassist/ui"; interface FeedbackItem { id: string; rating: "positive" | "negative" | "neutral"; category: string; comment: string; query: string; responseSnippet: string; searchResults: Array<{ docId: string; content: string; score: number }>; createdAt: string; processed: boolean; } export function KBCurationDashboard() { const [activeTab, setActiveTab] = useState<"feedback" | "metrics" | "content">("feedback"); const { data: feedback, isLoading } = useQuery({ queryKey: ["feedback", "negative"], queryFn: () => fetch("/api/admin/feedback?rating=negative&unprocessed=true").then((r) => r.json()), }); const { data: stats } = useQuery({ queryKey: ["feedback", "stats"], queryFn: () => fetch("/api/admin/feedback/stats").then((r) => r.json()), }); const markProcessed = useMutation({ mutationFn: (ids: string[]) => fetch("/api/admin/feedback/mark-processed", { method: "POST", body: JSON.stringify({ ids }), headers: { "Content-Type": "application/json" }, }), }); return (

KB Curation Dashboard

{/* Stats Overview */}
Total Feedback
{stats?.total || 0}
Positive Rate
{((stats?.positive_rate || 0) * 100).toFixed(1)}%
Unprocessed
{stats?.unprocessed || 0}
This Week
{stats?.this_week || 0}
{/* Tabs */} Negative Feedback Search Metrics Content Issues Query Category Comment Date Actions {feedback?.items?.map((item: FeedbackItem) => ( {item.query} {item.category} {item.comment || "-"} {new Date(item.createdAt).toLocaleDateString()}
))}
{/* Search quality metrics charts */}

Search Quality Metrics

{/* Charts for MRR, NDCG, click-through rates */}
Charts coming soon - integrate with your preferred charting library
{/* Content issues detected from feedback patterns */}

Detected Content Issues

AI-detected patterns in negative feedback pointing to specific KB content

); } ``` ### 4.5 Implementation Tasks | Task | Priority | Effort | Dependencies | | ---------------------------------- | -------- | ------- | ----------------- | | Create feedback database schema | HIGH | 2h | None | | Create FeedbackService | HIGH | 6h | Schema | | Create feedback API endpoints | HIGH | 4h | Service | | Build feedback UI component | HIGH | 6h | API | | Create experiment database schema | MEDIUM | 2h | None | | Create ExperimentManager | MEDIUM | 8h | Schema | | Create variant selection logic | MEDIUM | 4h | ExperimentManager | | Build A/B testing dashboard | MEDIUM | 8h | ExperimentManager | | Create search metrics collection | HIGH | 6h | Search service | | Build KBCurationDashboard | HIGH | 12h | Feedback API | | Create fine-tuning export endpoint | LOW | 4h | FeedbackService | | Integrate feedback into chat UI | HIGH | 4h | Feedback UI | | Write unit tests | HIGH | 8h | All services | | Write integration tests | MEDIUM | 6h | All services | | **Total** | | **80h** | | ### 4.6 Deliverables 1. Database migrations for feedback, experiments, metrics tables 2. `services/api-gateway/app/services/feedback/*` - Feedback service 3. `services/api-gateway/app/services/ab_testing/*` - A/B testing framework 4. `services/api-gateway/app/services/analytics/*` - Search/usage analytics 5. `apps/admin-panel/src/pages/KBCurationDashboard.tsx` - Curation UI 6. `apps/admin-panel/src/pages/ABTestingDashboard.tsx` - A/B test management 7. `apps/web-app/src/components/FeedbackButton.tsx` - In-chat feedback 8. API documentation for feedback and experiments 9. Unit and integration tests --- ## Implementation Phases ### Phase 1: Foundation (Weeks 1-3) **Focus:** Design system and security foundations | Week | Tasks | | ---- | -------------------------------------------------------- | | 1 | Animation/shadow tokens, encryption storage setup | | 2 | Medical UI components, PHI detector, audit trail | | 3 | Storybook docs, PHI warnings integration, security tests | **Deliverables:** - Complete design token system - Client-side PHI detection with warnings - Encrypted IndexedDB storage - Session audit trail ### Phase 2: Advanced Search (Weeks 4-7) **Focus:** Hybrid search and re-ranking | Week | Tasks | | ---- | --------------------------------------------- | | 4 | Meilisearch setup, BM25 index service | | 5 | Hybrid search service, RRF fusion | | 6 | Cross-encoder re-ranker, medical synonyms | | 7 | Integration, benchmarking, performance tuning | **Deliverables:** - Hybrid search (semantic + BM25) - Cross-encoder re-ranking - Medical synonym expansion - Search quality benchmarks ### Phase 3: Continuous Learning (Weeks 8-11) **Focus:** Feedback and analytics | Week | Tasks | | ---- | ----------------------------------------- | | 8 | Feedback schema, service, API | | 9 | Feedback UI, chat integration | | 10 | A/B testing framework, experiment manager | | 11 | KB curation dashboard, analytics | **Deliverables:** - Feedback collection system - A/B testing framework - KB curation dashboard - Search analytics ### Phase 4: Polish & Documentation (Weeks 12-14) **Focus:** Testing, optimization, documentation | Week | Tasks | | ---- | -------------------------------------- | | 12 | End-to-end testing, bug fixes | | 13 | Performance optimization, load testing | | 14 | Documentation, deployment guides | **Deliverables:** - Comprehensive test coverage (>80%) - Performance targets met (<200ms search) - Complete documentation --- ## Technical Architecture ### System Integration Diagram ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ Frontend │ │ ┌─────────────┐ ┌──────────────┐ ┌─────────────┐ ┌──────────────┐ │ │ │ Design │ │ PHI │ │ Encrypted │ │ Feedback │ │ │ │ System │ │ Detection │ │ Storage │ │ Collection │ │ │ │ (tokens) │ │ (warnings) │ │ (IndexedDB) │ │ (thumbs) │ │ │ └──────┬──────┘ └──────┬───────┘ └──────┬──────┘ └──────┬───────┘ │ └─────────│────────────────│─────────────────│─────────────────│──────────┘ │ │ │ │ │ │ │ ▼ │ │ │ ┌─────────────────────┐ │ │ │ │ Feedback API │ │ │ │ │ /api/feedback │ │ │ │ └──────────┬──────────┘ │ │ │ │ │ │ ▼ │ │ │ ┌─────────────────────┐ │ │ │ │ Audit API │ │ │ │ │ /api/audit/batch │ │ │ │ └──────────┬──────────┘ │ │ │ │ │ │ ▼ ▼ ▼ │ ┌───────────────────────────────────────────────────────┐ │ │ API Gateway │ │ │ ┌─────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ │ Audit │ │ A/B Testing │ │ Feedback │ │ │ │ │ Service │ │ Manager │ │ Service │ │ │ │ └─────────────┘ └──────────────┘ └──────────────┘ │ │ │ │ │ │ ┌────────────────────────────────────────────────┐ │ │ │ │ Hybrid Search Service │ │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │ │ │ │ │ │ Semantic │ │ Lexical │ │ Cross-Encoder│ │ │ │ │ │ │ (Qdrant) │ │ (Meili) │ │ Re-ranker │ │ │ │ │ │ └────┬─────┘ └────┬─────┘ └──────────────┘ │ │ │ │ │ │ │ │ │ │ │ │ └──────┬──────┘ │ │ │ │ │ ▼ │ │ │ │ │ ┌─────────────────┐ │ │ │ │ │ │ RRF Fusion │ │ │ │ │ │ └─────────────────┘ │ │ │ │ └────────────────────────────────────────────────┘ │ │ └───────────────────────────────────────────────────────┘ │ │ │ ▼ │ ┌───────────────────────────────────────────────────────┐ │ │ Data Layer │ │ │ ┌─────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ │ PostgreSQL │ │ Qdrant │ │ Meilisearch │ │ │ │ │ (feedback, │ │ (vectors) │ │ (BM25) │ │ │ │ │ audit, etc)│ │ │ │ │ │ │ │ └─────────────┘ └──────────────┘ └──────────────┘ │ │ └───────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ Admin Panel │ │ ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────────┐ │ │ │ KB Curation │ │ A/B Testing │ │ Analytics │ │ │ │ Dashboard │ │ Dashboard │ │ Dashboard │ │ │ └─────────────────┘ └──────────────────┘ └─────────────────────┘ │ └─────────────────────────────────────────────────────────────────────────┘ ``` --- ## Risk Assessment | Risk | Likelihood | Impact | Mitigation | | ------------------------------- | ---------- | ------ | ---------------------------------------- | | Meilisearch performance issues | Medium | High | Load testing, fallback to vector-only | | Cross-encoder latency too high | Medium | Medium | GPU inference, model distillation | | PHI false positives annoy users | High | Medium | Tunable sensitivity, user acknowledgment | | A/B test statistical errors | Low | High | Proper sample sizes, multiple metrics | | IndexedDB encryption key loss | Low | Medium | Key derivation from auth, recovery flow | | Search quality regression | Medium | High | Continuous benchmarking, rollback plan | --- ## Success Metrics ### Design System - **Component coverage:** 100% of UI components use design tokens - **Storybook docs:** All components documented with examples - **Theme consistency:** Zero visual inconsistencies between light/dark ### Security - **PHI detection rate:** >95% of PHI patterns caught - **Audit coverage:** 100% of sensitive actions logged - **Storage encryption:** All offline data encrypted ### Search Quality - **MRR@10:** >0.65 (baseline: ~0.50 with vector-only) - **NDCG@10:** >0.70 (baseline: ~0.55) - **Latency P95:** <200ms (including re-ranking) ### Continuous Learning - **Feedback collection rate:** >10% of conversations get feedback - **A/B test velocity:** Ability to run 2+ experiments simultaneously - **KB improvement cycle:** <1 week from feedback to content update --- ## Appendices ### A. Meilisearch Deployment ```yaml # docker-compose.meilisearch.yml version: "3.8" services: meilisearch: image: getmeili/meilisearch:v1.6 ports: - "7700:7700" volumes: - meilisearch_data:/meili_data environment: - MEILI_ENV=production - MEILI_MASTER_KEY=${MEILISEARCH_MASTER_KEY} - MEILI_NO_ANALYTICS=true restart: unless-stopped volumes: meilisearch_data: ``` ### B. Cross-Encoder Model Comparison | Model | Latency (20 passages) | Quality (MS MARCO) | | ------------------------------------- | --------------------- | ------------------ | | cross-encoder/ms-marco-MiniLM-L-6-v2 | ~50ms (CPU) | 0.373 MRR | | cross-encoder/ms-marco-MiniLM-L-12-v2 | ~100ms (CPU) | 0.388 MRR | | BAAI/bge-reranker-base | ~80ms (CPU) | 0.385 MRR | | BAAI/bge-reranker-large | ~150ms (CPU) | 0.392 MRR | **Recommendation:** Start with MiniLM-L-6-v2 for latency, upgrade if quality insufficient. ### C. Feedback Categories | Category | Description | Action | | ------------ | ------------------------------------ | ------------------------------ | | accuracy | Factually incorrect information | Review source KB, flag content | | relevance | Answer not relevant to question | Improve search, prompt tuning | | clarity | Answer unclear or confusing | Prompt engineering | | completeness | Answer missing important information | Expand KB content | | other | General feedback | Manual review | --- _Last updated: November 26, 2025_ _Based on VoiceAssist main branch post-Phase 12_ 6:["slug","PART3_PLATFORM_ENHANCEMENTS_PLAN","c"] 0:["X7oMT3VrOffzp0qvbeOas",[[["",{"children":["docs",{"children":[["slug","PART3_PLATFORM_ENHANCEMENTS_PLAN","c"],{"children":["__PAGE__?{\"slug\":[\"PART3_PLATFORM_ENHANCEMENTS_PLAN\"]}",{}]}]}]},"$undefined","$undefined",true],["",{"children":["docs",{"children":[["slug","PART3_PLATFORM_ENHANCEMENTS_PLAN","c"],{"children":["__PAGE__",{},[["$L1",["$","div",null,{"children":[["$","div",null,{"className":"mb-6 flex items-center justify-between gap-4","children":[["$","div",null,{"children":[["$","p",null,{"className":"text-sm text-gray-500 dark:text-gray-400","children":"Docs / Raw"}],["$","h1",null,{"className":"text-3xl font-bold text-gray-900 dark:text-white","children":"Part3 Platform Enhancements Plan"}],["$","p",null,{"className":"text-sm text-gray-600 dark:text-gray-400","children":["Sourced from"," ",["$","code",null,{"className":"font-mono text-xs","children":["docs/","PART3_PLATFORM_ENHANCEMENTS_PLAN.md"]}]]}]]}],["$","a",null,{"href":"https://github.com/mohammednazmy/VoiceAssist/edit/main/docs/PART3_PLATFORM_ENHANCEMENTS_PLAN.md","target":"_blank","rel":"noreferrer","className":"inline-flex items-center gap-2 rounded-md border border-gray-200 dark:border-gray-700 px-3 py-1.5 text-sm text-gray-700 dark:text-gray-200 hover:border-primary-500 dark:hover:border-primary-400 hover:text-primary-700 dark:hover:text-primary-300","children":"Edit on GitHub"}]]}],["$","div",null,{"className":"rounded-lg border border-gray-200 dark:border-gray-800 bg-white dark:bg-gray-900 p-6","children":["$","$L2",null,{"content":"$3"}]}],["$","div",null,{"className":"mt-6 flex flex-wrap gap-2 text-sm","children":[["$","$L4",null,{"href":"/reference/all-docs","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"← All documentation"}],["$","$L4",null,{"href":"/","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"Home"}]]}]]}],null],null],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children","$6","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/7f586cdbbaa33ff7.css","precedence":"next","crossOrigin":"$undefined"}]],["$","html",null,{"lang":"en","className":"h-full","children":["$","body",null,{"className":"__className_f367f3 h-full bg-white dark:bg-gray-900","children":[["$","a",null,{"href":"#main-content","className":"skip-to-content","children":"Skip to main content"}],["$","$L8",null,{"children":[["$","$L9",null,{}],["$","$La",null,{}],["$","main",null,{"id":"main-content","className":"lg:pl-64","role":"main","aria-label":"Documentation content","children":["$","$Lb",null,{"children":["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}]}]]}]]}]}]],null],null],["$Lc",null]]]] c:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"Part3 Platform Enhancements Plan | Docs | VoiceAssist Docs"}],["$","meta","3",{"name":"description","content":"**Date:** 2025-11-26"}],["$","meta","4",{"name":"keywords","content":"VoiceAssist,documentation,medical AI,voice assistant,healthcare,HIPAA,API"}],["$","meta","5",{"name":"robots","content":"index, follow"}],["$","meta","6",{"name":"googlebot","content":"index, follow"}],["$","link","7",{"rel":"canonical","href":"https://assistdocs.asimo.io"}],["$","meta","8",{"property":"og:title","content":"VoiceAssist Documentation"}],["$","meta","9",{"property":"og:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","10",{"property":"og:url","content":"https://assistdocs.asimo.io"}],["$","meta","11",{"property":"og:site_name","content":"VoiceAssist Docs"}],["$","meta","12",{"property":"og:type","content":"website"}],["$","meta","13",{"name":"twitter:card","content":"summary"}],["$","meta","14",{"name":"twitter:title","content":"VoiceAssist Documentation"}],["$","meta","15",{"name":"twitter:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","16",{"name":"next-size-adjust"}]] 1:null