Frontend Phase 3 Plan - Web App UX & Voice Enhancements
Date: 2025-11-25 Branch: feature/frontend-phase3D-voice-transcript-preview Status: Phase 3A-D Complete Scope: Web App (apps/web-app) frontend-focused improvements
Goals for Phase 3
Phase 3 focuses on Voice/Realtime UX polish, advanced chat controls, and evidence/context UX enhancements. The primary objective is to elevate the user experience from functional to polished and professional.
Success Criteria
- Voice mode feels seamless with clear status indicators and metrics visibility
- Message actions (edit, regenerate, branch) are discoverable and intuitive
- Citations and clinical context are easily accessible and useful
- All new features have comprehensive test coverage
Current Implementation Status (Phase 2 Complete)
Voice Mode (Existing)
- ✅
useRealtimeVoiceSessionhook with OpenAI Realtime API integration - ✅
VoiceModePanelwith waveform visualization - ✅
VoiceModeSettingsfor voice/language selection - ✅ Voice metrics tracking (connection time, STT latency, response latency)
- ⚠️ Metrics are tracked but not prominently displayed in UI
- ⚠️ Mic permission handling could be more graceful
Chat Interface (Existing)
- ✅
MessageBubblewith markdown, code blocks, citations - ✅
MessageActionMenuwith edit, regenerate, delete, copy, branch - ✅ Message editing with save/cancel
- ✅ Copy-to-clipboard with toast feedback
- ⚠️ Branch UI exists but is sidebar-based, not inline
- ⚠️ Edit/regenerate UX could be more discoverable
Citations & Context (Existing)
- ✅
CitationDisplaywith expandable details and copy - ✅
CitationSidebarfor browsing all citations - ✅
ClinicalContextSidebarandClinicalContextPanel - ⚠️ Citation filtering could be enhanced
- ⚠️ Clinical context presets not implemented
Phase 3 Backlog (Prioritized)
P0 - Must Have (Critical Path)
1. Voice Metrics Dashboard in Voice Panel
Status: ✅ Implemented (feature/frontend-phase2-polish, PR #66)
Effort: 1-2 days
Files: VoiceModePanel.tsx, VoiceMetricsDisplay.tsx
Description: Display voice metrics prominently in the VoiceModePanel so users can see connection health and latency.
Features:
- Show connection time, STT latency, response latency in real-time
- Color-coded indicators (green/yellow/red) for latency thresholds
- Expandable/collapsible metrics panel
- Time to first transcript display
- User/AI message counts and reconnect tracking
- Accessibility: sr-only legend text, aria-expanded, aria-controls
Acceptance Criteria:
- Metrics visible during active voice session
- Latency thresholds: <500ms green, 500-1000ms yellow, >1000ms red
- Tests for VoiceMetricsDisplay component (25 tests)
- Integration tests for VoiceModePanel metrics wiring (8 tests)
- Accessible legend with screen reader support
2. Mic Permission Error Handling UX
Status: ✅ Implemented (feature/frontend-phase2-polish, PR #66)
Effort: 0.5-1 day
Files: VoiceModePanel.tsx, useRealtimeVoiceSession.ts
Description: Improve the user experience when microphone permission is denied or unavailable.
Features:
- Clear error message when mic permission denied
- Link to browser settings instructions
- Retry button after granting permission
- Graceful fallback ("Use text-only mode" button)
- State hygiene (micPermissionDenied reset on disconnect/reconnect)
Acceptance Criteria:
- Permission denied shows helpful UI instead of error
- User can recover without refreshing page
- "Use text-only mode" fallback button available
- Tests for permission error states (14 tests)
3. Enhanced Message Action Menu UX
Effort: 1-2 days
Files: MessageActionMenu.tsx, MessageBubble.tsx
Description: Make message actions more discoverable and polish the interaction patterns.
Features:
- Show action icons on hover (not hidden in dropdown)
- Add tooltips with keyboard shortcuts
- Confirmation dialogs for destructive actions (delete)
- Optimistic UI updates for edit/regenerate
- Loading states during async operations
Acceptance Criteria:
- Actions visible on hover without opening menu
- Delete requires confirmation
- Loading spinners during async ops
- Tests for all action states
P1 - Important (Near-term)
4. Inline Branch Creation UI
Status: ✅ Implemented (feature/frontend-phase3C-branches-citations)
Effort: 2-3 days
Files: MessageBubble.tsx, useBranching.ts, new BranchPreview.tsx
Description: Allow users to "fork" a conversation from any message with inline preview.
Features:
- "Branch from here" action in message menu
- Inline preview showing where branch will start
- Navigate to new branch or stay in current
- Visual indicator for messages that have branches
Acceptance Criteria:
- Can create branch from any message
- Visual feedback for branched messages
- Branch preview before confirming
- Tests for branch creation flow (16 tests)
5. Voice Transcript Preview During Speech
Status: ✅ Implemented (feature/frontend-phase3D-voice-transcript-preview)
Effort: 1-2 days
Files: VoiceModePanel.tsx, useRealtimeVoiceSession.ts, new VoiceTranscriptPreview.tsx
Description: Show real-time transcript preview as user speaks, before finalizing.
Features:
- Live partial transcript display (streaming text)
- Visual distinction between partial and final transcripts
- Auto-clear partial on new utterance
- Smooth animation for transcript updates
Acceptance Criteria:
- Partial transcripts appear as user speaks
- Clear visual distinction (e.g., italic/faded)
- Smooth transitions to final transcript
- Tests for component and hook (14 + 2 tests)
6. Citation Sidebar Filters
Status: ✅ Implemented (feature/frontend-phase3C-branches-citations)
Effort: 1-2 days
Files: CitationSidebar.tsx
Description: Add filtering and search capabilities to the citation sidebar.
Features:
- Filter by source type (KB, PubMed, guidelines)
- Filter by message (show citations for selected message)
- Search citations by text
- Sort by relevance/date
- "Jump to source" in message
Acceptance Criteria:
- Can filter citations by type
- Can search citation text
- "Jump to" scrolls to citation in message
- Tests for filter/search functionality (18 new tests in Phase 3C)
P2 - Nice to Have (Future)
7. Clinical Context Presets
Effort: 2-3 days
Files: ClinicalContextPanel.tsx, new ClinicalContextPresets.tsx
Description: Allow users to save and load clinical context presets for common scenarios.
Features:
- Save current context as named preset
- Load preset to populate fields
- Built-in presets for common scenarios (pediatric, cardiac, etc.)
- Export/import presets
Acceptance Criteria:
- Can save custom presets
- Can load presets
- Built-in presets available
- Tests for preset save/load
8. Voice Interruption (Barge-in) Indicator
Effort: 1-2 days
Files: VoiceModePanel.tsx, useRealtimeVoiceSession.ts
Description: Visual feedback when user interrupts AI response with new speech.
Features:
- Visual indicator when barge-in detected
- Show which part of AI response was interrupted
- Smooth transition from AI speaking to user speaking
Acceptance Criteria:
- Barge-in visually indicated
- AI audio stops gracefully
- Tests for barge-in detection
9. Message Regeneration Options
Effort: 1-2 days
Files: MessageBubble.tsx, new RegenerateOptionsDialog.tsx
Description: Allow users to customize regeneration with options (temperature, length, etc.).
Features:
- "Regenerate with options" menu item
- Temperature slider (more creative vs more focused)
- Length preference (shorter/longer)
- Keep or clear clinical context
Acceptance Criteria:
- Can regenerate with options
- Options affect response
- Tests for regeneration options
10. E2E Tests with Playwright
Effort: 3-5 days
Files: New e2e/ directory
Description: Set up Playwright for critical user flow E2E testing.
Features:
- Login → chat → send message flow
- Voice mode activation and basic interaction
- Citation display and expansion
- Conversation management (rename, archive, delete)
Acceptance Criteria:
- Playwright configured and running
- 5+ critical path E2E tests passing
- CI integration for E2E tests
Dependencies on Backend
Most Phase 3 items are frontend-only. Potential backend dependencies:
| Feature | Backend Dependency |
|---|---|
| Voice metrics | None (already tracked client-side) |
| Mic permission | None (browser API) |
| Message actions | Existing APIs sufficient |
| Branching | POST /api/conversations/:id/branch (exists) |
| Transcript preview | WebSocket events (already sent) |
| Citation filters | None (client-side filtering) |
| Clinical presets | May need POST /api/clinical-context/presets |
| Barge-in | OpenAI Realtime API (already supported) |
| Regeneration options | May need API params for temp/length |
Estimated Timeline
| Priority | Items | Estimated Effort |
|---|---|---|
| P0 | Voice Metrics, Mic UX, Message Actions | 3-5 days |
| P1 | Inline Branch, Transcript Preview, Citation Filters | 4-7 days |
| P2 | Presets, Barge-in, Regeneration, E2E | 7-12 days |
| Total | 10 items | 14-24 days |
Testing Strategy
Unit Tests
- All new components tested with Vitest + React Testing Library
- Mock voice session hook for VoiceMetricsDisplay tests
- Test filter/search logic in CitationSidebar
Integration Tests
- Message action flows (edit → save → verify)
- Branch creation flow
- Citation filter interactions
E2E Tests (P2)
- Critical path flows with Playwright
- Voice mode activation (if possible with mocked audio)
Files to Create/Modify
New Files
src/components/voice/VoiceMetricsDisplay.tsxsrc/components/chat/BranchPreview.tsxsrc/components/chat/RegenerateOptionsDialog.tsxsrc/components/clinical/ClinicalContextPresets.tsxe2e/directory with Playwright tests
Modified Files
src/components/voice/VoiceModePanel.tsx(metrics, transcript preview)src/components/chat/MessageActionMenu.tsx(UX enhancements)src/components/chat/MessageBubble.tsx(action visibility)src/components/citations/CitationSidebar.tsx(filters)src/hooks/useRealtimeVoiceSession.ts(mic error handling)
Open Questions
- Clinical context presets API: Should presets be stored server-side per user, or just in localStorage?
- Regeneration options: Does the backend support temperature/length params for regeneration?
- E2E voice testing: Can we mock audio APIs in Playwright, or should voice E2E be manual?
Phase 3A Summary – Voice UX & Observability (Completed)
Phase 3A focused on voice mode polish and observability. This work was completed as part of the Phase 2 polish effort (PR #66).
Implemented Features
-
VoiceMetricsDisplay Component
- Collapsible metrics panel with real-time latency display
- Color-coded indicators (green <500ms, yellow 500-1000ms, red >1000ms)
- Displays: connection time, STT latency, response latency, time to first transcript
- Shows user/AI message counts and reconnect count
- Accessible legend with sr-only text for screen readers
- Robust header that handles narrow viewport widths
-
Mic Permission UX
- Contextual error messages for permission denied vs generic errors
- Browser settings instructions for granting mic access
- "Use text-only mode" fallback button
- State properly resets on disconnect/reconnect
- Retry button for non-permission connection errors
-
Voice Metrics Logging
- Console logging for observability:
voice_session_connect_ms,voice_stt_latency_ms,voice_first_reply_ms,voice_session_duration_ms onMetricsUpdatecallback for parent component integration
- Console logging for observability:
Test Coverage
VoiceMetricsDisplay.test.tsx: 25 tests (visibility, collapsible, metrics display, formatting, color coding, accessibility)VoiceModePanel-metrics.test.tsx: 8 tests (integration wiring)VoiceModePanel-permissions.test.tsx: 14 tests (permission handling, state, connection status)
Upcoming (Phase 3D+)
- Voice transcript preview during speech
- Barge-in indicator
- Message action menu enhancements
Phase 3B Summary – Keyboard-driven Voice UX & Responsive Layout (Completed)
Phase 3B focused on keyboard accessibility and responsive design for voice mode. This work was completed as PR #67.
Implemented Features
-
Keyboard-driven Voice Mode Control
- Global hotkey
Ctrl+Shift+Vto toggle voice mode - Push-to-talk mode (hold Space to talk)
- Escape to disconnect voice session
- Full keyboard navigation within voice panel
- Global hotkey
-
Responsive Voice Panel Layout
- Stacked layout on narrow screens (< 640px)
- Touch-friendly buttons meeting 44px minimum tap targets
- Metrics legend wraps appropriately on mobile
- Waveform scales to viewport width
Test Coverage
- Multiple tests for keyboard interactions and responsive behavior
Phase 3C Summary – Advanced Branching & Citations (Completed)
Phase 3C focused on conversation branching preview and citation filtering enhancements. This work was completed as part of feature/frontend-phase3C-branches-citations.
Implemented Features
-
BranchPreview Component
- Confirmation dialog before creating branch
- Shows parent message preview with truncation
- Displays message position (e.g., "message 2 of 4")
- Shows count of messages that will be excluded from branch
- Loading state with spinner during branch creation
- Proper ARIA attributes for accessibility
-
Visual Branch Indicator
- Messages that have branches show "Branched" badge
- Badge styled differently for user vs assistant messages
- Uses
branchedMessageIdsSet for efficient lookup
-
Citation Sidebar Filters
- Type filter pills: All, Knowledge Base, PubMed/DOI, Guidelines
- Message filter dropdown (when multiple messages have citations)
- Filters combine with existing text search
- Smart categorization based on source and sourceType
-
Jump-to-Message Functionality
- "Jump to message #N" button on each citation
- Smooth scroll to message with highlight effect
- 2-second highlight ring animation
- Uses
data-message-idattribute for targeting
Test Coverage
BranchPreview.test.tsx: 16 tests (rendering, actions, creating state, edge cases, accessibility)CitationSidebar-Phase8.test.tsx: 18 new tests for Phase 3C features (type filters, message filters, jump-to, combined filters)
Files Created/Modified
New Files:
src/components/chat/BranchPreview.tsxsrc/components/chat/__tests__/BranchPreview.test.tsx
Modified Files:
src/pages/ChatPage.tsx(branch preview state, onJumpToMessage callback)src/components/chat/MessageList.tsx(branchedMessageIds prop)src/components/chat/MessageBubble.tsx(hasBranch prop, visual indicator)src/components/citations/CitationSidebar.tsx(type/message filters, jump-to)src/components/citations/__tests__/CitationSidebar-Phase8.test.tsx(new tests)
Phase 3D Summary – Voice Transcript Preview (Completed)
Phase 3D focused on implementing live speech-to-text preview while the user is speaking. This work was completed on branch feature/frontend-phase3D-voice-transcript-preview.
Implemented Features
-
Hook-level Partial Transcript Support
- Extended
useRealtimeVoiceSessionhook withpartialTranscriptstate - Added handler for
conversation.item.input_audio_transcription.deltaevents - Accumulates partial text as speech is recognized
- Clears partial transcript on speech start and when final transcript arrives
- Partial transcripts count toward "time to first transcript" metrics
- Extended
-
VoiceTranscriptPreview Component
- Shows "Listening" indicator with animated pulsing dot
- Displays partial transcript text in italic blue styling
- Blinking cursor indicates more text is expected
- Only visible when speaking AND partial text exists
- Accessible with
aria-live="polite"andaria-atomic="false" - Decorative elements hidden from screen readers
-
VoiceModePanel Integration
- VoiceTranscriptPreview appears after waveform, before final transcript
- "Speaking..." indicator hidden when partial transcript is displayed
- Smooth transition from partial to final transcript display
Test Coverage
VoiceTranscriptPreview.test.tsx: 14 tests (rendering, visual indicators, accessibility, content updates, edge cases)useRealtimeVoiceSession.test.ts: 2 additional tests (partialTranscript initialization, disconnect cleanup)
Files Created/Modified
New Files:
src/components/voice/VoiceTranscriptPreview.tsxsrc/components/voice/__tests__/VoiceTranscriptPreview.test.tsx
Modified Files:
src/hooks/useRealtimeVoiceSession.ts(partialTranscript state, delta event handler)src/hooks/__tests__/useRealtimeVoiceSession.test.ts(2 new tests)src/components/voice/VoiceModePanel.tsx(VoiceTranscriptPreview integration)
Additional P1 Backlog Items (Suggested)
Keyboard-driven Voice UX
Effort: 1-2 days
Files: VoiceModePanel.tsx, MessageInput.tsx
Description: Add keyboard shortcuts for voice mode control.
Features:
- Global hotkey to toggle voice mode (e.g.,
Ctrl+Shift+V) - Push-to-talk mode option (hold Space to talk)
- Keyboard navigation within voice panel
- Escape to disconnect
Acceptance Criteria:
- Can toggle voice mode with keyboard shortcut
- Push-to-talk mode available in settings
- Tests for keyboard interactions
Responsive Voice Panel & Metrics Layout
Effort: 1 day
Files: VoiceModePanel.tsx, VoiceMetricsDisplay.tsx
Description: Ensure voice panel and metrics display work well on mobile and narrow viewports.
Features:
- Stacked layout on narrow screens
- Touch-friendly buttons (minimum 44px tap targets)
- Metrics legend collapses or wraps on mobile
- Waveform scales appropriately
Acceptance Criteria:
- Usable on 320px viewport width
- All interactive elements meet touch target guidelines
- Tests for responsive behavior (if feasible)
Created: 2025-11-25 Last Updated: 2025-11-26 Author: Claude (AI Assistant) Status: Phase 3A-D Complete
🤖 Generated with Claude Code