2:I[7012,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],"MarkdownRenderer"]
4:I[9856,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],""]
5:I[4126,[],""]
7:I[9630,[],""]
8:I[4278,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"HeadingProvider"]
9:I[1476,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Header"]
a:I[3167,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Sidebar"]
b:I[7409,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"PageFrame"]
3:T3f5d,
# Unified Conversation Memory
Voice Mode v4.1 introduces unified conversation memory that maintains context across voice and text interactions, enabling seamless mode switching.
## Overview
The unified memory system provides:
- **Cross-modal context**: Conversation history shared between voice and text
- **Language switching events**: Tracks when users switch languages
- **Mode transition handling**: Preserves context when switching voice ↔ text
- **Session persistence**: Maintains memory across browser refreshes
- **Privacy controls**: User-controlled memory retention
```
┌─────────────────────────────────────────────────────────────────┐
│ Unified Memory Store │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Voice Mode │◄────── Shared ──────►│ Text Mode │ │
│ │ │ Memory │ │ │
│ └──────────────┘ └──────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Conversation Context │ │
│ ├──────────────────────────────────────────────────┤ │
│ │ • Message history (last 50 messages) │ │
│ │ • Language preferences & switches │ │
│ │ • RAG context (retrieved passages) │ │
│ │ • User preferences │ │
│ │ • Session metadata │ │
│ └──────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
```
### Thinker-Talker Pipeline Integration
```mermaid
sequenceDiagram
participant User
participant Frontend
participant Memory as Unified Memory
participant Thinker
participant RAG
participant Talker
User->>Frontend: Voice or Text input
Frontend->>Memory: add_entry(role="user", mode, content)
Note over Memory: Store with mode tag
(voice/text)
Memory->>Thinker: get_context(max_messages=10)
Thinker->>RAG: retrieve_passages(query)
RAG-->>Thinker: relevant_passages
Note over Thinker: Build LLM context
with history + RAG
Thinker-->>Memory: add_entry(role="assistant")
Thinker-->>Talker: response_stream
Talker-->>Frontend: audio_chunks
Note over Memory: Context preserved
across mode switches
```
### Memory Flow on Mode Switch
```mermaid
flowchart TD
subgraph Voice Mode
VA[🎤 Voice Input]
VT[Voice Transcript]
VM[Voice Message Entry]
end
subgraph Text Mode
TA[⌨️ Text Input]
TM[Text Message Entry]
end
subgraph Unified Memory
MC[Message Context]
LC[Language Events]
RC[RAG Context]
ME[Mode Events]
end
subgraph Thinker-Talker
TH[Thinker LLM]
TK[Talker TTS]
end
VA --> VT --> VM --> MC
TA --> TM --> MC
VM --> ME
TM --> ME
MC --> TH
LC --> TH
RC --> TH
TH --> TK
style MC fill:#FFD700
```
### Environment Variable for Data Directory
When customizing lexicon paths, use the `_resolve_data_dir()` helper:
```python
from app.core.config import _resolve_data_dir
# Returns VOICEASSIST_DATA_DIR env var or default ./data
data_dir = _resolve_data_dir()
# Lexicon paths relative to data dir
lexicons_path = data_dir / "lexicons" / "medical_terms.txt"
```
**Environment variable**: `VOICEASSIST_DATA_DIR=/path/to/data`
If not set, defaults to `./data` relative to the working directory.
## Memory Architecture
### Memory Layers
| Layer | Scope | Retention | Storage |
| ---------- | --------------- | ------------ | ---------- |
| Session | Current session | Until close | Redis |
| Short-term | Last 24 hours | 24h TTL | Redis |
| Long-term | User history | Configurable | PostgreSQL |
| Episodic | Key moments | Indefinite | PostgreSQL |
### Memory Entry Structure
```python
@dataclass
class MemoryEntry:
"""Single memory entry in the conversation."""
id: str
session_id: str
user_id: str
timestamp: datetime
# Content
role: Literal["user", "assistant", "system"]
content: str
mode: Literal["voice", "text"]
# Context
language: str
detected_language: str
language_switched: bool
# RAG context
retrieved_passages: List[str]
sources: List[Dict]
# Metadata
latency_ms: Optional[float]
degradations: List[str]
phi_detected: bool
```
## Implementation
### UnifiedMemoryService
```python
from app.services.unified_memory import UnifiedMemoryService
memory_service = UnifiedMemoryService()
# Add voice message to memory
await memory_service.add_entry(
session_id="session_123",
user_id="user_456",
entry=MemoryEntry(
role="user",
content="What is metformin used for?",
mode="voice",
language="en",
detected_language="en",
language_switched=False
)
)
# Get context for LLM
context = await memory_service.get_context(
session_id="session_123",
max_messages=10,
include_rag=True
)
```
### Cross-Modal Context
When switching from voice to text (or vice versa):
```python
async def handle_mode_switch(
session_id: str,
from_mode: str,
to_mode: str
) -> ConversationContext:
"""Handle mode switch while preserving context."""
# Get existing conversation context
context = await memory_service.get_context(session_id)
# Add mode switch event
await memory_service.add_event(
session_id=session_id,
event_type="mode_switch",
data={
"from_mode": from_mode,
"to_mode": to_mode,
"timestamp": datetime.utcnow().isoformat()
}
)
# Return context for new mode
return context
```
### Language Switching Events
Track language changes for multilingual users:
```python
async def track_language_switch(
session_id: str,
from_language: str,
to_language: str,
trigger: str # "user_request" | "auto_detected" | "explicit_setting"
):
"""Track when user switches languages."""
await memory_service.add_event(
session_id=session_id,
event_type="language_switch",
data={
"from_language": from_language,
"to_language": to_language,
"trigger": trigger,
"timestamp": datetime.utcnow().isoformat()
}
)
# Update session language preference
await session_service.update_language(
session_id=session_id,
language=to_language
)
```
## Context Building
### Building LLM Context
```python
async def build_llm_context(
session_id: str,
current_query: str,
rag_results: List[Dict]
) -> List[Dict]:
"""Build context for LLM including memory."""
# Get conversation history
history = await memory_service.get_history(
session_id=session_id,
max_messages=10
)
# Get language switches (for context awareness)
language_events = await memory_service.get_events(
session_id=session_id,
event_type="language_switch",
limit=5
)
# Build messages array
messages = []
# System prompt with context
system_prompt = build_system_prompt(
language_history=language_events,
rag_context=rag_results
)
messages.append({"role": "system", "content": system_prompt})
# Add conversation history
for entry in history:
messages.append({
"role": entry.role,
"content": entry.content
})
# Add current query
messages.append({
"role": "user",
"content": current_query
})
return messages
```
### Context Truncation
When context exceeds token limits:
```python
async def truncate_context(
messages: List[Dict],
max_tokens: int = 4000
) -> List[Dict]:
"""Truncate context while preserving important information."""
# Always keep: system prompt, last 3 messages
protected = messages[:1] + messages[-3:]
middle = messages[1:-3]
# Count tokens
total_tokens = count_tokens(messages)
if total_tokens <= max_tokens:
return messages
# Summarize middle messages
if middle:
summary = await summarize_messages(middle)
summary_message = {
"role": "system",
"content": f"[Previous conversation summary: {summary}]"
}
return [messages[0], summary_message] + messages[-3:]
return protected
```
## Session Persistence
### Redis Session Storage
```python
class RedisMemoryStore:
"""Redis-backed memory store for sessions."""
def __init__(self, redis_client: Redis):
self.redis = redis_client
self.ttl = 86400 # 24 hours
async def save_session(
self,
session_id: str,
memory: List[MemoryEntry]
):
key = f"memory:{session_id}"
data = json.dumps([entry.to_dict() for entry in memory])
await self.redis.set(key, data, ex=self.ttl)
async def load_session(
self,
session_id: str
) -> List[MemoryEntry]:
key = f"memory:{session_id}"
data = await self.redis.get(key)
if data:
entries = json.loads(data)
return [MemoryEntry.from_dict(e) for e in entries]
return []
async def extend_ttl(self, session_id: str):
key = f"memory:{session_id}"
await self.redis.expire(key, self.ttl)
```
### Long-term Storage
For persistent memory across sessions:
```python
class PostgresMemoryStore:
"""PostgreSQL-backed long-term memory store."""
async def save_conversation(
self,
user_id: str,
session_id: str,
entries: List[MemoryEntry]
):
"""Save conversation to long-term storage."""
async with self.db.transaction():
# Save conversation record
conversation = await self.db.execute(
"""
INSERT INTO conversations (user_id, session_id, created_at)
VALUES ($1, $2, NOW())
RETURNING id
""",
user_id, session_id
)
# Save entries
for entry in entries:
await self.db.execute(
"""
INSERT INTO conversation_entries
(conversation_id, role, content, mode, language, timestamp)
VALUES ($1, $2, $3, $4, $5, $6)
""",
conversation.id,
entry.role,
entry.content,
entry.mode,
entry.language,
entry.timestamp
)
```
## Privacy Controls
### User Memory Settings
```python
@dataclass
class MemorySettings:
"""User's memory and privacy preferences."""
enabled: bool = True
retention_days: int = 30
cross_session: bool = True
save_voice_transcripts: bool = True
save_rag_context: bool = True
anonymize_phi: bool = True
```
### Memory Deletion
```python
async def delete_user_memory(
user_id: str,
scope: Literal["session", "day", "all"]
):
"""Delete user's conversation memory."""
if scope == "session":
await redis_store.delete_session(user_id)
elif scope == "day":
await postgres_store.delete_today(user_id)
elif scope == "all":
await redis_store.delete_all(user_id)
await postgres_store.delete_all(user_id)
logger.info(f"Deleted memory for user {user_id}, scope: {scope}")
```
## Frontend Integration
### Memory Hook
```tsx
import { useUnifiedMemory } from "@/hooks/useUnifiedMemory";
const ChatContainer = () => {
const { messages, addMessage, clearMemory, mode, switchMode } = useUnifiedMemory();
const handleSend = async (content: string) => {
// Add to unified memory
await addMessage({
role: "user",
content,
mode: mode, // "voice" or "text"
language: currentLanguage,
});
// Get AI response
const response = await fetchResponse(content);
// Add response to memory
await addMessage({
role: "assistant",
content: response.text,
mode: mode,
language: response.language,
});
};
return (