2:I[7012,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],"MarkdownRenderer"]
4:I[9856,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],""]
5:I[4126,[],""]
7:I[9630,[],""]
8:I[4278,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"HeadingProvider"]
9:I[1476,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Header"]
a:I[3167,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Sidebar"]
b:I[7409,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"PageFrame"]
3:T278b,
# Multilingual RAG Architecture

The multilingual RAG service enables voice interactions in multiple languages by implementing a translate-then-retrieve pattern with graceful degradation.

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────────┐
│                     Multilingual RAG Pipeline                    │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────┐   ┌─────────────┐   ┌───────────────┐             │
│  │  User    │──▶│  Language   │──▶│  Translation  │             │
│  │  Query   │   │  Detection  │   │  (if needed)  │             │
│  └──────────┘   └─────────────┘   └───────────────┘             │
│                        │                   │                     │
│                        ▼                   ▼                     │
│              ┌─────────────────────────────────┐                │
│              │    English Query for RAG        │                │
│              └─────────────────────────────────┘                │
│                              │                                   │
│                              ▼                                   │
│              ┌─────────────────────────────────┐                │
│              │      RAG Knowledge Base         │                │
│              │   (English embeddings only)     │                │
│              └─────────────────────────────────┘                │
│                              │                                   │
│                              ▼                                   │
│              ┌─────────────────────────────────┐                │
│              │      LLM Response Generation    │                │
│              │   (with language instruction)   │                │
│              └─────────────────────────────────┘                │
│                              │                                   │
│                              ▼                                   │
│              ┌─────────────────────────────────┐                │
│              │     Response in User Language   │                │
│              └─────────────────────────────────┘                │
└─────────────────────────────────────────────────────────────────┘
```

## Translation Service

### Multi-Provider Fallback

```python
from app.services.translation_service import TranslationService

# Initialize with providers
service = TranslationService(
    primary_provider="google",
    fallback_provider="deepl"
)

# Translate with automatic fallback
result = await service.translate_with_fallback(
    text="¿Cuáles son los síntomas de la diabetes?",
    source="es",
    target="en"
)

if result.failed:
    # Graceful degradation - use original query
    print(f"Translation failed: {result.error_message}")
else:
    print(f"Translated: {result.text}")
    if result.used_fallback:
        print("Used fallback provider")
```

### Caching Strategy

Translations are cached in Redis with a 7-day TTL:

```python
# Cache key format
cache_key = f"trans:{source}:{target}:{hash(text)}"

# TTL
TTL_DAYS = 7

# Cache hit rate typically >80% for common queries
```

### Supported Languages

| Code | Language   | Status      |
| ---- | ---------- | ----------- |
| en   | English    | Native      |
| es   | Spanish    | Full        |
| fr   | French     | Full        |
| de   | German     | Full        |
| it   | Italian    | Full        |
| pt   | Portuguese | Full        |
| ar   | Arabic     | Full        |
| zh   | Chinese    | Full        |
| hi   | Hindi      | Full        |
| ur   | Urdu       | Full        |
| ja   | Japanese   | Placeholder |
| ko   | Korean     | Placeholder |
| ru   | Russian    | Placeholder |
| pl   | Polish     | Placeholder |
| tr   | Turkish    | Placeholder |

## Language Detection

### Code-Switching Detection

The language detection service identifies when users mix languages:

```python
from app.services.multilingual_rag_service import LanguageDetectionService

detector = LanguageDetectionService()

# Detect primary language
result = await detector.detect("Tell me about مرض السكري please")
# result.primary_language = "en"
# result.secondary_languages = ["ar"]
# result.is_code_switched = True
```

### Detection Algorithm

1. **Fast detection**: Use langdetect for initial guess
2. **Confidence check**: Verify confidence > 0.7
3. **Code-switching scan**: Check for embedded phrases in other languages
4. **Fallback**: Default to user's preferred language

## RAG Integration

### Query Flow

```python
from app.services.multilingual_rag_service import MultilingualRAGService

service = MultilingualRAGService()

response = await service.query_multilingual(
    query="¿Qué medicamentos se usan para la diabetes?",
    user_language="es"
)

# Response structure
{
    "answer": "Los medicamentos más comunes para...",
    "language": "es",
    "sources": [...],
    "original_query": "¿Qué medicamentos se usan para...",
    "translated_query": "What medications are used for...",
    "translation_warning": None,  # or "Translation used fallback"
    "latency_ms": 523.4,
    "degradation_applied": []
}
```

### LLM Prompting for Multilingual Response

The LLM is instructed to respond in the user's language:

```python
system_prompt = f"""You are a helpful medical assistant.
Respond to the user's question using the provided context.
IMPORTANT: Respond entirely in {language_name}.
Do not mix languages unless the user's query contains specific terms
that should remain in their original language (e.g., medication names).
Be accurate, helpful, and cite your sources when providing information."""
```

## Graceful Degradation

When translation fails, the system degrades gracefully:

### Degradation Levels

| Scenario                  | Action                   | DegradationType               |
| ------------------------- | ------------------------ | ----------------------------- |
| Primary translation fails | Use fallback provider    | `translation_used_fallback`   |
| All translation fails     | Use original query + LLM | `translation_failed`          |
| Translation too slow      | Skip translation         | `translation_budget_exceeded` |
| RAG retrieval fails       | Return empty results     | `rag_retrieval_failed`        |

### Error Messages by Language

```python
FALLBACK_MESSAGES = {
    "en": "I apologize, but I'm unable to process your request. Please try again.",
    "es": "Lo siento, no puedo procesar su solicitud. Por favor, inténtelo de nuevo.",
    "fr": "Je m'excuse, je ne peux pas traiter votre demande. Veuillez réessayer.",
    "de": "Es tut mir leid, ich kann Ihre Anfrage nicht bearbeiten. Bitte versuchen Sie es erneut.",
    "ar": "عذراً، لا أستطيع معالجة طلبك. يرجى المحاولة مرة أخرى.",
    "zh": "抱歉，我目前无法处理您的请求。请重试。",
    # ... more languages
}
```

## Performance Considerations

### Latency Impact

| Stage              | Typical Latency | Budget    |
| ------------------ | --------------- | --------- |
| Language detection | 10-30ms         | 50ms      |
| Translation        | 100-180ms       | 200ms     |
| RAG retrieval      | 150-250ms       | 300ms     |
| **Total impact**   | **~300ms**      | **550ms** |

### Optimization Strategies

1. **Translation caching**: 7-day Redis cache
2. **Async detection**: Run language detection in parallel with audio processing
3. **Skip translation for English**: Detect English early and bypass translation
4. **Budget-aware skipping**: Skip translation when budget is tight

## Configuration

### Environment Variables

```bash
# Primary translation provider
TRANSLATION_PROVIDER=google

# API keys (store in secrets manager)
GOOGLE_TRANSLATE_API_KEY=xxx
DEEPL_API_KEY=xxx

# Cache settings
TRANSLATION_CACHE_TTL_DAYS=7
TRANSLATION_CACHE_PREFIX=trans

# Feature flag
VOICE_V4_MULTILINGUAL_RAG=true
```

### Feature Flag

```python
from app.core.feature_flags import is_enabled

if is_enabled("voice_v4_multilingual_rag", user_id=user.id):
    service = MultilingualRAGService()
    response = await service.query_multilingual(query, user_language)
else:
    # Fall back to English-only RAG
    response = await rag_service.query(query)
```

## Testing

```python
# Test translation fallback
pytest tests/services/test_voice_v4_services.py::TestTranslationFailureHandling -v

# Test multilingual RAG
pytest tests/services/test_voice_v4_services.py::TestMultilingualRAG -v
```

## Related Documentation

- [Voice Mode v4.1 Overview](./voice-mode-v4-overview.md)
- [API Reference](../api-reference/rest-api.md)
- [Latency Budgets Guide](./latency-budgets-guide.md)
6:["slug","voice/multilingual-rag-architecture","c"]
0:["X7oMT3VrOffzp0qvbeOas",[[["",{"children":["docs",{"children":[["slug","voice/multilingual-rag-architecture","c"],{"children":["__PAGE__?{\"slug\":[\"voice\",\"multilingual-rag-architecture\"]}",{}]}]}]},"$undefined","$undefined",true],["",{"children":["docs",{"children":[["slug","voice/multilingual-rag-architecture","c"],{"children":["__PAGE__",{},[["$L1",["$","div",null,{"children":[["$","div",null,{"className":"mb-6 flex items-center justify-between gap-4","children":[["$","div",null,{"children":[["$","p",null,{"className":"text-sm text-gray-500 dark:text-gray-400","children":"Docs / Raw"}],["$","h1",null,{"className":"text-3xl font-bold text-gray-900 dark:text-white","children":"Multilingual RAG Architecture"}],["$","p",null,{"className":"text-sm text-gray-600 dark:text-gray-400","children":["Sourced from"," ",["$","code",null,{"className":"font-mono text-xs","children":["docs/","voice/multilingual-rag-architecture.md"]}]]}]]}],["$","a",null,{"href":"https://github.com/mohammednazmy/VoiceAssist/edit/main/docs/voice/multilingual-rag-architecture.md","target":"_blank","rel":"noreferrer","className":"inline-flex items-center gap-2 rounded-md border border-gray-200 dark:border-gray-700 px-3 py-1.5 text-sm text-gray-700 dark:text-gray-200 hover:border-primary-500 dark:hover:border-primary-400 hover:text-primary-700 dark:hover:text-primary-300","children":"Edit on GitHub"}]]}],["$","div",null,{"className":"rounded-lg border border-gray-200 dark:border-gray-800 bg-white dark:bg-gray-900 p-6","children":["$","$L2",null,{"content":"$3"}]}],["$","div",null,{"className":"mt-6 flex flex-wrap gap-2 text-sm","children":[["$","$L4",null,{"href":"/reference/all-docs","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"← All documentation"}],["$","$L4",null,{"href":"/","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"Home"}]]}]]}],null],null],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children","$6","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/7f586cdbbaa33ff7.css","precedence":"next","crossOrigin":"$undefined"}]],["$","html",null,{"lang":"en","className":"h-full","children":["$","body",null,{"className":"__className_f367f3 h-full bg-white dark:bg-gray-900","children":[["$","a",null,{"href":"#main-content","className":"skip-to-content","children":"Skip to main content"}],["$","$L8",null,{"children":[["$","$L9",null,{}],["$","$La",null,{}],["$","main",null,{"id":"main-content","className":"lg:pl-64","role":"main","aria-label":"Documentation content","children":["$","$Lb",null,{"children":["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}]}]]}]]}]}]],null],null],["$Lc",null]]]]
c:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"Multilingual RAG Architecture | Docs | VoiceAssist Docs"}],["$","meta","3",{"name":"description","content":"Technical architecture for multilingual voice RAG with translation fallback"}],["$","meta","4",{"name":"keywords","content":"VoiceAssist,documentation,medical AI,voice assistant,healthcare,HIPAA,API"}],["$","meta","5",{"name":"robots","content":"index, follow"}],["$","meta","6",{"name":"googlebot","content":"index, follow"}],["$","link","7",{"rel":"canonical","href":"https://assistdocs.asimo.io"}],["$","meta","8",{"property":"og:title","content":"VoiceAssist Documentation"}],["$","meta","9",{"property":"og:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","10",{"property":"og:url","content":"https://assistdocs.asimo.io"}],["$","meta","11",{"property":"og:site_name","content":"VoiceAssist Docs"}],["$","meta","12",{"property":"og:type","content":"website"}],["$","meta","13",{"name":"twitter:card","content":"summary"}],["$","meta","14",{"name":"twitter:title","content":"VoiceAssist Documentation"}],["$","meta","15",{"name":"twitter:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","16",{"name":"next-size-adjust"}]]
1:null