2:I[7012,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],"MarkdownRenderer"]
4:I[9856,["4765","static/chunks/4765-f5afdf8061f456f3.js","9856","static/chunks/9856-3b185291364d9bef.js","6687","static/chunks/app/docs/%5B...slug%5D/page-e07536548216bee4.js"],""]
5:I[4126,[],""]
7:I[9630,[],""]
8:I[4278,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"HeadingProvider"]
9:I[1476,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Header"]
a:I[3167,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"Sidebar"]
b:I[7409,["9856","static/chunks/9856-3b185291364d9bef.js","8172","static/chunks/8172-b3a2d6fe4ae10d40.js","3185","static/chunks/app/layout-2814fa5d15b84fe4.js"],"PageFrame"]
3:T7388,
# VoiceAssist Load Testing Guide

## Overview

This comprehensive guide covers load testing for VoiceAssist, including when to run tests, how to interpret results, choosing between tools (k6 vs Locust), understanding test scenarios, troubleshooting issues, CI/CD integration, and best practices.

## Table of Contents

- [When to Run Load Tests](#when-to-run-load-tests)
- [Load Testing Tools](#load-testing-tools)
- [Test Scenarios](#test-scenarios)
- [Running Load Tests](#running-load-tests)
- [Interpreting Results](#interpreting-results)
- [Troubleshooting](#troubleshooting)
- [CI/CD Integration](#cicd-integration)
- [Best Practices](#best-practices)

---

## When to Run Load Tests

### Regular Testing Schedule

#### Pre-Release Testing (Required)

Run comprehensive load tests before **every production release**:

- **Scope**: Full test suite (50, 100, 200 users)
- **Duration**: 30 minutes per scenario
- **Goal**: Validate no performance regressions
- **Action**: Block release if critical thresholds exceeded

#### Weekly Baseline (Recommended)

Run baseline tests **every Monday morning**:

- **Scope**: 100 user scenario (production simulation)
- **Duration**: 15 minutes
- **Goal**: Establish performance trends
- **Action**: Investigate significant deviations (>10%)

#### Monthly Stress Testing (Optional)

Run stress tests **last Friday of month**:

- **Scope**: 200-500 user scenarios
- **Duration**: 60 minutes
- **Goal**: Validate capacity limits
- **Action**: Update capacity planning

### Trigger-Based Testing

#### After Major Changes

Run load tests after:

1. **Database Schema Changes**
   - New indexes
   - Table modifications
   - Migration scripts

2. **Cache Strategy Changes**
   - TTL adjustments
   - New cache layers
   - Eviction policy changes

3. **Infrastructure Changes**
   - Kubernetes configuration
   - Resource limits
   - HPA threshold adjustments

4. **Code Optimizations**
   - Query optimizations
   - Algorithm improvements
   - Caching implementations

5. **Dependency Updates**
   - Major library upgrades
   - Framework updates
   - Database version changes

#### Before Capacity Changes

Run load tests before:

- Adding/removing nodes
- Changing instance types
- Modifying autoscaling policies
- Database scaling operations

### Ad-Hoc Testing

Run load tests when:

- Investigating performance issues
- Validating optimization hypotheses
- Responding to user complaints
- Capacity planning exercises
- Training new team members

---

## Load Testing Tools

### Tool Comparison: k6 vs Locust

We use both tools for different purposes. Here's when to use each:

#### k6 - Recommended for:

**Strengths**:

- Fast execution (written in Go)
- Low resource overhead
- JavaScript-based scripts (familiar syntax)
- Excellent CLI integration
- Cloud service available
- Great for CI/CD pipelines
- Built-in metrics and thresholds
- Protocol-level testing (HTTP/2, gRPC)

**Best For**:

- Quick smoke tests
- CI/CD integration
- Simple API endpoint testing
- Protocol-specific testing
- Resource-constrained environments
- Automated regression testing

**Example Use Cases**:

```bash
# Quick smoke test
k6 run --vus 10 --duration 30s smoke-test.js

# CI/CD integration
k6 run --out json=results.json ci-test.js

# Protocol testing
k6 run --http2 http2-test.js
```

#### Locust - Recommended for:

**Strengths**:

- Python-based (easy to customize)
- Web UI for real-time monitoring
- Distributed load generation
- Complex user behavior modeling
- Task weights and think times
- WebSocket support
- Extensible architecture
- Better for long-running tests

**Best For**:

- Complex user scenarios
- Long-duration tests (hours)
- Real-time monitoring needs
- Distributed testing
- Custom behavior modeling
- WebSocket testing
- Exploratory testing

**Example Use Cases**:

```bash
# Web UI with real-time monitoring
locust -f locustfile.py --web-port 8089

# Distributed testing
locust -f locustfile.py --master
locust -f locustfile.py --worker --master-host=master-ip

# Headless with specific targets
locust -f locustfile.py --headless -u 100 -r 10 -t 30m
```

### Decision Matrix

| Criteria                 | Use k6       | Use Locust |
| ------------------------ | ------------ | ---------- |
| **Test Duration**        | <30 min      | >30 min    |
| **Script Complexity**    | Simple       | Complex    |
| **CI/CD Integration**    | Yes          | Optional   |
| **Real-time Monitoring** | Not critical | Required   |
| **Distributed Testing**  | Not needed   | Required   |
| **WebSocket Testing**    | No           | Yes        |
| **Team Familiarity**     | JavaScript   | Python     |
| **Resource Constraints** | Limited      | Abundant   |

### Hybrid Approach (Recommended)

Use **both** tools in your testing strategy:

1. **k6 for CI/CD**:
   - Quick regression tests
   - Automated on every PR
   - 5-10 minute tests
   - Pass/fail criteria

2. **Locust for Deep Testing**:
   - Pre-release validation
   - Capacity planning
   - Performance investigations
   - 30-60 minute tests

---

## Test Scenarios

### Overview

We have 4 standard test scenarios with increasing load:

1. **Smoke Test** (10 users, 5 min)
2. **Baseline Test** (50 users, 15 min)
3. **Load Test** (100 users, 30 min)
4. **Stress Test** (200-500 users, 60 min)

### Smoke Test

**Purpose**: Verify system functionality under minimal load

**Configuration**:

```yaml
virtual_users: 10
duration: 5 minutes
ramp_up: 1 minute
think_time: 5-10 seconds
```

**User Distribution**:

- 70% Regular Users (simple queries)
- 20% Power Users (complex queries)
- 10% Admin Users (document operations)

**When to Use**:

- After deployments
- Quick sanity checks
- Before longer tests
- CI/CD pipelines

**Success Criteria**:

- Error rate <0.5%
- P95 response time <500ms
- No crashes or errors

### Baseline Test (50 Users)

**Purpose**: Establish performance baseline under light load

**Configuration**:

```yaml
virtual_users: 50
duration: 15 minutes
ramp_up: 5 minutes
steady_state: 10 minutes
think_time: 3-10 seconds
```

**User Distribution**:

- 70% Regular Users
- 20% Power Users
- 10% Admin Users

**When to Use**:

- Weekly baseline tests
- After optimizations
- Regression detection
- Performance trending

**Success Criteria**:

- P95 response time <500ms
- Error rate <1%
- CPU utilization <60%
- Cache hit rate >80%

### Load Test (100 Users)

**Purpose**: Simulate production load

**Configuration**:

```yaml
virtual_users: 100
duration: 30 minutes
ramp_up: 10 minutes
steady_state: 20 minutes
think_time: 3-10 seconds
```

**User Distribution**:

- 70% Regular Users
- 20% Power Users
- 10% Admin Users

**When to Use**:

- Pre-release testing
- Capacity validation
- SLO verification
- Monthly reviews

**Success Criteria**:

- P95 response time <800ms
- Error rate <1%
- CPU utilization <70%
- Throughput >80 req/s

### Stress Test (200-500 Users)

**Purpose**: Test system limits and breaking points

**Configuration**:

```yaml
virtual_users: 200-500 (incremental)
duration: 60 minutes
ramp_up: 20 minutes
steady_state: 40 minutes
think_time: 3-10 seconds
```

**User Distribution**:

- 70% Regular Users
- 20% Power Users
- 10% Admin Users

**When to Use**:

- Capacity planning
- Breaking point analysis
- Quarterly reviews
- Before major events

**Success Criteria**:

- System remains stable
- Error rate <5%
- Graceful degradation
- No cascading failures

---

## Running Load Tests

### Prerequisites

1. **Environment Setup**:

   ```bash
   # Install Locust
   pip install locust locust-plugins

   # Install k6
   brew install k6  # macOS
   # or download from https://k6.io/
   ```

2. **Configuration**:

   ```bash
   # Copy and configure environment
   cp load-tests/locust/.env.example load-tests/locust/.env

   # Update BASE_URL, credentials, etc.
   vim load-tests/locust/.env
   ```

3. **Test Environment**:
   - Use staging/test environment
   - Never run against production without approval
   - Ensure monitoring is enabled
   - Clear caches before testing

### Running Locust Tests

#### Web UI Mode (Interactive)

```bash
# Navigate to test directory
cd load-tests/locust

# Start Locust with web UI
locust -f locustfile.py --web-port 8089

# Open browser to http://localhost:8089
# Configure:
#   - Number of users
#   - Spawn rate
#   - Host (if not in config)
# Click "Start Swarming"
```

**Advantages**:

- Real-time monitoring
- Dynamic control (pause, stop, adjust)
- Visual charts
- Best for exploratory testing

#### Headless Mode (Automated)

```bash
# Baseline test (50 users)
locust -f locustfile.py \
  --headless \
  -u 50 \
  -r 5 \
  -t 15m \
  --html report-50users.html \
  --csv report-50users

# Load test (100 users)
locust -f locustfile.py \
  --headless \
  -u 100 \
  -r 10 \
  -t 30m \
  --html report-100users.html \
  --csv report-100users

# Stress test (200 users)
locust -f locustfile.py \
  --headless \
  -u 200 \
  -r 10 \
  -t 60m \
  --html report-200users.html \
  --csv report-200users
```

**Parameters**:

- `-u`: Number of users (peak)
- `-r`: Spawn rate (users/second)
- `-t`: Test duration
- `--html`: Generate HTML report
- `--csv`: Generate CSV results

#### Distributed Mode (High Load)

For tests >500 users or resource constraints:

```bash
# Terminal 1: Start master
locust -f locustfile.py \
  --master \
  --expect-workers 4 \
  --web-port 8089

# Terminals 2-5: Start workers
locust -f locustfile.py \
  --worker \
  --master-host localhost

# Use web UI or headless mode as above
```

### Running k6 Tests

#### Smoke Test

```bash
# Navigate to test directory
cd load-tests/k6

# Run smoke test
k6 run --vus 10 --duration 5m smoke-test.js
```

#### Load Test

```bash
# Run with staged load
k6 run \
  --stage 5m:50 \
  --stage 10m:50 \
  --stage 5m:0 \
  load-test.js
```

#### Stress Test

```bash
# Run stress test with thresholds
k6 run \
  --stage 10m:100 \
  --stage 20m:100 \
  --stage 10m:200 \
  --stage 20m:200 \
  --stage 10m:0 \
  stress-test.js
```

#### With Cloud Output

```bash
# Send results to k6 Cloud
k6 run --out cloud load-test.js

# Or to other backends
k6 run --out influxdb=http://localhost:8086/k6 load-test.js
```

### Monitoring During Tests

#### Grafana Dashboards

Open these dashboards before starting tests:

1. **Load Testing Overview**:

   ```
   http://grafana:3000/d/voiceassist-load-testing
   ```

   - Test status and VUs
   - Request rate and errors
   - Response time percentiles

2. **System Performance**:

   ```
   http://grafana:3000/d/voiceassist-system-performance
   ```

   - Request throughput
   - Resource utilization
   - Database and cache performance

3. **Autoscaling Monitoring**:

   ```
   http://grafana:3000/d/voiceassist-autoscaling
   ```

   - HPA status
   - Pod count
   - Scaling events

#### Real-Time Metrics

```bash
# Watch pod metrics
watch kubectl top pods -n voiceassist

# Watch HPA status
watch kubectl get hpa -n voiceassist

# Watch pod count
watch kubectl get pods -n voiceassist

# Stream pod logs
kubectl logs -f -l app=voiceassist-api -n voiceassist
```

---

## Interpreting Results

### Key Metrics to Analyze

#### 1. Response Time

**What to Look For**:

- P50 (median): Representative user experience
- P95: What 95% of users experience
- P99: Edge cases and outliers
- Trend over time: Stability vs degradation

**Good**:

```
P50: 180ms (stable throughout test)
P95: 520ms (no spikes)
P99: 950ms (within target)
```

**Bad**:

```
P50: 320ms (increasing over time)
P95: 1850ms (frequent spikes)
P99: 5200ms (extreme outliers)
```

**Analysis**:

- Increasing trend → Resource exhaustion or memory leak
- Periodic spikes → Garbage collection or batch jobs
- High variance → Inconsistent performance (investigate)

#### 2. Throughput

**What to Look For**:

- Requests per second (sustained)
- Consistency throughout test
- Correlation with user count

**Good**:

```
Target: 100 users
Throughput: 90 req/s (consistent)
```

**Bad**:

```
Target: 100 users
Throughput: 45 req/s (declining)
Or: 150 req/s (users waiting, not thinking)
```

**Analysis**:

- Lower than expected → Bottleneck (DB, CPU, network)
- Higher than expected → Unrealistic think times
- Declining → System degradation under load

#### 3. Error Rate

**What to Look For**:

- Percentage of failed requests
- Error types (4xx vs 5xx)
- When errors occur (start, middle, end)

**Good**:

```
Total Requests: 27,000
Failed Requests: 81 (0.3%)
Error Type: Mostly 4xx (validation)
```

**Bad**:

```
Total Requests: 27,000
Failed Requests: 1,350 (5%)
Error Type: 5xx (server errors)
Trend: Increasing over time
```

**Analysis**:

- <1%: Acceptable (expected transient errors)
- 1-3%: Warning (investigate if sustained)
- > 3%: Critical (system under stress)
- Increasing: System failing under load

#### 4. Resource Utilization

**What to Look For**:

- CPU and memory utilization
- Pod count (autoscaling)
- Database connections
- Cache hit rates

**Good**:

```
CPU: 60-70% (stable, room for spikes)
Memory: 55-65% (stable)
Pods: 4-5 (scaled appropriately)
DB Connections: 30/50 (60%, comfortable)
Cache Hit Rate: 83% (effective)
```

**Bad**:

```
CPU: 85-95% (saturated, no headroom)
Memory: 88-92% (risk of OOM)
Pods: 10 (max, still struggling)
DB Connections: 49/50 (98%, bottleneck)
Cache Hit Rate: 45% (ineffective)
```

**Analysis**:

- High CPU → Computation bottleneck
- High memory → Memory leak or inefficient data structures
- Max pods → Need vertical or horizontal scaling
- DB connections saturated → Need connection pooling or replicas
- Low cache hit rate → Poor cache strategy

### Locust Report Analysis

#### HTML Report Sections

1. **Statistics Table**:
   - Shows per-endpoint performance
   - Look for outliers (slow endpoints)
   - Check failure rates per endpoint

2. **Response Time Chart**:
   - Visualize P50/P95/P99 over time
   - Look for trends and spikes
   - Correlate with events (scaling, errors)

3. **Users Chart**:
   - Verify ramp-up pattern
   - Ensure smooth increase
   - Check if target reached

4. **Requests per Second**:
   - Verify throughput expectations
   - Look for correlation with users
   - Check for plateaus (bottlenecks)

#### CSV Data Analysis

```python
# Example: Analyze Locust CSV output
import pandas as pd

# Load results
stats = pd.read_csv('report-100users_stats.csv')
history = pd.read_csv('report-100users_stats_history.csv')

# Calculate key metrics
print(f"Median Response Time: {stats['Median Response Time'].median():.0f}ms")
print(f"95th Percentile: {stats['95%'].median():.0f}ms")
print(f"99th Percentile: {stats['99%'].median():.0f}ms")
print(f"Total Requests: {stats['Request Count'].sum()}")
print(f"Total Failures: {stats['Failure Count'].sum()}")
print(f"Error Rate: {(stats['Failure Count'].sum() / stats['Request Count'].sum() * 100):.2f}%")

# Identify slowest endpoints
slowest = stats.nlargest(5, '95%')[['Name', '95%', 'Request Count']]
print("\nSlowest Endpoints (P95):")
print(slowest)
```

### k6 Results Analysis

#### Terminal Output

```
execution: local
    script: load-test.js
    output: -

scenarios: (100.00%) 1 scenario, 100 max VUs, 30m30s max duration (incl. graceful stop):
          * default: 100 looping VUs for 30m0s (gracefulStop: 30s)

     ✓ status was 200
     ✓ response time < 500ms

     checks.........................: 99.67% ✓ 26890      ✗ 89
     data_received..................: 45 MB  25 kB/s
     data_sent......................: 3.2 MB 1.8 kB/s
     http_req_blocked...............: avg=1.23ms  min=1µs   med=5µs    max=234ms  p(90)=8µs    p(95)=11µs
     http_req_connecting............: avg=487µs   min=0s    med=0s     max=89ms   p(90)=0s     p(95)=0s
     http_req_duration..............: avg=182ms   min=23ms  med=156ms  max=2.1s   p(90)=289ms  p(95)=398ms
       { expected_response:true }...: avg=181ms   min=23ms  med=156ms  max=1.8s   p(90)=288ms  p(95)=396ms
     http_req_failed................: 0.33%  ✓ 89         ✗ 26890
     http_req_receiving.............: avg=156µs   min=18µs  med=98µs   max=12ms   p(90)=245µs  p(95)=389µs
     http_req_sending...............: avg=38µs    min=6µs   med=25µs   max=8ms    p(90)=58µs   p(95)=89µs
     http_req_tls_handshaking.......: avg=645µs   min=0s    med=0s     max=156ms  p(90)=0s     p(95)=0s
     http_req_waiting...............: avg=182ms   min=23ms  med=156ms  max=2.1s   p(90)=289ms  p(95)=398ms
     http_reqs......................: 26979  14.98/s
     iteration_duration.............: avg=6.65s   min=5.02s med=6.45s  max=15.2s  p(90)=7.89s  p(95)=8.76s
     iterations.....................: 26979  14.98/s
     vus............................: 100    min=100      max=100
     vus_max........................: 100    min=100      max=100


running (30m00.0s), 000/100 VUs, 26979 complete and 0 interrupted iterations
default ✓ [======================================] 100 VUs  30m0s
```

**Key Points**:

- ✓ checks: 99.67% → Good (most checks passed)
- http_req_duration: P95 = 398ms → Good (within target)
- http_req_failed: 0.33% → Good (low error rate)
- http_reqs: 14.98/s → Throughput (with 100 VUs, 6.65s iteration)

---

## Troubleshooting

### Common Issues and Solutions

#### Issue 1: High Error Rate (>3%)

**Symptoms**:

- Many 5xx errors
- Error rate increasing over time
- Specific endpoints failing

**Diagnosis**:

```bash
# Check pod logs
kubectl logs -l app=voiceassist-api -n voiceassist --tail=100

# Check error distribution
# In Locust UI: Look at failures tab
# In Grafana: Check error rate by endpoint

# Check resource utilization
kubectl top pods -n voiceassist
```

**Common Causes**:

1. **Database connection pool exhausted**
   - Solution: Increase pool size or add replicas

2. **CPU/Memory saturation**
   - Solution: Scale horizontally or vertically

3. **Timeouts**
   - Solution: Increase timeout values or optimize slow queries

4. **Rate limiting**
   - Solution: Adjust rate limits or distribute load

#### Issue 2: Poor Response Times

**Symptoms**:

- P95/P99 exceeding targets
- Response time increasing over test duration
- Inconsistent performance

**Diagnosis**:

```bash
# Check slow queries
kubectl exec -it postgres-pod -- psql -U user -d voiceassist
SELECT * FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 10;

# Check cache hit rate
# In Grafana: Cache Performance dashboard

# Check autoscaling
kubectl get hpa -n voiceassist
```

**Common Causes**:

1. **Database queries not optimized**
   - Solution: Add indexes, optimize queries

2. **Cache misses**
   - Solution: Warm cache, adjust TTLs

3. **Insufficient resources**
   - Solution: Scale up

4. **Network latency**
   - Solution: Check network configuration

#### Issue 3: Autoscaling Not Working

**Symptoms**:

- Pods not scaling up despite high load
- Scaling too slowly or too aggressively
- Pods scaling down during active test

**Diagnosis**:

```bash
# Check HPA status
kubectl describe hpa voiceassist-api -n voiceassist

# Check metrics server
kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods

# Check HPA events
kubectl get events -n voiceassist --sort-by='.lastTimestamp'
```

**Common Causes**:

1. **Metrics server not running**
   - Solution: Install/restart metrics server

2. **Incorrect HPA configuration**
   - Solution: Review and adjust thresholds

3. **Resource requests not set**
   - Solution: Set CPU/memory requests in pod spec

4. **Scaling too conservative**
   - Solution: Adjust scale-up/scale-down policies

#### Issue 4: Memory Leaks

**Symptoms**:

- Memory usage increasing over time
- Pods being OOMKilled
- Performance degrading over test duration

**Diagnosis**:

```bash
# Monitor memory over time
kubectl top pods -n voiceassist --watch

# Check for OOMKilled pods
kubectl get pods -n voiceassist | grep OOMKilled

# Get detailed pod memory usage
kubectl describe pod <pod-name> -n voiceassist
```

**Common Causes**:

1. **Unclosed connections**
   - Solution: Ensure proper connection cleanup

2. **Caching too aggressively**
   - Solution: Implement cache size limits, eviction

3. **Large response objects**
   - Solution: Implement pagination, streaming

4. **Circular references**
   - Solution: Review object lifecycle, use weak references

#### Issue 5: Load Test Itself Failing

**Symptoms**:

- Locust/k6 crashing
- Cannot reach target user count
- Inconsistent results

**Diagnosis**:

```bash
# Check load test machine resources
top
htop

# Check network connectivity
ping <target-host>
curl -v https://<target-host>/health

# Verify test configuration
cat load-tests/locust/config.py
```

**Common Causes**:

1. **Load test machine under-resourced**
   - Solution: Use more powerful machine or distributed mode

2. **Network issues**
   - Solution: Check firewall, DNS, routing

3. **Test script errors**
   - Solution: Review and debug test code

4. **Unrealistic think times**
   - Solution: Adjust to realistic values

---

## CI/CD Integration

### GitHub Actions Example

```yaml
# .github/workflows/load-test.yml
name: Load Test

on:
  schedule:
    # Run every Monday at 8 AM UTC
    - cron: "0 8 * * 1"
  workflow_dispatch:
    inputs:
      users:
        description: "Number of users"
        required: true
        default: "50"
      duration:
        description: "Test duration (e.g., 15m)"
        required: true
        default: "15m"

jobs:
  load-test-k6:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Install k6
        run: |
          sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
          echo "deb https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
          sudo apt-get update
          sudo apt-get install k6

      - name: Run k6 load test
        run: |
          k6 run \
            --vus ${{ github.event.inputs.users || '50' }} \
            --duration ${{ github.event.inputs.duration || '15m' }} \
            --out json=results.json \
            load-tests/k6/baseline-test.js
        env:
          BASE_URL: ${{ secrets.STAGING_URL }}
          API_KEY: ${{ secrets.STAGING_API_KEY }}

      - name: Upload results
        uses: actions/upload-artifact@v3
        with:
          name: k6-results
          path: results.json

      - name: Check thresholds
        run: |
          # Parse results and check if thresholds passed
          # Fail job if critical metrics exceeded
          python scripts/check-thresholds.py results.json

  load-test-locust:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: "3.11"

      - name: Install dependencies
        run: |
          pip install locust locust-plugins

      - name: Run Locust load test
        run: |
          cd load-tests/locust
          locust -f locustfile.py \
            --headless \
            -u ${{ github.event.inputs.users || '50' }} \
            -r 5 \
            -t ${{ github.event.inputs.duration || '15m' }} \
            --html report.html \
            --csv report
        env:
          BASE_URL: ${{ secrets.STAGING_URL }}
          TEST_USER_EMAIL: ${{ secrets.TEST_USER_EMAIL }}
          TEST_USER_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }}

      - name: Upload reports
        uses: actions/upload-artifact@v3
        with:
          name: locust-results
          path: |
            load-tests/locust/report.html
            load-tests/locust/report_*.csv

      - name: Post results to Slack
        if: always()
        uses: slackapi/slack-github-action@v1
        with:
          payload: |
            {
              "text": "Load Test Results",
              "blocks": [
                {
                  "type": "section",
                  "text": {
                    "type": "mrkdwn",
                    "text": "*Load Test Completed*\nUsers: ${{ github.event.inputs.users || '50' }}\nDuration: ${{ github.event.inputs.duration || '15m' }}"
                  }
                }
              ]
            }
        env:
          SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
```

### GitLab CI Example

```yaml
# .gitlab-ci.yml
stages:
  - test
  - report

variables:
  BASE_URL: $STAGING_URL

load-test-baseline:
  stage: test
  image: grafana/k6:latest
  script:
    - k6 run --vus 50 --duration 15m --out json=results.json load-tests/k6/baseline-test.js
  artifacts:
    paths:
      - results.json
    expire_in: 30 days
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"
    - if: $CI_PIPELINE_SOURCE == "web"

load-test-full:
  stage: test
  image: locustio/locust:latest
  script:
    - cd load-tests/locust
    - locust -f locustfile.py --headless -u 100 -r 10 -t 30m --html report.html --csv report
  artifacts:
    paths:
      - load-tests/locust/report.*
    expire_in: 30 days
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
    - if: $CI_PIPELINE_SOURCE == "schedule"

analyze-results:
  stage: report
  image: python:3.11
  script:
    - pip install pandas matplotlib
    - python scripts/analyze-results.py
    - python scripts/generate-report.py
  artifacts:
    paths:
      - performance-report.pdf
    expire_in: 90 days
  dependencies:
    - load-test-baseline
    - load-test-full
```

---

## Best Practices

### Test Design

1. **Realistic User Behavior**
   - Use appropriate think times (3-10s)
   - Model different user types (regular, power, admin)
   - Include realistic data (varied query complexity)
   - Simulate real workflows (multi-step operations)

2. **Gradual Ramp-Up**
   - Don't start at peak load
   - Ramp up gradually (10% of duration)
   - Allow system to stabilize
   - Observe autoscaling behavior

3. **Sufficient Duration**
   - Minimum 15 minutes for baseline
   - 30+ minutes for load tests
   - 60+ minutes for stress tests
   - Include cooldown period

4. **Appropriate Load Levels**
   - Start with 50% of expected production load
   - Gradually increase to 100%, 150%, 200%
   - Don't jump directly to stress levels
   - Document your reasoning

### Environment Management

1. **Dedicated Test Environment**
   - Don't test in production (unless explicitly approved)
   - Use production-like configuration
   - Same resource limits and constraints
   - Isolated from development

2. **Clean State**
   - Clear caches before tests
   - Reset database to known state
   - Restart services if needed
   - Document starting conditions

3. **Monitoring Setup**
   - Ensure all monitoring is active
   - Configure alerts appropriately
   - Set up dashboards beforehand
   - Enable detailed logging

### Data Management

1. **Test Data**
   - Use realistic test data
   - Sufficient variety (avoid cache saturation)
   - Anonymized production data (if possible)
   - Documented and reproducible

2. **Results Storage**
   - Save all test results
   - Include environment configuration
   - Store raw and analyzed data
   - Use consistent naming (date-load-duration)

3. **Result Analysis**
   - Compare against baselines
   - Look for trends over time
   - Investigate anomalies
   - Document findings

### Communication

1. **Before Testing**
   - Notify team of testing window
   - Coordinate with ops team
   - Reserve resources if needed
   - Set expectations

2. **During Testing**
   - Monitor actively
   - Be ready to abort if issues
   - Document observations
   - Take screenshots of dashboards

3. **After Testing**
   - Share results with team
   - Summarize key findings
   - Create action items
   - Update documentation

### Continuous Improvement

1. **Regular Reviews**
   - Weekly baseline comparisons
   - Monthly trend analysis
   - Quarterly comprehensive review
   - Update targets as system evolves

2. **Automation**
   - Automate routine tests
   - Automatic threshold checking
   - Automated reporting
   - Alert on regressions

3. **Documentation**
   - Document test procedures
   - Record configuration changes
   - Maintain runbook
   - Share lessons learned

---

## Quick Reference

### Common Commands

```bash
# Locust
## Web UI
locust -f locustfile.py --web-port 8089

## Headless (50 users, 15 min)
locust -f locustfile.py --headless -u 50 -r 5 -t 15m --html report.html

## Distributed
locust -f locustfile.py --master
locust -f locustfile.py --worker --master-host=localhost

# k6
## Simple run
k6 run script.js

## With load profile
k6 run --stage 5m:50 --stage 10m:50 --stage 5m:0 script.js

## With output
k6 run --out json=results.json script.js

# Monitoring
## Watch pods
watch kubectl top pods -n voiceassist

## Watch HPA
watch kubectl get hpa -n voiceassist

## Stream logs
kubectl logs -f -l app=voiceassist-api -n voiceassist
```

### Useful Links

- **Dashboards**:
  - Load Testing: http://grafana:3000/d/voiceassist-load-testing
  - System Performance: http://grafana:3000/d/voiceassist-system-performance
  - Autoscaling: http://grafana:3000/d/voiceassist-autoscaling

- **Documentation**:
  - Performance Benchmarks: `/docs/PERFORMANCE_BENCHMARKS.md`
  - Performance Tuning: `/docs/PERFORMANCE_TUNING_GUIDE.md`

- **External Resources**:
  - Locust Docs: https://docs.locust.io/
  - k6 Docs: https://k6.io/docs/

---

## Support

For questions or issues:

1. Check the troubleshooting section
2. Review Grafana dashboards for insights
3. Check #performance Slack channel
4. Contact DevOps team
5. Create a GitHub issue

**Remember**: Load testing is an iterative process. Start small, learn, and gradually increase complexity and load levels.
6:["slug","LOAD_TESTING_GUIDE","c"]
0:["X7oMT3VrOffzp0qvbeOas",[[["",{"children":["docs",{"children":[["slug","LOAD_TESTING_GUIDE","c"],{"children":["__PAGE__?{\"slug\":[\"LOAD_TESTING_GUIDE\"]}",{}]}]}]},"$undefined","$undefined",true],["",{"children":["docs",{"children":[["slug","LOAD_TESTING_GUIDE","c"],{"children":["__PAGE__",{},[["$L1",["$","div",null,{"children":[["$","div",null,{"className":"mb-6 flex items-center justify-between gap-4","children":[["$","div",null,{"children":[["$","p",null,{"className":"text-sm text-gray-500 dark:text-gray-400","children":"Docs / Raw"}],["$","h1",null,{"className":"text-3xl font-bold text-gray-900 dark:text-white","children":"Load Testing Guide"}],["$","p",null,{"className":"text-sm text-gray-600 dark:text-gray-400","children":["Sourced from"," ",["$","code",null,{"className":"font-mono text-xs","children":["docs/","LOAD_TESTING_GUIDE.md"]}]]}]]}],["$","a",null,{"href":"https://github.com/mohammednazmy/VoiceAssist/edit/main/docs/LOAD_TESTING_GUIDE.md","target":"_blank","rel":"noreferrer","className":"inline-flex items-center gap-2 rounded-md border border-gray-200 dark:border-gray-700 px-3 py-1.5 text-sm text-gray-700 dark:text-gray-200 hover:border-primary-500 dark:hover:border-primary-400 hover:text-primary-700 dark:hover:text-primary-300","children":"Edit on GitHub"}]]}],["$","div",null,{"className":"rounded-lg border border-gray-200 dark:border-gray-800 bg-white dark:bg-gray-900 p-6","children":["$","$L2",null,{"content":"$3"}]}],["$","div",null,{"className":"mt-6 flex flex-wrap gap-2 text-sm","children":[["$","$L4",null,{"href":"/reference/all-docs","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"← All documentation"}],["$","$L4",null,{"href":"/","className":"inline-flex items-center gap-1 rounded-md bg-gray-100 px-3 py-1 text-gray-700 hover:bg-gray-200 dark:bg-gray-800 dark:text-gray-200 dark:hover:bg-gray-700","children":"Home"}]]}]]}],null],null],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children","$6","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[null,["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children","docs","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]],null]},[[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/7f586cdbbaa33ff7.css","precedence":"next","crossOrigin":"$undefined"}]],["$","html",null,{"lang":"en","className":"h-full","children":["$","body",null,{"className":"__className_f367f3 h-full bg-white dark:bg-gray-900","children":[["$","a",null,{"href":"#main-content","className":"skip-to-content","children":"Skip to main content"}],["$","$L8",null,{"children":[["$","$L9",null,{}],["$","$La",null,{}],["$","main",null,{"id":"main-content","className":"lg:pl-64","role":"main","aria-label":"Documentation content","children":["$","$Lb",null,{"children":["$","$L5",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}]}]]}]]}]}]],null],null],["$Lc",null]]]]
c:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"Load Testing Guide | Docs | VoiceAssist Docs"}],["$","meta","3",{"name":"description","content":"This comprehensive guide covers load testing for VoiceAssist, including when to run tests, how to interpret results, choosing between tools (k6 vs Loc..."}],["$","meta","4",{"name":"keywords","content":"VoiceAssist,documentation,medical AI,voice assistant,healthcare,HIPAA,API"}],["$","meta","5",{"name":"robots","content":"index, follow"}],["$","meta","6",{"name":"googlebot","content":"index, follow"}],["$","link","7",{"rel":"canonical","href":"https://assistdocs.asimo.io"}],["$","meta","8",{"property":"og:title","content":"VoiceAssist Documentation"}],["$","meta","9",{"property":"og:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","10",{"property":"og:url","content":"https://assistdocs.asimo.io"}],["$","meta","11",{"property":"og:site_name","content":"VoiceAssist Docs"}],["$","meta","12",{"property":"og:type","content":"website"}],["$","meta","13",{"name":"twitter:card","content":"summary"}],["$","meta","14",{"name":"twitter:title","content":"VoiceAssist Documentation"}],["$","meta","15",{"name":"twitter:description","content":"Comprehensive documentation for VoiceAssist - Enterprise Medical AI Assistant"}],["$","meta","16",{"name":"next-size-adjust"}]]
1:null