Will Percey — Portfolio

Model Deployment

> > Updated Dec 2025

rocket_launch

Deployment Strategies

Blue-Green Deployment

Maintain two identical production environments (blue and green), switching traffic instantly between them for zero-downtime deployments.

Key Features

Two identical production environments
Instant traffic switching
Complete environment isolation
Fast rollback capability
Zero-downtime deployments

Use Cases

Critical production updates
Major model version changes
Infrastructure migrations
Scenarios requiring instant rollback

Related Patterns

CanaryRollingShadow

Canary Deployment

Gradually roll out new model to increasing percentage of traffic, monitoring metrics before proceeding to full deployment.

Key Features

Gradual traffic increase (5% → 25% → 50% → 100%)
Automated rollback on metric degradation
Risk mitigation through phased rollout
Real user feedback at each stage
Configurable promotion criteria

Use Cases

Production ML model updates
New feature releases
Performance-sensitive changes
Risk-averse deployments

Related Patterns

Blue-GreenA/B TestingProgressive Delivery

Shadow Deployment

Deploy new model alongside production, sending duplicated traffic to both without affecting users, for safe testing.

Key Features

Zero user impact
Real production traffic patterns
Side-by-side performance comparison
Production environment validation
No rollback needed (not serving users)

Use Cases

New model validation
Performance benchmarking
Regression testing in prod
Pre-launch confidence building

Related Patterns

A/B TestingCanaryOffline evaluation

A/B Testing

Split traffic between two model versions to compare performance, business metrics, and user experience statistically.

Key Features

Statistical significance testing
Configurable traffic splits
Business metric tracking
User segmentation support
Multi-variant support (A/B/C)

Use Cases

Model performance comparison
Algorithm experimentation
User experience optimization
ROI-focused deployments

Related Patterns

Multi-Armed BanditCanaryChampion/Challenger

Champion/Challenger

Continuously compare new model (challenger) against current best (champion), automatically promoting better performers.

Key Features

Ongoing performance comparison
Automatic promotion on success
Multiple challenger support
Metric-based decision making
Built-in experimentation

Use Cases

Continuous model improvement
AutoML model selection
Algorithm optimization
Adaptive systems

Related Patterns

A/B TestingMulti-Armed BanditCanary

Multi-Armed Bandit

Dynamically allocate traffic based on real-time performance, automatically optimizing for best-performing model variant.

Key Features

Dynamic traffic allocation
Exploration vs exploitation balance
Real-time optimization
Adaptive to performance changes
Minimizes opportunity cost

Use Cases

Recommendation optimization
Content ranking
Ad serving optimization
Dynamic model selection

Related Patterns

A/B TestingChampion/ChallengerContextual Bandits

Model Testing Strategies

Strategy	Traffic Split	Risk Level	Rollback	Use Case
Shadow Deployment	0% to new model (observational)	Very Low	N/A (no prod traffic)	Test new model without risk
A/B Testing	50/50 or custom split	Medium	Route traffic to old model	Compare model performance
Canary	5-20% to new model gradually	Low-Medium	Fast - reduce traffic	Gradual rollout with safety
Blue-Green	0% or 100%	High	Fast - switch back	Quick cutover with instant rollback
Multi-Armed Bandit	Dynamic based on performance	Medium	Automatic reduction	Optimize performance automatically

Progressive Rollout Phases

science

Phase 1: Internal

Audience: Dev/QA team

Traffic: 0%

Duration: Days

Test basic functionality

group

Phase 2: Alpha

Audience: Power users

Traffic: 1-5%

Duration: 1-2 weeks

Real-world validation

groups

Phase 3: Beta

Audience: Select regions

Traffic: 10-25%

Duration: 1-2 weeks

Performance at scale

public

Phase 4: GA

Audience: All users

Traffic: 100%

Duration: Ongoing

Full production rollout

Pre-Deployment Checklist

Model Validation

✓ Offline metrics meet thresholds
✓ Model size appropriate for serving
✓ Latency requirements validated
✓ Bias and fairness checks passed
✓ Security scanning completed
✓ Input validation tested

Infrastructure

✓ Resource limits configured
✓ Auto-scaling policies set
✓ Health checks implemented
✓ Monitoring dashboards ready
✓ Alerting rules configured
✓ Rollback plan documented

Observability

✓ Metrics collection enabled
✓ Logging properly configured
✓ Distributed tracing setup
✓ Error tracking integrated
✓ Performance profiling ready
✓ Cost tracking enabled

Process

✓ Change request approved
✓ Stakeholders notified
✓ Deployment window scheduled
✓ Runbook updated
✓ On-call rotation set
✓ Communication plan ready

Rollback Strategies

Traffic Shifting

Speed: Immediate

Method: Adjust load balancer weights to route traffic back to stable model version.

Pros: Instant, no downtime

Cons: Requires both versions running

Model Registry Pointer

Speed: Fast (seconds)

Method: Update model registry to point to previous version, reload endpoints.

Pros: Simple, version controlled

Cons: Brief service interruption

Infrastructure Rollback

Speed: Slower (minutes)

Method: Redeploy previous container/pod configuration via IaC or K8s.

Pros: Full infrastructure revert

Cons: Takes longer, more complex