Will Percey — Portfolio

Deep Learning & MLOps

> Updated Feb 2026

neurology

Architectures & Specialized

AI ArchitecturesModel architectures

Deep dive into neural network architectures including transformers, CNNs, RNNs, attention mechanisms, and modern foundation models.

TransformersCNNsRNNsAttention

Transformer ArchitectureSelf-attention & fine-tuning

Internal mechanics of the transformer architecture including self-attention, Q/K/V projections, positional encoding, encoder-decoder variants, and fine-tuning approaches.

TransformerSelf-AttentionEncoder-DecoderFine-Tuning

BERTBidirectional encoder representations

Bidirectional Encoder Representations from Transformers. Architecture fundamentals, MLM and NSP pre-training, fine-tuning patterns, and the BERT family of variants.

BERTEncoderMLMFine-Tuning

GPTGenerative pre-trained transformers

Generative Pre-trained Transformers. Decoder-only architecture, next-token prediction, scaling laws, RLHF alignment, and the GPT lineage from GPT-1 to GPT-4.

GPTDecoderScalingRLHF

Mixture of ExpertsSparse activation & expert routing

Mixture of Experts architecture. Sparse activation, expert routing mechanisms, load balancing, and notable MoE models including Mixtral and DeepSeek.

MoESparseRoutingExperts

AWS Model DevelopmentNova Forge & SageMaker HyperPod

Build custom frontier models with Amazon Nova Forge and distributed training infrastructure with SageMaker HyperPod.

Nova ForgeHyperPodAWSBedrock

Computer VisionCNNs, transformers & detection

CNNs, vision transformers, object detection, segmentation, image classification, and modern computer vision architectures.

CNNsTransformersDetectionSegmentation

Recommendation SystemsAlgorithms, ranking & personalization

Collaborative filtering, content-based filtering, hybrid approaches, ranking algorithms, and personalization strategies.

RecommendationsFilteringRankingPersonalization

model_training

Training & Fine-Tuning

Time Series ForecastingClassical & deep learning methods

Classical methods (ARIMA, Prophet) and deep learning approaches (LSTM, Temporal CNNs) for time series prediction.

ForecastingARIMAProphetLSTM

Fine-TuningLoRA, QLoRA & model adaptation

LoRA, QLoRA, PEFT techniques, model adaptation, transfer learning, and parameter-efficient fine-tuning methods.

LoRAQLoRAPEFTTransfer Learning

Experiment TrackingMLflow, W&B & reproducibility

MLflow, Weights & Biases, experiment management, reproducibility, hyperparameter tracking, and model comparison.

MLflowW&BExperimentsReproducibility

memory

Infrastructure & GPUs

GPU InfrastructureGPU types & cluster management

GPU types, cluster management, resource allocation, scheduling, and infrastructure for training and inference workloads.

GPUClustersSchedulingResources

GPU InferenceTensorRT, CUDA & optimization

TensorRT optimization pipeline, CUDA memory management, stream concurrency, custom kernels, and GPU inference best practices.

TensorRTCUDAOptimizationKernels

Edge AIOn-device inference & optimization

Model quantization, TensorFlow Lite, ONNX Runtime, edge hardware (TPUs, NPUs), and deployment patterns for edge devices.

EdgeQuantizationTFLiteONNX

precision_manufacturing

MLOps & Production

MLOpsML lifecycle & pipelines

ML lifecycle management, CI/CD for ML, model versioning, experiment tracking, and production ML pipelines.

ML LifecycleCI/CDVersioningPipelines

Model ServingInference servers & deployment

Inference servers, deployment patterns, model optimization, batching, and scaling strategies for serving ML models in production.

InferenceDeploymentOptimizationScaling

ML Platform ArchitectureInfrastructure & MLOps patterns

Feature stores, model registries, training infrastructure, MLOps automation, and platform architecture patterns.

PlatformInfrastructureMLOpsAutomation

Feature StoresFeature engineering platforms

Feature engineering platforms for managing, storing, and serving ML features with consistency across training and inference.

FeaturesEngineeringStorageConsistency

AI Data ArchitectureData pipelines & storage layers

Data pipelines, storage layers, data versioning, lineage tracking, and architecture patterns for ML data platforms.

Data PipelinesStorageVersioningLineage

Performance EngineeringLatency & throughput optimization

Latency optimization, throughput tuning, quantization, pruning, and performance engineering for ML systems.

LatencyThroughputQuantizationOptimization

Inference OptimizationQuantization & performance tuning

Model optimization techniques including quantization, pruning, distillation, batching, caching, and performance tuning for inference.

QuantizationPruningDistillationBatching

verified

Testing & Quality

Model TestingUnit testing, integration testing

Unit testing, integration testing, behavioral testing, performance testing, and validation strategies for ML models.

TestingValidationPerformanceQuality

ML Quality AssuranceModel monitoring & drift detection

Model monitoring, drift detection, governance, quality gates, observability, and compliance for production ML systems.

MonitoringDriftGovernanceCompliance

Model DeploymentA/B testing & rollout strategies

ML model deployment strategies including A/B testing, shadow deployment, canary releases, and rollout patterns for production ML.

A/B TestingCanaryShadowDeployment

Model MonitoringDrift detection & retraining

Production ML monitoring including drift detection, model performance tracking, data quality monitoring, and automated retraining triggers.

Drift DetectionPerformanceData QualityRetraining

gavel

Governance & Compliance

AI GovernanceFrameworks & compliance

Governance frameworks, model documentation, risk management, compliance requirements, and AI policy development.

GovernanceComplianceRisk ManagementPolicy

Privacy-Preserving MLDifferential privacy & federated learning

Differential privacy, federated learning, secure multi-party computation, and privacy-preserving techniques for ML.

PrivacyDifferential PrivacyFederated LearningSMPC