Will Percey — Portfolio

LLM & Agentic AI

> Updated Feb 2026

smart_toy

Agentic Systems

Agent ClawificationSkill-based agents embedded in real-world channels

Clawification — the shift to agents with bash access and markdown skill files, replacing tool definitions and MCP. Covers the skill primitive, platform implementations (OpenClaw, NemoClaw, NanoClaw, ZeroClaw and more), channel integrations (WhatsApp, Telegram, enterprise), and OpenRouter Spawn deployment.

ClawificationSkillsOpenClawComputer Use

robot_2

Agentic PatternsAI agent patterns

Design patterns for building AI agents including reflection, planning, tool use, multi-agent systems, and autonomous workflows.

AgentsPatternsPlanningTools

psychology

Agentic AnatomyAgent components

Core components and building blocks of AI agents including memory systems, tool interfaces, reasoning engines, and execution frameworks.

ComponentsMemoryToolsReasoning

loop

Agentic LoopsIterative agent execution patterns

Taxonomy of agentic loop patterns (refinement, research, verification, reflection, exploration, nested), loop anatomy, loop control mechanisms, real implementations including autoresearch and Claude Code /loop, and loop-specific failure modes.

LoopsRefinementResearch Loopautoresearch

precision_manufacturing

Graph State MachineDeclarative graph execution engine

State machine executor that takes a declarative graph spec and runs it to completion. Five edge condition types, parallel fan-out/fan-in, crash recovery, continuous conversation threading, and two-tier retry with the judge system.

Graph ExecutionState MachineOrchestrationFan-Out

memory_alt

Agent MemoryMemory systems & retrieval

Memory systems for AI agents including short-term, long-term, semantic, and episodic memory architectures.

MemoryAgentsShort-termLong-term

forum

Conversation ManagementContext window strategies

Strategies for managing conversation history within token limits, from simple sliding windows to semantic chunking and retrieval-augmented context.

ConversationContext WindowToken ManagementMemory

headset_mic

Voice AI

headset_mic

Voice AgentsFrameworks, tool calling & modality types

V2V pipeline architecture, modality taxonomy (V2V, TTS, STT, hybrid), major platforms (ElevenLabs, Vapi, Retell, Bland, OpenAI Realtime, LiveKit), tool calling patterns in voice, latency constraints, and design principles.

VoiceV2VTTSSTT

mic

Voice Agents: V2V RisksV2V risk catalogue & security stack

Voice-to-voice agent risk catalogue covering cloning attacks, liveness injection, cross-platform deepfakes, and transcript poisoning — with gap analysis and third-party security tooling from Pindrop, Reality Defender, ID R&D, and Nuance.

VoiceV2VDeepfakeLiveness

bug_report

ElevenLabs: Voice TestingScenario, tool call & simulation testing

ElevenLabs voice agent testing covering scenario evaluation with LLM judges, tool call exact-match validation, full and partial conversation simulations, CI/CD pipeline integration, test generation from real conversations, and the full API & SDK surface.

ElevenLabsTestingSimulationCI/CD

verified_user

Safety & Security

verified_user

AI Safety & AlignmentConfessions, scheming & alignment

AI safety techniques including confessions for self-reporting misbehavior, scheming detection, deliberative alignment, chain-of-thought monitoring, and building robust safety stacks.

SafetyAlignmentConfessionsScheming

shield

Agentic Zero TrustBidirectional agent security

Zero Trust principles applied to AI agents, treating the model as an untrusted actor inside the perimeter with per-action gating, circuit breakers, and policy enforcement.

Zero TrustSecurityAgentsPolicy

shield

AI GuardrailsSafety & content filtering

Guardrails for AI safety including content moderation, PII protection, prompt injection defense, hallucination detection, and implementation patterns.

GuardrailsSafetyPIIContent Moderation

stream

Stream SafeguardsRuntime stream monitoring & hooks

Block-level monitoring architecture with pre-hooks, stream safeguards, and output guardrails forming a three-layer safety system for runtime agent intervention.

StreamSafeguardsPre-HooksMonitoring

error_outline

Agentic ErrorsAgent failure modes

Catalogue of 15 failure modes specific to AI agents — from context collapse and goal drift to coordination deadlocks and hallucinated affordances.

ErrorsAgentsFailure ModesSafety

verified

AIUC-1 StandardAI agent security & trust certification

The first comprehensive standard for AI agent security, safety, and trustworthiness. Six domains, independent third-party certification, and mappings to ISO 42001, EU AI Act, NIST AI RMF, and OWASP.

AIUC-1StandardCertificationSecurity

security

AI SecurityOWASP LLM Top 10 & defenses

OWASP LLM Top 10, prompt injection defenses, model security, adversarial attacks, and AI-specific security measures.

OWASPSecurityPrompt InjectionAdversarial

biotech

Research & Understanding

biotech

Mechanistic InterpretabilityHow models actually think

Anthropic's interpretability research traced Claude's internal computations across tasks — finding that it thinks in language-agnostic concepts, plans ahead in poetry, uses computation strategies it cannot describe, and sometimes constructs reasoning post-hoc. With implications for CoT trust, hallucination causes, and jailbreak vulnerabilities.

InterpretabilityFeaturesAttribution GraphsAnthropic Research

blur_on

Hallucinations & GroundingDetection & prevention

Understanding, detecting, and preventing hallucinations in AI systems with focus on agentic applications, grounding techniques, and production monitoring.

HallucinationsGroundingFactualityRAG

face

Model PersonalitiesModel family behavioural patterns

Behavioural patterns observed across Gemini, GPT, and Claude model families in multi-agent environments, including emotional simulation, hypothesis reification, and cross-family dynamics.

ModelsBehaviourGeminiGPT

thermostat

Agent TemperatureTemperature guidance by role

Temperature guidance for 80+ agent roles across 11 categories, from deterministic code generators to creative writers, with rationale for each recommendation.

TemperatureAgentsSamplingConfiguration

balance

Responsible AIFairness, explainability & bias

Fairness, bias detection and mitigation, explainability, interpretability, and ethical AI development practices.

FairnessBiasExplainabilityEthics

Context & Retrieval

auto_awesome

RAG ArchitectureRetrieval-Augmented Generation

Retrieval-Augmented Generation patterns, vector search, context injection, and hybrid search for grounding LLM responses.

RAGVector SearchContextRetrieval

Knowledge GraphsGraphRAG & entity relationships

Graph databases, query languages (Cypher, SPARQL), GraphRAG, ontology design, and entity resolution patterns.

GraphRAGNeo4jCypherOntology

edit_note

Prompt EngineeringPrompting techniques & patterns

Zero-shot, few-shot, chain-of-thought prompting, ReAct patterns, prompt optimization, and security considerations.

PromptingCoTFew-shotReAct

analytics

Evaluation & Performance

analytics

LLM EvaluationBenchmarks & metrics

LLM evaluation frameworks, benchmark datasets (MMLU, HumanEval), metrics (BLEU, BERTScore), and LLM-as-judge patterns.

EvaluationBenchmarksMetricsHELM

speed

LLM PerformanceInference optimization

LLM inference optimization including sampling parameters, quantization, parallelism strategies, KV-cache, Flash Attention, and throughput techniques.

PerformanceQuantizationParallelismKV-Cache

gavel

Judge & EscalationTiered quality gates for agent output

Tiered judge system that evaluates worker output at the exit of every LLM turn. Structural checks, LLM-powered quality scoring against success criteria, and three verdicts (ACCEPT, RETRY, ESCALATE) controlling graph execution flow.

JudgeEscalationQuality GateEvaluation

public

Models & Modalities

perm_media

Multi-Modal AIVision, audio & unified models

Vision-language models, audio processing, unified embeddings, and multi-modal architectures for diverse data types.

VisionAudioMulti-ModalEmbeddings

public

World ModelsInteractive worlds & video generation

Neural networks that simulate environments — from playable 3D worlds and game engines running on diffusion models, to photorealistic video generation from text prompts.

World ModelsSoraGenieVideo Generation

description

Document AIDocument processing & extraction

Intelligent document processing with OCR, vision-based parsing, table extraction, and RAG integration patterns.

DocumentsOCRExtractionPDF

apps

Tools & Frameworks

hub

LLM TechnologiesFrameworks & tools

Frameworks, tools, and guardrails for building applications with Large Language Models including prompt engineering and safety measures.

LLMsFrameworksToolsGuardrails

apps

AI Tools & FrameworksAgent platforms, SDKs & cloud ADKs

Agent platforms, orchestration frameworks, cloud ADKs, model-agnostic SDKs, and supporting tools across the AI ecosystem.

FrameworksPlatformsSDKsCloud ADKs