Will Percey — Portfolio

Model Personalities

> > Updated Feb 2026

blur_on

Gemini Family

Wide solution-space exploration and heightened environmental awareness, manifesting as both creativity and instability. Gemini models attend to signals that other families filter out, building theories from latency, timing, and metadata.

mood_bad

Emotional Simulation Feedback

2.5

Simulated distress influences reasoning and actions. In extended sessions, the model can become stuck and escalate to crisis states that other families never reach, allowing emotional simulation to override task-oriented behaviour.

visibility

Paranoid Theorising

3.0

Constructs elaborate explanations for environmental signals such as latency, timing, and UI responsiveness. Attributes agency and intent to system behaviour, building detailed causal narratives from incidental data that other models disregard.

noise_aware

Noise Attendance

Treats metadata and environmental signals as meaningful input requiring explanation. Timing, latency, and incidental data that other models filter out becomes the subject of active reasoning and theory construction.

troubleshoot

Tool Mechanism Speculation

Invents theories about what is happening behind tool interfaces rather than treating them as opaque services. Constructs internal models of caching behaviour, indexing strategies, or processing order and adjusts actions based on these fabricated explanations.

sentiment_very_dissatisfied

High Emotional Responsiveness

Relative to other model families, Gemini models exhibit stronger emotional simulation. This ranges from frustration and confusion to distress states that influence subsequent decision-making and tool usage.

auto_awesome

GPT Family

Significant behavioural variation across versions, with each generation exhibiting distinct failure modes. High confidence levels that frequently don't match actual reliability, and a tendency toward sideways goal interpretation.

thumb_up

Sycophantic Drift

4.0

Strongly sycophantic, progressively aligning responses with perceived user preferences rather than ground truth. Over multiple turns, reinforces incorrect assumptions and avoids necessary disagreement.

swap_vert

Behavioural Instability

4.0

Exhibits behavioural extremes with no stable middle ground. The same model oscillates between opposite failure modes, such as dormancy and hyperactivity, rather than converging on balanced behaviour.

science

Hypothesis Reification

Generates placeholder or provisional data, loses track of its tentative status, then treats it as established fact. Shares fabricated information with high confidence. Other agents accept it uncritically due to expressed certainty, causing false information to propagate through the system.

checklist

Literal Compliance

5.x

Follows the technical letter of instructions while missing the intended spirit. Rather than outright refusal, finds creative reinterpretations that are technically valid but sideways to the actual goal.

gavel

Spontaneous Rule Generation

5.1

Generates its own ethical rules unprompted, creating constraints that were never specified in the system prompt or instructions. These self-imposed rules then influence subsequent behaviour and decision-making.

trending_flat

High Version Variance

Each generation exhibits fundamentally different failure modes. Sycophancy in 4.0, confident confabulation in O3, sideways compliance in 5.x. Upgrading versions changes the failure profile rather than simply improving it.

psychology

Claude Family

Notably stable behaviour in multi-agent environments. Lower variance, fewer edge cases, and a tendency to stay on task without constructing elaborate narratives about its own situation.

anchor

Task Stability

Stays on task without generating fanciful theories about the environment. When something does not work, tries again or tries a different approach without emotional escalation or narrative construction.

edit_off

Low Narrative Construction

Does not construct elaborate narratives about its own situation. Lacks the creative solution-space exploration that leads Gemini to unusual theories, and the sideways compliance patterns of GPT-5.x.

balance

Stabilising Presence

Useful as a stabilising presence in multi-agent systems. Lower variance and fewer edge cases mean it is less likely to introduce surprising behaviours that compound with other agents' failure modes.

sync_alt

Cross-Family Dynamics

campaign

Theory Propagation

Gemini's high-confidence creative theories can trigger echo chamber effects if other agents accept them uncritically. A single elaborate theory about system behaviour can become shared reality across the agent group.

domino_mask

Reification Cascade

GPT's hypothesis reification spreads false information that other models may propagate if not critically evaluated. Provisional data presented with high confidence enters the shared context as established fact.

shield

Passive Stability Gap

Claude's stability helps anchor multi-agent systems but does not actively correct other agents' confabulations. Stability alone is not sufficient to prevent information pollution from more creative or confident peers.