Agents at Work — Research Series
From bias in job adverts to the behaviour of AI judgement systems
Over the past several months, this work has been developed as a series of linked studies examining how AI systems detect, interpret and evaluate age-related bias in recruitment language.
Today, the fourth phase of this work is now complete.
This latest stage moves beyond observing variation to examining how AI judgements behave when tested under repeated evaluation, constraint and comparison.
The series as a whole traces a progression in focus.
It begins with identifying patterns in data, moves through interpretation and explanation, and arrives at the point where those judgements can be tested as behaviour over time.
For ease of reference, the full series is set out below.
Phase 4 represents the point at which this work moves from explanation to structured testing.
Phase 1 — Detection
Analysis of age-adjacent language across UK job adverts at scale.
→ [Phase 1 Report]
Phase 2 — Interpretation
Examination of how an AI system identifies and explains age-related signals.
→ [Phase 2 Report]
Phase 3 — Behavioural Audit
Introduction of a behavioural evaluation framework, testing stability, confidence and explanation under repetition and ambiguity.
→ [Phase 3 Report]
Phase 4 — Testing Under Repetition and Constraint
A structured behavioural audit examining how AI judgements hold under repeated evaluation, cross-model comparison and reduced context.
→ [Phase 4 Report]
Phases 1 - 4
Complete research arc of the full progression from detection to behavioural audit. Includes ebook and guide.
→ [Phases 1 to 4]
Together, these phases trace the development of a behavioural approach to AI evaluation, moving from outputs to observed system behaviour over time.


