Me in 10 seconds
I'm a MATS fellow, currently doing the MATS extension working with David Africa, Sid Black and more recently Daniel Tan. I'm also collaborating with the Center on Long-Term Risk through their summer fellowship programme. I'm currently based in London. I'm interested in personas and understanding phenomena like emergent misalignment in large language models, as well as how models think about themselves.
I'm also strongly involved with AI Safety South Africa as cofounder and strategic director, growing the AI safety community in Cape Town.
I welcome feedback on how I'm doing! If you'd like to share, please feel encouraged to do so using this feedback form.
Me in many seconds
link to my about page
Now Now Now
What I'm doing now
Writing
A Case for Persona Robustness as a Research Area
Contributing to Technical Research in the AI Safety End Game
Lessons from my first 10 day Vipassana
Do Models Know They're Lying When Claiming Fake Identities?
Whole Brain Emulation as an Anchor for AI Welfare
How South Africa's Electricity Catastrophe Was (Mostly) Fixed
Revisiting GSM-Symbolic: Do 2026 Frontier Models Still Fail at Confounded Grade School Math?
Projects
Building and training a word embedding system
Creating a small GPT from scratch in Pytorch
Adding vision and navigation to an autonomous farm robot
Developing toy models of agency. A mechanistic interpretability project.
Talks
Papers
Investigating Factored Cognition in Large Language Models For Answering Ethically Nuanced Questions
A Security Analysis of the Linux RNG Protocol in Virtual Machines
HumanAgencyBench: Do Language Models Support Human Agency?