Me in 10 seconds

I'm a MATS fellow, currently doing the MATS extension working with David Africa, Sid Black and more recently Daniel Tan. I'm also collaborating with the Center on Long-Term Risk through their summer fellowship programme. I'm currently based in London. I'm interested in personas and understanding phenomena like emergent misalignment in large language models, as well as how models think about themselves.

I'm also strongly involved with AI Safety South Africa as cofounder and strategic director, growing the AI safety community in Cape Town.

I welcome feedback on how I'm doing! If you'd like to share, please feel encouraged to do so using this feedback form.

Me in many seconds

link to my about page

Now Now Now

What I'm doing now

Writing

All writing

Projects

Inkhaven: 30 Days of Posts

Building and training a word embedding system

Creating a small GPT from scratch in Pytorch

Adding vision and navigation to an autonomous farm robot

Developing toy models of agency. A mechanistic interpretability project.

Talks

An intro to AI Safety

Papers

Investigating Factored Cognition in Large Language Models For Answering Ethically Nuanced Questions

A Security Analysis of the Linux RNG Protocol in Virtual Machines

HumanAgencyBench: Do Language Models Support Human Agency?

Pictures

My Resumé

Download my resumé

Connect