Computational Superposition in a Toy Model of the U-AND Problem

2025-03-272025-03-27 | Boris

I’ve been working on some AI Safety research. It’s kinda dense for a blog, so I’m hosting elsewhere.

It’s investigation into how ML models do boolean at the most fundamental level. Under an assumption of feature sparsity, which is common for large models, certain patterns appear.

Read on Less Wrong

BorisTheBrave.Com

Computational Superposition in a Toy Model of the U-AND Problem

Leave a ReplyCancel reply