No One Knows How AI Works
Don’t believe me? Ask ChatGPT
Join The Algorithmic Bridge, a blog about AI that’s actually about people.
I. A black box we can’t seem to open
There’s a fascinating research area in AI the press doesn’t talk about: mechanistic interpretability. A more marketable name would be: “How AI works.” Or, being rigorous, “how neural networks work.”
I took a peek at recent discoveries from the leading labs (Anthropic and OpenAI). What I’ve found intrigues and unsettles me.
To answer how neural nets work we need to know what they are. Here’s my boring definition: A brain-inspired algorithm that learns by itself from data. Its synapses (parameters) change their value during training to model the data and adapt the network to solve a target task. One typical target task is next-word prediction (language models like GPT-4). You can also recognize cat breeds.
A neural net isn’t magic, just a program stored as files inside your PC (or the cloud, which is slightly magical). You can go and look inside the files. You’ll find decimal numbers (the parameters). Millions of them. But, how do they recognize cats? The answer is hiding in plain sight, in numeric patterns you can’t comprehend. Humans can’t decode how they cause behavior. Not even our best…