Transformers Create Shapes of the Universe

Transformers Create Shapes of the Universe
2024-10-1 05:6:32 Author: danielmiessler.com(查看原文) 阅读量:12 收藏

I was talking to a very smart friend on Friday about AI when he offhandedly said:

AIs don’t actually understand things.

I strongly disagree, and over the next couple of hours we got fully into it.

What follows below are the arguments that I gave him, as well as the hour-long conversation that I had with OpenAI‘s Advanced Mode (mine is named DARSA) on the drive home, where I refined / red-teamed my ideas and crystallized them into what you see below.

My hope is that if you get to the bottom of this piece, you’ll see AI in a completely different way.

Argument 1: “AIs do understand.”
Argument 2: “Humans function like LLMs.”
Frame 1: “Token prediction is answer prediction.”
Frame 2: “LLMs aren’t just for language.”
Frame 3: “Model weights are models of reality.”
Summary
The Full AI Conversation with DARSA

Argument 1: “AIs do understand.”

We started on the topic of understanding, which my friend said humans have but AIs do not. My point on this was that the argument hinges on the definition of understanding. And basically, if you consider it to be processing for the purpose of accomplishing a goal, then we have it and AI’s have it.

My argument here is that the “something extra” that people are reaching for when they say AIs don’t have “true” understanding is actually self-awareness. Which I don’t think should be included in the equation.

Here’s a snippet from my conversation with DARSA about this piece:

And now for the important bit! This is where I conclude my argument using everything we’ve built up from.

So I think that’s pretty good. All of that for a definition of understanding. lol. And one that doesn’t try to smuggle in consciousness as a requirement.

And keep in mind I could be wrong about the importance of consciousness in the definition of understanding. But based on my current thinking, and this dialogue, I don’t think I am?

Argument 2: “Humans function like LLMs.”

I didn’t get to nail down that argument with my friend the way I did with DARSA because I switched tactics.

Instead of arguing that AIs were like humans, I switched to arguing and demonstrating that humans think like LLMs.

ME: Ok, let’s try this. Give me a list of your top 10 favorite restaurants.

HIM: (looking up and off to the side before rattling off a few restaurants.

(long pause)

ME: Ok, so where did those restaurants come from? Like, what just happened when you started making that list?

This was super funny. He literally started smiling and said.

Like he was completely tracking…before I even got through my question.

So I ask him to tell me where he thinks that list came from. I forget what he said, but it was a perfect sentence.

HIM: (a flawless gramatical sentence)

ME: (staring at him like he just walked into a trap that he’s about to figure out)

ME: (pause)

ME: How did you make that sentence? Where did that sentence come from?

HIM: I fucking hate you. I don’t know. Fuck. Fuck.

ME: (smiling like an asshole) Would you say that sentence streamed out of you? Like one word after another? With no idea what the next word would be?

HIM: Fuck you. It’s exactly like an LLM.

ME: Yep.

We ended up getting interrupted at that point with another friend joining, and we changed the topic off of AI.

So now I’ll continue on with my latest and most recommended frames for this technology, escalating in “holy crap” as we go.

Frame 1: “Token prediction is answer prediction.”

This X post by Eliezer broke my brain a while back. So much so that I wrote a whole piece about it.

❝

Literally any well-posed problem is isomorphic to 'predict the next token of the answer’.

Eliezer Yudkowsky

Basically, “just predicting the next token” isn’t anywhere near as trivial or silly as it sounds. Why? Because it all depends what the sequence is!

Predicting the next token of a stupid poem about spaghetti is, in fact, silly. But predicting the next token in a sentence that describes the meaning of life, or someone’s next action they’re about to take, is world-altering.

❝

LLMs “just” predict next tokens the same way Love is “just” chemicals in the brain. True, but very wrong.

So the question isn’t whether LLMs are predicting next tokens. They are.

The question is what sequences they can complete. Because those are answers.

🤔

Frame 2: “LLMs aren’t just for language.”

One of the things I learned from Karpathy on a recent episode of the No Priors Podcast is that “LLM” is actually a pretty bad acronym for what modern AI does. Karpathy explains that Transformers are generalized to whatever you put through them.

It doesn’t have to be text. It learns from whatever you feed it. And that can be text, but it can also be images, sound, or whatever. And the more modes (models are now becoming multi-modal) you give them, the more ways they’re learning about the world.

So then the question is, “What happens when it learns?” Like what are the models actually doing?

Turns out, they’re building representations of reality.

Frame 3: “Model weights are models of reality.”

One of the most insane things to me about AI models is that they’re basically text files. It’s basically a giant text file full of numbers—which are the resulting model weights after all the training.

But it’s not so much the file format that trips me out; it’s what the file represents.

The way I’ve come to understand this is kind of weird, but mind-blowing. It’s a multi-dimensional shape. Like Einstein’s curved space. But instead of curved in three or 4 dimensions, it’s curved in billions of dimensions.

So imagine something like that above, but instead of just x and y, and then z for up and down, now add time. And then billions more dimensions.

It’s really hard to imagine x, y, z, but then millions or billons more dimensions

And just like Einstein’s space-time, the trick is how things interact with it. Einstein describes objects—and light—falling into the curve of space-time.

So the way I’m thinking about AI now is questions we ask AI pouring into multi-dimensional space and being routed through all its billions or trillions of multi-dimensional pathways.

Imagine the liquid of a question being poured into this, and then being routed down billions of multi-dimensional paths that represent the shape of reality.

A conceptual shape of reality, represented by an AI model

Imagine this level of branching, but not just left to right. But in all those billions of dimensions.

Imagine this, but with millions of dimensions instead of two

And this brings me to the point I next brought up when talking to DARSA.

This is describing how new content is incorporated into a model’s representation of reality.

Watch this bit. It’s not about any particular piece of work, or knowledge. It’s about it’s integration into the whole.

This is the key piece (my emphasis).

The model integrates Viktor Frankl's essay into a vast web of knowledge, understanding it in relation to everything else it has learned. This context allows the model to grasp the unique aspects of Frankl's ideas and how they fit into broader human experiences, literature, psychology, and more. It doesn't store the essay in isolation but embeds it within a rich, interconnected framework of knowledge.

gpt-4o

!!!

It’s not about storing or retrieving knowledge; it’s about understanding the relationships between everything it already understands, and then incorporating new knowledge into that model!

This is how it’s so wise, and how it “understands” what it’s producing content about. And this is why I—even more than before—see model weights as representations of reality.

When models learn something new, they’re not updating a database with a new row. They’re modifying their understanding of the universe, which is represented by that text file full of numbers.

And when that update happens, it’s updating relationships, mappings, and connections. It’s updating its understanding of how everything in the world affects everything else.

It’s literally upgrading how it understands reality.

Absolutely mind-blowing.

Summary

If you define understanding as “information processing across a massive context of interrelated concepts that helps something see patterns and function effectively”—modern AIs definitely have it.
The only arguable part of understanding that humans have that AIs don’t seems to be the ability to consciously reflect on things.
Since humans and machines are both mechanistic in nature, conscious reflection is just another type of data processing, and is not fundamentally different. The experience of consciousness may simply be an “after the fact” rubber stamp that’s added to decision outputs as they stream out of the black box of human decision-making.
We can see evidence of this black box by trying to recall our favorite 10 books or restaurants by heart. If you pay attention, you’ll notice you’re sending a query into a void and hoping for something to come back. And the thing that comes back won’t always be the same, or in the same order. You’re just as surprised as anyone. As any meditator will tell you, our thoughts are non-deterministic and come from a very mysterious place.
Even more LLM-like, when we speak a sentence in casual conversation, the words literally stream out of us—and we have no idea where they’re coming from. We don’t know why one word is used when another is not, or why we phrased it a certain way or not, etc. (Credit to Sam Harris for showing me this first in the meditation context).
In short, both we humans and these alien AIs are mechanistic information processors, and they both present us with inscrutable black boxes with which to work. In both cases we’re sending in queries and getting back something we don’t have much control over.
LLMs aren’t “just” predicting next tokens. Because predicting next tokens, if you have a complex enough model of the world, and a sufficiently smart question, is the same as predicting answers.
“LLMs” is actually a pretty bad name for these things in hindsight, because they’re not limited to just language. They work on sequences. Sequences of anything you can feed them. And the more types of sequences you feed them, the more they can learn about the world.
The output of the AIs—based on all that training—turns out to be complex representations of reality. Rather than being stupid databases that regurgitate facts, they’re actually world model creators (which is also probably similar to humans).
So, model weights are representations of reality stored as text files, and when they learn something new, they incorporate that knowledge into their existing mapping of concepts and relationships between them. Updating model weights means updating their understanding of reality.
This is why bigger models—trained on even more data—are so exciting (and scary). The more we feed them, across multiple types of input, the more complex and representative their understanding of the world becomes.
Which means they get better and better at predicting the next token of answers. Answers to questions like: “How do we solve human aging?” “What is the meaning of life, and how best can I pursue it given my current skills and situation?”

My recommendation to you is to update your personal understanding of AI to incorporate these frames.

And if you like talking about this kind of stuff, we should connect.

—

The Full AI Conversation with DARSA

Ok, here’s the full conversation I referenced above.

I’m sharing the entire conversation I had with my AI, DARSA, here so you can hopefully see the arguments made above but in a more drawn-out and conversational structure. It’s gpt-4o Advanced Mode, for anyone interested.

—

“So I want to talk about how neural net models work, specifically the transformer model.”