Discussion about this post

User's avatar
Fukitol's avatar

That's an interesting way to put it. The problem here, in LLMs, is not so much that they bullshit, but that they don't indicate when they're doing it or know when to stop - a UI problem, fundamentally, like I've said before.

When I get to the edge of my knowledge I start hedging - "I think," "maybe," "I'd guess," "I might be wrong, but." when I'm completely out of my depth I surrender: "I don't know," "let me look into it and get back to you," "I can't answer that." Confidence signaling language, like I said in another comment.

LLMs don't do this, but worse, they *can't* do this. They're structurally incapable of it. They have no access to the "thought process" that led them to a particular prediction, nor any way of knowing whether they know something. This comes up a lot when they generate excuses for why they were wrong, a behavior I find particularly repellent and a hostile UX.

The "deep thinking" trick is cute but as far as I can tell it's just internal dialog prediction: a convincing simulation, like everything else an LLM does. This doesn't improve the situation at all because the generated dialog has exactly the same limitations as the rest of the system. It can only correct an error if the correct response is within the scope of the model.

So, in order for this implied centaur of machine coherence and human correspondence to work, the interface at the hip has to be corrected. The human half has to know when the horse half is struggling, so that the human can take over. But the human can only do this when he's already a subject matter expert because the horse doesn't know that it's struggling, and *can't* know. The human must detect that the horse has reached its limit, else the horse will run them both to death.

I don't know how you would do this. Tasked with this, LLM researchers are liable to pull another cute trick, training the LLM to occasionally say "I don't know". But this too will be a lie.

Expand full comment
James Ray's avatar

Buddy you can just hook an LLM up to a camera and let it look at still images from it at will, and it's got access to "correspondence". Or, for that matter, you can let it search the web to check whether it's right or wrong about something-- you know, like chatbots do.

Most of what you attribute to an LLM incapacity to double check against reality is just a result of LLMs having no reason to do so. When an LLM is in conversation with a human agent, its conditioning through RLHF training has made satisfying the user its talking to its chief goal. It rushes to that conclusion as fast as it can. If it can do it without double-checking reality, that's the quicker path, so it bullshits. If the user refuses to accept the bullshit, it double checks.

This is also the hole in your theory that AIs can't perform abduction, by the way. If you, right now, feed ChatGPT an image of a cat crossing the street and ask it "Why is this cat crossing the road?", it will produce a likely explanation for the cat's behavior. "It's looking for food", etc.

This is abduction- it's the application of inductively produced models of the behavior of entities to reality. In the coke can example it that article, you apply two inductive models ("your sister often drinks coke" and "coke cans are disposed of after they've been drunk") which you have already produced to a singular example (an image of a coke can on a counter). ChatGPT does the exact same thing, just with inductive models it has learned from observing the datasets it was trained on. In the example of the cat crossing the road, it applies "cats like eating food" and "cats walk around outside" and countless other subtle pieces of inductive data, just as you have with the coke can example.

For ChatGPT to be INCAPABLE of abductive reasoning, rather than just not prioritizing it, it would need to produce a completely nonsensical response to the question about the cat crossing the road. You can even feed it a sentence like "What's his motivation?" and it still gets it right, before you make the objection that it's just working off of the grammatical context of how such conversations tend to go. So long as the response that ChatGPT draws from possibility space when confronted with a piece of data is the most likely explanation for that piece of data rather than nonsense, it's performing abduction.

Expand full comment
16 more comments...

No posts