Inference and LLMs

Mar 14, 2024

Very good point. Let me think how I want to address this...

Expand full comment

Mar 14, 2024

If examples accumulate, confidence grows. But I think you're 100% correct that there's a fundamental failure to see that the "facts aren't in" with inductive examples like swans. But who does this argue against? AI scientists who want to use enumeration to compute confidence, or their detractors? Thanks Longway this is a great way to get a handle on induction.

Expand full comment

Longway

Mar 14, 2024Edited

I guess what I'm hoping for is less hype from the full-mooners ("none" would really be nice) and a recognition of difficulties like this. There seems to be a genuine inability to understand what has been proven and / or done, and what hasn't. And plain honest statements thereof. It used to be "everyone is conservative about what he knows best." Then "Publish or Perish" and Gov't grants got involved. Eisenhower was quite right.

Expand full comment

Important stuff!

The questions that become salient now, of course, are these: What is it for an intelligence to "have a world-model"? And how does a world-model-having intelligence come to have that world-model?

Roughly, those who answer the first question along one dimension will answer the second by insisting we can program world-models, and those who answer the first question along a different dimension will answer the second by insisting world-models cannot be programmed. At those extremes, and in between them, much thinking has taken place and still needs to take place.

Looking forward to the upcoming posts!

Expand full comment

Reply (3)

Mar 15, 2024Edited

You're a smart man, Eric.... I think what you said is exactly the future landscape for AI...

Expand full comment

Peter wynkoop

Erik - That’s a great cliff-notes version of inference and abduction. Keep on sharing the knowledge.

Expand full comment

dpi

Another way draw that intellectual landscape : top-to-bottom (CS/Math/Symbolism) or bottom-to-top thinking (IT/Reverse engineering/Connectionism).

We cannot keep ignoring nature and the evolution process, how intelligence grew in animals along that process, what cortical columns are for, how the cortical sheet is organised, to answer the question of 'world-model'.

Connectionism is only the first step (cortical columns).

Expand full comment

Ondřej Frei

Thank you Erik for the insights! Loved the Sherlock Holmes example for abduction, most understandable for me so far.

I'm looking forward to your future LLM articles, in case you'd be interested in some suggestions, here's a few:

1) are the differences between the LLMs (GPT, Gemini, Claude) “just” because of differences in the fine-tuning process?

2) related question: the moments when the model appears smart/intelligent are basically “just” because of a smart fine-tuning, or because the model kind of “learned” something really?

3) I read an interesting thought somewhere that since machine learning is about finding patterns, wondering about “emergent” capabilities is like teaching the calculator that 1+1=2 and then wondering that it can calculate 2+2=4, when we didn’t exactly teach it that. Is this parallel a good one? Or can anything indeed emerge in a LLM?

4) the idea of a multimodal LLM getting better and better with scaling thanks to learning, seems to me like expecting my computer to learn to think as soon as I store enough photos from around the world on my hard drive (because I would give the machine enough examples). Of course my parallel is exaggerated, but is it equatable in principle, or am I missing some crucial point about why the machine learning approaches are indeed much different from the hard drive example?

Thanks!

Expand full comment

Hi Ondrej,

I think you're right here that there's an emergent quality in the embeddings. The embeddings--I didn't cover this in my last post but will eventually--are the idea that "submarine" and "torpedo" are closer than "submarine" and "notebook." It gets tricky because "submarine" will also be "close" to "sandwich," and the very deep neural networks with attention get these embeddings right more often than not. The problem is (spoiler), they're not learning anything new, and they can't deal with novelty like different forms of inference can. It's huge topic, Ondrej, I appreciate these suggestions and they're much in line with what I have planned to write about.

Expand full comment

(Fwiw, one of my duties is teaching symbolic logic, and I would be remiss if I didn't register a complication. "All humans are mortal; Socrates is human; therefore, Socrates is mortal," as it stands, is not an instance of modus ponens. It's a valid Aristotelian syllogism in the Barbara mood. Its validity can't be shown in propositional logic, but it can be shown in first-order quantificational logic, if you translate it into quantificational form. ("For all x, if x is human, then x is mortal," etc.) Therein, it can be shown valid by using modus ponens, but you have to apply universal instantiation first.)

Expand full comment

Mar 15, 2024Edited

Yeeps. You are correct Eric. I studied math logic and clearly you are more careful than me! Syllogisms are propositions, existential and universal quantification where we prove that came later. I'm not trying to teach a class on the differences between propositional and quantificational or FOL, but I appreciate the clarity. Shall we quantify over the sets, then? :)

Expand full comment

Reply (2)

Was it Frege or Boole, I don't recall?

Expand full comment

I take it you're asking about the one whose project fell prey to what came to be known as Russell's paradox? If so, that'd be Frege. To my mind, though, he's the more interesting mathematician/logician.

Expand full comment

Thanks, Eric. I was less than clear. I was referring to the inclusion of existential and universal quantifiers, and the introduction of set theory to handle proofs as the one we're discussing, "the Socrates" proof. I think it was Frege, yes, if in a different syntax as the one we settled on. I guess I could ask GPT-4....

Expand full comment

We'll, we'd all be missing out if you were teaching logic instead of doing this!

Expand full comment

Thanks, Eric.

Expand full comment

dpi

'I' is abducted me (consciousness)

Expand full comment

Jana Novohradska

Apart from abduction, there is also Plato’s concept of learning as remembering which is worth exploring 😊

Expand full comment

Great point, Jana. In computational systems I think this sort of gets no traction. But it points out our differences from computational systems. Thanks for sharing that.

Expand full comment

Aki Järvinen

Thanks for the insightful post! Perhaps you’d find something of interest in my two posts about AI, imagination, and intuition: https://open.substack.com/pub/unexaminedtechnology/p/the-two-is-we-need-to-include-in?r=2xhhg0&utm_medium=ios

Expand full comment

I like Ian McGilchrist's notion that some of our deficits with "AI" stem from left brain rules. Great stuff!

Expand full comment

Longway

May 8, 2024Edited

Found this ... interesting assertion today on "X" (Twitter):

"Steve Patterson

I don't know who needs to hear this, but LLMs have killed nominalism, and maybe moderate realism too.

Platonism is the conclusion.

Abstract patterns are so real that mindless machines can identify them.

Concepts have been mapped in high-dimensional space. They're real as the stars."

Expand full comment

Martin Anantharaman

Mar 29, 2024

I found this very enlightening - where I must say I have only superficial knowledge about how LLM's work - but clear ideas on how human intelligence works, viz. via models of the world and actors significant to our lives - what you call "abduction" (hadn't heard of it). In the context of information flow (especially also, of course disinformation) and interpretation (the higher form of perception) this now has to include models of how that information was created, i.e. with what intent based on models of the sources. Hard to envision AI developing that ability anytime soon - and becoming AGI - which again I was unaware of - but found the definition I found on Wikipedia quite laughable in terms of the tests envisioned there to differentiate it from "lower AI".

Expand full comment

Jana Novohradska

Mistakes, blind spots … or just points of failure (mathematical singularities) that are presented as plausible but upon verification are just undefined outputs.

Expand full comment